Get all palindromes
Problem Definition:
given a string of length n check for all subsets of this string if the are
palindromes or not.
Problem Tackling Approach:
To solve this problem in a brute force way it will take about O(n^3) ,because it
will be something like this.
Algorithm 1:
For all char ‘a’ belongs to S do
For all char ‘b’ belongs S and after char ‘a’ do
Check_Palindrome(String(a,b))
This algorithm is doomed if it’s part in solving a bigger problem also if n grows large, but this algorithm showed a pattern here let’s look at this example.
Ex1:
Let our string here be “aaabbaaa”, when we apply Algorithm 1 you will notice the pattern when we apply function “Check_Palindromes” first to the string starting at index 0 to index 7 and to string at index 1 to index 6.
The Pattern is we checked if the substring from index 2 to 5 is palindrome twice!.
This pattern tells us we can divide it into small subsets and solve it with straight forward Dynammic Programming with O(n^2).
Algorithm 2:
for ( int spaces = 1; spaces < n; spaces++)
for(int startIndex = 0; startIndex < n && startIndex+spaces < n; startIndex++)
If( String[startIndex] == String[startIndex + spaces] && DP[startIndex+1][spaces-1])
{
DP[startIndex][spaces] = true;
}
Ok, that’s all of it hope you find it challenging.
Problem Reference:
-idea of the problem taken from codechef Octobor 2010 contest.
Introduction to Memory Management on iPhone
Content of this Article:
>Memory Management Model.
>How to allocate, retain and release Objects in Memory.
>Using nil to safe release Objects.
>References.
Memory Management Model in iOS dosen’t have a Garbage Collector inorder to increase the performance of the language on the mobile platform which is constrained to small computational power. Forthat it uses Reference Counting the idea of Reference Couting is very simple every object you define in the memory have a number that counts how many items reference this object, When it reaches zero the object is deallocated from the memory since no other object needs it.
Let’s take it to the Next Step Memory Allocation, to allocate an object we will use the keyword “alloc” and then call “init” to initialize the object, for Example:
NSString *string_1 = [[NSString alloc] init];
Now the reference count of the object is “1”, since alloc will allocate a memory block and makes it’s reference count to “1”. if another object referenced the same object the reference count wil increase by “1” (current reference count = 2) :
NSString *string_2 = [string_1 retain];
But after doing all these alloc and retain operations we are going to be faced with a huge problem which is Memory Leaks, objects tha aren’t needed or out of our scope will resides in the memory and won’t be garbage collected, Why ? Because there reference count isn’t “0”. Calling “release” every time on an object will decrease the reference count of the object by one.
[string_1 release]; //now reference count is “1”.
[string_1 release]; //now reference count is “0”, and the object will be deallocated.
since “nil” can be retained or released, by using this great advantage you can define a setter for your variable which will realese your object and assign it to nil. Why is that even useful ? It’s very helpful to not fail in an invalid pointer operation, for example:
-(void) setStr:(NSString *) input{
[input retain];
[myStr release];
myStr = input;
}
-(void) callFunction:(id)sender {
if(myStr)
//do something …..
else
//pointer assigned to null can't do anything …..
}
-(void) dealloc {
[self setStr:nil];
[super release];
}
References:
- Memory Management Basics Tutorial Video by Mark
Linear Programming Model and it’s Applications
> Contents of the Article:
- What is Linear Programming ?.
- Applications of Linear Programming.
- An Example.
- Geometric Representation.
- Refrences.
What is Linear Programming (LP) ? it is a mathematical method for determining a way to achieve the best outcome (such as maximum profit or lowest cost) in a given mathematical model for some list of requirements represented as linear relationships.
where x represents the vector of variables (to be determined), c and b are vectors of (known) coefficients and A is a (known) matrix of coefficients. The expression to be maximized or minimized is called the objective function (cTx in this case).
Applications of Linear Programming arose as a mathematical model developed during World War II to plan expenditures and returns in order to reduce costs to the army and increase losses to the enemy. It was kept secret until 1947. Postwar, many industries found its use in their daily planning.
The founders of the subject are Leonid Kantorovich, a Russian mathematician who developed linear programming problems in 1939, George B. Dantzig, who published the simplex method in 1947, and John von Neumann, who developed the theory of the duality in the same year.
Application in AI programming and in particular in Planning like LPSAT wich is used in Resource Planing. It is essentially a propositional conjunctive normal form(CNF) that includes linear (in)equalities as propositions. When SAT solver returns the propositions that are satisfiable the linear programming solver is used to get the value of the arithmetic (in)equalities.
An Example taken from Introduction to Algorithms named “A political problem”.
Problem: you are a politician trying to win the election. You have three districts Urban, Sub-Urban, Rureal. Each contains 100, 200, 50, registered vote. To win the elections your primary issues in your program are “build-roads”, “gun-control”, “farm-subsidies” and “gasoline-tax”.
Your researchers gave you a table for the effects of paying 1$ on each issue, how many votes you will win and how many will you lose for each urban.
| Policy | urban | suburban | rural |
| build-roads | -2 | 5 | 3 |
| gun-control | 8 | 2 | -5 |
| farm-subsidies | 0 | 0 | 10 |
| gasoline-tax | 10 | 0 | -2 |
So inorder to win the election you decided that you need at least 50 votes, 100 votes, 25 votes respectively.
Solution Approach: let’s model each inequality
x1: amounts of dollars spent on “build-roads”
x2: amounts of dollars spent on “gun-control”
x3: amounts of dollars spent on “farm-subsidies”
x4: amounts of dollars spent on “gasoline-tax”
minimize x1 + x2 + x3 + x4
subject to
-2 * x1 + 8 * x2 + 0 * x3 + 10 * x4 >= 50 ( 1 )
5 * x1 + 2 * x2 + 0 * x3 + 0 * x4 >= 100 ( 2 )
3 * x1 + -5 * x2 +10 * x3 + -2 * x4 >= 25 ( 3 )
x1, x2, x3, x4 >= 0 ( 4 )
Geometric Representation is very important to understand linear programming more in depth, but inorder to be easy on geometric representation we will define another example with only two variables.
maximize x1 + x2
subject to
5 * x1 – 2 * x2 >= -2 ( 1 )
2 * x1 + x2 <= 10 ( 2 )
4 * x1 – x2 <= 8 ( 3 )
x1, x2 >= 0 ( 4 )
The Feasible area of solution is defined as the set of points that solve the 5 equations defined(shaded region).
References
Introduction to Algorithms(CLRS), 2nd edition (Chapter 29).
Wikipedia.
Introducing – “On-line Planning for Resource Production in RTS”
On-line Planning for Resource Production in Real-Time Strategy Games
Hei Chan, Alan Fern, Soumya Ray, Nick Wilson and Chris Ventura
School of Electrical Engineering and Computer Science
Oregon State University
Corvallis, OR 97330
{chanhe,afern,sray,wilsonic,ventura}@eecs.oregonstate.edu
Goal :
Develop action-selection mechanism in producing certain amount of resources as fast as possible.
Planner :
Computationally efficient “action-selection” mechanism which at each epoch creates a possibly Sub-Optimal concurrent plan from the current state to the goal, then begin executing the set of initial actions.
How it’s done :
formed via a combination of MEA(means-ends-analysis) and Scheduling and Bounded Search over sub-goals that aren’t required for goal achievements but may improve the make span.
Two Key problem domains :
-Resource production and tactical battles.
In Resource Production : player has to produce ( or gather ) varies raw materials, buildings, civilian and military Units to improve their economic and military power.
Tactical Battles : a player uses military units to gain territory and defeat enemy Units ( offensive or defensive actions ).
“Winning the Resource Production race is often a key factor in overall success”.
Uses :
1- either in computer opponent A.I.
2- Human can use it to specify resources needed and the Planner will get the best way to get this “RTS resource production is interesting from a pure A.I. Planning perspective as it encompasses a number of challenging issues”.
First, resource production involves temporal actions with numeric effects.
Second, performing well in this task requires highly concurrent activity.
Third, real-time constraints of the problem require action selection be computational efficient in a practical sense.
Why? :
Most existing planners are :
1- not handling temporal and numerical domains.
2- simply too inefficient to be useful.
3- produce highly Sub-Optimal plans.
The planning component used by online planner is based on a combination of means-ends analysis (MEA) and scheduling.
Given an initial state and resource goal, is used to compute a sequential plan that reaches the goal using the minimum number of actions and resources in sense that any valid plan must include all of the actions in the MEA plan.
Importantly, the special structure of our domain guarantees that MEA will produce such a plan and do so efficiently (linear time in the plan length).
Given the minimum sequential plan, we then reschedule those actions, allowing for concurrency, in an attempt to minimize the make span. This scheduling step is computationally hard, however, we have found that simple worst-case quadratic time heuristic methods work quite well. Thus both the MEA step and scheduling step are both low-order polynomial operations in the minimum number of actions required to achieve the goal, allowing for real-time efficiency.
Refrences :
MEA( means-ends analysis) : http://en.wikipedia.org/wiki/Means-ends_analysis
OCBP – “Introducing Darmok” part4
On-Line Plan Expansion and Execution
During execution, the plan expansion, plan execution and plan adaptation occur in parallel in order to maintain a current partial plan tree that the system is executing. A partial plan tree (or simply the “plan”) is represented as a tree consisting of two types of nodes: goals and snippets.
Initially, the plan consists of a single goal corresponding to the planning task at hand. In particular, in Darmok the initial goal is always “win the game”. Then, the plan expansion module asks the plan retrieval module to retrieve a snippet for that goal. That snippet might have several subgoals, for which the plan expansion module will again ask the plan retrieval module to retrieve snippets, and so on. For instance, at the top of Figure 8 we can see a sample plan, where the top goal is to “win”. The snippet assigned to the “win” goal has three subgoals, namely “build base”, “build army” and “attack”.
Figure 9 shows the algorithm the plan expansion module executes at every execution cycle given a plan p and a game state S. The plan expansion module is constantly querying the current plan to see if there is a ready open goal. When this happens, the open goal is sent to the plan retrieval module to retrieve a snippet for it. Then, that snippet is sent to the plan adaptation module, and then inserted in the current plan, marked as pending.
Function PlanExpansion(p, S)
G = GetReadyOpenGoals(p)
For g ∈ G Do
pg = RetrievePlan(g, S)
pg = AdaptPlan(pg , g, S)
p = InsertSnippetInPlan(p, g, pg )
EndFor
Return p
EndFunction
Figure 9: Plan expansion algorithm used in Darmok, where p is the current plan
Function PlanExecution(p, S)
p = startReadySnippets(p,S)
p = sendActionsToExecution(p,S)
p = updateSnippetStatus(p,S)
p = updateActionStatus(p,S)
Return p
EndFunction
Figure 10: Plan execution algorithm used in Darmok, where p is the current plan and S is the current game state.
Figure 10 shows the algorithm that the plan execution module executes at each execution cycle, composed of four steps:
• startReadySnippets: For each pending snippet, the execution module evaluates the preconditions, and as soon as they are met, the snippet starts its execution. If the current game state has changed since the time the and S is the current game state. plan retrieval module retrieved it, the snippet is handed back to the plan adaptation module to make sure that the plan is adequate for the current game state.
• sendActionsToExecution: If any of the executing snippets have any basic actions, and those actions have all its preconditions satisfied, then they are sent to Wargus to be executed. If the preconditions of the actions are not satisfied, the snippet is sent back to the plan adaptation module to see if the plan can be repaired. If after a certain amount of time t1 (set to 2000 game cycles in our experiments) it cannot, then the snippet is marked as failed, and thus its corresponding goal is open again (thus, the system will have to find another plan for that goal).
• updateSnippetStatus: The execution module periodically evaluates the alive conditions and success conditions of each snippet. If the alive conditions of an executing snippet are not satisfied, the snippet is marked as failed, and its goal is open again. If the success conditions of a snippet are satisfied, the snippet is marked as succeeded.
• updateActionStatus: Whenever a basic action succeeds or fails, the execution module updates the status of the snippet that contained it. When a basic action succeeds, the executing snippet can continue to the next step. When a basic action fails, the snippet is marked as failed, and thus its corresponding goal is open again.
OCBP – “Introducing Darmok” part3
Plan Retrieval
To solve a complex planning task, several subproblems have to be solved. For instance, in our domain, the system has to solve problems such as how to build a proper base, how to gather the necessary resources, or how to destroy each of the units of the enemy.
The propose to organize the case base using two kinds of elements:
• Snippets: a snippet is a procedure composed of a collection of actions, and subgoals composed in sequence or parallel.
• Episodes: an episode is a tuple e = ( p, G, S, O ) where e.p is a snippet, e.G is a goal, e.S is a situation (i.e. a game state), and e.O is the outcome of applying e.p in e.S to achieve e.G. In particular e.O is a real number between 0 and 1 representing how well the snippet achieved its goal in the situation e.S.
Figure. 7 Figure 7 shows an example of a snippet with an associated episode, with the four elements: a goal (in this case, building the resource infrastructure of player “1”) a procedure to achieve the specified goal in the given map in the snippet, a game state (with general features about the map and information about each players) and the outcome for the episode. In particular, in Darmok, the game state is defined by 35 features that represent the state of a Wargus game. Twelve features represent the number of troops (number of fighters, number of peasants, and so on), four represent the resources the player owns (gold, oil, wood and food), fourteen represent the number of the buildings (number of town halls, number of barracks, and so on) and finally, five features represent the map (size in both dimensions, percentage of water, percentage of trees, number of gold mines).
Given a game state and a goal to retrieve the snippet which could have the highest performance in such a goal in the given game state. To predict the performance of a snippet p, the plan retrieval module uses the episodes in the case base associated with p. To retrieve episodes, Darmok uses the episode relevance measure ER(e, S, G) that computes the relevance of a given episode e given the current game state S and goal G, and is defined as:
where GS is the goal similarity of the goal in the snippet p and the goal G, and SS is the state similarity. α is a parameter that has been set to 0.75 in experiments. The distance between two goals g1 = name1 (p1 , …, pn ) and g2 = name2 (q1 , …, qm ) is measured as follows:
where Pi is the maximum value that the parameter i of a goal might take (we assume that all the parameters have positive values). Thus, when name1 = name2 , the two goals will always have the same number of parameters and the distance can be computed using an Euclidean distance among the parameters. The distance is maximum (1) otherwise.
SS(gs1 , gs2 ) computes the similarity between two given game states, and returns a number between 0 and 1 (where 0 means identical, and 1 means totally different). SS is computed as the inverse of a simple Euclidean distance among the game states, where each feature is normalized between 0 and 1.
Finally, to predict the performance of a snippet, we define the set of relevant episodes RE(p, S, G) = {e1 , …, ek }, as the set of k episodes associated with the snippet p, and that have the maximum relevance ER. In our experiments, Darnok have set k ≤ 5, i.e. if there is fewer or equal to 5 episodes associated with p in the case base, then RE(p, S, G) contains all of them, but if there is more than 5, then only the best 5 are included in RE(p, S, G). Using that definition we can now define the predicted performance of a snippet as:
The predicted performance is a weighted average of the outcomes that snippet p has had in similar game states to S for similar goals. We add 1 to the numerator and 2 to the denominator following the Laplace probability estimation rule (which biases the predicted performance towards 0.5 when the number of episodes we have to predict the performance is small.) The result of the retrieval process is the snippet p that has the highest predicted performance P P (p, S, G), i.e. the snippet that has had the best performance in the past for similar goals in a similar game state. The snippet retrieved then needs to go through the adaptation process.
However, real-time domains require delayed adaptation.The game state changes with time and it is important that adaptation is done with the most up to date game state (ideally with the game state just before the snippet starts execution). For that reason, in Darmok, when the plan execution module is just about to start the execution of a particular snippet, the snippet is sent to the plan adaptation module for adaptation. Darmok cannot guarantee that the snippet is adapted with the latest game state since in a real-time domain the domain changes continuously, however, delaying adaptation as much as possible minimizes the adaptation error of Darmok.
OCBP – “Introducing Darmok” part2
Plan Acquisition
The propose to acquire such cases by analyzing game traces. In Darmok, this is done in two processes : trace annotation (revision) and case learning.
1- Trace Annotation(Revision) :
Annotation consists of associating which goals were being pursued by each of the actions executed by the expert. Annotation is needed because if the system was to learn snippets simply by observing a human play, it will need to implement plan recognition techniques in order to identify what the human is intending to do at every moment. Thus, annotations provide a way to avoid complex plan recognition techniques.
In approach, a goal g = name(p1 , …, pn ) consists of a goal name and a set of parameters. For instance, in Wargus, these are some of the goal types defined:
• WinGame(player): representing that the action had the intention of making the player win the game.
• KillUnit(unit): representing that the action had the intention of killing the unit unit.
• Resources(gold, wood, oil): the action had the intention of increasing the resource levels to at least the specified levels in the parameters.
• SetupResourceInfrastructure(player, peasants, farms): indicates that the expert wanted to create a good resource infrastructure for player , that at least included peasants number of peasants and farms number of farms.
The fourth column of Table 1 shows the annotations that the expert specified for his actions. Since the snippet shown corresponds to the beginning of the game, the expert specified that he was trying to create a resource infrastructure and, of course, he was trying to win the game.
2- Case Learning :
The annotated trace is processed by the case learning module, that encodes the strategy of the expert in this particular trace in a series of cases. Traditionally, in the CBR literature cases consist of a problem/solution pair; in Darmok system the case base is composed of two structures: snippets and episodes.
Snippet and Episodes :
A snippet stores just a plan, and an episode stores the outcome of having applied a particular snippet in a particular context to achieve a particular goal.
The case learning module analyzes the annotated trace to determine the temporal relations among the individual goals appearing in the trace. For instance, if we look at the sample annotated trace in Figure 6, we can see that the goal g2 was attempted before the goal g3, and that the goal g3 was attempted in parallel with the goal g4.
In Daemok framework, we want to know if two goals are pursued in sequence, in parallel, or if one is a subgoal of the other. Darmok determines those relations in the following way:
• If most (90%) of the actions associated with a goal g1 happen before the first action of another goal g2 , then g1 and g2 are considered to happen in sequence.
• If all the actions associated with a goal g1 are also associated with another goal g2 and the goal g2 has some action not associated with g1, then g1 is considered to be a subgoal of g2.
• Otherwise, two goals are considered to be in parallel.
The cases learned consist only of the procedural information in the snippets and a goal, game state and outcome in the episodes, as Figure 7 shows.
OCBP – “Introducing Darmok”
Reference Paper That Implemented Darmok System.
Darmok
Using designed OLCBP cycle has been designed with domains such as RTS games in mind. This approach has been implemented in the Darmok system, designed to play the full Wargus game. The only aspect of Wargus still not covered by Darmok is the “fog-of-war”, that has been disabled in during experiments.
The Darmok system learns how to play Wargus by observing how humans play. Darmok learns what we call plan snippets by observing a human play, and stores those snippets in the system in the form of cases. Such snippets are then retrieved and composed together to form plans.
Learning:
During learning Darmok observes a game trace to learn plan snippets that will be stored in the case base. In experiments, an expert plays Wargus to generate a trace. However, notice that the system could learn from any trace, even from traces of itself playing, or observing other systems play. Then, the trace is annotated by the expert, explaining the goals he was pursuing with the actions he took while playing. Using those annotations, a set of snippets are extracted from the trace and stored as a set of cases. For each snippet, the situation in which it was executed, the goal it was pursuing, and its success or failure are stored in the case base.
Execution:
Plan Retrieval, Plan Adaptation, Plan Expansion and Plan Execution are in charge of maintaining a current plan to win the game. The Plan Execution module is in charge of executing the current plan, and update its state (e.g., marking which actions succeeded or failed). The Plan Expansion module is in charge of identifying open goals in the current plan and expand them. In order to do that it relies on the Plan Retrieval module, which given an open goal and the current game state retrieves the most appropriate plan snippet to fulfill the open goal. Finally, we have the Plan Adaptation module in charge of adapting the retrieved snippets according to the current game state.
Darmok requires:
1- a set of goals.
2- a set of primitive actions.
3- a vocabulary for conditions.
4- a set of features to represent the game state (used for plan retrieval).
5- a set of annotated expert traces.
6- and (as to be explained later) a set of rules to help the system perform precondition-postcondition matching.
Plan Representation in Darmok
The plan representation formalism used by Darmok, designed to allow a system to learn plans, represent them, and to reason about them and their intended and actual effects.The basic constituent piece is the snippet. Snippets are composed of three elements:
• A set of preconditions that must be satisfied before the plan can be executed. For instance, a snippet can have as preconditions that a particular peasant exists and that a desired location is empty.
• A set of alive conditions that represent the conditions that must be satisfied during the execution of the plan for it to have chances of success (also known as “maintenance goals” in the planning literature). If at some moment during the execution, the alive conditions are not met, the plan can be stopped, since it will not achieve its intended goal. For instance, the peasant in charge of building a building must remain alive; if he is killed, the building will not be built.
• The plan itself. which can contain the following constructs: sequence, parallel, action, and subgoal, where an action represents the execution of a basic action in the domain of application (a set of basic actions must be defined for each domain), and a subgoal means that the execution engine must find another snippet that has to be executed to satisfy that particular subgoal.
Also, snippets are associated with goals. A goal is a representation of the intended goal of the snippet.
For every domain, an ontology of possible goals has to be defined. For instance, a snippet might have the goal of “having a tower”.
Notice that unlike classical planning approaches, postconditions cannot be specified for snippets, since a snippet is not guaranteed to succeed. Thus, we can only specify the goal a snippet pursues, i.e., its success conditions.
The Difference between a postcondition and a success condition :
A postcondition is a condition that we can ensure is going to be true after the execution of a snippet (or an action), while a success condition is a condition that when satisfied we can consider the snippet (or action) to have completed.
For example, a side effect of an action is a postcondition but not a success condition: “enemy killed” is a success condition of an attack, but not a postcondition since we cannot ensure that after the attack is done the enemy would be killed( because the domain is non-deterministic). Our use of success conditions instead of postconditions defines an abstraction over the notion of a nondeterministic action in planning handled by interleaving planning and execution.
Specifically, three things need to be defined for using Darmok in a particular domain:
• A set of basic actions that can be used in the domain. For instance, in Wargus we define actions such as move, attack, or build. For uniformity, in Darmok actions are treated as standard snippets, and thus have a goal, preconditions and alive conditions (so that the system can reason about them too).
• A set of sensors, that are used to obtain information about the current state of the world, and are used to specify the preconditions, alive conditions and goals of snippets. For instance, in Wargus we might define sensors such as numberOfTroops, or unitExists. A sensor might return any of the standard basic data types, such as boolean or integer.
• A set of goals. Goals can be structured in a specialization hierarchy in order to specify the relations among them.
A goal might have parameters, and for each goal a set of success conditions is defined. For instance, HaveUnits(TOWER) is a valid goal in our gaming domain and it has as success condition: UnitExists(TOWER). Therefore, the goal definition can be used by the system to reason about the intended result of a snippet, while the success conditions are used by the execution engine to verify whether a particular snippet succeeds at run time.
On-line Case-Based Planning
This Paper is Made By (i will only try to summarize it for a general Reader):
Santi Ontanon and Kinshuk Mishra and Neha Sugandh and Ashwin Ram
CCL, Cognitive Computing Lab,
Georgia Institute of Technology,
Atlanta, GA 30322/0280, USA
Abstract
Some domains, such as real-time strategy (RTS) games, pose several challenges to traditional planning and machine learning techniques. In this paper, a novel on-line case-based planning architecture that addresses some of these problems. The architecture addresses issues of plan acquisition, on-line plan execution, interleaved planning and execution and on-line plan adaptation. It also introduces the Darmok system, which implements this architecture in order to play Wargus (an open source clone of the well-known RTS game Warcraft II). Then presents empirical evaluation of the performance of Darmok and show that it successfully learns to play the Wargus game.
There are four refinements with respect to the original CBR cycle:
1. Problems “enter the cycle” through the Expansion process. When a new problems arrives to the system, this problem is set as the current plan, i.e. the current plan consists of a single open problem. Thus, the first thing the Expansion module will do is to send this problem to the retrieve process.
2. The CBR cycle is divided in two parts: a first part composed of retrieval and adaptation (or reuse), and a second part composed of revision and retention. The first part is in charge of finding solutions to new problems (i.e. it corresponds to the problem solving capability of the CBR cycle), and the second part is responsible for learning from experience. Revision takes as input the outcome of executing the solutions provided by the system and revises them to verify they achieve their goals. Revised solutions are handed to retention, that will decide whether to retain new cases or not, and update any internal indexes or similarity metrics when required.
3. The new cycle incorporates the world in it’s design, since it is an important part of any real-time problem solving process.
4. The new cycle features a new delayed adaptation cycle (notice that there is a loop between adaptation and expansion). In a domain where the domain changes dynamically, we want to delay adaptation till the last moment to ensure that plans are adapted with the latest information. Thus, the expansion component may send back plans to adaptation if the environment changed too much since the last time the plan was adapted. Moreover, notice that for real-time domains it is important for adaptation echniques to be efficient.
Expansion:
1- Takes the current solution proposed by the system for a problem (i.e. the current plan), and tries to find open sub-problems (sub-goals) in it. If there are any, these sub-problems are sent to the retrieve process so that they can be solved.
2- Also monitor the world state for changes, and send plans to the adaptation module again in case the world state changes enough so that plans have to be changed. We call this a delayed adaptation, since adaptation is delayed and performed at run-time with the latest game state.
Execution:
This process is in charge of executing the current plan and updating its status according to the result of execution. If a particular step in the plan fails when executed, and that causes a particular sub- problem to fail, then the execution process will update the current plan to reflect this. When that happens the expansion module will be responsible to find an alternative plan for such a sub-problem.
Why Planing in RTS(Real Time Strategy) are Hard ?
Real-time strategy (RTS) games have several characteristics that make the application of traditional planning approaches difficult:
• They have huge decision space (i.e. the set of different actions that can be executed in any given state is huge).
• Huge state space (the combination of the previous bullet and this bullet makes them not suitable for search based AI techniques ( Minimax, Alpha-beta pruning )
• They are non-deterministic.
• They are incomplete information games, where the player can only sense the part of the map he has explored and include unpredictable opponents.
• They are real-time. Thus, while the system is deciding which actions to execute, the game continues executing and the game state changes constantly. They are difficult to represent using classical planning formalisms since postconditions for actions cannot be specified easily.
Structure of CBP part4- “Plan Anticipator and Final Model”
Problem anticipation
This is so the prediction of a problem can be used to find the plans in memory that avoid it. The whole point of an ANTICIPATOR is to provide information about problems that have to be avoided, information that will be used by a RETRIEVER to find a plan that does so, so it makes sense that an ANTICIPATOR has to be called prior to any search for plans . To anticipate a problem on the basis of surface features, the ANTICIPATOR needs the base of information built by the ASSIGNER.
Let us again consider the example of the traveler. The task of the ANTICIPATOR is to take the features of the situation that the planner is dealing with, and recall the problems that have arisen in past situations to handle them. By being reminded of the problems caused by waiting for luggage, a planner can either find a past plan that deals with the problem, or alter another plan in response to the prediction. The problem can be avoided because it can be predicted.
There is an important relationship between plans stored in memory and the tasks of the ANTICIPATOR and the RETRIEVER. Once a problem plan has been repaired, it is stored in memory, indexed by the fact that it deals with a particular problem. At this point, however, there is no way for the planner to look at a new situation and predict that it will have to avoid that problem.
So there is no way for it to find the plan in situations where the problem will come up again. Because of this, it will continue to make the same mistake because it does not know that the planning approach it used before in a similar situation led to problems. If it can figure out the causes of a failure, however, it can use this information to anticipate the problem in similar situations and look for a plan that avoids it. The ASSIGNER figures out the causes of problem.
The ANTICIPATOR then uses the information built up by the ASSIGNER to predict the problem again when similar causes are present. The ANTICIPATOR notices that a problem is going to arise and then tells the RETRIEVER to find a plan that avoids it.
The final package
The basic case-based planner that grows out of the need to reuse plans and adapt them for new goals functions as follows: A set of goals is handed to the planner and sent directly to the ANTICIPATOR.
The ANTICIPATOR, based on the knowledge built up by the ASSIGNER, makes any predictions of planning problems that it thinks will arise out of the current goals. The goals are then handed to the RETRIEVER along with the ANTICIPATOR’s predictions of problems that have to be avoided. The RETRIEVER uses both to search for a plan in memory that best satisfies the goals it is trying to achieve and that avoids any problems that have been anticipated. The result is a past plan that matches some or all of the goals now being planned for.
This plan is sent to the MODIFIER, which adds or substitutes new steps to the plan in order to make it satisfy any of the planner’s goals that it does not yet achieve. Once modified, the plan is run and the results checked against the goals of the planner. If there is a failure, either because a desired goal has not been achieved or because an undesired state has resulted from the plan, the plan is given to the REPAIRER.
The REPAIRER builds a characterization of the failure and uses this to find and apply one of its repair methods. The repaired plan, along with a description of the problems that had to be solved along the way, are then sent to the STORER for placement into memory. The STORER indexes the new plan by the goals it achieves and the problems it avoids. Now the plan can be used again in similar circumstances in the future.
While the REPAIRER is repairing the plan, the ASSIGNER is deciding which features in the original request, that is, which goals, interacted to cause the failure to occur. Once it has done this, it marks these features as predictive of the problem so that the ANTICIPATOR can anticipate the problem if it encounters the goals in a later input.
This planner does two things as it builds a plan: it tries to satisfy a set of goals using its model of what plans are appropriate for different situations and at the same time it is also tests that model of the appropriateness of those plans against the real world, so that later planning will be easier and more reliable.
To serve its first function, the planner has to react to failures by repairing the present plan. To serve its second function, it has to alter its view of the world by adding new plans indexed by the problems they solve and by altering its predictions so that it can anticipate those problems and find the plans that avoid them.
Reference :
Case-Based Planning – A Framework for planning from experience – 1994








