Wednesday, 12 October 2016

The AI Dictionary

A
"ako" signifies "a kind of". It acts as a relation between two types of things, and specifies that one is a subset of the other. For example, chair ako furniture signifies that the concept of chair is a subconcept of the concept of furniture, or to put it another way, the set of all chairs is a subset of the set of all furniture.
"ako" acts like an operator in the iProlog frame implementation. For example, assuming that a furniture frame already existed, chair ako furniture with ... would have the effect of creating a newgeneric frame with all the slots of the furniture generic frame, together with whatever extra slots were specified after the with.
"ako" should be contrasted with isa.
B
Backward chaining is a means of utilizing a set of condition-action rules. In backward chaining, we work back from possible conclusions of the system to the evidence, using the rules backwards. Thus backward chaining behaves in a goal-driven manner.
One needs to know which possible conclusions of the system one wishes to test for. Suppose, for example, in a medical diagnosis expert system, that one wished to know if the data on the patient supported the conclusion that the patient had some particular disease, D.
In backward-chaining, the goal (initially) is to find evidence for disease D. To achieve this, one would search for all rules whose action-part included a conclusion that the patient had disease D. One would then take each such rule and examine, in turn, the condition part of the rule. To support the disease D hypothesis, one has to show that these conditions are true. Thus these conditions now become the goals of the backward-chaining production system. If the conditions are not supported directly by the contents of working memory, we need to find rules whose action-parts include these conditions as their conclusions. And so on, until either we have established a chain of reasoning demonstrating that the patient has disease D, or until we can find no more rules whose action-parts include conditions that are now among our list of goals.
Backward-chaining is to be contrasted with forward chaining.
Best-first search, rather than plunging as deep as possible into the tree (as in depth-first search), or traversing each level of the tree in succession (as in breadth-first search), uses a heuristic to decide at each stage which is the best place to continue the search. Here is a more technical discussion of best-first search.
Breadth-first search is best understood with respect to a tree, though it can be applied to graphs in general, provided a starting node is nominated.
A complete breadth first search traverses every node of the tree or graph, starting from the root node or starting node, first processing, checking, or inspecting the root/starting node. In future we'll just say it "processes" the node. Next it processes the neighbours of the root/starting node (in some order), and then the neighbours of the neighbours, and so on, until all the nodes have been processed.
In the case of a tree, no node will be visited twice - this is a property of trees. In the case of a graph, whether a directed acyclic graph or a general graph, some nodes will be visited twice. On the second and subsequent visits, the node and its neighbours should be ignored. Thus a breadth-first algorithm on such graphs needs to mark each node as visited when it first encounters it, and check each node it comes to process to see if it has already been visited.
For the tree shown below, the order of visiting for a breadth first search would be: A B C D E F G H I J
                    A
                   / \
                  /   \
                 B     C
                / \   / \
               D   E F   G
              /       \   \
             H         I   J
Compare depth first search.
C
A certainty factor is a number, often in the range -1 to +1, which is associated with a condition or an action of a rule. In more detail, each component of a condition may have an certainty factor associated with it - for example if the condition is of the form A and B, then there could be a certainty factor for A and a certainty factor for B.
A certainty factor of 1 means that the fact (or proposition) is highly certain. A certainty factor of 0 means no information about whether the proposition is true or not. A certainty factor of -1 means that the proposition is certainly false. A certainty factor of 0.7 means that the proposition is quite likely to be true, and so on.
The certainty factors of conditions are associated with facts held in working memory. Certainty factors for actions are stored as part of the rules.
Rules for manipulating certainty factors are given in the lecture notes on uncertain reasoning.
However, here is a simple example. Suppose that there is a rule
if P then Q (0.7)
meaning that if P is true, then, with certainty factor 0.7, Q follows. Suppose also that P is stored in working memory with an associated certainty factor of 0.8. Suppose that the rule above fires (see alsomatch-resolve-act cycle). Then Q will be added to working memory with an associated certainty factor of 0.7 * 0.8 = 0.56.
A condition-action rule, also called a production or production rule, is a rule of the form
if condition then action.
The condition may be a compound one using connectives like andor, and not. The action, too, may be compound. The action can affect the value of working memory variables, or take some real world action, or potentially do other things, including stopping the production system.
The knowledge of many expert systems is principally stored in their collections of rules.
See also inference engine.
Conflict resolution in a forward-chaining inference engine decides which of several rules that could be fired (because their condition part matches the contents of working memory should actually be fired. Conflict resolution proceeds by sorting the rules into some order, and then using the rule that is first in that particular ordering. There are quite a number of possible orderings that could be used:
Specificity Ordering
If a rule's condition part is a superset of another, use the first rule since it is more specialised for the current task. For example, you might have two satisfied rules whose condition parts are:
(i) X is a dog and X likes cheese and X chases cars
(ii) X is a dog and X likes cheese
Then (i) is more specific than (ii) so under specificity ordering, you would use (i).
Rule Ordering
Choose the first rule in the text, ordered top-to-bottom.
Data Ordering
Arrange the data in a priority list. Choose the rule that applies to data that have the highest priority.
Size Ordering
Choose the rule that has the largest number of conditions. For example, you might have two satisfied rules whose condition parts are:
(ii) X is a dog and X likes cheese
(iii) X is a dog and X bites postmen and X chases cars and X has rabies
Then (iii) is has larger size (i.e. more conditions) than (ii) so under size ordering, you would use (iii). Note that neither rule is more specific than the other, so this is different from specificity ordering, which can be viewed as a special case of size ordering. Size ordering will work in cases where specificity ordering will not. Conflict resolution by size ordering can of course fail if there is tie for the satisfied rule with the most conditions.
Recency Ordering
The most recently used rule has highest priority. The least recently used rule has highest priority. The most recently used datum has highest priority. The least recently used datum has highest priority.
Context Limiting
Reduce the likelihood of conflict by separating the rules into groups, only some of which are active at any one time. Have a procedure that activates and deactivates groups.
D
In frames, a default (value) is a value to be assigned to a particular slot in an instance frame if a specific value has not be assigned or added to that slot. The default is specified as part of the corresponding generic frame, and thus is inherited by the instance frame when the instance frame is created.
A demon is a facet of a slot in a frame which causes some action to be taken when the frame is accessed in certain types of ways. For example, an if-needed demon is activated or triggered if the value of the slot is required and a value has not yet been stored in the slot, and it should calculate or otherwise obtain a value for the slot, while a range demon is triggered if a new value is added to the slot, to check that the value added is permissible for this particular slot.
Here is a list of the demon types supported by the iProlog frame implementation:
demons are triggered when a new value is put into a slot.
if_removed
demons are triggered when a value is removed from a slot.
if_replaced
is triggered when a slot value is replaced.
if_needed
demons are triggered when there is no value present in an instance frame and a value must be computed from a generic frame.
if_new
is triggered when a new frame is created.
range
is triggered when a new value is added. The value must satisfy the range constraint specified for the slot.
help
is triggered when the range demon is triggered and returns false.
The following are not demons but demon-related slots in a frame.
means that when a value is computed it is stored in the instance frame.
multi_valued
means that the slot may contain more than one value.
For examples, see the lecture notes.
Depth-first search is best understood with respect to a tree, though it can be applied to graphs in general, provided a starting node is nominated.
A complete depth first search traverses every node of the tree or graph, starting from the root node or starting node, first processing, checking, or inspecting the root/starting node. In future we'll just say it "processes" the node. Next it considers (but does not yet process) the neighbours of the root/starting node. The neighbours should be ordered in some way. Suppose that the first neighbour is called F1. Depth-first search proceeds to search first the subtree or subgraph composed of the neighbours of F1. Suppose that the first of F1's neighbours is F2. Depth-first search proceeds by searching the subtree or subgraph composed of the neighbours of F2. And so on, until the bottom of the tree/graph is reached. Thus the algorithm can be expressed recursively as follows (for a tree):
to depthFirstSearch a tree with root R
  if tree is empty
  then % we're finished
  else
     let N1, N2, ..., Nk be the neighbours of R
         depthFirstSearch the subtree with root N1
         depthFirstSearch the subtree with root N2
         ...
         depthFirstSearch the subtree with root Nk

In the case of a tree, no node will be visited twice - this is a property of trees. In the case of a graph, whether a directed acyclic graph or a general graph, some nodes will be visited twice. On the second and subsequent visits, the node and its neighbours should be ignored. Thus a depth-first algorithm on such graphs needs to mark each node as visited when it first encounters it, and check each node it comes to process to see if it has already been visited.
For the tree shown below, the order of first visiting for a depth first search would be: A B D H E C F I G J
                    A
                   / \
                  /   \
                 B     C
                / \   / \
               D   E F   G
              /       \   \
             H         I   J
In the case of binary trees there are 3 common variants of depth-first search called pre-order, in-order, and post-order traversal. The variants distinguish between first visiting a node, and processingthat node, i.e. doing something with the data stored in the node (e.g. printing out the name of the node). There are six possible orders for three objects, of which the three orders commonly used are as follows:
·        process node, visit left subtree, visit right subtree. This variant is called pre-order traversal. In order traversal of the binary tree shown above processes the nodes in the order A B D H E C F I G Jas for simple depth-first search.
·        visit left subtree, process node, visit right subtree. This variant is called in-order traversal. In order traversal of the binary tree shown above processes the nodes in the order H D B E A F I C G J.
·        visit left subtree, visit right subtree, process node. This variant is called post-order traversal. In order traversal of the binary tree shown above processes the nodes in the order H D E B I F J G C A.
Compare breadth first search.
directed acyclic graph
see graph
directed graph
see graph
E
A component of a graph.
An expert system is a computer system intended to perform at the level of a human expert in whatever domain it aspires to deal with. Early expert systems included systems aimed at medical diagnosis. All expert systems must confront the issue of knowledge representation - that is, how the program will encode the knowledge of the human expert.
Knowledge representation methods include production rulesframesripple-down rulessemantic networks. Many or most expert systems are rule-based systems.
F
Facets are the components of a slot in a frame. Most slots would have a value facet and a default facet. Many would also have facets corresponding to various demons such as a range or if-neededdemon - each demon is a facet.
production is said to be ready to fire if its condition part matches the contents of working memory.
Forward chaining is a means of utilizing a set of condition-action rules. In this mode of operation, a rule-based system is data-driven. We use the data to decide which rules can fire, then we fire one of those rules, which may add to the data in working memory and then we repeat this process until we (hopefully) establish a conclusion.
For more details of this method of using rules, see inference engine.
Forward-chaining is to be contrasted with backward chaining.
Frames are a knowledge representation technique. They resemble an extended form of record (as in Pascal and Modula-2) or struct (using C terminology) or class (in Java) in that they have a number ofslots which are like fields in a record or struct, or variable in a class. Unlike a record/struct/class, it is possible to add slots to a frame dynamically (i.e. while the program is executing) and the contents of the slot need not be a simple value. If there is no value present in a slot, the frame system may use a default for frames of that type, or there may be a demon present to help compute a value for the slot. Documentation for frames as implemented in iProlog is available at http://www.cse.unsw.edu.au/~claude/teaching/AI/notes/prolog/Frames/Frames.html. (This link is outside my web space and may break.)
Demons in frames differ from methods in a Java class in that a demon is associated with a particular slot, whereas a Java method is not so linked to a particular variable.
G
frame that serves as a template for building instance frames. For example, a generic frame might describe the "elephant" concept in general, giving defaults for various elephant features (number of legs, ears, presence of trunk and tusks, colour, size, weight, habitat, membership of the class of mammals, etc.), which an instance frame would describe a particular elephant, say "Dumbo", who might have a missing tusk and who would thus have the default for number of tusks overridden by specifically setting number of tusks to 1. Instance frames are said to inherit their slots from the generic frame used to create them. Generic frames may also inherit slots from other generic frames of which they are a subconcept (as with mammal and elephant - elephant inherits all the properties of mammal that are encoded in the mammal generic frame - warm blood, bear young alive, etc.)
See article on operators and states.
This material should be familiar to you if you have studied discrete mathematics, either in a mathematics department, or, for example, in the subject COMP9020 Foundations of Computer Science.
Technically, a (directed) graph is a set V of vertices or nodes, together with a set E of directed edges, which are ordered pairs of vertices. Graphs are widely used in computer science as a modelling tool. A simple example of a graph would be V = {1, 2, 3} and E = {(1,2), (3,1)}, which could be drawn as:
3 -----> 1 -----> 2
Usually it is convenient to have names or "labels" for the nodes - technically this could be expressed as a mapping from V to a set VLabels of labels for vertices. Sometimes it is useful (as in semantic networks), to have labels for the edges. Again, technically this can be expressed as a function from E to a set ELabels of labels for edges.
It is also possible to have undirected graphs, in which the edges are not ordered but rather unordered pairs. Consider the possibility of edges from a node to itself - sometimes these could be useful, sometimes not.
directed cycle in a directed graph is a sequence of edges (v1, v2), (v2, v3), ..., (vn, v1) such that the second vertex of the final edge is the same as the first vertex of the first edge. Here is a simple example:
            1 ------> 3
            ^         |
            |         |
            |         v
            2 <------ 4
The in-degree of a vertex v in a directed graph is the number of directed edges (u, v) such that v is the second vertex of the edge. In the following example, vertex b has in-degree 2 and the other vertices have in-degree 0. In the example above, all vertices have in-degree 1.
            a
            |
            |
            v
            b <------ c
The out-degree of a vertex u in a directed graph is the number of directed edges (u, v) such that u is the first vertex of the edge. In the example above, vertex b has out-degree 0 and the other vertices have out-degree 1.
Directed acyclic graphs are graphs with no directed cycles.
Trees are a special kind of directed graph, in which there is a special vertex, called the root, and which has in-degree 0, and every other vertex has in-degree 1.
Binary trees are a special kind of tree, in which every node has out-degree at most 2.
Graphs can be represented in a variety of ways. One can represent them directly as lists or other collections of (directed) edges:
edge(1, 3).
edge(3, 4).
edge(4, 2).
edge(2, 1).
or
[[1, 3], [3, 4], [4, 2], [2, 1]]
It is also possible to use adjacency matrices, a representation using a matrix of 0s and 1s:
0 0 1 0
1 0 0 0
 .... the 1 in the (2,1)-position tells us (2,1) is an edge
0 0 0 1
0 1 0 0
The structures represented using graphs in Artificial Intelligence (and elsewhere) frequently need to be searched. There are a number of graph search algorithms for this purpose.
See breadth first searchdepth first search, and best first search. There are many other search concepts - see the textbook by Bratko or elsewhere: look for minimax search, and alpha-beta pruning for a start, if interested.
H
A heuristic is a fancy name for a "rule of thumb" - a rule or approach that doesn't always work or doesn't always produce completely optimal results, but which goes some way towards solving a particularly difficult problem for which no optimal or perfect solution is available.
I
A rule-based system requires some kind of program to manipulate the rules - for example to decide which ones are ready to fire. (i.e. which ones have conditions that match the contents of working memory). The program that does this is called an inference engine, because in many rule-based systems, the task of the system is to infer something, e.g. a diagnosis, from the data using the rules. See also match-resolve-act cycle.
Inheritance is a property of semantic networks and frames. It works as follows.
In the case of a semantic network, if I try to use the network to retrieve, say, the number of legs of a node (with name "Dumbo") the system will first look to see if the "Dumbo" node has an explicit "legs" link. If so, it is followed. If not, inheritance is applied, and, since "Dumbo" is an object rather than a type, the isa link is followed. Assuming "Dumbo" isa elephant, we then check the "elephant" node to see if it has a link labelled "legs". If so, we use it. If not, we look to see if "elephant" has a ako link, perhaps to "mammal". If mammal has a link labelled "legs" then we use it. If not, then we look for further "ako" links from either "elephant" or "mammal", and so on, until we either find the "legs" or run out of semantic network to search.
In the case of a frame, the effect is the same, but the details are different - an "elephant" frame would have been constructed using the information in a "mammal" frame, since elephant ako mammal. The "Dumbo" instance frame will have been constructed using the "elephant" generic frame since Dumbo isa elephant. Thus elephant inherits all the properties of mammal and Dumbo inherits all the properties of elephant. At each stage, there is an opportunity to change e.g. the number of legs if Dumbo should happen to an aberrant elephant in the matter of legs.
See article on operators and states.
See generic frames.
"isa" signifies "is a". It acts as a relation between an object and a type, and specifies that the object is a member of the type. For example, Fido isa dog signifies that the object Fido is a member of the set of all dogs.
"isa" acts like an operator in the iProlog frame implementation. For example, X isa dog with ... would have the effect of binding X to a new instance frame with all the slots of the dog generic frame, together with whatever extra information (such as slot values) was provided after the with.
"isa" should be contrasted with ako.
J
K
L
M
The match-resolve-act cycle is the algorithm performed by a forward-chaining inference engine. It can be expressed as follows:
loop
    1. match all condition parts of condition-action rules
       against working memory and collect all the rules that match;
    2. if more than one match, resolve which to use;
    3. perform the action for the chosen rule
until action is STOP or no conditions match
Step 2 is called conflict resolution. There are a number of conflict resolution strategies.
N
A component of a graph.
May also be a component of a neural network (see 
node in a neural network.
O
operators and states
In problem solving, the term operator is used to describe one of the procedures that can be used to move from one state to another. One starts at an initial state and uses the allowable operators to move towards the or a goal state.
The sequence of states and operators (or sometimes just the states) that lead from the initial state to the goal state is referred to as a path. The trick lies in choosing operators that do in fact lie along a path, and preferably a short or cheap path towards the or a goal state. Sometimes it is possible to navigate intelligently through state space, but sometimes blind backtracking search through state space is the only possibility.
In most cases, it is necessary to check, at each state, which operators are feasible at this particular point. If, for example, the states were physical positions in some real or simulated terrain, the operators moved one in different physical directions, it would be necessary to check for example that there were no barriers in some directions, or that there were no bad consequences for steps in certain directions. For example, a step to the West might land lead to a state which could not be escaped from with any of the available operators.
P
path
See article on operators and states.
Procedural attachment refers to the practice of attaching small procedures called demons to slots in frames. Demons in frames resemble methods in object-oriented languages such as Java, and historically precede them.
production or production rule
See condition-action rule.
Q
R
Not covered in COMP9414. Discussed in COMP9416.
A rule-based system is one based on condition-action rules. See also backward chainingforward chaininginference engineworking memory, and match-resolve-act cycle.
S
Search is a prevalent metaphor in artificial intelligence. Many types of problems that do not immediately present themselves as requiring search can be transformed into search problems. An example is problem solving, which can be viewed in many cases as search a state space, using operators to move from one state to the next.
Particular kinds of search are described under the headings breadth-first searchdepth-first search, and best-first search
Semantic networks are a knowledge representation technique. More specifically, it is a way of recording all the relevant relationships between members of set of objects and types. "Object" means an individual (a particular person, or other particular animal or object, such as a particular cat, tree, chair, brick, etc.). "Type" means a set of related objects - the set of all persons, cats, trees, chairs, bricks, mammals, plants, furniture, etc. Possible relationships include the special set-theoretic relationships isa (set membership) and ako(the subset relation), and also general relationships like likeschild-of. Technically a semantic network is a node- and edge-labelled directed graph, and they are frequently depicted that way. Here is a pair of labelled nodes and a single labelled edge (relationship) between them (there could be more than one relationship between a single pair):
http://www.cse.unsw.edu.au/~billw/dictionaries/pix/semnet1.gif
Here is a larger fragment of a semantic net, showing 4 labelled nodes (Fifi, cat, mammal, milk) and three labelled edges (isa, ako, likes) between them.
http://www.cse.unsw.edu.au/~billw/dictionaries/pix/semnet2.gif
A slot in a frame is like a field in a record or struct in languages like Pascal, Modula-2 and C. However, slots can be added dynamically to frames, and slots contain substructure, called facets. The facets would normally include a value, perhaps a default, quite likely some demons, and possibly some flags like the iProlog frame system's cache and multi_valued facets.
state
See the article on operators and states.
T
tree
see graph
U
V
A component of a graph. The irregular plural of vertex is vertices.
W
The working memory of a rule-based system is a store of information used by the system to decide which of the condition-action rules is able to be fired. The contents of the working memory when the system was started up would normally include the input data - e.g. the patient's symptoms and signs in the case of a medical diagnosis system. Subsequently, the working memory might be used to store intermediate conclusions and any other information inferred by the system from the data (using the condition-action rules.

The term "working memory" is also used in cognitive psychology, where it refers to the limited store of "chunks" (roughly, items in memory) available at the same time for conscious processing.

No comments:

Post a Comment