OpenCogPrime:IntegrativeInference

From OpenCog
Jump to: navigation, search

Integrative Inference

The PLN inference framework is not discussed extensively in this wikibook because it has a whole other, paper book devoted to it, Probabilistic Logic Networks. However, that book is about PLN rather than OCP, and doesn't go into depth on the integration of PLN with OCP.

The purpose of this page and its children is to discuss issues specifically pertinent to the integration of PLN with other OCP processes. Some such issues are dealt with in other pages already, such as OpenCogPrime:EvolutionBasedInferenceControl which discusses PEL as a form of inference control. The issues not dealt with in other pages, are gathered here.

In the main the issues discussed here pertain to inference control rather than the nature of individual inference steps. This is because individual inference steps are the same in PLN whether PLN is being utilized within or outside of OCP. On the other hand, inference control is done within OCP in ways that wouldn't necessarily make sense in a standalone-PLN context. The main theme of the chapter is adaptive inference control, but this is only a viable option if there is some method besides PLN that is studying patterns among inferences, and/or studying inference-relevant patterns among Atoms. If one were doing complex PLN inference outside the context of an integrative AI framework containing appropriate non-inferential cognitive processes, one would need to take a quite different approach. In fact, our suggestion is that integrative AI is essentially the only workable approach to the control of higher-order inference in complex cases. Certainly, attempts to solve the problem with a narrow-AI approach have been made many times before and have not succeeded.

Types of PLN Query

The PLN implementation in OCP is complex and lends itself for utilization via many different methods. However, a convenient way to think about it is in terms of three basic backward-focused query operations:

  • findtv, which takes in an expression and tries to find its truth value.
  • findExamples, which takes an expression containing variables and tries to find concrete terms to fill in for the variables.
  • createExamples, which takes an expression containing variables and tries to create new Atoms to fill in for the variables, using concept creation heuristics as discussed in a later chapter, coupled with inference for evaluating the products of concept creation.

and one forward-chaining operation:

  • findConclusions, which takes a set of Atoms and seeks to draw the most interesting possible set of conclusions via combining them with each other and with other knowledge in the AtomTable.

These inference operations may of course call themselves and each other recursively, thus creating lengthy chains of diverse inference.

(In an earlier version of PLN, these operations actually had the above names. I am not sure if these precise command names have survived into the current implementation, but the same functionality is there. Maybe this page should be updated.)

Findtv is quite straightforward, at the high level of discussion adopted here. Various inference rules may match the Atom; in our current PLN implementation, loosely described below, these inference rules are executed by objects called Evaluators. In the course of executing findtv, a decision must be made regarding how much attention to allocate to each one of these Evaluator objects, and some choices must be made by the objects themselves — issues that involve processes beyond pure inference, and will be discussed later in this chapter. Depending on the inference rules chosen, findtv may lead to the construction of inferences involving variable expressions, which may then be evaluated via findExamples or createExamples queries.

The findExamples operation, on the other hand, sometimes reduces to a simple search through the AtomSpace. On the other hand, it can also be done in a subtler way. If the findExamples Evaluator wants to find examples of $X so that F($X), but can't find any, then its next step is to run another findExamples query, looking for $G so that

Implication $G F

and then running findExamples on G rather than F. But what if this findExamples query doesn't come up with anything? Then it needs to run a createExamples query on the same implication, trying to build a $G satisfying the implication.

Finally, forward-chaining inference (findConclusions) may be conceived of as a special heuristic for handling special kinds of findExample problems. Suppose we have K Atoms and want to find out what consequences logically ensue from these K Atoms, taken together. We can form the conjunction of the K Atoms (let's call it C), and then look for $D so that

Implication C $D

Conceptually, this can be approached via findExamples, which defaults to createExamples in cases where nothing is found. However, this sort of findExamples problem is special, involving appropriate heuristics for combining the conjuncts contained in the expression C, which embody the basic logic of forward-chaining rather than backward-chaining inference.

A Toy Example

To exemplify the above ideas, I will now give a detailed log of a PLN backward chainer as applied to resolve simple findtv and findExamples queries over a small test database, consisting of a few dozen relationships regarding a half-dozen people. The example shows the actual output from a prior version of the OCP/PLN testing framework that we used to qualitatively assess PLN performance, in mid-2005. The style of output from the current version differs (due to the greater generality of the current version): at the moment we no longer use the English-like output that was used in the prior version, because it become confusing in the context of highly complex examples.

In this example we begin with the findtv query

findtv friendOf(Amir,Osama) 

The heuristic then searches in the knowledge base to find what relationships are known involving the terms in the query. It first finds the following relationship involving friendOf, which indicates the (probabilistic, not certain) symmetry of the friendOf relationship:

friendOf is symmetricRelation (0.65,0.72)

Note that symmetry is stored as a relationship like any other. This allows PLN to deal nicely with the cases of relationships that are only partially, probabilistically symmetric, such as friendOf and enemyOf.

The heuristic then searches for relations involving symmetricRelation, finding the following relationship, an Equivalence relationship that embodies the definition of symmetry

EQUIVALENCE TO IMPLICATION

if R000 is symmetricRelation then if and only if R000(X007,X008) then R000(X008,X007) (0.995,0.985)

(Here the notation such as X008 refers to VariableNodes).

The system then applies this definition to the relationship friendOf, which involves a step of deduction:

VARIABLE INSTANTIATION

if friendOf is symmetricRelation then if and only if friendOf(X007,X008) then friendOf(X008,X007) (0.995, 0.985)

DEDUCTION (NODE TRUTH VALUE)

if and only if friendOf(X007,X008) then friendOf(X008,X007) (0.654, 0.713)

Now it applies a step of variable instantiation, seeking to match the new link it has learned regarding friendOf with the other terms that were there in the original query.

VARIABLE INSTANTIATION

if and only if friendOf(Osama,Amir) then friendOf(Amir,Osama) (0.654,0.713)

This gives it a way to assess the truth value of friendOf(Amir, Osama). Namely: it now realizes that, since friendOf is symmetric, it suffices to evaluate friendOf(Osama, Amir). Thus it submits a findtv query of its own creation.

DEDUCTION (NODE TRUTH VALUE)

The truth value of friendOf(Amir, Osama) is unknown:
findtv friendOf(Osama, Amir)
no answer

But unfortunately, this query finds no answer in the system's knowledge base. The exploitation of the symmetry of friendOf was a dead end, and the inference heuristic must now backtrack and try something else. Going back to the start, it looks for another relationship involving the terms in the original query, and finds this one, which indicates that friendOf is transitive:

friendOf is transitiveRelation (0.4, 0.8)

if and only if R001 is transitiveRelation then if exists X010 such that
AND(R001(X009,X010),R000(X010,X011)) then R001(X009,X011) (0.99,0.99)

EQUIVALENCE TO IMPLICATION

if R001 is transitiveRelation then if exists X010 such that
AND(R001(X009,X010),R000(X010,X011)) then R001(X009,X011) (0.995,0.985)

VARIABLE INSTANTIATION

if friendOf is transitiveRelation then if exists X010 such that
AND(friendOf(Amir,X010),friendOf(X010,Osama)) then friendOf(Amir, Osama) (0.995,0.985)

DEDUCTION (NODE TRUTH VALUE)

if exists X010 such that AND(friendOf(Amir,X010),friendOf(X010,Osama))
then friendOf(Amir, Osama) (0.410, 0.808)

In this case, we end up with a findExamples query rather than a findtv query:

FIND INSTANCES

findExamples AND(friendOf(Amir,X010),friendOf(X010,Osama))

X010={Britney}

friendOf(Amir, Britney) (0.8, 0.3)

friendOf(Britney, Osama) (0.7, 0.6) 

To resolve this find query, it must use the truth value formula for the AND operator

AND RULE

AND(friendOf(Amir, Britney),friendOf(Britney,Osama)) (0.56,0.3)

Since this evaluation yields a reasonably high truth value for the find query, the system decides it can plug in the variable assignment

X010={Britney}

Along the way it also evaluates

exists X010 such that AND(friendOf(Amir,X010),friendOf(X010,Osama)) (0.56,0.3)

And, more pertinently, it can use the transitivity of friendOf to assess the degree to which Amir is a friend of Osama:

DEDUCTION (NODE TRUTH VALUE)

friendOf(Amir, Osama) (0.238, 0.103)

There is nothing innovative or interesting about the inference trajectory followed here — the simple inference control heuristic is doing just what one would expect based on the knowledge and inference rules supplied to it. For dealing with a reasonably large-sized knowledge base, more sophisticated inference control heuristics are needed, because a simplistic strategy leads to combinatorial explosion.

What is important to observe here is how the probabilistic truth values are propagated naturally and sensibly through the chain of inferences. This requires the coordination of a number of different PLN inference rules with appropriately tuned parameters.

Because the knowledge base is so small, the revision rule wasn't invoked in the above example. It's very easy to see how it would come up in the same example with a slightly larger knowledge base, however. If there were more knowledge about the friends of Amir and Osama, then the system could carry out a number of inferences based on the symmetry and transitivity of friendOf, and revise the results together at the end.

PLN and Bayes Nets

Some comments on the relationship between PLN and Bayes Nets may be useful. We have not yet implemented such an approach, but it may well be that Bayes Nets methods can be a useful augmentation to PLN for certain sorts of inference (specifically, for inference on networks of knowledge that are relatively static in nature).

We can't use standard Bayes Nets as the primary way of structuring reasoning in OCP because OCP's knowledge network is loopy. The peculiarities that allow standard Bayes net belief propagation to work in standard loopy Bayes nets, don't hold up in OCP, because of the way you have to update probabilities when you're managing a very large network in interaction with a changing world, so that different parts of which get different amounts of focus. So in PLN we use different mechanisms (the "inference trail" mechanism) to avoid "repeated evidence counting" whereas in loopy Bayes nets they rely on the fact that in the standard loopy Bayes net configuration, extra evidence counting occurs in a fairly constant way across the network.

However, when you have within the AtomTable a set of interrelated knowledge items that you know are going to be static for a while, and you want to be able to query them probabilistically, then building a Bayes Net (i.e. "freezing" part of OCP's knowledge network and mapping it into a Bayes Net) may be useful. I.e., one way to accelerate some PLN inference would be:

  1. Freeze a subnetwork of the AtomTable which is expected not to change a lot in the near future
  2. Interpret this subnetwork as a loopy Bayes net, and use standard Bayesian belief propagation to calculate probabilities based on it

This would be a highly efficient form of "background inference" in certain contexts. (Note that this requires an "indefinite Bayes net" implementation that propagates indefinite probabilities through the standard Bayes-net local belief propagation algorithms, but this is not problematic.)

<< Integrative Inference