PLN Reasoning on MOSES Output

From OpenCog
Jump to: navigation, search

Theory

A program learner like MOSES attempts to find candidates (programs) maximizing some fitness function, such as precision and recall against a data set. Reasoning can be introduced in multiple ways. Briefly presented are two ways, amongst certainly many others.

Guiding the search over the program space

The estimation of distribution algorithm, (EDA), used during the course of learning to bias the search towards better candidates, is a form of inductive reasoning. Often a Bayesian Net is used but in principle other probabilistic inferences could be used. The advantage of using a general purpose reasoning system like PLN is that the distribution over the promising candidates can incorporate background knowledge. Knowledge that is not necessarily present, or very hard to extract, in the training data set. Such guidance obtained via reasoning over the learning history as well as domain background knowledge should be particularly effective for cross-problem transfer learning.

Improving fitness estimate

Often the fitness has uncertainties. That would be the case of a fitness based on a small dataset for instance, ultimately producing overfit candidates. Here again background knowledge can be used to overcome the lack of training data and reassess the scores of promising candidates via reasoning. Since we live in a world where smaller things tend to determine the behavior of bigger things, and smaller things are more abundant, thus known with greater certainty, using background knowledge may be an effective way to reveal about a candidate what the data set may hide.

This tutorial focuses on this latter use, that is how to improve the fitness estimate of a candidate using background knowledge.

Practice

This tutorial is based on the MOSES-PLN synergy demo.

We will

  1. Learn a simplistic model based on a very small data set using MOSES.
  2. Reasses the true value of the learned model via reasoning over background knowledge, thus supposedly overcoming overfitting.

Preparation

Learn MOSES model

First make sure you have MOSES installed on your system.

which moses

should display something like

/usr/local/bin/moses

If it isn't installed follow the instruction Building_OpenCog. Otherwise go under

<OPENCOG_ROOT>examples/pln/moses-pln-synergy/scm

We're gonna learn a predictive model of the recovery speed of some injury given some treatments based on the following data set dataset.csv

For that run the following command

 moses \
      --input-file ../dataset.csv \
      --target-feature recovery-speed-of-injury-alpha \
      --output-with-labels 1 \
      --problem pre \
      -q 0.7 \
      -p 0.5 \
      --result-count 1 \
      --output-format scheme

Let's explain some of the command line arguments

  • --problem pre indicates that we want to maximize the precision.
  • -q 0.7 indicates that we want to maintain a recall of at least 0.7 while maximizing the precision
  • -p 0.5 controls the occam's razor
  • --output-format scheme outputs the model in scheme format, to import it to the atomspace

It should output

0.875 (OrLink (PredicateNode "take-treatment-1") (PredicateNode "eat-lots-fruits-vegetables"))

Meaning that the best model MOSES found, the disjunction of the first and third input features, has precision 0.875.

Now let's build the knowledge that represents that. Fire guile

guile

and enter the following

(ImplicationLink (stv 0.875 0.0099)
  (OrLink
    (PredicateNode "take-treatment-1")
    (PredicateNode "eat-lots-fruits-vegetables")
  )
  (PredicateNode "recovery-speed-of-injury-alpha")
)

meaning that the model can predict the target feature with precision 0.875 and confidence 0.099. An ImplicationLink can be used because the prediction is atemporal in this particular knowledge representation. The confidence 0.0099 is obtained by applying the formula in http://wiki.opencog.org/w/TruthValue#Confidence with N=8, because the data set has 8 entries satisfying the model, and K=800 because it is hardcoded in OpenCog.

Load background knowledge

Now that the model is loaded let's load the background knowledge that is gonna be used to reason on that model.

(load "background-knowledge.scm")

Take a few minutes to explore it, it contains knowledge about what contain the treatments and such. Notice how most atoms in the background knowledge have much greater confidence than the model obtained via learning, as they are supposedly based on more evidence.

Load PLN

Now load the PLN configuration file

(load "pln-fc-config.scm")

This file loads PLN rules, associates them with the PLN rule-base and sets a few parameters.

Inference

We are gonna use inference to reassess the truth value on

(ImplicationLink (stv 0.875 0.0099)
  (OrLink
    (PredicateNode "take-treatment-1")
    (PredicateNode "eat-lots-fruits-vegetables")
  )
  (PredicateNode "recovery-speed-of-injury-alpha")
)

The optimal reasoning to do that is informally:

  1. Treatment-1 contains compound-A that is known to speed up recovery of this specific injury.
  2. Fruits and vegetables contain a lot of water, which is known to speed up recovery in general.
  3. Then 1 and 2 are combined to form the model.

There are 3 ways to do that

  1. Running the forward chainer till the TV of the model gets updated.
  2. Running the backward chainer with the model as target.
  3. Running the inference applying rules step-by-step with the pattern matcher.

Running the forward or backward chainer can take a few hours to come up with the best answer, so instead we're gonna run the inference step-by-step. By far the easiest way to do that is to load moses-pln-synergy-pm.scm

that will apply all the rules in the right order. Feel free to explore the README.md that contains a description of what each rule application is doing (see Section Apply rules).

(load "moses-pln-synergy-pm.scm")

This should take a few seconds. Once done it should print all the atoms that have been inferred. Type again the model but without the TV

(ImplicationLink
  (OrLink
    (PredicateNode "take-treatment-1")
    (PredicateNode "eat-lots-fruits-vegetables")
  )
  (PredicateNode "recovery-speed-of-injury-alpha")
)

and you should get as a result

(ImplicationLink (stv 0.60357847 0.65475)
   (OrLink
      (PredicateNode "take-treatment-1" (stv 0.1 0.8))
      (PredicateNode "eat-lots-fruits-vegetables" (stv 0.07 0.8))
   )
   (PredicateNode "recovery-speed-of-injury-alpha" (stv 0.3 0.8))
)

You may see that both the strength and confidence have changed. The strength is lower, but the confidence is much higher, meaning that the strength, the precision of the model, should be a lot less overfit.

Notes

Maintained by Nil