ImportanceDiffusionAgent

From OpenCog
Jump to: navigation, search

The ImportanceDiffusionAgent treats the spread of importance as a diffusive process.

Let the variable equal the "probability that Atom is selected", which is proportional to its STI. STI is linearly scaled between 0 (== minimum recently seen STI) and 1 (== maximum recently seen STI).

((( Note: perhaps 0 should actually equal the attention focus boundary and atoms not in the AF should have their final STI increased the equivalent amount at the end. In this case we would be unable to directly use the vector of 's to set the STI after diffusion because "0" would equal "0 and below". Instead if an atom has 0 scaled STI, then the atom's end nonscaled STI would have to be set relative to it's previous STI.

I.e. AF boundary = 0.0, max STI = 10. X.sti = -1 Thus X.scaled_sti = 0.0

Say that after diffusion X.scaled_sti = 0.2 This translate to 2. Because we conserve sti we can't set X.sti directly to 2 since it's previous sti is -1. Instead work out the absolute increase of a scaled sti going from 0 -> 0.2 and add this to X.sti:

X.sti after diffusion = -1 + 2 = 1 )))

The HebbianLinks and InverseHebbianLinks determine the transition probabilities (The probability that is selected, given that was selected)

Then, we have a Markov matrix M...

but with the caveat that transition probabilities of HebbianLinks are only added to M if the source Atom's STI is sufficient for the selected spread decider to to return true. This is influenced by the parameter diffusionThreshold, which by default is the Attentional focus boundary.

This is necessary because InverseHebbianLinks are included in M by reversing the the indices i and j. Thus if there is a Hebbian link from -> then the weight of an inverse hebbian link would be included as instead of . If there were no diffusionThreshold, then atoms below the attentional focus boundary could steal importance if they had InverseHebbianLinks to those that were. Conceptually InverseHebbianLinks should only have an effect when they are above the attentional focus boundary... for an atom they say, "when I'm important, then this other atom isn't", rather than the more presumptuous "I'm always more important than this other atom".

To conserve STI, the Markov matrix then has to be normalised to a left stochastic matrix such that each column sums to 1. Here, another parameter is introduced maxSpreadPercentage which indicates the maximum percentage (from 0..1) of an atoms STI that it will give up (perhaps this is unnecessary, in which case it would equal 1).

Then the diagonal of M is set to (1-maxSpreadPercentage) and the columns totaled, i.e. let n be the vector whose entries are:

Then for each entry of n:

  • if is more than maxSpreadPercentage, then each entry of column j of M (apart from ) is normalised by (maxSpreadPercentage) / .
  • if is less than maxSpreadPercentage, then is set to .

Now each column of M should equal 1.

We then multiply , where

and set the STI levels relative to the entries of .

If the interaction of the Atoms with the world don't change the HebbianLink strengths or systematically perturb the atom STI levels, the network will eventually settle to the fixed point of the Markov matrix.

Spread Decision

The spread decision can be made in several ways, and can also be stochastic so that Hebbian links from a source with given STI will sometimes be included and sometimes not.

Hyperbolic

ImpDiffAgent decision hyperbolic.png

The hyperbolic spread decider is stochastic, and the probability of spread is:

where is the 0..1 normalised STI, is the attentional focus boundary also normalised from 0..1 and is a shape parameter (higher s == more abrupt transition between 0 and 1 probability of spread as i increases).

Step

Implementatation

The initial implementation uses the gsl::matrix class from GSLWrap. This can only work for a small AtomSpace, since gsl::matrix isn't sparse.

In future

Boost provides a series of sparse matrices and simple operation on these.

Or work out how to carry out the same procedure without using a matrix, and instead calculating the values directly from the AtomSpace... which may be slower but have less memory requirements.