Attention Allocation

From OpenCog
(Redirected from Attention allocation)

Attention allocation within OpenCog weights pieces of knowledge relative to one another, based on what has been important to the system in the past and what is currently important. Attention allocation has several purposes:

  1. To guide the process of working out what knowledge should be stored in memory, what should be stored locally on disk, and what can be stored distributed on other machines.
  2. To guide the forgetting process. Deleting knowledge that is deemed to not be useful any more (or that has been integrated into system in other ways, e.g. raw perceptual data).
  3. To guide reasoning carried out by PLN. During the inference process, the combinatorial explosion of potential inference paths become a bane to effective and efficient reasoning. By providing an ordering of which paths should be used first, and potentially also providing a cutoff (knowledge with too low an importance is ignored) inference should become more tractable.


Currently Attention allocation is implemented for keeping track of the importance of atoms. The overall design of OpenCog calls for keeping track of MindAgent importance as well. MindAgents confer attention to the atoms they use, and are then rewarded in importance funds when they achieve system goals. MindAgents can currently store attention, but the mechanism for rewarding them for fulfilling system goals is not yet implemented (in part because the goal system is not yet implemented).

Entities involved


This sections presents how the flow of attention allocation works.

  1. Rewarding "useful" atoms:
    1. Atoms are given stimulus by a MindAgent if they've been useful in achieving the MindAgent's goals.
    2. This stimulus is then converted into Short and Long Term Importance, by the ImportanceUpdatingAgent.
  2. STI is spread between atoms along HebbianLinks, either by the ImportanceDiffusionAgent or the ImportanceSpreadingAgent.
  3. The HebbianLinkUpdatingAgent updates the HebbianLink truth values, based on whether linked atoms are in the Attentional Focus or not.
  4. The ForgettingAgent removes either atoms that are below a threshold LTI, or above it.

Specification for future work

ECAN Parameter Tuning Workbench

To facilitate independent ECAN parameter tuning, we should set up an ECAN Parameter Tuning workbench. This workbench should consist of a ECAN Parameter Tuning agent whose only goal is to provide known time-varying stimuli to sets of Atoms. For example, we could begin with two sets of Atoms, A and B. Set A Atoms could be initialized with mean importance mu_A and standard deviation sigma_A and set B Atoms similarly initialized with mean importance mu_B and standard deviation sigma_B. As time progresses, the ECAN parameter tuning agent should begin to change the stimuli provided to these two sets of Atoms.

For example, suppose initially mu_A >> mu_B but as time progresses these mean values change so that eventually mu_B >> mu_A. We would expect that initially many set A Atoms and few set B Atoms would appear in the attentional focus, but that as time moves forward, some set A atoms begin to get replaced by atoms from set B. To make things simple we should begin with two distinct "bands" of atoms by setting, for example, mu_A-2*sigma_A > mu_b+2*sigma_B, so that initially there is little overlap between the two sets of Atoms. We also need to set the focus boundary and the time-varying stimuli in such a manner that we should know the desired behavior of the Atoms and can then vary the parameters until the desired behavior is achieved.

By creating an ECAN Parameter Tuning workbench we will obviously still need to tune the parameters again once ECAN begins interacting with other agents. The hope is that the experiments carried out within the workbench will provide a baseline from which to better explore the parameter space.

ECAN Simplification Redesign Spring 2015

As part of the ECAN simplification process we should update the updating equations using these New ECAN Equations

MindAgent specific stimulus

The conversion from stimulus (the quantity that MindAgents reward useful atoms with) should be converted to short term importance (STI) after each MindAgent runs.

This change of when conversion occurs is because the amount of STI to reward per unit of stimulus is dependent on the amount of STI the MindAgent has to distribute. This prevents a MindAgent that isn't particular effective (in achieving system goals) from upping the importance of the atoms it uses - since they are not important in the overall OpenCog instance.

At the moment, stimulus -> STI/LTI is just done by the ImportanceUpdatingAgent (along with collecting rent and taxation when necessary). For the above stim->STI/LTI scheme, it'd probably be better to include it as a separate process.

I'm thinking:

  • Add an AttentionValue and a stimulus map to the MindAgent class.
  • Split the ImportanceUpdatingAgent into a RentCollectionAgent and a WageAgent: The WageAgent needs to be called after every other MindAgent has completed it's cycle (assuming the MindAgent has distributed any stimulus or has STI to pay its atoms with.
  • move the atom stimulus map from the AtomSpace to a separate map per MindAgent.
  • In the MindAgent base class, provide a method rewardAtoms() which calls the WageAgent. This method would also signal to the WageAgent who the caller was so that the correct stimulus map and amount of STI for reward could be worked out.
  • It's up to the MindAgent to figure out the appropriate time to reward the atoms it uses. Usually at the end of a run cycle.
  • This scheme would prevent us from batching/preprocessing the importance updates -- which would be left as one of the tasks of the WageAgent. Alternatively, we could let the server itself collect the stimulus/importance data from each agent and pass it to the Wage agent.

Launchpad bug report.


Importance diffusion

  • Make it possible for importance to spread beyond a single Hebbian link in a single diffusion of importance. E.g. try implementing a Lévy flight for importance spread (use a Lévy distribution to work out the proportion of importance that gets spread beyong a single link. Note: depending on the density of HebbianLinks, this could be computationally impractical. Also, PLN may be used to do inference on HebbianLinks, which would make this redundant.

Rent updates

  • Gradually update rent when AtomSpace funds go out of homeostatic bounds. Use time since last rent change to work out the weighting combining the new value with the old.

Ideas for simple ECAN tests

Here is a suggestion for a test case for ECAN dynamics that is “simple but not too simple” …

1) Consider a limited dictionary of N words (start with N=100 perhaps, then try bigger N)

2) Identify K words as “special” (say, a few peoples’ names: Bob, John, Jane, Amen, whatever…). Start with K=5 perhaps

3) Generate a series of fake “sentences”, each one of which is a series of (say) 2-10 WordInstanceNodes corresponding to words from the limited dictionary. Generate these sentences so that:

  • 3/4 of the sentences have NONE of the special words in them
  • 1/4 of the sentences have 2 of the special words in them, chosen at random

So then we have a situation where the special words tend to co-occur with each other

4) Feed a series of fake sentences, generated, from the above, into the Atomspace (not using the NLP pipeline, just feeding the Atoms in each sentence into the Atomspace. Don’t worry about word order.

5) When a sentence gets fed in, boost the STI and LTI of each Atom in the sentence…

6) Now, set the memory capacity of the Atomspace at M Atoms (where M>>N). So we want to trigger forgetting when the Atomspace gets bigger than M. In this experiment, this will cause the Atomspace to fill up with WordInstanceNodes….


Soooo -- what we want to see happen here is:

  • 1) the special WordNodes don’t get forgotten, and special WordInstanceNodes are less likely to get forgotten than non-special WordInstanceNodes
  • 2) the AttentionalFocus generally stays filled up with the special WordNodes and recent special WordInstanceNodes
  • 3) We get HebbianLinks between the special WordNodes (since they co-occur)

Also, we can graph the changes of STI and LTI over time for

  • the special WordNodes
  • randomly selected special WordInstanceNodes
  • randomly selected non-special nodes

This should help us understand what’s going on…

We can also vary things a bit…

E.g. — Suppose we take a certain pair of special words, say Bob and Tom, and we change the distribution of fake sentences so that Bob and Tom never co-occur in the same sentence. Will we still get a HebbianLink between the WordNodes for Bob and Tom? I would guess we should, as they will both be in the AF at the same time. But it should have a lower strength than the HebbianLinks between other special WordNodes.

or — suppose we use (19/20, 1/20) instead of (3/4, 1/4) in the fake sentence generating model. Then what happens? Do the special words get kicked out of the AF sometimes? I would think so, depending on how the parameters are tuned…

See also