Rule of Choice
The "Rule of Choice" is a high level PLN rule governing what to do when we encounter two different truth values associated with "the same Atom". This may occur, for instance, when loading an Atom from the backing store, and finding the same Atom already exists in the Atomspace with a different TV.
The rule is discussed in the original PLN book, and is modeled on a similar rule in the NARS inference system.
By "the same Atom" what is meant is "two nodes with the same type and name" or "two links with the same type and outgoing set." This is an informal usage. A more careful language would call these not "the same Atom" but "corresponding Atoms in two different Atom domains", where an Atom domain may -- in the present implementation -- be either an Atomspace object or a backing store.
Theory versus Current Code
As of Oct 2015, it would seem that the way the current code deals with Atom "merging" is conceptually wrong, though OK in many common cases...
The PLN design says that there should be a "Rule of Choice" that chooses what to do when there are two instances of the same logical Atom present...
So, one thing the Atomspace's addAtom method should do, according to this theoretical approach, is: When it finds it is adding an Atom-instance corresponding to an Atom-instance that already exists in the Atomspace, is trigger invocation of the Rule of Choice, not just reflexively trigger a TV revision (because TV revision is only one option that the Rule of Choice might select)
Or as an alternate design: addAtom could reject attempts to add a new Atom-instance corresponding to any Atom-instance already in the Atomspace.... Then, before adding an Atom A to the Atomspace, the external code would need to check if any Atom-instance A1 corresponding to A already exists in the Atomspace, and deal with this situation however it wants. It could choose not to insert A, or it could choose to remove A1 and insert some new combination of A and A1, or it could choose to modify A1 according to A.
A compromise between these two approaches might be to have two methods, addAtomStrict and addAtomWithRuleOfChoice or whatever.... The former would reject attempts to add anything already there. The latter would be a convenience method that invokes a Rule of Choice associated with the Atomspace to handle different instances of the same logical Atom. One could also make it more like
addAtomWithRuleOfChoice( Atom A, RuleOfChoice R)
where the addAtom... method explicitly tells you what choice-rule to use in mediating possible conflicts...
Terminology sometimes becomes confusing in discussions of situations involving multiple instances of the same logical Atom.
Let us use the term "Atom domain" to denote an Atomspace or backing store, or other container of Atoms
Let us use the term "logical Atom" to indicate
- the set of nodes, in various Atom domains, w/ the same name and type
- the set of links, in various Atom domains, w/ the same type and target list
Let us use the term "Atom instance" to denote a specific instance of a logical Atom, existing within a specific Atom domain
"Two versions of the same Atom instance" is meaningless; "Two instances of the same logical Atom" makes perfect sense ... some of the confusion in the recent part of this thread have been due to not making this distinction clearly...
So, in this terminology,
- merging two Atom-instances corresponding to the same logical Atom, is a general process that's not only about Atomspaces, sure
- doing "belief revision" on two TruthValue objects should be done on the level of the TV object
Right now (Oct. 2015),
- the Atom-instance merger process is invoked via addAtom, without consideration of other "Rule of Choice" options
- the belief revision process is done via a special method associated with each time of TV object
Let's look at a specific example.
Inheritance Ben crazy <.8 , .6>
is already in the Atomspace.
Then consider two cases
Inheritance Ben crazy <.6, .2>
is loaded from the backing store.
What this probably means is that, since the last sync with the backing store, the Atomspace has been the locus of additional reasoning or learning about (Inheritance Ben crazy) -- specifically there has been reasoning/learning worth .6 confidence, as opposed to the previous reasoning/learning worth only .1 confidence....
So the likely right step is to do a weighted average, and assign
Inheritance Ben crazy <s .c>
s = (.8*.6 + .6*.2)/(.6+.2) c = .8 + f*.2
where f is a parameter in [0,1] representing an estimate of how much the evidence underlying the two strength estimates overlaps. This is what the code does now, for simple truth values (oct 2015).
There are fancier revision rules one can use also. But the weighted average is a start...
Inheritance Ben crazy <.1, .85>
is loaded from the backing store.
Ok, now this is confusing. The backing store and the Atomspace have two very different estimates of the strength, and both are reasonably confident. It may not be a good idea to just merge them and ignore the conflict.
There is a PLN solution which would require more advanced truth values than we currently have in the system, which would be to merge the two into
Inheritance Ben crazy <D,c>
where D is not a single strength number, bur rather a probability distribution (e.g. a beta distribution with appropriate parameters). In this case we'd perhaps want a bimodal distribution. It's possible the backing store was formed only from cases where Ben was not at all crazy, whereas the Atomspace was formed only from cases where Ben was very crazy...
Another solution would be to flag the situation as something that needs to be thought about more. So, one would put both possibilities in the Atomspace, and then tell some inference agent to think about them.
Representationally, in this case, one could enter the version from the BackingStore as something like
EmbeddedTruthValueLink <.1, .85> ConceptNode "Prior Knowledge 15" InheritanceLink Ben crazy
and if we want we could add extra annotation like
EvaluationLink PredicateNode "source" ConceptNode "Prior Knowledge 15" ConceptNode "Postgres Backing Store 5555" InheritanceLink ConceptNode "Postgres Backing Store 5555" ConceptNode "Postgres Backing store" AtTimeLink ConceptNode "Postgres Backing Store 5555" TimeNode "5/5/15 12:10:11"
Note this representation is similar to the one used for contextual knowledge in Claims_and_contexts#An_Example_of_Moderately_Complex_Semantic_Embedding
We could also boost the ShortTermImportance value of this EmbeddedTruthValueLink -- which in a system with STI-guided inference, would tell the system to think about this link, i.e. to try to understand why it has the truth value it does.
Currently PLN has no rules for dealing with EmbeddedTruthValueLinks, so this wouldn't work. Such rules need to be created.
Some additional examples suggested by Linas in an email thread:
The problem is that, sometimes TV revision needs to be idempotent, and sometimes not.
For example: Suppose that, due to some choice of algorithm, the same atom keeps getting added over and over. Should the TV change or stay the same?
Perhaps the algorithm is some sort of recursive search and its inserting the same atom over and over simply because that is the easiest way to design the algo. In this case, the TV should never change.
In other cases, perhaps the atom is being added over and over because some event is being witnessed over and over, in which case the count on the atom should be incremented each time.
In the backing-store case, there are multiple possibilities:
A) this atomspace is holding a copy of an atom whose TV is simply 'obsolete'. It's 'obsolete' because PLN ran on some other remote machine, and updated the TV and pushed the TV into the database. When fetching this atom, the correct behavior is to replace the 'obsolete' value with the freshly-created value.
B) this atomspace is holding a copy of an atom whose TV was recently recomputed by PLN. However, just before this new TV is stored in the database, some other thread (completely unrelated to PLN) fetches a copy of that atom from the DB, along with the old, 'obsolete' TV that the DB holds. Should that old, obsolete TV be merged into the newly-created TV? Almost surely not.
C) re-imagine case A & B) above, but replace 'PLN' by "counting of observed events" and imagine that the database holds a count of "20", and, locally, we saw the event once, so the new count should be "21". If the merge is done wrong, you might end up with 20+21=41 instead ....
You don't even need the backing store to get this kind of craziness. Imagine you had some algo Z that computed TV's in some way. This algo can do one of two things:
E) if it has a pointer to the atom, it can just change the TV directly.
F) if it does not have a pointer to the atom, it can create a new atom, set the TV on it, and insert it into the atomspace,
Case E) seems unambiguous.
Case F) -- if the atom already exists in the atomspace, then should there be a merge, or should it be a replace? Don't you think that depend on the algorithm Z? Different algo's may want/expect different behaviors.