OpenCogPrime:PLNBookErrata
From OpenCog
The PLN book really has more errors than it should, and sometime in early 2009 we plan to fix the manuscript and send the revised version to the publisher for future printings.
Entering errata you find here will be much appreciated.
Relatively Substantial Errors
page 73-74
As noted by Jeremy Zucker:
I was working through the Heuristic Derivation of the Independence Assumption-based deduction rule ..., and found 3 typos. They don't affect the main results ... On pages 73 and 74 of the derivation, you replaced B with U-B and cranked through the algebra: P((C\intersect A)|(U-B)) = P(A|(U-B)) P(C|(U-B)) = |A\intersect (U-B)||C\intersect(U-B)|/|U-B|^2 so far, so good, but in the next step, you replace |C\intersect (U-B)| with [P(C)-P(A\intersect B)] instead of [P(C)-P(C\intersect B)] It is no big deal, because in the subsequent step, you go back to [P(C)-P(C\intersect B)] However, later on in the derivation, when you expanded P(C|A) = P(C\intersect A)/P(A) you divided the second term by P(A) incorrectly, resulting in the expression = P(A|B)P(C|B)P(B)/P(A) + [1-P(A\intersect B)][P(C) -P(C\intersect B)]/(1-P(B)) instead of: = P(A|B)P(C|B)P(B)/P(A) + [1-P(A\intersect B)/P(A)][P(C) -P(C\intersect B)]/(1-P(B)) Fortunately, in the next step, you recover by replacing [1-P(A \intersect B)] with [1-P(B|A)], so the two mistakes cancel each other out. Now you would be home free here, except that at the end, when you translate the final equation P(B|A)P(C|B) + [1-P(B|A)][P(C) - P(B)P(C|B)]/(1-P(B) into: s_AC = s_AB s_BC + (1-s_AB)(s_C - s_C s_BC)/(1-s_B) you claim that this "is the formula mentioned above", when it is actually: s_AC = s_AB s_BC + (1-s_AB)(s_C - s_B s_BC)/(1-s_B)
page 233
The formulas describing the semantics of the Context relationship, given at the bottom of the page, only hold when R is a form of inheritance/implication or similarity/equivalence relationship; they don't hold for relationships R generally
page 255-256
This is an IMPORTANT one...
The text says
"Formally, we may introduce the ASSOC operator, defined as
ExtensionalEquivalence
Member $X (ExOut ASSOC $C)
AND
ExtensionalImplication
Subset $Y $E
Subset $Y $C
NOT
ExtensionalImplication
Subset $Y $E
NOT [Subset $Y $C]
but it should actually be
ExtensionalEquivalence
Member $E (ExOut ASSOC $C)
ExOut
Func
List
Inheritance E$ $C
Inheritance
NOT $E
$C
where
Func(x,y) = [x-y]^+
and ^+ denotes the positive part ...
and where if $E and $C are relationships, Inheritance is replaced with Implication.
This one was a bad cut and paste error that somehow crept into the final version from a very old version of the manuscript ;-(
Note that a better notation is
AttractionLink E C
which is introduced at
http://www.opencog.org/wiki/OpenCogPrime:PredictiveAttraction
Generally speaking, the PLN book is unacceptably fuzzy on the interpretation of ASSOC in the initial discussion, and then the formal definition that is supposed to clarify it, is fucked up as noted above ...
The best way to interpret
P(cat | E)
in the discussion in this section would be as
P(Situation S involving cat | Situation S involving E)
This is how we actually did it when using intensional inference of this sort for NL corpus analysis (work that Izabela and Ben Goertzel did in 2005), for instance we looked at
P(sentence involving word "cat" | sentence involving word "fur")
So then, the intensional similarity of two words was basically a measure of whether the two words tended to co-occur-in-sentences with the same other words...
I think there was a more recent write-up of this stuff, but somehow what got into the book finally was something older and less clear, with some errors ... I look forward to correcting this in subsequent printings ;-p
Furthermore, Nil Geisweiller has suggested a modified approach which may be better in practice.
In the modified approach, we would replace
ExtensionalEquivalence
Member $E (ExOut ASSOC $C)
ExOut
Func
List
Inheritance E$ $C
Inheritance
NOT $E
$C
with
ExtensionalEquivalence
Member $E (ExOut ASSOC_int $C)
ExOut
Func
List
IntensionalInheritance E$ $C
IntensionalInheritance
NOT $E
$C
and
ExtensionalEquivalence
Member $E (ExOut ASSOC_ext $C)
ExOut
Func
List
ExtensionalInheritance E$ $C
ExtensionalInheritance
NOT $E
$C
and then define
ASSOC = ASSOC_int OR ASSOC_ext
where OR is an appropriate fuzzy OR (most simply, fuzzy OR is just max).
Typos and Formatting Errors and Such
from Charles Griffiths
p 6, rules 3 & 4, problem with < vs <=
p 16/17 repeated line at page boundary
p 17 "And there is a variety of probabilistic approaches..." (there are)
p 19 "there remain many similarities." (many similarities remain.)
p 26 "Chapter 10 The" (Chapter 10. The)
p 50 why do you mix [L, U] and <(L, U], ...> ? p 14 notation is <L, U, ...> and p 57 has <[L, U], ...>
p 80 "I balls in it, I black ones" (N balls in it, b black ones)
p 140 "D2 is the second-order distribution for D2." (for premise 2.)
p 164 red d
p 271 conbinations
page 25
Section 2.2, we have the definition of Context as follow
Context
C
R A B <t>
is simply
R (A AND C) (B AND C) <t>
I think it is highly ambiguous, a better representation would be:
Context <t>
C
R A B
is simply
R (A AND C) (B AND C) <t>
page 28
- Relationships representing symmetrized higher-order conditional probabilities:
- ExtensionalEquivalence
- Equivalence (mixed)
- IntensionalInheritance <== should be IntensionalEquivalence
page 108
from Kaj Sotala:
calculating the heuristic formula for G(A,B,C), there is the equation
P_subsume(A,B) = min[0, ([P(A) - P(B)] / [P(A) + P(B)])
Since this formula would return negative probabilities in cases where P(B) > P(A), and a probability of 0 otherwise, I assume that the "min" should be "max"...
page 110
P(B) = P(B|A) + P(B|¬A) P(¬A)
P(A) is missing after P(B|A)
page 212
In Section 10.6.2.1 "Boolean Operators for Combining Terms", in the subsection "extensional union", the relationship
Equivalence
Subset x (A ANDExt B)
(Subset x A) AND (Subset x B)
is given.
Actually it seems this is best interpreted as a heuristic equivalence, which holds on average and doesn't have strength 1.
In the multiset approach described later on this errata page, we have
P(A & B) = P( mult(A &B) ) P(A) = P(mult(A))
It follows that
P(X | A & B) = P(X|A) P(X|B)
if we make two big assumptions:
1) A and B are independent
2) A and B are independent in X [meaning A&X and B&X are independent]
[note, this is if not iff ... the equation could still hold if the dependencies exist but go in opposite directions... e.g. if P(A&B) > P(A)P(B) but P(A&B&X) < P(A&X)P(B&X) ]
So, in the multiset based approach, that equivalence relationship is a heuristic which is "true on average" in the above sense, rather than a strength-1 equivalence...
page 252
" Or one can take a probabilistic approach to defining complexity by introducing some reference process H, and defining
c(F;T) = ASSOC(G,H;T) "
G should be F.
page 257
In the following PLN formula
Intensional Inheritance A B
Intensional Inheritance => IntensionalInheritance
page 259
Top of the page, there is a parenthesis problem in :
s = (OR(X,Y).tv.s
page 265-266
Last sentence on page 265, "...until it finds at has a conclusion of a form ..." should be "...until it finds that it has a conclusion of a form ..."
page 267
First sentence, "... some set of known pre to the given target predicate.", should probably be, "some set of known premises to the given target predicate."
page 287
at the top of the page:
SS_Initiation(showering_event_43) => SS_InitiatedAt(showering_event_43)
SS_Initiation(shaving_event_33) => SS_InitiatedAt(shaving_event_33)
at the bottom of the page:
SS_Initiation(B) => SS_InitiatedAt(B)
SS_Initiation(A) => SS_InitiatedAt(A)
this later is actually on the top of the next page.
index
Many index terms are underlined or shown in Courier font ... index terms really should all be in the same font without underlining
Clarifications and Comments (that aren't errata)
The Nature of Causality
The position taken in the PLN book (and elaborated more in The Hidden Pattern, from a philosophical perspective) is that causality is not an elegant, mathematical thing but rather a part of human "folk psychology" ... part of the way we humans intuitively understand the world ... that is a mixture of different factors. See http://goertzel.org/PLN_causality.pdf for an earlier version of the text from the book....
One psychologically important aspect of causality was left out of the discussion in the book, namely the relationship between causality and action. This factor was mentioned in The Hidden Pattern, but in a less formal way.
The most direct kind of causal relationship perceived by an organism is one that involves the organism's own actions, i.e.
PredictiveImplication I do X Y happens
or more formally
L := PredictiveImplication Execution X Y
Given another implication
M := PredictiveImplication A B
the mind may then assess M as causal if M derives a lot of positive evidence via inference based on L.
In general, a mind may assess a predictive implication relationship as causal if it derives a lot of positive evidence for this implication's truth value from predictive implications whose sources are ExecutionLinks.
The concept here is that: we think A causes B if we can imagine ourselves in a position where we enact A and this results in B happening.
Causation is thus getting reduced to the "feeling of free will."
I'm not positing this as the only, true or ultimate explanation of causality -- but I think it's a significant aspect of the mix of fuzzy intuitions that underly our folk psychology notion of causality (and an aspect that was left out in the PLN book's discussion, though mentioned in The Hidden Pattern in a less formal way).
Relation between predicate and term logic representations of the same relationships
The same statement can be expressed in both term logic and predicate logic, eg "all ravens are black", "Joe is a raven".
Term logic:
raven -> black
Joe -> raven
InheritanceLink
___ ConceptNode: raven
___ ConceptNode: black
InheritanceLink
___ SemeNode: Joe
___ ConceptNode: Raven
Predicate logic:
raven(X) -> black(X)
raven(Joe)
To avoid confusion let's write the latter as
isRaven(X) ==> isBlack(X)
isRaven(Joe)
which would be
VariableScopeLink $X
__ImplicationLink
_____ EvaluationLink
________ PredicateNode: isRaven
________ VariableNode: $X
_____ EvaluationLink
________ PredicateNode: isBlack
________ VariableNode: $X
EvaluationLink
___ PredicateNode: isRaven
___ SemeNode: Joe
But then we have that
ForAll $X
__ Equivalence
_____ EvaluationLink
________ PredicateNode: isRaven
________ VariableNode: $X
_____ MemberLink
________ VariableNode: $X
________ SatisfyingSetLink
___________ PredicateNode: isRaven
So that the predicate-argument relation is equivalent to a fuzzy membership relation
So, the relation between predicate and term logic relationships in PLN boils down to the relation between fuzzy membership and probabilistic inheritance relationships
About the semantics of SubsetLinks between fuzzy sets
The semantics of the equation at the top of page 29, in section 2.4.1.1, is not adequately clear ... and some heuristic formulas are given there without a clear explanation that they are in fact just heuristics that could be replaced by precise probabilistic formulas.
The following paper explains how to replace those heuristics with something that has a precise probabilistic foundation but is slightly complicated:
http://goertzel.org/MyPapers/FuzzyProbabilistic.pdf
About PLN and NARS
Someone asked Ben Goertzel:
NARS has been reimplemented by Pei four times in four different languages. Since it predates PLN, I wonder, were there ever any plans to just basically change the "truth value" formulas in one of the implementations, making it more "probabilistic"?
He answered...
That is how PLN began, when Pei was working for me at Webmind Inc. during 3 years in the 1990s. Jeff Pressing and I invented PLN originally as a "probabilistic version of NARS" We had a software system then that could do either PLN or NARS reasoning (using a subset of the rules and formulas of each) depending on a parameter setting. But it eventually became apparent that you can't really do things that way. It's not just about the truth value formulas, it's about the underlying semantics, which are very different for NARS than for any probabilistic reasoning system. The meaning of an inferred frequency value for an inheritance relationship is just very different in NARS and in PLN, and this leads to all sorts of other differences. For instance, in PLN one can derive higher-order inference rules [using "higher-order" in the NARS sense] from first-order rules... due to the probabilistic semantics. In NARS you can't. In PLN one can attach semantics to variable expressions with unbound variables, using "mean value" semantics. In NARS you can't... The way intension vs. extension is handled in NARS doesn't make sense once you introduce probabilistic truth values ... so if you want to talk about intension separately from intension, you need to do something else (the approach taken in the PLN book being one route...) etc. Basically, the deeper you get, the more it becomes clear that you can't just paste out NARS truth value formulas and paste in probabilistic ones, even though PLN and NARS do wind up with a lot of similarities...
Intensional Node Probabilities
Nil suggested the following:
In the PLN book it is said that: A <w> means SubSet Universe A <tv> I'm wondering if we could consider the intensional and mixed inheritance as well, that is: A <w> means Inheritance Universe A <w> or even IntensionalInheritance Universe A <w>
Ben said:
Yes, if we look at
ExtensionalEquivalence
Member $E (ExOut ASSOC $C)
ExOut
Func
List
Inheritance E$ $C
Inheritance
NOT $E
$C
where $C is the universe, then we get
Member $E (ExOut ASSOC $C) = [E.tv - (NOT E).tv]^+ = 2 E.tv -1
so that where
IntensionalInheritance Universe A <w>
then w is a normalized sum of terms of the form
(Member E (ExOut ASSOC A)) * (2 E.tv -1)
so it's pretty much a reweighting of the set A_ASSOC, right?
So what we find is that the intensional node probability of A is a measure of how much probability mass is concentrated in A's association-set.
Of course, E.tv may also be calculated intensionally, extensionally or mixedly in the above...
When doing purely intensional inference, presumably one would use purely intensional node probabilities
When doing mixed inference, one may want to use mixed node probabilities
One of the lessons I learned from Pei is that human commonsense inference mixes up intension and extension in complex ways...
ben
Calculating the truth values of unquantified variable expressions
In the book it says
"The VariableScope link is a kind of "average quantifier": the truth value of
VariableScopeLink $X
F($X)
is defined as the weighted average of the truth value of F($X), i.e. as the sum
w($x) F($x) / normalizer
where w($x) is defined as the truth value of $x in the system."
It seems useful to elaborate a bit on the normalizer/
A simple approach is
normalizer = Sum{ w($x) ; $x goes over all entities in the system's
memory that match F's input type restrictions}
There is a lot of formalism in the OpenCogPrime wikibook about input and output type restrictions of predicates.
We can make similar distinctions to in a programming language, distinguishing
Concept Concept --> Concept (Concept --> Concept) --> Concept
and so forth
The subtle issue that arises here pertains to probabilistic overlap between Atoms.
Note for instance that in the Atom formalism, we represent
cat
as a ConceptNode, and also represent
{cat, dog}
as a ConceptNode ...
The thing is, to handle this sort of situation *right*, one would need to account for dependencies among the different Atoms in the system in defining the truth value of a VariableScope link...
In other words, rather than just using a weighted average, we'd have to use the inclusion-exclusion formula from set theory, and do stuff like the following
Let
h($x) = w($x) F($x)
Then, F had domain ConceptNode, and the only ConceptNodes in the system were A and B, we'd need something like
w(A) F(A) + w(B) F(B) - w(A&B) F(A&B)
which bypasses the problem that w(A) and w(B) may both rely on the same pieces of evidence...
Then we still have the funniness of applying F to the logical intersection A&B ... but that funniness is just irreducible... I mean there is no semantic problem there, as if A and B are concepts then A&B is a concept and F is supposed to be applicable to concepts...
But the problem is that applying the inclusion-exclusion formula to a large space of concepts (or other Atoms) is just wholly intractable...
So, as a simple and tractable approximation, I just suggest to ignore all dependencies and use a simple weighted average...
What else can be done?
This leads to all sorts of cognitive biases, but ... well ... so be it ...

