OpenCoggy Probabilistic Programming
EXPRESSING OPENCOG COGNITIVE PROCESSES IN TERMS OF PROBABILISTIC SAMPLING
This page contains preliminary notes toward a reformulation of most or all OpenCog cognitive processes in terms of iterated probabilistic-programming-style sampling queries ... via introduction and appropriate use of a new link type called SampleLink.
The motivation is to make OpenCog more simple and elegant, via abstracting the "hard parts" of the OpenCog cognitive algorithms into the SampleLink construct. Now, various cognitive algorithms must then be used together to evaluate the SampleLinks -- so we still need the whole "cognitive synergy" complexity of OpenCog, in a way. But the SampleLink construct is used as a focal point for the complexity, which may help simplify the design both theoretically and implementation-wise.
The bulk of the page goes through various OpenCog cognitive processes and attempts to formulate simplistic versions of them in terms of Atomese with SampleLinks.
Aside from SampleLink, some of the other Atomese used also doesn't exist yet; but SampleLink is the only "big deal"; the other new Atomese suggestions are just minor variations on what already works.
(The ideas here were conceived by Ben Goertzel and the bulk of the page was written by Ben, but Nil Geisweiller has reviewed the contents and fixed/improved various things. Thanks are also due to Alexey Potapov for inspiring Ben to think in this direction, though Alexey is not to be blamed for the specifics of these ideas....)
GENERAL PROCESS (Probabilistic Growth and Mining of Combinations)
The conceptual inspiration for the code sketches presented here is the notion of Probabilistic Growth and Mining of Combinations, or PGMC.
Conceptual Idea of PGMC
At the highest level, the PGMC idea is that the essential core of cognition can be summed up as two processes:
- synthesis (forward combination) ... existing combinations of entities are grown by adding new elements onto them to make bigger combinations. The choice of which elements to add is biased via history, i.e. via which combinations have proved useful in the past.
- analysis (backward combination) ... Given an existing entity, an effort is made to form it via combination of 2 or more other entities. Which "backward combinations" of this nature are formed, is biased via history, i.e. via which combinations have proved useful in the past.
This is an extremely general rendition of cognitive process; and in this view, there is a lot of flexibility in terms of what kinds of entities are in the mind combining with each other.
Also, there is the question of how the biasing works. A key idea is that patterns regarding which combinations have been successful in past history, can themselves be expressed as combinations -- and the pattern recognition process itself can be executed via forward growth and backward combination.
The general philosophy underlying PGMC is discussed a bit here
and in more depth in the book The Hidden Pattern (Ben Goertzel, 2006).
Probabilistic Formulation of PGMC
To phrase this a little more mathematically ... here is a simple summary of the PGMC process as I see it being used in OpenCog in the near future ...
1) Probability Estimate Guided Growth (PEGG)
Given a quality measure F, and one or more combinatory operators C.
Choose x w/ probability proportional to F
Choose y (and optionally also choose C from a set of options) so that C(x,y) holds, with probability proportional to estimated F(C(x,y))
Calculate actual F(C(x,y)), and assign y and C(x,y) important proportionately
2) Probability Estimation
In many cases, given the nature of C, there will be certain heuristics for choosing C and y that have a decent chance of making F(C(x,y)) large. An example is that the greedy heuristic in pattern mining, which chooses new patterns that are direct extensions of existing patterns (instead of just exploring random patterns in the space of possible patterns).
But in some cases PEGG must also be used for Probability Estimation ... in this case the quality measure F for an Atom x measures how well x estimates the quality of some entity x1 according to some measure F1. An example is the use of pattern mining to estimate the surprisingness of a pattern in a set of multi-step inferences; in this case, goal of the pattern mining is to get a good estimate of the surprisingness of an inference pattern P, because the surprisingness of an inference pattern P will be the quality measure used for PEGG-driven forward and backward chaining inference.
The guiding idea expressed on this wiki page, is to use the above ideas to reformulate OpenCog cognitive processes in concise and relatively standard Atomese, via relying on a new link type called SampleLink, and on the URE (Unified Rule Engine) to chain various SampleLink invocations together.
Basically, SampleLink is used for the "biasing" involved in choosing which forward or backward combinations to make. The URE is then used to chain together multiple steps that are chosen via the biasing operation.
A Note on DefineLink Syntax
In the remainder of this page I will assume a slightly different syntax for DefineLink than what is currently implemented.
DefineLink, as currently implemented, has the following format only:
DefineLink <alias> <definition>
So, typically, in order to define a function you need to wrap the variables and the body in a LambdaLink
DefineLink <alias> LambdaLink <variables> <body>
However, it has previously been suggested to have DefineLink support a different format, which I will use here:
DefineLink <alias> <variables> <body>
as syntactic sugar for the currently functional format.
The SampleLink (not currently implemented in OpenCog) would need to have the following functionality:
SampleLink PredicateNode F PredicateNode C
would need to choose a random sample from the of Atoms X satisfying
EvaluationLink <s> PredicateNode C X s > 0
, according to the distribution F (which may be unnormalized).
Note that in the currently proposed approach, SampleLink only returns one individual, and one needs to repeatedly call SampleLink to get a larger sample. One could also implement a SampleLink variant that returned a ListLink of n samples, where n was specified as an additional argument.
That is, SampleLink is driving "fitness-proportionate sampling" where F is the fitness, subject to the constraint C. This is closely related to "optimization queries" in probabilistic programming.
The second argument C is not strictly necessary as one could equivalently say
SampleLink AndLink PredicateNode F PredicateNode C
but the use of a separate argument slot for C allows more convenient and efficient processing for the case where the sampling process is to be restricted to a small set of Atoms, e.g. if C is
DefineLink DefinedPredicate "C" VariableNode $X PresentLink <-- probably not necessary MemberLink VariableNode $X SetLink 
so that C is true only of elements in a certain SetLink ...
In that case processing can proceed first by enumerating the elements permitted by the filter C, and then sampling from this set according to F. Of course, this could be done in the format
SampleLink AndLink PredicateNode F PredicateNode C
as well, but the format
SampleLink PredicateNode F PredicateNode C
explicitly expresses C as a filter... We might also want to allow forms
SampleLink PredicateNode F ConceptNode C
(restricting sampling to members of C, according to MemberLinks) and
SampleLink PredicateNode F SetLink  SampleLink PredicateNode F ListLink 
restricting attention to elements of the given set or list...
A third argument may optionally be added to indicate the sampling algorithm used
SampleLink PredicateNode F PredicateNode C SchemaNode "indicate sampling method here"
where the GSN must be either a GroundedSchemaNode or a DefinedSchemaNode
ACTUALLY DOING THE SAMPLING
Everything on this page is based on SampleLink, so that begs the question of how to actually do the sampling. But I'm not going to focus on that here. That's the next step. Here my focus is on expressing simplified versions of all the different OpenCog cognitive processes in the language of "Atomese + sampling"
Clearly basic MC type sampling will be inadequate except in very simple cases. Since SampleLink is doing "fitness proportionate sampling" with an aim toward resolving optimization queries, I believe PLN can be an appropriate tool for doing the sampling. Using PLN with a history-based-sampling-driven control mechanism, PLN can be made to find Atoms satisfying a specified predicate, which is what SampleLink requires....
Thus it becomes "turtles all the way down", as we rely on SampleLink to drive PLN inference control, and rely on PLN inference control to drive SampleLink. But AGI is always turtles all the way down...
A little more detail
Specifically, how might we use PLN to solve the problem: "Choose a bunch of x, sequentially (or in batches), with probability of choosing x proportional to F(x)?"
One approach is to have PLN able to take, as a target,
Find x so that "F(x) is true to degree around p"
where p is some specific number. Or
Find x so that "F(x) is true with degree in interval I"
If we can run PLN with this sort of target, then
- as we evaluate F(x) for multiple x, we can build a model of the distribution (histogram) of F(x) values for all x in the Atomspace
- Given this distributional model D_F, when we want to choose a new x, we can choose p from D_F, and then search for x so that "F(x) is true with degree p"
To support queries of the form
Find x so that "F(x) is true to degree around p"
in PLN, we would need to do a little work to propagate the p-values through the backward chainer. In other words, given a target Atom A (associated with a target strength value p) and an inference rule R, what target strength values should be set for the premises of R in backward chaining? This is not hard to answer given the specific quantitative truth value formulas for the rules R, but the backward chainer would have to be modified to incorporate this feature.
Also, to make this work, we would need the distribution model D_F to be stored for ongoing use, meaning that we need F to have a DTV (distributional truth value) rather than a simple truth value.
This is a strange kind of sampling then: What we are sampling are the p-values used as targets for PLN backward chaining.
In the above proposal, PLN is playing the leading/driving role in doing the "sampling" behind a SampleLink. However, for PLN to do its stuff, it will need help from other cognitive processes. Pattern Mining can learn new Atoms that can be used inside PLN inference; ECAN attention allocation can help guide PLN; etc.
The whole "cognitive synergy" mess of OpenCog is needed here, to evaluate SampleLinks. So the SampleLink proposal doesn't especially solve the subtle combinatorial-explosion problems at the heart of AGI, it just channels the combinatorial explosion problems underlying the various cognitive processes in OpenCog into one place (evaluating SampleLinks)....
FORMULATION OF VARIOUS OPENCOG ALGORITHMS IN TERMS OF ATOMESE WITH SAMPLE-LINK
Now comes the "meat" of the page.... What follows are a bunch of rough sketches of OpenCog "cognitive processes" formulated as Atomese making central use of SampleLink.
These are not intended as fully fleshed out, maximally functional AI processes. Rather, they are intended as simple examples, to illustrate how each TYPE of cognitive process considered can be expressed in terms of the style of probabilistic programming presented here.
I will start with a very simplistic version of "something vaguely MOSES-ish", just for illustration...
Given a fitness function F (let's say mapping into [0,1])
Select x with probability proportional to F (or rather try to).
Select z so that z = Cross(x,y) for some y, with probability proportional to estimated F(z) (and also fitness of y? we can leave that out for now...)
Select w so that w = Mutate(z), with probability proportional to estimated F(w)
SignatureLink GroundedSchemaNode "Best estimate" ArrowLink VariableTypeNode "PredicateNode" \\ exact fitness function VariableTypeNode "PredicateNode" \\ estimated fitness function ExecutionOutputLink DefinedSchemaNode "mutation step" ExecutionOutputLink DefinedSchemaNode "crossover step" ExecutionOutputLink DefinedSchemaNode "initial selection step" DefineLink DefinedSchemaNode "initial selection step" SampleLink PredicateNode "FitnessFunction" PredicateNode "Population" DefineLink DefinedSchemaNode "crossover step" VariableNode $x SampleLink ExecutionOutputLink \\ sampling according to the estimated fitness GroundedSchemaNode "Best estimate" PredicateNode "FitnessFunction" LambdaLink VariableNode $z MemberLink VariableNode $z RangeLink LambdaLink VariableNode $y ExecutionOutputLink GroundedSchemaNode "Cross" ListLink VariableNode $y VariableNode $x
where RangeLink corresponds to the codomain or a subset of it of the schema given in input.
DefineLink DefinedSchemaNode "mutation step" VariableNode $z SampleLink AndLink ExecutionOutputLink \\ sampling according to the estimated fitness GroundedSchemaNode "Best estimate" PredicateNode "FitnessFunction" SatisfactionLink \\ this clause makes really wild mutants less likely VariableNode $a SimilarityLink VariableNode $z VariableNode $a LambdaLink VariableNode $a MemberLink \\ $a is produced by mutating $z VariableNode $a RangeLink ExecutionOutputLink GroundedSchemaNode "mutate" VariableNode $z
assuming similarity(y,z) measures degree to which y is a mutant of z
FORWARD CHAINING INFERENCE
Given a quality metric Q, such as "interestingness"
Given a source x, selected w/ probability proportionate to importance
Select a rule R and additional source y, with probability proportional to estimated Q( R(x,y))
Give the result z = R(x,y), and the formal representation of "z = R(x,y) , an importance boost proportional to the actual calculated Q( R(x,y))
DefineLink DefinedSchemaNode "forward step" SequentialExecutionLink ExecutionOutputLink DefinedSchemaNode "stimulate Atom" ExecutionOutputLink DefinedSchemaNode "importance_guided_combination_selection_and_evaluation" SampleLink DefinedSchemaNode "get STI"
The above uses SequentialExecutionLink, which doesn't exist in the code currently, and differs from SequentialAndLink in that it doesn't care what its arguments evaluate to.
DefineLink DefinedSchemaNode "importance_guided_combination_selection_and_evaluation" VariableNode $A ExecutionOutputLink GroundedSchemaNode "py: evaluate_and_supply_missing_premises_via_STI" ListLink SampleLink SatisfactionLink \\ the set of PLN rule combinations that include $A as premise VariableNode $X AndLink MemberLink VariableNode $X ConceptNode "PLNRuleCombinationPatterns" EvaluationLink DefinedPredicateNode "combination includes premise matching" ListLink VariableNode $X VariableNode $A VariableNode $A and DefineLink DefinedSchemaNode "stimulate Atom" ExecutionOutputLink DefinedSchemaNode "forward chaining Atom stimulation constant"
BACKWARD CHAINING INFERENCE
Given a quality metric Q, such as "strength * confidence"
Given a target z, selected w/ probability proportionate to importance or via some external query
Select a rule R and sources x, y, with probability proportional to estimated Q( R(x,y))
Give the formal representation of "z = R(x,y)", and the entities x and y, an importance boost proportional to the actual calculated Q( R(x,y))
DefineLink DefinedSchemaNode "backward step from target" VariableNode $Z SequentualAndLink ExecutionOutputLink DefinedSchemaNode "boost STI" SampleLink DefinedSchemaNode "get combination utility" SatisfactionLink VariableNode $X AndLink MemberLink VariableNode $X ConceptNode "CombinationPatterns" EvaluationLink DefinedPredicateNode "combination produces" ListLink VariableNode $Z VariableNode $X
See below for the definition of DefinedSchemaNode "get combination utility".
Given a cue x, selected w/ probability proportionate to importance or via some external query
Select a match y with probability proportional to estimated Match-Degree(x,y) [i.e. estimated quality of Match(x,y)]
Calculate actual Match-Degree(x,y), and return y as a response if it's high enough
DefineLink DefinedSchemaNode "guess fuzzy match" VariableNode $x SampleLink LambdaLink VariableNode $y EvaluationLink GroundedPredicateNode "fuzzy estimated match degree" ListLink VariableNode $x VariableNode $y
To get the actual, accurate degree of the match found, one could do
EvaluationLink GroundedPredicateNode "fuzzy match degree" VariableNode $x ExecutionOutputLink DefinedSchemaNode "guess fuzzy match" VariableNode $x
(simple algorithm similar to current SuReal)
Given a cue x
Select a match y from the set of sentence-structures corresponding to known sentences, with probability proportional to estimated Match-Degree(x,y)
Calculate actual Match-Degree(x,y), and return y as a response if it's high enough
The current (crude) SuReal algorithm is just like fuzzy matching, but restricts focus to Atoms that are SetLinks produced from interpreting sentences w/ the OpenCog NLP pipeline...
DefineLink DefinedSchemaNode "guess fuzzy sentence match" VariableNode $x SampleLink LambdaLink VariableNode $y EvaluationLink GroundedPredicateNode "estimated fuzzy match degree" VariableNode $x VariableNode $y PredicateNode "is sentence interpretation" EvaluationLink GroundedPredicateNode "fuzzy sentence match degree" VariableNode $x ExecutionOutputLink DefinedSchemaNode "guess fuzzy sentence match" VariableNode $x
Select x that isn't already a member of some larger cluster (or if one is building overlapping clusters: so that isn't already a member of too many existing clusters)
Select z so that z = Merge(x,y) , with probability proportional to estimated Cluster-Quality(z)
Calculate actual Cluster-Quality(z), and reject z if the value is too low
\\ Basic clustering step DefineLink DefinedSchemaNode "clustering step" ExecutionOutputLink DefinedSchemaNode "agglomerate onto" SampleLink PredicateNode "getSTI" SatisfactionLink TypedVariableLink VariableNode $x TypeChoice TypeNode "ConceptNode" TypeNode "SetLink" VariableNode $x \\ The above assumes that clusters are concepts or sets ... of course the assumption could be \\ made more flexible, this is just an illustration \\ Given x, take a stab at trying to expand x into a bigger cluster DefineLink DefinedSchemaNode "agglomerate onto" VariableNode $x ExecutionOutputLink DefinedSchemaNode "optionally make new cluster" ExecutionOutputLink DefinedSchemaNode "make cluster pair" VariableNode $x \\ Given x, guesses some y to glom onto x to make a bigger cluster DefineLink DefinedSchemaNode "merge step" VariableNode $x SampleLink ExecutionOutputLink \\ sampling according to the estimated cluster quality GroundedSchemaNode "Estimated cluster quality" LambdaLink VariableNode $z MemberLink VariableNode $z RangeLink LambdaLink VariableNode $y ExecutionOutputLink SchemaNode "Merge" ListLink VariableNode $y VariableNode $x \\ Makes a pair (proto-cluster, proto cluster quality) DefineLink DefinedSchemaNode "make cluster pair" VariableNode $x ExecutionOutputLink DefinedSchemaNode "pair builder" ListLink VariableNode $x DefinedSchemaNode "merge step" DefinedSchemaNode "calculate cluster quality" DefineLink DefinedSchemaNode "pair builder" ListLink VariableNode $x VariableNode $F VariableNode $C ListLink ExecutionOutputLink VariableNode $F VariableNode $x ExecutionOutputLink VariableNode $C ExecutionOutputLink VariableNode $F VariableNode $x DefineLink "cluster creation threshold" NumberNode "0.2" DefineLink "new cluster STI" NumberNode "0.6" \\ Makes a new cluster from the proto-cluster if it's good enough DefineLink DefinedSchemaNode "optionally make new cluster" VariableNode $pair ExecutionOutputLink DefinedSchemaNode "optionally make new Atom" ListLink VariableNode $pair DefinedSchemaNode "cluster creation threshold" DefinedSchemaNode "new cluster STI" \\ Helper function used by the above, also useful in pattern mining as discussed later DefineLink DefinedSchemaNode "optionally make new atom" ListLink VariableNode $pair VariableNode $threshold VariableNode $startingSTI IfElseLink GreaterThanLink ElementAtLink VariableNode $pair NumberNode "2" VariableNode $threshold SetSTILink ElementAtLink VariableNode $pair NumberNode "2" VariableNode $startingSTI SetSTILink ElementAtLink VariableNode $pair NumberNode "2" NumberNode "0"
Given a size measure Concision(x), which is inversely proportional to the size of x; i.e Concision(0)=1, Concision(infinity)=0, and Concision(x) is monotone dereasing in x.
Given a quality measure Q(x), such as frequency or surprisingness of the pattern x
Let WQ(x) = Concision(x) * Q(x)
Choose x with a probability proportional to WQ(x)
Select y so that Extends(x,y) is true, with a probability proportional to the estimate of WQ(y)
(this is very similar to agglomerative clustering, structurally, though of course very different in effect)
\\ Pattern mining step DefineLink DefinedSchemaNode "pattern mining step" ExecutionOutputLink DefinedSchemaNode "grow pattern" SampleLink PredicateNode "getSTI" PredicateNode "isPattern" \\ returns true for Atoms containing variables bound by Lambda? \\ Given x, take a stab at trying to expand x into a bigger pattern DefineLink DefinedSchemaNode "grow pattern" VariableNode $x ExecutionOutputLink DefinedSchemaNode "optionally make new pattern" ExecutionOutputLink DefinedSchemaNode "make pattern pair" VariableNode $x \\ Takes a pattern x and chooses a possible extension of it (y) DefineLink DefinedSchemaNode "expand pattern" VariableNode $x SampleLink LambdaLink VariableNode $y AndLink PredicateNode "Extends" VariableNode $y VariableNode $x PredicateNode "estimated WQ" VariableNode $y \\ Make a pair (pattern, pattern quality) DefineLink DefinedSchemaNode "make pattern pair" VariableNode $x ExecutionOutputLink DefinedSchemaNode "pair builder" ListLink VariableNode $x DefinedSchemaNode "expand pattern" DefinedSchemaNode "calculate WQ" DefineLink "pattern creation threshold" NumberNode "0.2" DefineLink "new pattern STI" NumberNode "0.6" DefineLink DefinedSchemaNode "optionally make new pattern" VariableNode $pair ExecutionOutputLink DefinedSchemaNode "optionally make new Atom" ListLink VariableNode $pair DefinedSchemaNode "pattern creation threshold" DefinedSchemaNode "new pattern STI"
A simple STI-spreading process could be implemented like this:
DefineLink DefinedSchemaNode "spread STI to chosen Atom" VariableNode $sourceAtom SequentualExecutionLink ExecutionOutputLink DefinedSchemaNode "spread STI among" ExecutionOutputLink DefinedSchemaNode "get STI" VariableNode $sourceAtom ExecutionOutputLink DefinedSchemaNode "get links to neighbors" VariableNode $sourceAtom ExecutionOutputLink DefinedSchemaNode "decrement STI" VariableNode $sourceAtom
where the source of spreading could be chosen w/ probability proportional to STI
DefineLink DefinedSchemaNode "choose Atom and spread importance from it" ExecutionOutputLink DefinedSchemaNode "spread STI to chosen Atom" SampleLink DefinedSchemaNode "get STI"
SIMPLE (TOY) REINFORCEMENT LEARNING
For this simple toy example, let's consider the case of an OpenCog system with only one goal. Also, let us consider that forward and backward chaining are both going on continually. Perhaps multiple instances of each are going on continually. Balancing between forward and backward chaining activity can be done adaptively but we won't consider that here as it's not critical to the points we want to make.
Let us assume that importance (STI) propagation is happening in the OpenCog Atomspace, concurrently with the backward and forward chaining processes, and that a large fraction of the importance in the Atomspace is being distributed via the single goal. That is, the single goal is being allocated a substantial amount of importance at each point in time, and is spreading it along its links and throughout the Atomspace. In this scenario, importance is a reasonable proxy for "importance for the single goal."
A forward step could then be chosen via
DefineLink DefinedSchemaNode "forward step" SampleLink DefinedSchemaNode "get combination utility" SatisfactionLink $X MemberLink $X ConceptNode "CombinationPatterns"
where the utilities may be stored in the Atomspace,
DefineLink DefinedSchemaNode "get combination utility" VariableList VariableNode $C // Atom whose utility is being assessed VariableNode $G // The goal GetLink TypedVariableLink VariableNode $N TypeNode "NumberNode" ExecutionLink SchemaNode "utility" ListLink VariableNode $C VariableNode $G VariableNode $N
What the "forward step" above does, then, is to choose a random Atom according to the utility distribution, and then execute it.
Similarly -- but slightly more complexly -- a backward step could then be chosen via
DefineLink DefinedSchemaNode "backward step" SequentualAndLink ExecutionOutputLink DefinedSchemaNode "boost importance" SampleLink DefinedSchemaNode "get combination utility" SatisfactionLink $X AndLink MemberLink $X ConceptNode "CombinationPatterns" EvaluationLink DefinedPredicateNode "combination produces" ListLink ExecutionOutputLink DefinedSchemaNode "get bc target" $X
Here, as compared to the forward step, we need the SampleLink to fulfill an additional requirement: the set of Atoms it samples from is restricted to Atoms that are able to produce the desired target of the backward chaining step. The desired target itself is to be chosen from a different distribution, which in the simplest case could just be
DefineLink DefinedSchemaNode "get bc target" SampleLink DefinedSchemaNode "get importance"
That is, in the simplest case, we can choose the target for backward chaining according to the "goal importance" values associated with Atoms (and note that, syntactically, we can assume if a SampleLink only has one argument, the second argument is implicitly the entire Atomspace).
The DefinedSchemaNode "boost importance", invoked in the "backward step" Schema, is assumed to boost the goal importance values of the Atoms involved in the combination that has been produced by the SampleLink and then executed. To make this works sensibly, either there needs to be some importance decay process in the system (such as OpenCog's ECAN module has built in), or the "boost goal importance" function has to explicitly take goal importance from other Atoms to give it to the Atoms in the current combination.
Sugaring the Syntax A Bit
Just for fun, let's see what the above "Attention Allocation" example would look like if we were using a more concise Atomese syntax... along the general lines suggested at Atomese_Syntax_Musings , but even more so...
The simple STI spreading algorithm given above would be written like this in a fairly well sugared Atomese:
Define "spread STI to chosen Atom" $sourceAtom SequentialExecution @ "spread STI among" @ "get STI" $sourceAtom @ "get links to neighbors" $sourceAtom @ "decrement STI" $sourceAtom Define "choose Atom and spread importance from it" @ "spread STI to chosen Atom" Sample DefinedSchema "get STI"
(the new thing I've experimentally introduced here is @ as a shorthand for (ExecutionLink (DefinedSchemaNode( ... )
These syntactic sugars are not currently implemented, and will probably be revised or replaced with other sugars before implementation happens. But the above example may give some flavor of what a more elegantly syntaxed "Atomese as probabilistic programming language" might look like...
Videos on Probabilistic Programming
Nil gives a tutorial on Probabilistic Programming with examples at the Cogathon (at Robotics Garage, Science Park in Hong Kong) https://www.youtube.com/watch?v=CvUDMvMnFVc
Probability & Pattern Mining for Inference Control - Video Lecture with Ben Goertzel https://www.youtube.com/watch?v=5t_XxFWSoWo