Ideas
From OpenCog
GSoC Students: See also StudentApplicationTemplate, Development, Publications, GSoC Projects for 2009 and GSoCProjects2008 for background reading. Note that we welcome any interested volunteers to work on these projects and not only GSoC applicants.
This page outlines many project ideas for the further development of OpenCog and related projects. Some of these will eventually be attempted by the core contributors, but others are projects or avenues of research we'd likely only investigate if we had the resources available (namely time and people). If you, dear reader, would like to become involved, and one of these projects catches your eye, then please contact us and we'll give you all the support we can!
There are a lot of projects here. The projects that some experienced OpenCog developer or scientist believe are especially interesting for GSoC 2012 are tagged with GSOC_2012_FLAG before the name in the list below. These are not the only projects with a chance of being accepted for GSoC 2012, but they are projects that somebody qualified is particularly eager to mentor.
See also: Not so quick tasks
Most of the projects here are are given difficulty labels:
- Relatively straightforward AI R&D, though not easy
- Pretty difficult
- Rather difficult
Note that these labels refer only to the AI aspects of the task; there may also be difficult software engineering or systems integration aspects, which haven't been labeled. Best to ask us directly if you are concerned about the potential challenge a project presents.
OpenCog Framework
- Language: C++ (uses Boost and templates)
- Code: https://code.launchpad.net/opencog
- List: http://groups.google.com/group/opencog-developers?hl=en
- License: AGPLv3 + linking exception
OpenCog consists of a core framework with several modules. This section outlines projects relating to improving the framework that supports the rest of the modules and MindAgents built on it. Unless otherwise specified, these projects don't require advanced AI related background knowledge.
GSOC_2012_FLAG -- Packaging for Ubuntu and other *nix, Cygwin and OSX port
Something that is important for community uptake of OpenCog is the ability to easily install and experiment with the system. Many people get stuck or put off by having to compile OpenCog and its dependencies. OpenCog does have Cogbuntu, a customised Ubuntu image, but this is overkill for many people and it would be preferable to allow users/developers to use the familiar "apt-get install opencog" (assuming you're using a Debian-derivative).
In addition, the OSX port recently initiated by Joel, is not complete. The port only includes a subset of the components within OpenCog and would ideally conforms to to the ".app" packaging of programs in OSX. Another alternative is building a port file for Macports so that users can run "port install opencog".
And last, but definitely not least, a Cygwin port would be very valuable as it would make demoing as well development dramatically simpler on Windows platforms. For development one would still need to install Cygwin but for demo only a couple of Cygwin dlls bundled with the package would work. OpenCog implements an virtual agent that lives and learn in a small virtual world coded with Unity3D game engine. That engine runs only on Windows so when one wants to run that virtual agent he/she needs to either have 2 machines, one with Linux running the "brain", the other with Windows running the virtual world, or run Linux on a virtual machine within Windows. Neither solution are convenient. The proxy (the software connecting OpenCog to Unity3D) will soon be released in open source, so achieving a complete Cygwin port + packaging the whole thing (including the proxy and the virtual world) would make demoing OpenCog on Windows accessible to anyone.
See Packaging for project requirements and steps.
Porting OpenCog to Windows
Cygwin is likely the easiest way to port the OpenCog codebase to Windows. But the Windows platform has some downsides, the main one being loss in speed. It's not clear whether OpenCog would suffer from that since most of its resources are spent on scientific calculations rather than system calls but in case the Cygwin version is indeed slower than it should be a Windows port might be worth considering. But this would only work in the mid/long term if someone is always willing to maintain the compatibility over time. Given that OpenCog is in active development it would quickly become out of date otherwise.
Distributed and persistent AtomSpace
Provide distributed AtomSpace persistence and synchronization.
This project requires little of no acquaintance with general AI or NLP concepts; instead, it requires strong knowledge of distributed processing, network, locking, transactional and kernel-type operating system principles. This is an ideal project for students with strong coding skills and theoretical comp-sci knowledge and who have ideally also have experience with one or several NoSQL databases. Interested should be familiar with the CAP Theorem and the ACID acronym.
In GSoC 2009, Jeremy Schlatter worked on connecting HyperTable to OpenCog as a distributed and persistent AtomSpace. This is a good prototype to base further development on, but other scalable NoSQL databases may be a better fit. In particular Riak looks promising due to it's in built link-walking on the database side.
In particular we need to also provide indexes that can hopefully mirror the planned pluggable AtomSpace indexes, and come up with a workable way of having multiple OpenCog instances deal with conflicting changes, etc.
There is also a need for redesigning the AtomSpace API to allow clients/MindAgents to specify the depth of their queries. The ability to focus on only those atoms in memory before resorting to searching the distributed AtomSpace is an important requirement to allow AI algorithms to scale.
- Distributed Architecture RFC provides an outline of one possible approach and could be used as a starting point in the design.
GSOC_2012_FLAG --Language Bindings
We would like to relegate C++ programming to the internals of OpenCog and have most development occur in other languages via language bindings.
Python bindings are implemented and in use, but other languages would be great to have too.
Haskell bindings would be great, for instance... but other possibilities are also worth exploring!
GSOC_2012_FLAG -- AtomSpace Visualizer
It can be hard to understand what is going on within OpenCog, and visualizing the hypergraph can be a big help. Unfortunately, real-world hypergraphs run from 100K to 10M or more atoms, and thus overwhelm not only most graphing/visualization software but often the user as well!
- There is a working visualization system using Gephi which works quite well. It would still be very useful to find ways to limit the number/type of Atoms displayed, to make it more manageable. There is some Python code on the OpenCog side which streams data to Gephi.
- Ubigraph was close to being useful, but it seems to be an essentially dead project and is also NOT open source. See the existing ubigraph dynamic library written by Jared Wigmore and Joel Pitt to see how the AtomSpace is currently connected to Ubigraph.
- Joel started building an embedded javascript viewer using an HTML5 canvas element - this would supplement the existing web UI. A demo web interface with a visualiser was available but is no longer maintained. See http://localhost:17034/index2.html if you run OpenCog locally and have the ocweb module loaded (enabled by default). ProcessingJS is used to simplify the graph drawing process and calls to the REST interface could be used expand the AtomSpace. Using The Javascript InfoVis Toolikit would probably simplify the visualisation code even further.
User defined indexes
The AtomTable (AtomSpace) currently has some hand-crafted indexes for quickly locating all atoms of some particular type, or having some particular property. The idea is to remove these and replace them by a generalized, user-defined index. (see graph of existing indexes).
The purpose of an index is to very quickly find an atom of some particular type or kind, without having to trawl over the whole atomspace to find it -- one just asks for all atoms in the index. To make this work, when an atom is added to the atomtable, it is also added to any index which it might match.
A general purpose extension would be to allow user-defined indexes. The "user" provides an index name, and a template pattern. Whenever an atom is added to the atom table, we check to see if it matches the template. If so, its added to the named index. When I say "atom" here, I really mean "maybe a whole graph of stuff": the template might be some fairly complicated hypergraph: the template might say "there must be two nested links of type X and Y, and these nodes here could be any type, but that there must be type Z", etc.
For instance, we might have many links in the AtomTable matching the "atom structure template"
Pred($X, $Y) = BindLink $X $Y AND ___ EvaluationLink isHuman $X ___ EvaluationLink ________ livesIn ________ List $X $Y
... and then we might want a special index that basically indexes "where people live". So, you could look up any location, and find who lives there ...
The hardest part of implementing this is already done: the pattern matcher can already match on any arbitrary hypergraph structure: so, we'd invoke it on any incoming atoms, to see if they match the desired template. User:Linas will tutor anyone on how to use this, or write a little convenience wrapper, just to make it easy.
In fancier language: the pattern matcher provides a Structured Query interface to the atomspace (vaguely similar to SQL or to SPARQL). In functional programming, an index is called memoization. Alternately, they can be understood as Sinot's Director strings for OpenCog's term algebra.
Who needs this? Why should we bother?:
- The embodiment code could probably use this (to be confirmed). It already uses the pattern matcher to filter input from sensors.
- NLP/chatbot processing could use this (but there are no definite plans at this time).
Difficulty: mostly straight-ahead coding, little/no AI research needed. The hard part will be understanding the existing code base, to understand where/how to slot this in.
OpenCog Workbench
The AtomSpace visualiser only addresses one aspect of a UI for understanding OpenCog dynamics. A generic GUI workbench, which can handle the parsing of log files and dynamic configuration of a running OpenCog instance, would be useful.
Other features that such a workbench might have:
- graphing dynamics through time:
- total atoms of various types
- distribution of truthvalues
- distribution of atom importance (STI/LTI)
- average coordination number (number of links atoms are connected with)
- network diameter
- # of disjoint clusters
- visualising and controlling expansion of the backward inference tree of PLN.
- active MindAgents and their resource usage (memory, CPU)
This can be part of with the OpenPSI GUI that Zhenhua Cai has developed.
The UI uses:
- zeromq messages for monitoring state changes from OpenCog.
- implementation in Python.
- pyqt for widgets.
Performance measurement suite
Many contemplated changes to the opencog infrastructure have a real impact on execution time, and the amount of memory used. A performance suite would collect into one place a large variety of different test cases, instrument them properly so as to measure speed and memory usage, and then report the results, in a completely automated fashion. An existing but simplistic benchmark utility for the AtomSpace exists in opencog/benchmark.
Support higher-order types in Atoms
Suppose you have a SchemaNode S with behavior like (using a Haskell-like notation, where A-->B means a function mapping A to B)
[[Number --> Concept] --> [Concept --> Concept] ] --> Number
This SchemaNode S has output type SchemaNode. However, this is not that precise a statement. S actually outputs SchemaNodes that have particular input restrictions and a particular output type...
So, ideally, a SchemaNode or PredicateNode would come with an optional Type object indicating what higher-order type they embody.
The Type object could be a dag indicating a higher-order type. This could utilise existing code in comboreduct or be based off some old C code Luke Kaiser wrote for Novamente LLC. Both used DAGs for type representation and inheritance on higher-order types. Or in principle one could introduce a TypeMapLink, and then store higher-order type objects as subgraphs in the Atomspace.
Right now type restrictions are left implicit in the Atomspace, which is generally OK since we're mostly dealing with simple cases at the moment. e.g. an AndLink between two or more ConceptNodes outputs a ConceptNode, so AndLink has type
List(Concept) --> Concept
However, to extend PLN, concept creation and other functions to deal with higher-order types effectively, explicit typing will probably be needed. This will be useful for abstract reasoning, not just mathematical reasoning but e.g. for analogical reasoning between significantly different domains, or experiential learning of symbol groundings or linguistic rules.
The question of when these types should be stored in the Atom versus calculated dynamically is not clear. In some cases, the type could be expensive to calculate, but it may be that such calculations are infrequent with most higher order type calculations being simple. A case where the type would need to be stored in the Atom would be when the SchemaNode in question carried out its calculation using a BuiltInSchema (coded, e.g., in C++); in this case there would be no simple way of inferring the type if it weren't specified. If a ComboSchema were involved instead, the type could be inferred, though this could be expensive.
OpenCog modules
The ideas listed in this section are all related to parts of OpenCog which are to some extent projects in their own right.
MOSES
- Language: advanced C++
- Code: https://code.launchpad.net/opencog (opencog/learning/moses)
- License: Apache 2.0
Meta-Optimizing Semantic Evolutionary Search (MOSES) is a new approach to program evolution, based on representation-building and probabilistic modeling. MOSES has been successfully applied to solve hard problems in domains such as computational biology, sentiment evaluation, and agent control. Results tend to be more accurate, and require less objective function evaluations, in comparison to other program evolution systems. Best of all, the result of running MOSES is not a large nested structure or numerical vector, but a compact and comprehensible program written in a simple Lisp-like mini-language. More at http://metacog.org/doc.html.
GSOC_2012_FLAG -- Extend MOSES to encompass primitive recursive functions using fold
Within Reduct/MOSES, implement the algebra of fold, as Moshe hints at the bottom of page 5 of "A Representation of Programs for Learning and Reasoning" (Looks, Goertzel).
and as explained in detail in the paper
G. Hutton (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming.
More specifically what needs to be done is
- upgrade combo to support lists (data structure, constructors, etc),
- add the builtin fold (update the combo interpreter, etc),
- design and code reduction rules for that new programmatic construct,
- upgrade representation building so MOSES can evolve programs using lists and fold.
From the perspective of a GSoC project, the great thing is that steps 1 and 2 are surely within reach, and this would already be useful in itself (i.e. the code would definitely not be lost). Then anything beyond that is just plain bonus.
Difficulty: 1-2-3
Other references:
- SRFI-1 shows fold and other list primitives, as implemented in scheme (lisp)
- Forrest Briggs, Melissa O'Neill, (2006), "Functional Genetic Programming with Combinators"
GSOC_2012_FLAG -- Higher-order Programmatic Constructs
An important project, appropriate for a student with some functional programming background, is to extend MOSES to handle higher-order programmatic constructs, including variable expression.
Our design for this involves Sinot's formalism of "director strings as combinators", and there is opportunity to assist with working out the details of the design as well as the implementation. This can be done many ways, including using combinatory logic or lambda calculus. The route that seems best at the moment would be to use Sinot's formalism of "director strings as combinators." Much of the work here is in Reduct and representation-building, which would be useful for both MOSES and Pleasure.
Difficulty: 2
Library of Known Routines
Currently, MOSES constructs programs out of a very minimal collection of functions and operators; for example, boolean expressions are constructed entirely from and, or, not. However, learning of programs could happen much faster if more complex primitives were provided: for example, xor, a half-adder, a multiplexer, etc. Rather than hard-code such functions into combo, this project calls for creating a generic library that could hold such functions, and the infrastructure to allow MOSES to sample from the library when building exemplars.
This is a needed step, on the way to implementing the pleasure algorithm below, as well as for transfer learning, etc.
This might be doable over the summer, although it would still need a very strong programmer. Proof that it is working would be defining xor in some file (as combo, viz. that xor($1,$2) := and(or($1, $2), not (and($1, $2))) and then show that this definition was actually used in solving the k-parity problem.
Pleasure
Complete implementation (or starts a new one from scratch) and then test and explore the Pleasure Algorithm for program learning started last year by Alesis Novik. http://opencog.org/wiki/MOSES:_the_Pleasure_Algorithm
Pre-requisite: Library of Known Routines, above.
Difficulty: 3
GSOC_2012_FLAG -- Generalize reduct and representation building to on operator properties
One problem with both Reduct and representation building is that they are pretty much hard wired for a set of operators, and, or, not, +, *, etc. While what we would want is that they would rely on operator properties instead.
For instance instead of having a rule
x + 0 -> x
or a rule
x or false -> x
we could have a single rule
x OP NE -> x
where NE is the neutral element of operator OP.
This same idea could be applied to representation building as well though is more complicated. The obvious advantage of that is that when one needs to expand reduct with new operators, no need to add new reduction rules (if the operator doesn't involve new properties), it would suffice to set the properties of the operator and we're done.
In more abstract terms: add support for algebraic structures and their interpretations, in the sense of model theory. The thing above that is called an "operator" is really an "algebraic structure", and "x+0 -> x" and "x or F ->x" are two different "interpretations" of that structure. Such systems are called "theories"; an example where this abstraction is already used is in SMT (Satisfiability Modulo Theories) solvers. The wikipedia article on SMT gives a list of "theories" that MOSES could support.
For other discussion, see this email thread
Difficulty: 2 for reduct, 3 for representation building
Use SVM for program tree selection
Try replacing the Bayes-net modeling in MOSES with SVM. SVM with tree kernels have now matured somewhat, see e.g.
http://www.xialuo.net/2010/11/07/svm-light-tk-1-2/
So the idea would roughly be to:
- use SVM to learn a model distinguishing fit from unfit program trees.
- use this model as a filter on generated program trees, to determine
which ones are worth fitness-evaluating.
Difficulty: 2
Transfer Learning
Causing MOSES to generalize across problem instances, so what it has learned across multiple problem instances can be used to help prime its learning in new problem instances. This can be done by extending the probabilistic model building step to span multiple generations, but this poses a number of subsidiary problems, and requires integration of some sort of sophisticated attention allocation method into MOSES to tell it which patterns observed in which prior problem instances to pay attention to.
Pre-requisite: Library of Known Routines, above.
Difficulty: 2-3
Arbitrarily Complex Program Learning
More on the previous project suggestion: The motivation for the above is to allow MOSES to learn arbitrarily complex programs. For instance, we would like it to be able to easily learn nlogn sorting algorithms without any fancy data preparation or other "cheating." It is possible that integrating Sinot's formalism into MOSES will allow effective learning of moderately complex programs using recursive control, which is something no one has achieved before and which is of critical importance in automated program learning.
Pre-requisite: Library of Known Routines, above.
Difficulty: 3
Action-Sequences Handling
The current version of MOSES does not elegantly or efficiently handle the learning of programs involving long sequences of actions. This is problematic for applications involving the control of robots or virtual agents. So, an important project is the extension of the Reduct and representation-building components of MOSES to effectively handle action-sequences. This work will be testable via using MOSES to control agents in virtual worlds such as Multiverse or Unity.
GSOC_2012_FLAG -- Integrate MOSES with the current OpenCog-controlled game agent
The MOSES program learning algorithm was used for imitation and reinforcement learning by virtual pets in the Multiverse virtual world, in a previous version of OpenCog. Now we are using OpenCog to control game characters in a Unity3D based game world inspired by Minecraft, and overall this is much more sophisticated than the previous virtual pet system. It uses PLN for reasoning, OpenPsi for goals and motivations, frequent pattern mining, and lots of other cool stuff. But it doesn't yet use MOSES.
Simply porting the virtual-pet MOSES approach to the new game-agent control system isn't a good idea, because lots of changes to the architecture have been made since then. But integrating MOSES within the new architecture should not be extremely problematic.
This project is good for someone who wants to become very familiar with the use of OpenCog to control game agents in Unity3D.
Improved hBOA
MOSES consists of four critical aspects: deme management, program tree reduction, representation-building, and population modeling. For the latter, the hBOA algorithm (invented by Martin Pelikan in his 2002 PhD thesis) is currently used, but we've found it not to be optimal in this context. So there is room for experimentation in replacing hBOA with a different algorithm; for instance, a variant of simulated annealing has been suggested, as has been a pattern-recognition approach similar to LZ compression. A student with some familiarity with evolutionary learning, probability theory and machine learning may enjoy experimenting with alternatives to hBOA so as to help turn MOSES into a super-efficient automated program learning framework. It already works quite well, dramatically outperforming GP, but we believe that with some attention to improving the hBOA component it can be improved dramatically.
Dimensional Embedding for Improved Program Learning
Suppose one is using MOSES (or some related technique) to learn a program tree that contains nodes referring to semantic knowledge in a large knowledge base (e.g. a program tree that contains terms like "cat", "walk" etc. that represent concepts in OpenCog's AtomTable).
Then, mutating these nodes (for "representation building", in MOSES lingo) requires some special mechanism -- for instance, one wants to mutate "cat" into "some other concept that is drawn from a Gaussian of specified variance, centered around 'cat'". One straightforward way to do this is to embed the concepts in the semantic knowledge base in an n-dimensional space, and then use Gaussian distributions in this dimensional space to do mutation.
We have identified some good algorithms for dimensional embedding, but the coding needs to be done, and a bunch of fiddling will likely be required!
MOSES Evolution of Recurrent Neural Nets
In 2009 Joel Lehman worked on Extending MOSES to evolve Recurrent Neural Networks. He did a great job but due to a limitation in MOSES' capability in handling continuous knobs he could not fully experiment with his work. Continuous knobs are now fully supported but the performance of NN evolutionary learning has only shown a slight increase and does not currently out-perform other more classical approaches. We don't know why, could be a bug in the code or a design error or something more fundamental.
So the goal of this project would be to pursue Joel Lehman's work, to first try and understand the low performance and fix/improve it, and then if time permits implement and experiment with other classes of RNN.
PLN
- Language: advanced C++
- Code: TBD
- List: TBD
- License: TBD
Probabilistic Logic Networks (PLN) are a novel conceptual, mathematical and computational approach to uncertain inference. In order to carry out effective reasoning in real-world circumstances, AI software must robustly handle uncertainty. However, previous approaches to uncertain inference do not have the breadth of scope required to provide an integrated treatment of the disparate forms of cognitively critical uncertainty as they manifest themselves within the various forms of pragmatic inference. Going beyond prior probabilistic approaches to uncertain inference, PLN is able to encompass within uncertain logic such ideas as induction, abduction, analogy, fuzziness and speculation, and reasoning about time and causality.
Planning via Temporal Logic & Consistency-Checking
A general and elegant way to do planning using PLN would be
- Use explicit temporal logic, so that the temporal dependencies between different actions are explicitly encoded in Atoms using relations like Before, After, During, etc.
- Improve the backward chainer so that, when it spawns a new Link in its chaining process, it checks whether this link is logically consistent with other promising Links occurring elsewhere in the backward chaining tree... (where if A and B are uncertain, logical inconsistency means that (A and B) has a much lower truth value strength than one would expect via an independence assumption.)
The search for logical consistency can be done heuristically, via starting at the top of the back-inference tree and working down. If quick inference says that a certain Link L is consistent with the link at a certain node N in the BIT, then consistency-checking with the children of N is probably not necessary.
This approach would subsume the heuristics generally used in planning algorithms, into a generally sensible method for inference control...
But, while conceptually straightforward, this would be significant work to implement.... So, for the near term, Jade is working on a simpler state-space planner extending the current, crude planner....
Intensional reasoning
Intensional inheritence has been implemented but more tests are needed, like comparing the results to data regarding human intensional inference (2)
Spatial and temporal reasoning
Implement spatial and temporal reasoning into PLN, using the Region Connection Calculus (for space) and Allen's Interval algebra (for time). (3)
Combining PLN and MOSES
Integrate MOSES-based supervised categorization into PLN, so that when PLN chaining hits a confusing point, it can launch MOSES to learn patterns in the members of the Atoms at the current end of the chain (which may then provide additional information useful in pruning). (1)
History-Guided Inference
Cause PLN's backward and forward chaining inference to utilize history -- so that an inference step is more likely to be taken if similar steps have been taken in similar instances. (1)
Some notes on how to do this, written in early May 2012 in the context of Dingjie Wang potentially starting work on the topic...
Here's an example inference tree pattern that might be mined from a data store of inference trees:
Implication
AND
Evaluation navigating me
Evaluation stuck me
Evaluation near (me $X)
SimilarityLink $X $Y
Evaluation
UsefulForInference
SimilarityLink $X $Y
Let's call this pattern "P1", just for simplicity...
What this pattern P1 says, informally, is "If I'm stuck, and I'm near $X, and $X is similar to $Y ... then inferences involving (SimilarityLink $X $Y) may be useful.."
If we had ContextLinks, a nicer version of the above would be
ContextLink
navigation
Implication
AND
Evaluation stuck me
Evaluation near (me $X)
SimilarityLink $X $Y
Evaluation
UsefulForInference
SimilarityLink $X $Y
but for the moment, the ContextLink-free version P1 given previously will be OK...
For an initial study in the context of navigating, the relevant form of the pattern would just be
Implication
AND
Evaluation stuck me
Evaluation near (me $X)
SimilarityLink $X $Y
Evaluation
UsefulForInference
SimilarityLink $X $Y
The conceptual definition of the predicate UsefulForInference (which does not yet exist in the code or in the PLN book), is:
Evaluation UsefulForInference W
means that doing inference using W is more likely than not to increase the speed or quality of one's inference conclusion.
The practical definition of UsefulForInference is:
Evaluation UsefulForInference W
means that W occurs more often in "good" sub-dags than "bad" sub-dags, in the historical inference dags that have been stored for analysis.
The goal of history-based inference control is to learn patterns like P1 from a historical database of pairs of the firm
( inference dag, context of inference dag)
The "context" of an inference dag consists of Atoms referenced in that dag, and other Atoms linked to them.
In order to make the learning tractable, we will initially want to specify the predicates involved. Spatial predicates are a good start, if the initial context of study is PLN navigation. The above example used "stuck" -- we may want to invent some other predicates representing practical contexts that a system may get into while navigating.
The use of context here complexifies things, yet, also makes them more realistic and powerful. In humans, inference control really does depend on context, and on patterns learned via analyzing the relationships between inferences and their contexts.
It seems to me that the pattern learning could reasonably be done using MOSES, acting on program trees of restricted form. For starters maybe we could look at patterns of the restricted form
Implication
AND
Link_1 targ_11 targ_12
...
Link_k targ_k1 targ_k2
Evaluation
UsefulForInference
AND
Link_(k+1) targ_(k+1)1 targ_(k+1)2
...
Link_(k+r) targ_(k+r)1 targ_(k+r)2
where the targets could be concrete Atoms or variables... and initially k and r would be kept pretty small...
Steps toward implementing this would seem to be:
1) Build the mechanism for storing a bunch of (inference dag, context) pairs [probably in the Atomspace?]
2) Run a bunch of PLN navigation experiments in different situations, and save the inference dags & contexts
3) Do some basic statistical analysis on this datastore of inference dags & contexts, so as to understand it
4) Customize MOSES to use the inputs, and the restricted functional form, needed to learn the above sorts of patterns (like P1)
5) Apply MOSES and see what patterns it learns
Concept Formation
GSOC_2012_FLAG: Blending
Implement conceptual blending as a heuristic for combining concepts, with a fitness function for a newly formed concept incorporating the quality of inferential conclusions derived from the concepts, and the quality of the MOSES classification rules learned using the concept. (1)
Map formation
Implement "map formation," a process that uses frequent subgraph mining methods to find frequently co-occurring subnetworks of the AtomTable, and then embodies these as new nodes. This requires some extension and adaptation of current algorithms for frequent subgraph mining. It also requires functional attention allocation. (1)
Context formation
Implement context formation, wherein an Atom C is defined as a context if it is important, and restricting other Atoms A to the context of C leads to conclusions that are significantly different from A's default truth value. (2)
Attention Allocation and Psychopathology
Some intriguing parallels may be drawn between the dynamics of attention allocation and various human mental disorders. The analogies to human psychological issues shouldn't be taken very seriously; it was more a way of thinking about the various dynamical phenomena that could be observed by appropriately fiddling with the parameters of attention allocation and its relationship to OpenPsi...
...
- Obsession. Focusing too long on just one thing. This could be achieved via not having the STI of Atoms in the AttentionalFocus delay rapidly enough. This should result in more and more HebbianLinks being built btw the Atoms in the AF, which causes the obsession to be reinforced yet further.
- Attention deficit disorder. Inability to pay attention to anything. This could be achieved via having the STI of Atoms in the AF decay too fast. Not many strong associations are learned because nothing stays in the AF that long. However, if the mind is in a situation where a lot of stimuli are constantly streaming in, from some coherent context, then the mind will pay attention to that same context for a while and build associations regarding it; this is similar to how folks with ADD can often be good at focusing on video games or other contexts involving rich fast streams of data and total interaction.
- Hypernania.* This would happen if a mind became very happy with each new thing that came into its AF, but then rapidly became unhappy with it afterwards. So, novelty must be a very highly weighted goal; perhaps the goal would be formulated as fulfilled if things with high novelty have high STI.
- Depression.* One kind of depression involves dwelling on unhappy thoughts. One way this could happen is if: 1) certain unhappy thoughts (thoughts linked to low satisfaction) are in the AF for a while, 2) everything that passes through the AF gets HebbianLinks built to these unhappy thoughts, 3) PLN inference builds yet more links between Atoms and the unhappy thoughts; 4) all these HebbianLinks then pull the unhappy thoughts back into the AF, whenever other things linked to the unhappy thoughts are in the AF...
OpenCog affiliated projects
Natural Language Processing (NLP)
The overall OpenCog NLP pipeline is described here. The pipeline has many pieces, including the Link Grammar parser and RelEx as pre-processing stages. Output is fed to OpenCog for reasoning. The projects below address various aspects of the pipeline.
Although NLP and language comprehension are core foci of OpenCog, many of the language pipeline tools are not currently within the core OpenCog framework and instead do a lot of pre-processing before feeding their output into the OpenCog AtomSpace for reasoning and learning. Eventually we plan to move this NLP processing into the core so that language rules can be updated based on experience.
RelEx
- Code: https://code.launchpad.net/relex/
- List: http://groups.google.com/group/link-grammar (mentors: Linas and Murilo)
- License: BSD, Apache 2.0
- Languages: C, C++, Java, Scheme, Perl
RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech tagging, noun-number tagging, verb tense tagging, gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. Optionally, it can use GATE for entity detection. RelEx also provides semantic relationship framing, similar to that of FrameNet.
The output from RelEx is a hypergraph of Nodes and Links, which are input into OpenCog, and may then be processed in various ways. In addition, the LexAt (lexical attaction) package provides many scripts and database tools to collect statistical information on parsed sentences.
PLN Inference on extracted semantic relationships
Currently RelEx takes in a sentence, and outputs a set of logical relationships expressing the semantics of the sentence. It is possible to take the logical relationships extracted from multiple sentences, and combine them using a logical reasoning engine, to see what conclusions can be derived. Thus, for example, the English-language input Aristotle is a man. Men are mortal should allow the deduction of Aristotle is mortal.
Some prototype experiments along these lines were performed in 2006, using sentences contained in PubMed abstracts (see paper). But no systematic software approach was ever implemented.
These experiments could be done in a variety of rule engines, including Probabilistic Logic Networks, as well as more standard crisp rule engines such as SWIRLS.
This project is appropriate for a student who is interested in both computational linguistics and logical inference, and has some knowledge of predicate logic.
A simple but detailed example of language-based inference using RelEx and PLN is given here: File:RelEx PLN Example Inference.pdf.
Related working notes can be found in the OpenCog source code tree, in the directory opencog/nlp/seme/README and opencog/nlp/triples/README.
See also NLP-PLN-NLGen pipeline.
Link Grammar Parser
A number of improvements to the Link Grammar parser would be useful.
- Implement left-to-right, limited-window search through parse space. The parser currently examines sentences as a whole, which means that the parsing of long sentences becomes very slow (approximately as N^3, with N the number of words in the sentence). Implementing a "window" to limit searches for connections between distant words should dramatically improve parse performance. Not only that, but it makes the parser more "neurologically plausible", by limiting difficult, long-range correlations between words. The window algorithm can be thought of as a kind of Viterbi decoder. This project is technically challenging, and requires a reasonable grounding in the theory of context-free grammars and at least a passing acquaintance with the idea of chart parsing, the backward-forward algorithm, and the Viterbi algorithm. This project requires a deep dive into the C code that implements the parse algorithms of Link Grammar. This project is the most critical, outstanding fix needed for Link Grammar; the slowness of long-sentence performance is the biggest thing holding back link grammar at this time.
- Implement support for "speech registers", such as newspaper headlines, medical notes, children's books, conversational speech. The style of linguistic expression varies depending on context: for example, in newspaper headlines, articles are frequently omitted. Children's books often sometimes use archaic grammatical constructions not commonly used in adult writing. During conversational speech, the rules of grammar are significantly relaxed; run-on sentences can predominate. Thus, a "one-size-fits-all" set of grammar rules has trouble with these situations, because the kind of grammar rules appropriate in one context might not be appropriate in another. Loosening parse rules sufficiently to work well for newspaper headlines or chat will result in rules that are too loose for book or newspaper text, generating an over-abundance of incorrect parses. Thus, a better approach is to apply a different set of grammatical rules, based on the context. This project requires modifications to the Link Grammar parser internals to allow for a different parse scoring mechanism for different speech situations, as well as the development of altered grammar rules that can begin to tackle some of these different "speech registers".
- Currently, Link Grammar is unable to properly handle quoted text and dialogue. A mechanism is needed to disentangle the quoting from the quoted text, so that each can be parsed appropriately. This might be done with some amount of pre-processing (e.g. in RelEx), or possibly within Link Grammar itself.
Learning simple grammars
RelEx uses the CMU Link Grammar as its underlying English-language parser. Link Grammar's weak point is short sentences, simple commands, directives and the like; sentences which typically occur in chat rooms. The project is to tinker with with the automatic acquisition of language via some learning mechanism or another.
Jianfeng Gao,Hisami Suzuki, (2003) "Unsupervised Learning of Dependency Structure for Language Modeling" describe a method for learning dependency grammars. The parser currently used by OpenCog, the Link-Grammar parser, has many similarities to dependency grammars. The paper John Lafferty, Daniel Sleator, and Davy Temperley. 1992. "Grammatical Trigrams: A Probabilistic Model of Link Grammar." Proceedings of the AAAI Conference on Probabilistic Approaches to Natural Language, October, 1992. describes statistical learning algos to generate Link Grammar rules. See also the link grammar bibliography for details.
Yuret's (1998) algorithm tries to create a dependency tree by computing mutual information (MI) between word pairs. The tree is discovered by computing the maximum spanning tree of the MI between all word pairs. There is an alternate approach: true, hierarchical clustering. In hierarchical clustering, one creates an MI-based metric (MI alone is not a metric), and applies cluster analysis techniques. To get hierarchy, one also looks for and computes metric measures between clusters i.e. one computes the MI between a third word, and a word-pair, instead of just the third word, and the head-word of the pair phrase.
It is not clear how adaptable these algorithms would be to the short, frequently ungrammatical sentences seen in chatrooms; and what mechanisms they suggest for keeping such learning from garbaging up the correct parses of more complex sentences. It is quite possible that link grammar needs new link types which would be used only in short sentences. (!) The idea of learning new link types seems unexplored.
The project is then to learn new Link Grammar links and rules, suitable for parsing the kind of speech often seen in chat rooms.
Statistically similar expressions for question answering
Proper language-based inference requires that similar expressions be recognized as such. For example, X is the author of Y and X wrote Y are more or less synonymous phrases. Lin and Pantel describe how to automatically discover such similar phrases (See Dekang Lin and Patrick Pantel. 2001. "Discovery of Inference Rules for Question Answering." Natural Language Engineering 7(4):343-360). A similar, but more sophisticated approach is given by Hoifung Poon & Pedro Domingos, 2009, "Unsupervised Semantic Parsing".
Both of these systems use statistical techniques to find commonly occurring sub-expressions. Both start with text that has been processed with a dependency parser. While the first searches for chains of grammatical dependency relations that can be taken as synonymous, the second searches for arbitrary patterns that can be clustered into the same "synonym" or "concept". The two papers also differ in the statistical measures used to assign synonymous patterns into a common cluster, but it is not clear which statistical technique is better, or if that even matters very much. Lin uses measures based on mutual information, while Domingos applies the theory of Markov Logic Networks.
The goal of this task would be to implement a frequent-pattern-mining system within the OpenCog infrastructure. That is, given a large number of OpenCog hypergraphs, common subgraphs are to be identified, and then, using some statistical measure (such as mutual information, or possibly MLN, or another distance measure), such hypergraphs would be clustered together. Each such discovered cluster can be thought of as being a "concept"; the elements of the cluster are "synonymous expressions for the concept".
Ideally, by searching not just for synonyms, but for common patterns in general, the system will hopefully automatically find "speech formula", or perhaps "lexical functions" as described in Melcuk's Meaning-Text Theory. Presumably, such clustering techniques should also be able to group similar things, such as a cluster containing names of musical instruments, etc.
The discovery of such clusters should be useful both for question answering, as well as for natural-language generation (as clusters would contain different ways of expressing the same "idea".)
Classifying words by grammatical usage, learning new grammar rules.
One of the datasets associated with OpenCog is a large database associating words with their grammatical usage; another database contains triples of (word, grammatical usage, word sense). These databases contain millions of entries, and are largely unexplored. They could be data-mined for clusters of words that are used in grammatically similar ways. This project would require evaluating and selecting appropriate open-source clustering software. The result of applying clustering would include identifying how new, unknown words should be treated grammatically (is the unknown word being used as a noun, or a verb?)
Clusters might distinguish between types of words that link-grammar treats as being the same (such as most adjectives, adverbs), but are, in fact, used quite differently by English speakers. For example, the parser treats acetylene and adroitness as nouns belonging to the same class; yet clearly one would never say the adroitness exploded. Perhaps clustering could help split this class of words into smaller, more refined classes. Perhaps clustering might also distinguish different semantic senses: I tend to sheep and I tend to agree use the word tend in two dramatically different senses. Can grammatical clustering distinguish between these senses?
Given that Google has already released N-gram data from their book scanning, it would be worth investigating the use of this too.
Some words may be in classes that are too narrow: parses fail because the parser does not know that the word can be used in a broader grammatical context. By examining how a word is used, perhaps clustering could identify such cases as well.
Natural Language Generation
We have a Java software system called NLGen that, given a set of RelEx-like relations, generate a syntactically correct English-language sentence.
It works OK on short sentences but needs to be improved significantly to be able to handle sentences with complex phrasal or clausal structure. Also, some statistical linguistics should be inserted to allow it to use observed word frequencies to guide its sentence generation.
The basic idea underlying NLGen's algorithm is described here: SegSim
Other language generation approaches include Markus Guhe's (2003) "Incremental conceptualisation for language production"
Explore Landmark Transitivity in Link Grammar
The Link Grammar uses a constraint of "planar graphs" (i.e. no link crossings) to rule out unreasonable parses. It seems that it might be possible to replace this rule by the notion of "Landmark Transitivity" taken from Hudson's Word Grammar. The basic idea is this:
Each Link Grammar link is given a parent-child relationship: one end of the link is the parent, the other the child. Thus, for example, given a noun to noun-modifier link, the noun is the parent of the link. Then, parents are landmarks for children. Transitivity (in the mathematical sense of "transitive relation") is applied to these parent-child relationships. Specifically, the no-links-cross rule is replaced by two landmark transitivity rules:
- If B is a landmark for C, then A is also a type-L landmark for C
- If A is a landmark for C, then B is also a landmark for C
where type-L means either a right-going or left-going link.
Ben hypothesizes that adding Landmark transitivity might be able to eliminate most or all of Link Grammar's post-processing rules. See Ben's PROWL grammar for details. See also Natural Language Processing, below.
Word Grammar Parsing
Implement a Word Grammar parser that utilizes the link grammar dictionary, and utilizes PLN inference for semantic biasing of the parsing process. This gives a strong impression of being an approach to NL comprehension that is suitable for general intelligence.
Inferring Semantic Mapping Rules
Use PLN to combine the hand-coded semantic normalization rules that exist in the RelExToFrame rule-base, to form new rules. Also, to generalize the rules to other words not currently covered by them, via combining the rules with semantically-based concept-similarity measures.
Improved reference resolution
Reference resolution (anaphora resolution) is the problem of determining what words like "it", "him" and "her" refer to. There are several ways in which this problem might be attacked.
- Statistical methods: The current RelEx implementation uses the Hobbs algorithm for this, which is an intelligent but crude mechanism that achieves about 60% accuracy. By combining Hobbs algorithm with statistical measures (one of which is sometimes called the Hobbs score), it should be possible (according to literature results) to get up to 85% accuracy or so. Appropriate for anyone who is interested in getting some hands-on experience with statistical corpus linguistics.
- Reasoning: Sentences commonly make assertions, which can be checked for accuracy. So, for example: Jules went to Paris. He saw many things there. e.g.
- Does he refer to Jules, or to Paris?
- Can Jules see things?
- Can Paris see things?
- Does there refer to Paris, or to Jules?
- Can one see many things in Jules?
- Can one see many things in Paris?
This approach is more powerful than a statistical approach, but requires establishing a large knowledge-base, and performing PLN reasoning there-on.
OpenBiomind
- Language: Java
- Code: http://code.google.com/p/openbiomind
- List: http://groups.google.com/group/openbiomind
- License: GPLv2
OpenBiomind contains code for applying genetic programming to analyze gene expression microarray data and SNP data. This approach has been successfully used to learn diagnostic rules for cancer, Alzheimer's, Parkinson's and other diseases, as reflected in several publications.
Neurobiological data analysis
Add necessary data types preprocessors (e.g. MEG, fMRI, EEG, PET, etc.), analysis algorithm tweaks, documented methodologies and other facilities for analysis of neurobiological data. Many public neurobiological databases exist (falling under the umbrella of the Human Cognome Project), for example the Allen Brain Atlas and others.
DeSTIN
DeSTIN stands for Deep SpatioTemporal Inference Network and is a scalable deep learning architecture that relies on a combination of unsupervised learning and Bayesian inference. A paper by the inventors of this method is available. It is well suited to become the vision module of OpenCog and has been adopted by the project to tidy up the code for release as OSS. DeSTIN is implemented in CUDA.
Several people are already working on DeSTIN, but having someone involved to assist with the packaging, implement a testing framework using CTest, and/or create Python language bindings will help develop it into a worthwhile project in its own right.
GSOC_2012_FLAG -- Connecting DeSTIN and OpenCog
Here is one AI-intensive idea for a project involving DeSTIN (by Ben Goertzel)
I want to feed input from a webcam into DeSTIN, and then (in abstracted form) into OpenCog. Toy problems like MNIST are not really interesting to me, because my research goal in this context is robot vision. So scalability is important.
There will be a "smart interface" between DeSTIN and OpenCog, that will basically
-- record a database of DeSTIN state vectors
-- identify patterns in that database using machine learning algorithms
-- output these patterns into OpenCog's probabilistic/symbolic knowledge base
Here are some detailed ideas about how to potentially do this
That is enough for this summer. Later I also would like to take information from OpenCog and send it back to DeSTIN to bias DeSTIN's search, but that's a more advanced topic for later....
Operationally, what we need for this is version of DeSTIN that is scalable enough to take input at a reasonable rate of frames per second from a webcam, and keep on adapting its state based on what it sees. Michel's DeSTIN version did this, but Itamar's lab has other versions that may also, I'm not sure.
This may require implementing some sort of forgetting mechanism inside DeSTIN, so that it can get rid of some old centroids when the scene changes a lot.
I don't aim to get any fancy practical tasks accomplished this summer, using this integrated system. I just hope to get really simple stuff to work, like
-- feed DeSTIN pictures of cars, motorcycles, bicycles, unicycles and trucks (videos would be great, but let's start with pictures)
-- see if the whole system results in OpenCog getting ConceptNodes for car, motorcycle, bicycle, unicycle and truck, with appropriate logical links between them like
Inheritance car motor_vehicle <.9>
Inheritance truck motor_vehicle <.9>
Inheritance bicycle motor_vehicle <.1>
Similarity bicycle unicycle <.7>
Similarity bicycle motorcycle <.7>
Similarity motorcycle car <.6>
Similarity bicycle car <.3>
Note that while I'm using English terms here, in this experiment the OpenCog ConceptNodes would have no English names and English words would not be involved at all -- I'm just using the English terms as shorthands.
This would be a first example of the bridging of DeSTIN's subsymbolic knowledge and OpenCog's symbolic knowledge, which would be pretty cool and a good start toward further work on intelligent vision ;)
OpenCog Applications
This section lists projects related to using OpenCog components for specific applications.
GSOC_2012_FLAG -- Reasoning about Biological Interactions
There are many databases denoting interactions between genes, proteins, chemical compounds and other biological entities. These can be imported into OpenCog's AtomSpace format (after some scripting to reformat them) and then reasoned on using PLN.
An example of the value of this kind of reasoning would be: using analogical inference to approximatively transfer knowledge from one organism to another, e.g. flies or mice to humans.
Sokoban
Sokoban would be a good toy domain for experimenting with various OpenCog methods. So hooking up OpenCog to a simple Sokoban server would seem worthwhile. MOSES and PLN could be used for Sokoban on their own, but it's more interesting of course to take an integrative approach. (1)
Keyvan has written code to represent Sokoban levels in the AtomSpace, but it would also be useful to implement Sokoban in the Unity virtual world.
Embodiment
These projects are related to supporting the embodiment of OpenCog in real and virtual environments.
GSOC_2012_FLAG -- Inference regarding skeletons and surfaces of objects in a 3D game world
How can an OpenCog controlled game character, in a 3D game world implemented in Unity3D, reason about solid objects (represented either as 3D models or as Minecraft-like assemblages of bricks)?
One way is to use existing OSS code to map solid objects into skeletons, and to identify the parts of the surface of the objects. This information about skeletons and surfaces can then be fed into the PLN (Probabilistic Logic Networks) engine for reasoning.
This is good for someone with experience or very strong interest related to 3D modeling, graphics or game programming, as well as OpenCog and logical inference.
Implement new adaptors
These were formerly, but somewhat eroneously called "proxies", and are used to connect OpenCog with an environment, real or virtual. These adaptors need to interpret the XML actions passed by OpenCog into avatar actions within the environment as well as generate perceptual updates as the environment changes.
One core aspect this involves is building a NetworkElement on the world-side (whether it's in a robot or game engine) so that the adaptor can join the Embodiment System Architecture.
Some potential candidates:
- Create an adaptor to allow OpenCog to join Minecraft games or other online gaming worlds.
- Create an adaptor that lets OpenCog act as though it is embodied while browsing the web.
- Create an IRC or IM embodiment.
Convert message format from XML to protocol buffers
XML is slow and verbose. Google protocol buffers are better choice for passing perceptual updates and actions between OpenCog's "mind" and the environment it's interacting with. This project would involve a conversion of XML messages into protocol buffer objects, and given that some of the XML messages are built using string concatenation it may require some refactoring of the messaging system.
GSOC_2012_FLAG -- Robotics
Extend OpenCog to communicate with physical robots, using a toolkit like [ROS: Robot Operating System], PyRo, the Player-Stage Framework, or something similar. As mechanisms for communicating with agents in virtual worlds will be provided, this is not a huge conceptual leap, but will doubtless lead to many practical complexities. (1-3 depending on how far you go)
Work has been done toward this project by the Artificial Brain Lab at Xiamen University in Fujian, China.
- XIA-MAN: An Extensible, Integrative Architecture for Intelligent Humanoid Robotics PDF, Ben Goertzel, Hugo de Garis