FOR GSoC 2017 Students
Please see the page Early 2017 Ideas_Page
See also StudentApplicationTemplate, Development. OpenCog By Way of Wikipedia and Publications for background reading. Note that we welcome any interested volunteers to work on these projects and not only GSoC applicants!
SOME PARTLY OBSOLETE CONTENT BELOW
The rest of this page outlines many project ideas for the further development of OpenCog and related projects. It's kind of a mess -- e.g. some of the project are already done.... But the material is left here for the moment because there are some useful bits here and there; it will be cleaned up "soon"....
Most of the projects here are are given difficulty labels:
- Relatively straightforward AI R&D, though not easy
- Pretty difficult
- Rather difficult
Note that these labels refer only to the AI aspects of the task; there may also be difficult software engineering or systems integration aspects, which haven't been labeled. Best to ask us directly if you are concerned about the potential challenge a project presents.
- Language: C++ (uses Boost and templates), plus Python, Haskell and Scheme in some parts
- Code: https://github.com/opencog/opencog
- Lists: Discussion
- License: AGPLv3 + linking exception
OpenCog consists of a core framework with several modules. This section outlines projects relating to improving the framework that supports the rest of the modules and MindAgents built on it. Unless otherwise specified, these projects don't require advanced AI related background knowledge.
Packaging for Debian, Cygwin, OS X
Packaging OpenCog components like CogServer and Embodiment for popular operating systems could help to expand our user and developer communities.
Currently OpenCog compiles on Linux and is not packaged for any OS. A porting and packaging project would involve working with CMake files, gcc compiler, linker, and shared object loader specifics, and packaging tools for Debian/Ubuntu, Cygwin or MinGW, and OS X (.pkg installer packages, ports or homebrew).
See Packaging for project requirements and steps.
Porting OpenCog to Windows
Cygwin is likely the easiest way to port the OpenCog codebase to Windows. But the Windows platform has some downsides, the main one being loss in speed. It's not clear whether OpenCog would suffer from that since most of its resources are spent on scientific calculations rather than system calls but in case the Cygwin version is indeed slower than it should be a Windows port might be worth considering. But this would only work in the mid/long term if someone is always willing to maintain the compatibility over time. Given that OpenCog is in active development it would quickly become out of date otherwise.
Distributed and Persistent AtomSpace Performance
Provide distributed AtomSpace persistence and synchronization workloads and benchmarks.
- Obtaining a "typical" workload so that performance can be understood.
- Developing scripts to run that workload across multiple systems, booted with Vagrant containers.
- Expanding the existing benchmark to be able to run that workload, and measure performance.
- Perform actual benchmark measurements on the existing back-end.
- Possibly expand the existing backend to support additional indexes that might be required by the workload.
- Possibly create a new backend for a different database system.
The current best-idea for doing this is being tracked in github issue #1502. It requires first removing most uses of SetLink and replacing it by MemberLink. An atom can then be made to belong to an atomspace by using a new AtomSpaceLink. This removes much or most of the implementation complixity, opening the door to a port of the atomspace to Apache Ignite, which currently seems like the best bet for a distributed platform.
Design and implement initial framework towards a cognitive API which can scale up to satisfy the highlevel overview described @ http://wiki.opencog.org/w/A_Cognitive_API
Improved Layout for AtomSpace Visualizer
Scott Jones has written a nice AtomSpace visualizer, web-based and interacting with the AtomSpace via a REST API. However, the graph layouts provided with the Atomspace visualizer are often suboptimal given the topology of the AtomSpace subgraphs. For instance, when the subgraph being visualized is tree-like, it would be good to use a tree-like layout manager.
- Looks good for Semester of Code; Ben or Cosmo could mentor
OpenCog Visualization Workbench
The AtomSpace visualiser only addresses one aspect of a UI for understanding OpenCog dynamics. A generic GUI workbench, which can handle the parsing of log files and dynamic configuration of a running OpenCog instance, would be useful.
Other features that such a workbench might have:
- graphing dynamics through time:
- total atoms of various types
- distribution of truthvalues
- distribution of atom importance (STI/LTI)
- average coordination number (number of links atoms are connected with)
- network diameter
- # of disjoint clusters
- visualising and controlling expansion of the backward inference tree of PLN.
- active MindAgents and their resource usage (memory, CPU)
This could also be integrated with, or made to incorporate the functions of, the OpenPsi GUI that Zhenhua Cai has developed. That UI uses:
- zeromq messages for monitoring state changes from OpenCog.
- implementation in Python.
- pyqt for widgets.
- Looks good for Semester of Code; Ben or Cosmo could mentor
User defined indexes
The AtomSpace is essentially a certain kind of graphical database. Like other databases, it should allow users to define the kinds of structures that they need to be able to quickly locate. This could be done with a user-defined index; the index would be maintained by running the pattern matcher against newly-created atom configurations.
This requires work to be done in conjunction with the type checking proposal, above.
Usefulness: We've had the ability to do this for a decade now, but over the last 10 years, no one has wanted this. Is this actually interesting to anyone?
Difficulty: mostly straight-ahead coding, little/no AI research needed. The hard part will be understanding the existing code base, to understand where/how to slot this in.
- Looks good for Semester of Code; likely Linas could mentor
Currently in the Atomspace, nothing stops you from creating
InheritanceLink NumberNode "9" GroundedSchemaNode "print"
even though this is nonsense....
In the AtomSpace, type checking is needed so as to prevent users from adding inappropriate atoms to the atomspace. It is also a vital part of being able to provide user-defined indexes. (See below). The type subsystem is already rich enough to fully describe atom types; its just not set up to monitor the creation of specific atoms. Not that Some/many atom types have C++ classes that have constructors that do validite the structure of the atom. For the above example, it would not be hard to create an InheritanceLink class to perform that checking.
Usefulness: Is this really useful? There's a performance impact of doing this everywhere, and the important classes e.g. BindLink, already do this.
Difficulty: mostly straight-ahead coding, little/no AI research needed. The hard part will be understanding the existing code base, to understand where/how to slot this in.
- Looks good for Semester of Code; likely Linas could mentor
Type-checking for PLN
The ArrowLink is a function-type constructor, allowing the type A->B to be specified, where A->B means a function accepting A as input, and returning B. For example, a function accepting a number as an input and a Concept as an output can be written as
The TypedAtomLink can be used to associate the above function signature to a particular atom; for example:
TypedAtomLink SchemaNode "foobar schema" ArrowLink TypeNode "NumberNode" TypeNode "ConceptNode"
The above states that the Schema "foobar schema" is a function that takes a number as an input, and returns a concept as an output.
Right now type restrictions are left implicit in the Atomspace, which is generally OK since we're mostly dealing with simple cases at the moment.
However, to extend PLN, concept creation and other functions to deal with higher-order types effectively, explicit typing will probably be needed. This will be useful for abstract reasoning, not just mathematical reasoning but e.g. for analogical reasoning between significantly different domains, or experiential learning of symbol groundings or linguistic rules.
See also above, the "type checking" idea.
MOSES - Genetic Program Learning
- Language: advanced C++
- Code: https://code.launchpad.net/opencog (opencog/learning/moses)
- License: Apache 2.0
Meta-Optimizing Semantic Evolutionary Search (MOSES) is a new approach to program evolution, based on representation-building and probabilistic modeling. MOSES has been successfully applied to solve hard problems in domains such as computational biology, sentiment evaluation, and agent control. Results tend to be more accurate, and require less objective function evaluations, in comparison to other program evolution systems. Best of all, the result of running MOSES is not a large nested structure or numerical vector, but a compact and comprehensible program written in a simple Lisp-like mini-language. More at MOSES.
See also the MOSES Ideas page. The contents of this entire section should be moved to there ... and should be cleaned up, expanded.
Add time-series support
Right now, MOSES has no support at all for time series.
This is important for inferring differential equations (from finite differences) which is central for learning motion and movement, which is in-turn central for robot movement and sports (the ball is coming at you; you see it in your video feed. Where does your robot gripper need to be to catch it? When should the gripper close? The ball is arcing due to gravity, and maybe its a nerf ball and slowing due to air drag, so you need two neighboring time-series points to discover the 2nd-order difference equation.)
So for example, from a video feed, you have ball position x,y at times t1, t2, t3 ...
Right now, moses takes as input a collection of rows; these rows could be in any order; they are unrelated to each other. Sure you could give it rows:
t1, x1, y1 t2, x2, y2 t3, x3, y3
and maybe it could infer something, but this is a much harder problem, because it is an integral problem, rather than a differential problem. You want to be able to look at differences between neighboring rows, and right now MOSES cannot do that. Difference equations are much easier to spot than integral equations.
You could hack around this in various ugly ways e.g. putting two or three rows into one, but these are .. hacks. e.g
t1 x1 y1 x2 y2 t2 x2 y3 x3 y3 t3 x3 y3 x4 y4
Create an analogue of OpenBiomind using MOSES
- Language: C++ and R
- Code: http://code.google.com/p/openbiomind
- List: http://groups.google.com/group/openbiomind
- License: GPLv2
OpenBiomind is a Java toolkit contains code for applying genetic programming to analyze gene expression microarray data and SNP data. This approach has been successfully used to learn diagnostic rules for cancer, Alzheimer's, Parkinson's and other diseases, as reflected in several publications.
MOSES is a more powerful tool for microarray and SNP data analysis, than the GP system used in OpenBiomind. Also, the Java code for OpenBiomind is not as flexible and extensible as it should be.
It would be desirable to implement an analogue of OpenBiomind utilizing MOSES. A lot of headway has already been made in this direction using the R language, but more parts still need to be implemented.
- Looks good for Semester of Code; Mike Duncan could mentor
Use kernel methods to pre-filter continuous data
Continuous-valued data is often very noisy, and thus difficult to fit (i.e. difficult to converge to a well-fitting function). One can remove (filter out) a lot of the noise by binning it into discrete bins; and replacing a continuous-valued feature by a discrete-valued one (or as N binary-valued features for some small N).
The problem with binning is that if there are multiple continuous-valued features, they are often strongly correlated with each other. Binning them will smother some of this correlation, which is probably not a good thing. Thus, it would be better if the continuous-valued features were diagonalized into orthogonal components before being binned. The SVM, PCA and perceptron algos are all good ways to perform this diagonalization. They all implement a common, basic idea of diagonalization, refered to as "kernel methods".
So: the idea is to pass continous-valued features through some diagonalization stage before feeding them to MOSES. Note: this requires finding a good dataset which will demonstrate the utility of this approach.
Enable MOSES to learn recursive functions using fold
Within Reduct/MOSES, make use of the algebra of fold, as Moshe hints at the bottom of page 5 of "A Representation of Programs for Learning and Reasoning" (Looks, Goertzel).
Combo currently supports the fold primitive and lists. However, there is currently no clear/practical way of deploying these within the context of MOSES.
Thus, what needs to be done is
- design a way of presenting recursive data to MOSES
- design and code reduction rules for that new programmatic construct,
- upgrade representation building so MOSES can evolve programs using lists and fold.
- test this on some appropriate test cases,
One appropriate test case might be learning an nlogn sorting routine
For instance mergesort can be implemented in Haskell as
mergesort xs = foldt merge  [[x] | x <- xs]
with the function merge a duplicates-preserving variant of union.
XXX: It is not clear to me how we would even present list data to MOSES. Its not clear to me where this kind of data occurs naturally .. I know of no robotic sensors, financial data, biological data, or text/nlp algorithms that naturally create lists of data. So even if we had this function, where would it be applied? How would it be applied?
XXX Given a list, what sort of recursive processing would we want to do on it (besides sorting the list)? Or rather, what sort of recursive structure might some input data possess, such that we would need to automatically learn it? Recall, the point of MOSES is to find hidden, undetected, unknown patterns. What sort of primitive recursive patterns might we plausibly expect to find? Having found them, what would we do with them?
- A detailed explanation of fold is given in G. Hutton (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming.
- SRFI-1 shows fold and other list primitives, as implemented in scheme (lisp)
- Forrest Briggs, Melissa O'Neill, (2006), "Functional Genetic Programming with Combinators"
Generalize reduct and representation building to on operator properties
One problem with both Reduct and representation building is that they are pretty much hard wired for a set of operators, and, or, not, +, *, etc. While what we would want is that they would rely on operator properties instead (i.e. in the sense of category theory, model theory).
For instance instead of having the rule
x + 0 -> x
as well as the rule
x or false -> x
we could have a single rule
x OP NE -> x
where NE is the unit element of operator OP.
XXX -- are there really enough examples of this in combo to warrant this work? So, for example, we have:
- unit element: x+0 -> x, x or F -> x, x and T = x
- generic distributive law
- generic associative law
- generic commutative law
- generic division (??)
But we already have these implemented for boolean and contin (continuous) variables; its not clear that there's much benefit to doing this in general, since we have no plans at this time to use moses to learn groups, ideals, rings, integer domains, modules, vector spaces, magmas, categories, elliptic curves, monoids, schemes, sheaves, fibre bundles, homologies, etc., where these abstract category-theoretic concepts are actually useful and important.
XXX Also, without the work that makes combo/reduct extensible (see proposal above), this work would not be terribly useful, as there is no practical way of adding new operators to the system at this time.
This same idea could be applied to representation building as well though is more complicated. The obvious advantage of that is that when one needs to expand reduct with new operators, no need to add new reduction rules (if the operator doesn't involve new properties), it would suffice to set the properties of the operator and we're done.
In more abstract terms: add support for algebraic structures and their interpretations, in the sense of model theory. The thing above that is called an "operator" is really an "algebraic structure", and "x+0 -> x" and "x or F ->x" are two different "interpretations" of that structure. Such systems are called "theories"; an example where this abstraction is already used is in SMT (Satisfiability Modulo Theories) solvers. The wikipedia article on SMT gives a list of "theories" that MOSES could support.
For other discussion, see this email thread
Difficulty: 2 for reduct, 3 for representation building
Higher-order Programmatic Constructs
An important project, appropriate for a student with some functional programming background, is to extend MOSES to handle higher-order programmatic constructs, including variable expression.
Our design for this involves Sinot's formalism of "director strings as combinators", and there is opportunity to assist with working out the details of the design as well as the implementation. This can be done many ways, including using combinatory logic or lambda calculus. The route that seems best at the moment would be to use Sinot's formalism of "director strings as combinators." Much of the work here is in Reduct and representation-building, which would be useful for both MOSES and Pleasure.
- Looks good for Semester of Code; Nil could mentor
Library of Known Routines
Currently, MOSES constructs programs out of a very minimal collection of functions and operators; for example, boolean expressions are constructed entirely from and, or, not. However, learning of programs could happen much faster if more complex primitives were provided: for example, xor, a half-adder, a multiplexer, etc. Rather than hard-code such functions into combo, this project calls for creating a generic library that could hold such functions, and the infrastructure to allow MOSES to sample from the library when building exemplars.
This is a needed step, on the way to implementing the pleasure algorithm below, as well as for transfer learning, etc.
This might be doable over the summer, although it would still need a very strong programmer. Proof that it is working would be defining xor in some file (as combo, viz. that xor($1,$2) := and(or($1, $2), not (and($1, $2))) and then show that this definition was actually used in solving the k-parity problem.
- Looks good for Semester of Code; Nil could mentor
Complete implementation (or starts a new one from scratch) and then test and explore the Pleasure Algorithm for program learning started in 2011 by Alesis Novik. See MOSES: the Pleasure Algorithm
Pre-requisite: Library of Known Routines, above.
Causing MOSES to generalize across problem instances, so what it has learned across multiple problem instances can be used to help prime its learning in new problem instances. This can be done by extending the probabilistic model building step to span multiple generations, but this poses a number of subsidiary problems, and requires integration of some sort of sophisticated attention allocation method into MOSES to tell it which patterns observed in which prior problem instances to pay attention to.
A much simpler approach is to exchange the MOSES inner loop needs with the outer loop. Unfortunately, this would be a major tear-up of the code base. Exchanging the order of the loops would allow an endless supply of new stimulus to be continuously processed by MOSES, in essence performing continuous, ongoing transfer learning. This should be contrasted to the current design: the input is a fixed-size, fixed-content table, and during training, it is iterated over and over, within the inner loop. If the order of the loops were reversed, then equivalent function could be obtained simply by replaying the same input endlessly. However, there would now be the option to NOT replay the same input over and over, but rather, it could slowly mutate, and the learned structures would mutate along with it. This is a much more phenomenology, biologically correct approach to learning from input stimulus.
Arbitrarily Complex Program Learning
More on the previous project suggestion: The motivation for the above is to allow MOSES to learn arbitrarily complex programs. For instance, we would like it to be able to easily learn nlogn sorting algorithms without any fancy data preparation or other "cheating." It is possible that integrating Sinot's formalism into MOSES will allow effective learning of moderately complex programs using recursive control, which is something no one has achieved before and which is of critical importance in automated program learning.
Pre-requisite: Library of Known Routines, above.
The current version of MOSES does not elegantly or efficiently handle the learning of programs involving long sequences of actions. This is problematic for applications involving the control of robots or virtual agents. So, an important project is the extension of the Reduct and representation-building components of MOSES to effectively handle action-sequences. This work will be testable via using MOSES to control agents in virtual worlds such as Multiverse or Unity.
Use SVM for program tree selection
Try replacing the Bayes-net modeling in MOSES with SVM. SVM with tree kernels have now matured somewhat, see e.g.
So the idea would roughly be to:
- use SVM to learn a model distinguishing fit from unfit program trees.
- use this model as a filter on generated program trees, to determine
which ones are worth fitness-evaluating.
MOSES consists of four critical aspects: deme management, program tree reduction, representation-building, and population modeling. For the latter, the hBOA algorithm (invented by Martin Pelikan in his 2002 PhD thesis) is currently used, but we've found it not to be optimal in this context. So there is room for experimentation in replacing hBOA with a different algorithm; for instance, a variant of simulated annealing has been suggested, as has been a pattern-recognition approach similar to LZ compression. A student with some familiarity with evolutionary learning, probability theory and machine learning may enjoy experimenting with alternatives to hBOA so as to help turn MOSES into a super-efficient automated program learning framework. It already works quite well, dramatically outperforming GP, but we believe that with some attention to improving the hBOA component it can be improved dramatically.
- Looks good for Semester of Code; Nil or Linas could mentor
Dimensional Embedding for Improved Program Learning
Suppose one is using MOSES (or some related technique) to learn a program tree that contains nodes referring to semantic knowledge in a large knowledge base (e.g. a program tree that contains terms like "cat", "walk" etc. that represent concepts in OpenCog's AtomTable).
Then, mutating these nodes (for "representation building", in MOSES lingo) requires some special mechanism -- for instance, one wants to mutate "cat" into "some other concept that is drawn from a Gaussian of specified variance, centered around 'cat'". One straightforward way to do this is to embed the concepts in the semantic knowledge base in an n-dimensional space, and then use Gaussian distributions in this dimensional space to do mutation.
We have identified some good algorithms for dimensional embedding, but the coding needs to be done, and a bunch of fiddling will likely be required!
MOSES Evolution of Recurrent Neural Nets
In 2009 Joel Lehman worked on Extending MOSES to evolve Recurrent Neural Networks. He did a great job but due to a limitation in MOSES' capability in handling continuous knobs he could not fully experiment with his work. Continuous knobs are now fully supported but the performance of NN evolutionary learning has only shown a slight increase and does not currently out-perform other more classical approaches. We don't know why, could be a bug in the code or a design error or something more fundamental.
So the goal of this project would be to pursue Joel Lehman's work, to first try and understand the low performance and fix/improve it, and then if time permits implement and experiment with other classes of RNN.
PLN - Probabilistic Logical Reasoning
- Language: advanced C++
- Code: TBD
- List: TBD
- License: TBD
Probabilistic Logic Networks (PLN) are a novel conceptual, mathematical and computational approach to uncertain inference. In order to carry out effective reasoning in real-world circumstances, AI software must robustly handle uncertainty. However, previous approaches to uncertain inference do not have the breadth of scope required to provide an integrated treatment of the disparate forms of cognitively critical uncertainty as they manifest themselves within the various forms of pragmatic inference. Going beyond prior probabilistic approaches to uncertain inference, PLN is able to encompass within uncertain logic such ideas as induction, abduction, analogy, fuzziness and speculation, and reasoning about time and causality.
Reasoning about Biological Interactions
There are many databases denoting interactions between genes, proteins, chemical compounds and other biological entities. Some of these have recently be imported into OpenCog's AtomSpace format (after some scripting to reformat them) so they are ready to be reasoned on using PLN.
An example of the value of this kind of reasoning would be: using analogical inference to approximatively transfer knowledge from one organism to another, e.g. flies or mice to humans.
- Looks good for Semester of Code; Eddie Monroe could mentor
Planning via Temporal Logic & Consistency-Checking
A general and elegant way to do planning using PLN would be
- Use explicit temporal logic, so that the temporal dependencies between different actions are explicitly encoded in Atoms using relations like Before, After, During, etc.
- Improve the backward chainer so that, when it spawns a new Link in its chaining process, it checks whether this link is logically consistent with other promising Links occurring elsewhere in the backward chaining tree... (where if A and B are uncertain, logical inconsistency means that (A and B) has a much lower truth value strength than one would expect via an independence assumption.)
The search for logical consistency can be done heuristically, via starting at the top of the back-inference tree and working down. If quick inference says that a certain Link L is consistent with the link at a certain node N in the BIT, then consistency-checking with the children of N is probably not necessary.
This approach would subsume the heuristics generally used in planning algorithms, into a generally sensible method for inference control...
But, while conceptually straightforward, this would be significant work to implement.... So, for the near term, Jade is working on a simpler state-space planner extending the current, crude planner....
Spatial and temporal reasoning
Code exists for doing temporal reasoning within PLN, using a fuzzy/probabilistic version of Allen Interval Algebra. However, this code has not been used much. Building a good test suite for temporal reasoning is important.
Building an analogous system for spatial reasoning using Region Connection Calculus is also a pending task.
First, however, we need a workable/working SpaceServer.
Combining PLN and MOSES
Integrate MOSES-based supervised categorization into PLN, so that when PLN chaining hits a confusing point, it can launch MOSES to learn patterns in the members of the Atoms at the current end of the chain (which may then provide additional information useful in pruning). (1)
Cause PLN's backward and forward chaining inference to utilize history -- so that an inference step is more likely to be taken if similar steps have been taken in similar instances. (1)
Some notes on how to do this, written in early May 2012 in the context of Dingjie Wang potentially starting work on the topic...
Here's an example inference tree pattern that might be mined from a data store of inference trees:
Implication AND Evaluation navigating me Evaluation stuck me Evaluation near (me $X) SimilarityLink $X $Y Evaluation UsefulForInference SimilarityLink $X $Y
Let's call this pattern "P1", just for simplicity...
What this pattern P1 says, informally, is "If I'm stuck, and I'm near $X, and $X is similar to $Y ... then inferences involving (SimilarityLink $X $Y) may be useful.."
If we had ContextLinks, a nicer version of the above would be
ContextLink navigation Implication AND Evaluation stuck me Evaluation near (me $X) SimilarityLink $X $Y Evaluation UsefulForInference SimilarityLink $X $Y
but for the moment, the ContextLink-free version P1 given previously will be OK...
For an initial study in the context of navigating, the relevant form of the pattern would just be
Implication AND Evaluation stuck me Evaluation near (me $X) SimilarityLink $X $Y Evaluation UsefulForInference SimilarityLink $X $Y
The conceptual definition of the predicate UsefulForInference (which does not yet exist in the code or in the PLN book), is:
Evaluation UsefulForInference W
means that doing inference using W is more likely than not to increase the speed or quality of one's inference conclusion.
The practical definition of UsefulForInference is:
Evaluation UsefulForInference W
means that W occurs more often in "good" sub-dags than "bad" sub-dags, in the historical inference dags that have been stored for analysis.
The goal of history-based inference control is to learn patterns like P1 from a historical database of pairs of the firm
( inference dag, context of inference dag)
The "context" of an inference dag consists of Atoms referenced in that dag, and other Atoms linked to them.
In order to make the learning tractable, we will initially want to specify the predicates involved. Spatial predicates are a good start, if the initial context of study is PLN navigation. The above example used "stuck" -- we may want to invent some other predicates representing practical contexts that a system may get into while navigating.
The use of context here complexifies things, yet, also makes them more realistic and powerful. In humans, inference control really does depend on context, and on patterns learned via analyzing the relationships between inferences and their contexts.
It seems to me that the pattern learning could reasonably be done using MOSES, acting on program trees of restricted form. For starters maybe we could look at patterns of the restricted form
Implication AND Link_1 targ_11 targ_12 ... Link_k targ_k1 targ_k2 Evaluation UsefulForInference AND Link_(k+1) targ_(k+1)1 targ_(k+1)2 ... Link_(k+r) targ_(k+r)1 targ_(k+r)2
where the targets could be concrete Atoms or variables... and initially k and r would be kept pretty small...
Steps toward implementing this would seem to be:
1) Build the mechanism for storing a bunch of (inference dag, context) pairs [probably in the Atomspace?]
2) Run a bunch of PLN navigation experiments in different situations, and save the inference dags & contexts
3) Do some basic statistical analysis on this datastore of inference dags & contexts, so as to understand it
4) Customize MOSES to use the inputs, and the restricted functional form, needed to learn the above sorts of patterns (like P1)
5) Apply MOSES and see what patterns it learns
Natural Language Processing (NLP)
NLP is used in OpenCog in two different ways. First, an existing infrastructure, consisting of Link Grammar and RelEx, is used to provide raw data for higher-level cognitive processing. One of the current focus projects is using Relex2Logic output with PLN to perform reasoning.
The other primary focus area is unsupervised language learning. A proposal for this is described in the paper: Learning Language from a Large (Unannotated) Corpus on ArXiv. This project is described in greater detail on the language learning wiki page. There are several GSOC-2015 Candidate Projects described on that page.
Although NLP and language comprehension are core foci of OpenCog, many of the language pipeline tools are not currently within the core OpenCog framework and instead do a lot of pre-processing before feeding their output into the OpenCog AtomSpace for reasoning and learning. Eventually we plan to move this NLP processing into the core so that language rules can be updated based on experience. The current, defacto NLP pipeline is described here.
Below are some additional, related NLP projects that are worthy of pursuit.
- Code: https://code.launchpad.net/relex/
- List: http://groups.google.com/group/link-grammar (mentors: Linas)
- License: BSD, Apache 2.0
- Languages: Java, Scheme, Perl
RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech tagging, noun-number tagging, verb tense tagging, gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. Optionally, it can use GATE for entity detection. RelEx also provides semantic relationship framing, similar to that of FrameNet.
The output from RelEx is a hypergraph of Nodes and Links, which are input into OpenCog, and may then be processed in various ways. In addition, the LexAt (lexical attaction) package provides many scripts and database tools to collect statistical information on parsed sentences.
PLN Inference on extracted semantic relationships
Currently RelEx takes in a sentence, and outputs a set of logical relationships expressing the semantics of the sentence. It is possible to take the logical relationships extracted from multiple sentences, and combine them using a logical reasoning engine, to see what conclusions can be derived. Thus, for example, the English-language input Aristotle is a man. Men are mortal should allow the deduction of Aristotle is mortal.
Some prototype experiments along these lines were performed in 2006, using sentences contained in PubMed abstracts (see paper). But no systematic software approach was ever implemented.
These experiments could be done in a variety of rule engines, including Probabilistic Logic Networks, as well as more standard crisp rule engines such as SWIRLS.
This project is appropriate for a student who is interested in both computational linguistics and logical inference, and has some knowledge of predicate logic.
A simple but detailed example of language-based inference using RelEx and PLN is given here: File:RelEx PLN Example Inference.pdf.
See also NLP-PLN-NLGen pipeline.
- Looks good for Semester of Code; Amen or Cosmo could mentor
Improved Reference Resolution
Reference resolution (anaphora resolution) is the problem of determining what words like "it", "him" and "her" refer to. There are several ways in which this problem might be attacked.
1. Statistical methods: The current RelEx implementation uses the Hobbs algorithm for this, which is an intelligent but crude mechanism that achieves about 60% accuracy. By combining Hobbs algorithm with statistical measures (one of which is sometimes called the Hobbs score), it should be possible (according to literature results) to get up to 85% accuracy or so. Appropriate for anyone who is interested in getting some hands-on experience with statistical corpus linguistics.
2. Reasoning: Sentences commonly make assertions, which can be checked for accuracy. So, for example: Jules went to Paris. He saw many things there. e.g.
- Does he refer to Jules, or to Paris?
- Can Jules see things?
- Can Paris see things?
- Does there refer to Paris, or to Jules?
- Can one see many things in Jules?
- Can one see many things in Paris?
This approach is more powerful than a statistical approach, but requires establishing a large knowledge-base, and performing PLN reasoning there-on.
For example: The current RelEx code "knows" that "he" cannot refer to "Paris", because it knows that "he" is a masculine pronoun, and that Paris is a location. It does know that "Jules" is masculine, and can thus propose this as the resolution.
By contrast, Relex does not know that who or what can "see" things. Considerable work would be needed to construct a database of "see-able" things. Such a database ca be constructed by collecting certain simple grammatical constructions ... e.g. parsing a large corpus, and looking for all occurences of "X saw Y", and memorizing the X's and Y's, to be later used for reasoning.
3. Discourse theory: In discourse theory, a distinction is made between the theme and the rheme: what is being talked about, and what is being said. In the above example, "Jules" is the theme, and his trip is the rheme: the conversation is centered on Jules. Thus, it is appropriate to assume that "he" must surely refer to "Jules", and not "Paris", simply because "Jules" is the topic of the conversation. Word order and other syntactic phenomena can help clarify what the topic of the conversation is. Thus, discourse theory provides an indirect but strong way of assisting in reference resolution, and, in particular, disambiguating between otherwise equals choices.
The task here is then to discover the attentional focus of of the conversation, and use that to guide reference resolution. See, for example: Grosz, Joshi, Weinstein (1994) "Centering: A Framework for Modelling the Coherence of Discourse"
OpenCog, not RelEx: In any of these cases, the work would be done within the OpenCog AtomSpace, using scheme or python MindAgents. There would be little or no effort placed into extending the current Java code in RelEx. The reason for this is that OpenCog has (will have) access to a large database of knowledge, and the reasoning tools to work with it; RelEx does not.
Link Grammar Parser
A number of improvements to the Link Grammar parser would be useful.
Explore Landmark Transitivity in Link Grammar
The Link Grammar uses a constraint of "planar graphs" (i.e. no link crossings) to rule out unreasonable parses. It seems that it might be possible to replace this rule by the notion of "Landmark Transitivity" taken from Hudson's Word Grammar. The basic idea is this:
Each Link Grammar link is given a parent-child relationship: one end of the link is the parent, the other the child. Thus, for example, given a noun to noun-modifier link, the noun is the parent of the link. Then, parents are landmarks for children. Transitivity (in the mathematical sense of "transitive relation") is applied to these parent-child relationships. Specifically, the no-links-cross rule is replaced by two landmark transitivity rules:
- If B is a landmark for C, then A is also a type-L landmark for C
- If A is a landmark for C, then B is also a landmark for C
where type-L means either a right-going or left-going link.
Ben hypothesizes that adding Landmark transitivity might be able to eliminate most or all of Link Grammar's post-processing rules. See Ben's PROWL grammar for details. See also Natural Language Processing, below.
- Looks good for Semester of Code; Ben and Linas could mentor
Word Grammar Parsing
Implement a Word Grammar parser that utilizes the link grammar dictionary. This gives a strong impression of being an approach to NL comprehension that is suitable for general intelligence.
A next step would be utilization of PLN for semantic biasing of parsing.
- Looks good for Semester of Code; Ben or Linas could mentor
Implement support for "speech registers", such as newspaper headlines, medical notes, children's books, conversational speech. The style of linguistic expression varies depending on context: for example, in newspaper headlines, articles are frequently omitted. Children's books often sometimes use archaic grammatical constructions not commonly used in adult writing. During conversational speech, the rules of grammar are significantly relaxed; run-on sentences can predominate. Thus, a "one-size-fits-all" set of grammar rules has trouble with these situations, because the kind of grammar rules appropriate in one context might not be appropriate in another. Loosening parse rules sufficiently to work well for newspaper headlines or chat will result in rules that are too loose for book or newspaper text, generating an over-abundance of incorrect parses. Thus, a better approach is to apply a different set of grammatical rules, based on the context.
One way that this can be done is by adjusting the "cost" of using certain links in different situations. Thus, for example, omitting determiners such as "the" or "a" could be very high, when normal text is being parsed, but the cost would be low when newspaper headlines are parsed.
This project requires the development of altered grammar rules that can begin to tackle some of these different "speech registers". t also requires an extension of the scoring system, so that different sets of costs are used in different situations.
A more interesting variant would be perhaps the recognition of the context, given input text. Thus, for example, a block of Middle-English text would parse very poorly using the modern-English parse costs; the same text, assigned costs appropriate for Middle English, would result in a low cost: in effect, the different scoring systems act as a classifier for the text being parsed.
This is a difficult and time-consuming task, mostly because creating link-grammar dictionary entries by hand is very difficult, time-consuming and error-prone. This task might be better tackled after the unsupervised language learning task is providing results.
Quoted text, long-range structure, hierarchical structure
Currently, Link Grammar is unable to properly handle quoted text and dialogue. A mechanism is needed to disentangle the quoting from the quoted text, so that each can be parsed appropriately. This might be done with some amount of pre-processing (e.g. in RelEx), or possibly within Link Grammar itself.
Its somewhat unclear how to handle this within link-grammar. This is somewhat related to the problem of morphology (parsing words as if they were "mini-sentences",) idioms (phrases that are treated as if they were singe words), set-phrase structures (if ... then... not only... but also ...) which have a long-range structure similar to quoted text (he said ...)
- Looks good for Semester of Code; Linas or Ruiting or Rodas could mentor
Statistically similar expressions for question answering
Proper language-based inference requires that similar expressions be recognized as such. For example, X is the author of Y and X wrote Y are more or less synonymous phrases. Lin and Pantel describe how to automatically discover such similar phrases (See Dekang Lin and Patrick Pantel. 2001. "Discovery of Inference Rules for Question Answering." Natural Language Engineering 7(4):343-360). A similar, but more sophisticated approach is given by Hoifung Poon & Pedro Domingos, 2009, "Unsupervised Semantic Parsing".
Both of these systems use statistical techniques to find commonly occurring sub-expressions. Both start with text that has been processed with a dependency parser. While the first searches for chains of grammatical dependency relations that can be taken as synonymous, the second searches for arbitrary patterns that can be clustered into the same "synonym" or "concept". The two papers also differ in the statistical measures used to assign synonymous patterns into a common cluster, but it is not clear which statistical technique is better, or if that even matters very much. Lin uses measures based on mutual information, while Domingos applies the theory of Markov Logic Networks.
The goal of this task would be to implement a frequent-pattern-mining system within the OpenCog infrastructure. That is, given a large number of OpenCog hypergraphs, common subgraphs are to be identified, and then, using some statistical measure (such as mutual information, or possibly MLN, or another distance measure), such hypergraphs would be clustered together. Each such discovered cluster can be thought of as being a "concept"; the elements of the cluster are "synonymous expressions for the concept".
Ideally, by searching not just for synonyms, but for common patterns in general, the system will hopefully automatically find "speech formula", or perhaps "lexical functions" as described in Melcuk's Meaning-Text Theory. Presumably, such clustering techniques should also be able to group similar things, such as a cluster containing names of musical instruments, etc.
The discovery of such clusters should be useful both for question answering, as well as for natural-language generation (as clusters would contain different ways of expressing the same "idea".)
Natural Language Generation
Other language generation approaches include Markus Guhe's (2003) "Incremental conceptualisation for language production"
DeSTIN - Vision Processing
DeSTIN stands for Deep SpatioTemporal Inference Network and is a scalable deep learning architecture that relies on a combination of unsupervised learning and Bayesian inference. A paper by the inventors of this method is available. It is well suited to become the vision module of OpenCog and has been adopted by the project to tidy up the code for release as OSS. DeSTIN is implemented in CUDA.
Several people are already working on DeSTIN, but having someone involved to assist with the packaging, implement a testing framework using CTest, and/or create Python language bindings will help develop it into a worthwhile project in its own right.
Using DeSTIN and SVM or Neural Nets to Effectively Classify Images
DeSTIN can be used as a feature generator for supervised image classification using tools like SVM, Neural Nets or MOSES.
Some work has been done on this, using e.g. the CiFAR image classification corpus. But the accuracy achieved so far is not that high. It would be good to have someone focus on tweaking and tuning DeSTIN, and doing appropriate image preprocessing, to get the classification accuracy higher. This is useful because it is a way of teaching us how to adjust DeSTIN to make it work better in general.
Connecting DeSTIN and OpenCog
Here is one AI-intensive idea for a project involving DeSTIN (by Ben Goertzel)
I want to feed input from a webcam into DeSTIN, and then (in abstracted form) into OpenCog. Toy problems like MNIST are not really interesting to me, because my research goal in this context is robot vision. So scalability is important.
There will be a "smart interface" between DeSTIN and OpenCog, that will basically
-- record a database of DeSTIN state vectors
-- identify patterns in that database using machine learning algorithms
-- output these patterns into OpenCog's probabilistic/symbolic knowledge base
The goal would to get really simple stuff to work, like
-- feed DeSTIN pictures of cars, motorcycles, bicycles, unicycles and trucks (videos would be great, but let's start with pictures)
-- see if the whole system results in OpenCog getting ConceptNodes for car, motorcycle, bicycle, unicycle and truck, with appropriate logical links between them like
Inheritance car motor_vehicle <.9>
Inheritance truck motor_vehicle <.9>
Inheritance bicycle motor_vehicle <.1>
Similarity bicycle unicycle <.7>
Similarity bicycle motorcycle <.7>
Similarity motorcycle car <.6>
Similarity bicycle car <.3>
Note that while I'm using English terms here, in this experiment the OpenCog ConceptNodes would have no English names and English words would not be involved at all -- I'm just using the English terms as shorthands.
This would be a first example of the bridging of DeSTIN's subsymbolic knowledge and OpenCog's symbolic knowledge, which would be pretty cool and a good start toward further work on intelligent vision ;)
Implement conceptual blending as a heuristic for combining concepts, with a fitness function for a newly formed concept incorporating the quality of inferential conclusions derived from the concepts, and the quality of the MOSES classification rules learned using the concept.
Attention Allocation and Psychopathology
Some intriguing parallels may be drawn between the dynamics of attention allocation and various human mental disorders. The analogies to human psychological issues shouldn't be taken very seriously; it was more a way of thinking about the various dynamical phenomena that could be observed by appropriately fiddling with the parameters of attention allocation and its relationship to OpenPsi...
- Obsession. Focusing too long on just one thing. This could be achieved via not having the STI of Atoms in the AttentionalFocus delay rapidly enough. This should result in more and more HebbianLinks being built btw the Atoms in the AF, which causes the obsession to be reinforced yet further.
- Attention deficit disorder. Inability to pay attention to anything. This could be achieved via having the STI of Atoms in the AF decay too fast. Not many strong associations are learned because nothing stays in the AF that long. However, if the mind is in a situation where a lot of stimuli are constantly streaming in, from some coherent context, then the mind will pay attention to that same context for a while and build associations regarding it; this is similar to how folks with ADD can often be good at focusing on video games or other contexts involving rich fast streams of data and total interaction.
- Hypermania.* This would happen if a mind became very happy with each new thing that came into its AF, but then rapidly became unhappy with it afterwards. So, novelty must be a very highly weighted goal; perhaps the goal would be formulated as fulfilled if things with high novelty have high STI.
- Depression.* One kind of depression involves dwelling on unhappy thoughts. One way this could happen is if: 1) certain unhappy thoughts (thoughts linked to low satisfaction) are in the AF for a while, 2) everything that passes through the AF gets HebbianLinks built to these unhappy thoughts, 3) PLN inference builds yet more links between Atoms and the unhappy thoughts; 4) all these HebbianLinks then pull the unhappy thoughts back into the AF, whenever other things linked to the unhappy thoughts are in the AF...
Implement "map formation," a process that uses frequent subgraph mining methods to find frequently co-occurring subnetworks of the AtomTable, and then embodies these as new nodes. This requires some extension and adaptation of current algorithms for frequent subgraph mining. It also requires functional attention allocation.
Implement context formation, wherein an Atom C is defined as a context if it is important, and restricting other Atoms A to the context of C leads to conclusions that are significantly different from A's default truth value.
Embodiment - Controlling Avatars and Robots
These projects are related to supporting the embodiment of OpenCog in real and virtual environments.
Build a ROS proxy for OpenCog
Extend OpenCog's Embodiment module to communicate with physical robots, using ROS: Robot Operating System. ROS nodes perform two functions: they specify the connectivity between one-another, and they also stream real-time data between one-another. In OpenCog, this is split into two: Atoms are used to specify connectivity information, while Values store rapidly-changing real-time data. Thus, a ROS proxy might consist of two parts: a new ROSNode atom type, and a corresponding C++ class for it, that indicates where the ROS node is, what its called, and how it can be found, and a ROSValue, again, with a corresponding C++ class, that provides the data access methods to work with the data stream -- e.g. to grab the latest value, and return it to PLN, to be used during reasoning.
- XIA-MAN: An Extensible, Integrative Architecture for Intelligent Humanoid Robotics PDF, Ben Goertzel, Hugo de Garis
Build a Minecraft interface for OpenCog
Build an integration between OpenCog and a stable, well-documented open-source Minecraft client API (such as Mineflayer or Spock) in order to allow use of any server that implements the Minecraft standard (including the official Minecraft, as well as any open-source implementations), so that this existing standard environment could be utilized to build simulation environments for Artificial General Intelligence experiments without having to spend extensive time engineering new software for simulation.
The project would involve writing interface code between a Minecraft client API and OpenCog and would also involve replacing some of the existing OpenCog embodiment code that implements messaging with external systems with a simplified and modernized implementation. The project would be an opportunity to utilize the software design concepts of message queues, loose coupling and object serialization for external consumers. Ideally, a candidate should have experience in both Python and C++.
For more ideas about what this would look like, refer to this thesis paper on Minecraft as a simulation environment for Artificial General Intelligence: http://tmi.pst.ifi.lmu.de/thesis_results/0000/0073/thesis_final.pdf Note that this thesis was written in the context of a different project, and many of the implementation decisions contained within will be different from this project proposal; however, the document does include many nice explanations and examples that give a good general overview of the problem space.
This project has started and is currently looking for volunteers. If interested, please visit the Minecraft Bot Development Roadmap.
Inference regarding skeletons and surfaces of objects in a 3D game world
How can an OpenCog controlled game character, in a 3D game world implemented in Unity3D, reason about solid objects (represented either as 3D models or as Minecraft-like assemblages of bricks)?
One way is to use existing OSS code to map solid objects into skeletons, and to identify the parts of the surface of the objects. This information about skeletons and surfaces can then be fed into the PLN (Probabilistic Logic Networks) engine for reasoning.
This is good for someone with experience or very strong interest related to 3D modeling, graphics or game programming, as well as OpenCog and logical inference.
- Looks good for Semester of Code; Lake or Shujing could mentor
New Embodiment Module
This section lists projects related to using OpenCog components for specific applications.
Sokoban would be a good toy domain for experimenting with various OpenCog methods. So hooking up OpenCog to a simple Sokoban server would seem worthwhile. MOSES and PLN could be used for Sokoban on their own, but it's more interesting of course to take an integrative approach. (1)