Ideas

From OpenCog
Jump to: navigation, search

Contents

FOR GSoC 2015 Students

Please see the page GSoC 2015 Project Ideas

See also StudentApplicationTemplate, Development and Publications for background reading. Note that we welcome any interested volunteers to work on these projects and not only GSoC applicants!

SOME PARTLY OBSOLETE CONTENT BELOW

The rest of this page outlines many project ideas for the further development of OpenCog and related projects. It's kind of a mess -- e.g. some of the project are already done.... But the material is left here for the moment because there are some useful bits here and there; it will be cleaned up "soon"....


Most of the projects here are are given difficulty labels:

  1. Relatively straightforward AI R&D, though not easy
  2. Pretty difficult
  3. Rather difficult

Note that these labels refer only to the AI aspects of the task; there may also be difficult software engineering or systems integration aspects, which haven't been labeled. Best to ask us directly if you are concerned about the potential challenge a project presents.

OpenCog Framework

OpenCog consists of a core framework with several modules. This section outlines projects relating to improving the framework that supports the rest of the modules and MindAgents built on it. Unless otherwise specified, these projects don't require advanced AI related background knowledge.

Packaging for Debian, Cygwin, OS X

Packaging OpenCog components like CogServer and Embodiment for popular operating systems could help to expand our user and developer communities.

Currently OpenCog compiles on Linux and is not packaged for any OS. A porting and packaging project would involve working with CMake files, gcc compiler, linker, and shared object loader specifics, and packaging tools for Debian/Ubuntu, Cygwin or MinGW, and OS X (.pkg installer packages, ports or homebrew).

See Packaging for project requirements and steps.

Porting OpenCog to Windows

Cygwin is likely the easiest way to port the OpenCog codebase to Windows. But the Windows platform has some downsides, the main one being loss in speed. It's not clear whether OpenCog would suffer from that since most of its resources are spent on scientific calculations rather than system calls but in case the Cygwin version is indeed slower than it should be a Windows port might be worth considering. But this would only work in the mid/long term if someone is always willing to maintain the compatibility over time. Given that OpenCog is in active development it would quickly become out of date otherwise.

Performance Measurement Suite

Many contemplated changes to the opencog infrastructure have a real impact on execution time, and the amount of memory used. A performance suite would collect into one place a large variety of different test cases, instrument them properly so as to measure speed and memory usage, and then report the results, in a completely automated fashion. An existing but simplistic benchmark utility for the AtomSpace exists in opencog/benchmark.

Distributed and Persistent AtomSpace Performance

Provide distributed AtomSpace persistence and synchronization workloads and benchmarks.

Tasks include:

  • Obtaining a "typical" workload so that performance can be understood.
  • Developing scripts to run that workload across multiple systems, booted with Vagrant containers.
  • Expanding the existing benchmark to be able to run that workload, and measure performance.
  • Perform actual benchmark measurements on the existing back-end.
  • Possibly expand the existing backend to support additional indexes that might be required by the workload.
  • Possibly create a new backend for a different database system.

Based on past experience, the last item may be a bit too challenging to tackle over the summer. The last item requires requires strong knowledge of distributed processing, network, locking, transactional and kernel-type operating system principles. This is an ideal project for students with strong coding skills and theoretical comp-sci knowledge and who have ideally also have experience with one or several NoSQL databases. Interested should be familiar with the CAP Theorem and the ACID acronym.

RocksDB looks appealing, though more study will be needed to validate this. Also refer to http://wiki.opencog.org/w/Neo4j_Backing_Store

Cognitive API

Design and implement initial framework towards a cognitive API which can scale up to satisfy the highlevel overview described @ http://wiki.opencog.org/w/A_Cognitive_API

Improved Layout for AtomSpace Visualizer

Scott Jones has written a nice AtomSpace visualizer, web-based and interacting with the AtomSpace via a REST API. However, the graph layouts provided with the Atomspace visualizer are often suboptimal given the topology of the AtomSpace subgraphs. For instance, when the subgraph being visualized is tree-like, it would be good to use a tree-like layout manager.

  • Looks good for Semester of Code; Ben or Cosmo could mentor

OpenCog Visualization Workbench

The AtomSpace visualiser only addresses one aspect of a UI for understanding OpenCog dynamics. A generic GUI workbench, which can handle the parsing of log files and dynamic configuration of a running OpenCog instance, would be useful.

Other features that such a workbench might have:

  • graphing dynamics through time:
    • total atoms of various types
    • distribution of truthvalues
    • distribution of atom importance (STI/LTI)
    • average coordination number (number of links atoms are connected with)
    • network diameter
    • # of disjoint clusters
  • visualising and controlling expansion of the backward inference tree of PLN.
  • active MindAgents and their resource usage (memory, CPU)

This could also be integrated with, or made to incorporate the functions of, the OpenPsi GUI that Zhenhua Cai has developed. That UI uses:

  • zeromq messages for monitoring state changes from OpenCog.
  • implementation in Python.
  • pyqt for widgets.


  • Looks good for Semester of Code; Ben or Cosmo could mentor

Haskell Bindings for OpenCog

Python bindings are implemented and in use in OpenCog, so that cognitive processes can be written in Python, but interact freely with the C++ OpenCog core.

We would like to have comparable Haskell bindings. This should not be extremely hard for someone with strong knowledge of both Haskell and C++.

  • Looks good for Semester of Code; Ben could mentor

Type checking

Currently in the Atomspace, nothing stops you from creating

InheritanceLink

  NumberNode "9"
  GroundedSchemaNode "print"

even though this is nonsense....

In the AtomSpace, type checking is needed so as to prevent users from adding inappropriate atoms to the atomspace. It is also a vital part of being able to provide user-defined indexes. (See below).

Difficulty: mostly straight-ahead coding, little/no AI research needed. The hard part will be understanding the existing code base, to understand where/how to slot this in.

  • Looks good for Semester of Code; likely Linas could mentor

User defined indexes

The AtomSpace is essentially a certain kind of graphical database. Like other databases, it should allow users to define the kinds of structures that they neeed to be able to quickly locate. This could be done with a user-defined index.

This requires work to be done in conjunction with the type checking proposal, above.

Difficulty: mostly straight-ahead coding, little/no AI research needed. The hard part will be understanding the existing code base, to understand where/how to slot this in.

  • Looks good for Semester of Code; likely Linas could mentor

Support higher-order types in Atoms

Suppose you have a SchemaNode S with behavior like (using a Haskell-like notation, where A-->B means a function mapping A to B)

[[Number --> Concept] --> [Concept --> Concept] ] --> Number

This SchemaNode S has output type SchemaNode. However, this is not that precise a statement. S actually outputs SchemaNodes that have particular input restrictions and a particular output type...

So, ideally, a SchemaNode or PredicateNode would come with an optional Type object indicating what higher-order type they embody.

The Type object could be a dag indicating a higher-order type. This could utilise existing code in comboreduct or be based off some old C code Luke Kaiser wrote for Novamente LLC. Both used DAGs for type representation and inheritance on higher-order types. Or in principle one could introduce a TypeMapLink, and then store higher-order type objects as subgraphs in the Atomspace.

Right now type restrictions are left implicit in the Atomspace, which is generally OK since we're mostly dealing with simple cases at the moment. e.g. an AndLink between two or more ConceptNodes outputs a ConceptNode, so AndLink has type

List(Concept) --> Concept

However, to extend PLN, concept creation and other functions to deal with higher-order types effectively, explicit typing will probably be needed. This will be useful for abstract reasoning, not just mathematical reasoning but e.g. for analogical reasoning between significantly different domains, or experiential learning of symbol groundings or linguistic rules.

The question of when these types should be stored in the Atom versus calculated dynamically is not clear. In some cases, the type could be expensive to calculate, but it may be that such calculations are infrequent with most higher order type calculations being simple. A case where the type would need to be stored in the Atom would be when the SchemaNode in question carried out its calculation using a BuiltInSchema (coded, e.g., in C++); in this case there would be no simple way of inferring the type if it weren't specified. If a ComboSchema were involved instead, the type could be inferred, though this could be expensive.

MOSES - Genetic Program Learning

Meta-Optimizing Semantic Evolutionary Search (MOSES) is a new approach to program evolution, based on representation-building and probabilistic modeling. MOSES has been successfully applied to solve hard problems in domains such as computational biology, sentiment evaluation, and agent control. Results tend to be more accurate, and require less objective function evaluations, in comparison to other program evolution systems. Best of all, the result of running MOSES is not a large nested structure or numerical vector, but a compact and comprehensible program written in a simple Lisp-like mini-language. More at MOSES.

See also the MOSES Ideas page. The contents of this entire section should be moved to there ... and should be cleaned up, expanded.

Add time-series support

Right now, MOSES has no support at all for time series.

This is important for inferring differential equations (from finite differences) which is central for learning motion and movement, which is in-turn central for robot movement and sports (the ball is coming at you; you see it in your video feed. Where does your robot gripper need to be to catch it? When should the gripper close? The ball is arcing due to gravity, and maybe its a nerf ball and slowing due to air drag, so you need two neighboring time-series points to discover the 2nd-order difference equation.)

So for example, from a video feed, you have ball position x,y at times t1, t2, t3 ...

Right now, moses takes as input a collection of rows; these rows could be in any order; they are unrelated to each other. Sure you could give it rows:

t1, x1, y1
t2, x2, y2
t3, x3, y3

and maybe it could infer something, but this is a much harder problem, because it is an integral problem, rather than a differential problem. You want to be able to look at differences between neighboring rows, and right now MOSES cannot do that. Difference equations are much easier to spot than integral equations.

You could hack around this in various ugly ways e.g. putting two or three rows into one, but these are .. hacks. e.g

t1 x1 y1 x2 y2
t2 x2 y3 x3 y3
t3 x3 y3 x4 y4

yuck.

Provide extensible Combo and Reduct API

The Combo programming language is a small, lisp-like language used inside of MOSES for representing the programs that MOSES learns. Combo is used instead of lisp, scheme, python or any other popular language because the most important step of evolving program trees is being able to reduce them, to find simpler, smaller but otherwise equivalent program trees. None of the existing, popular programming languages have a publicly available, usable API to perform program reduction (even though this is, de fato, a part of what compilers and byte-code machines do). Thus, in order to be able to perform reduction, Combo was invented.

Combo expressions are trees consisting of operators that combine terms; terms may be constants, variables or trees. Combo currently supports four kinds algebras: boolean algebras, where all terms are true-or-false valued, the "contin" algebra, where all terms are real-valued, an action-perception algebra, which is useful for evolving small programs that control robot actions in response to perception stimuli, and a neural-net algebra used to represent simple neural nets.

A core issue within the current combo/reduct infrastructure is that there is no easy way to add new kinds of algebras into the system, without hard-coding them into assorted overly-fat C++ structures and header files. This makes extending combo difficult, and the resulting system fragile and over-weight. What is needed is a way of creating pluggable extensions to combo/reduct. This requires making the type system pluggable, the vertex object pluggable, and the assorted reduction subsystems pluggable.

A successful refactoring would move the neural-net and perception-action parts of the code into their own subsystems/extensions. A good place to start might be to finish abstracting away the "ant" example -- this is a demo where virtual ant learns navigation patterns that are efffective for foraging (finding virtual food).

Integrate MOSES with the current OpenCog-controlled game agent

The MOSES program learning algorithm was used for imitation and reinforcement learning by virtual pets in the Multiverse virtual world, in a previous version of OpenCog. Now we are using OpenCog to control game characters in a Unity3D based game world inspired by Minecraft, and overall this is much more sophisticated than the previous virtual pet system. It uses PLN for reasoning, OpenPsi for goals and motivations, frequent pattern mining, and lots of other cool stuff. But it doesn't yet use MOSES.

Simply porting the virtual-pet MOSES approach to the new game-agent control system isn't a good idea, because lots of changes to the architecture have been made since then. But integrating MOSES within the new architecture should not be extremely problematic.

This project is good for someone who wants to become very familiar with the use of OpenCog to control game agents in Unity3D.

Create an analogue of OpenBiomind using MOSES

OpenBiomind is a Java toolkit contains code for applying genetic programming to analyze gene expression microarray data and SNP data. This approach has been successfully used to learn diagnostic rules for cancer, Alzheimer's, Parkinson's and other diseases, as reflected in several publications.

MOSES is a more powerful tool for microarray and SNP data analysis, than the GP system used in OpenBiomind. Also, the Java code for OpenBiomind is not as flexible and extensible as it should be.

It would be desirable to implement an analogue of OpenBiomind utilizing MOSES. A lot of headway has already been made in this direction using the R language, but more parts still need to be implemented.

  • Looks good for Semester of Code; Mike Duncan could mentor

Use kernel methods to pre-filter continuous data

Continuous-valued data is often very noisy, and thus difficult to fit (i.e. difficult to converge to a well-fitting function). One can remove (filter out) a lot of the noise by binning it into discrete bins; and replacing a continuous-valued feature by a discrete-valued one (or as N binary-valued features for some small N).

The problem with binning is that if there are multiple continuous-valued features, they are often strongly correlated with each other. Binning them will smother some of this correlation, which is probably not a good thing. Thus, it would be better if the continuous-valued features were diagonalized into orthogonal components before being binned. The SVM, PCA and perceptron algos are all good ways to perform this diagonalization. They all implement a common, basic idea of diagonalization, refered to as "kernel methods".

So: the idea is to pass continous-valued features through some diagonalization stage before feeding them to MOSES. Note: this requires finding a good dataset which will demonstrate the utility of this approach.

Use kernel methods to pre-filter continuous data

Continuous-valued data is often very noisy, and thus difficult to fit (i.e. difficult to converge to a well-fitting function). One can remove (filter out) a lot of the noise by binning it into discrete bins; and replacing a continuous-valued feature by a discrete-valued one (or as N binary-valued features for some small N).

The problem with binning is that if there are multiple continuous-valued features, they are often strongly correlated with each other. Binning them will smother some of this correlation, which is probably not a good thing. Thus, it would be better if the continuous-valued features were diagonalized into orthogonal components before being binned. The SVM, PCA and perceptron algos are all good ways to perform this diagonalization. They all implement a common, basic idea of diagonalization, refered to as "kernel methods".

So: the idea is to pass continous-valued features through some diagonalization stage before feeding them to MOSES. Note: this requires finding a good dataset which will demonstrate the utility of this approach.

Enable MOSES to learn recursive functions using fold

Within Reduct/MOSES, make use of the algebra of fold, as Moshe hints at the bottom of page 5 of "A Representation of Programs for Learning and Reasoning" (Looks, Goertzel).

Combo currently supports the fold primitive and lists. However, there is currently no clear/practical way of deploying these within the context of MOSES.

Thus, what needs to be done is

  1. design a way of presenting recursive data to MOSES
  2. design and code reduction rules for that new programmatic construct,
  3. upgrade representation building so MOSES can evolve programs using lists and fold.
  4. test this on some appropriate test cases,

One appropriate test case might be learning an nlogn sorting routine

For instance mergesort can be implemented in Haskell as

mergesort xs = foldt merge [] [[x] | x <- xs]

with the function merge a duplicates-preserving variant of union.

XXX: It is not clear to me how we would even present list data to MOSES. Its not clear to me where this kind of data occurs naturally .. I know of no robotic sensors, financial data, biological data, or text/nlp algorithms that naturally create lists of data. So even if we had this function, where would it be applied? How would it be applied?

XXX Given a list, what sort of recursive processing would we want to do on it (besides sorting the list)?  Or rather, what sort of recursive structure might some input data possess, such that we would need to automatically learn it?  Recall, the point of MOSES is to find hidden, undetected, unknown patterns.  What sort of primitive recursive patterns might we plausibly expect to find?  Having found them, what would we do with them?

Difficulty: 1-2-3

References:

Generalize reduct and representation building to on operator properties

One problem with both Reduct and representation building is that they are pretty much hard wired for a set of operators, and, or, not, +, *, etc. While what we would want is that they would rely on operator properties instead (i.e. in the sense of category theory, model theory).

For instance instead of having the rule

x + 0 -> x

as well as the rule

x or false -> x

we could have a single rule

x OP NE -> x

where NE is the unit element of operator OP.

XXX -- are there really enough examples of this in combo to warrant this work? So, for example, we have:

  • unit element: x+0 -> x, x or F -> x, x and T = x
  • generic distributive law
  • generic associative law
  • generic commutative law
  • generic division (??)

But we already have these implemented for boolean and contin (continuous) variables; its not clear that there's much benefit to doing this in general, since we have no plans at this time to use moses to learn groups, ideals, rings, integer domains, modules, vector spaces, magmas, categories, elliptic curves, monoids, schemes, sheaves, fibre bundles, homologies, etc., where these abstract category-theoretic concepts are actually useful and important.

XXX Also, without the work that makes combo/reduct extensible (see proposal above), this work would not be terribly useful, as there is no practical way of adding new operators to the system at this time.

This same idea could be applied to representation building as well though is more complicated. The obvious advantage of that is that when one needs to expand reduct with new operators, no need to add new reduction rules (if the operator doesn't involve new properties), it would suffice to set the properties of the operator and we're done.

In more abstract terms: add support for algebraic structures and their interpretations, in the sense of model theory. The thing above that is called an "operator" is really an "algebraic structure", and "x+0 -> x" and "x or F ->x" are two different "interpretations" of that structure. Such systems are called "theories"; an example where this abstraction is already used is in SMT (Satisfiability Modulo Theories) solvers. The wikipedia article on SMT gives a list of "theories" that MOSES could support.

For other discussion, see this email thread

Difficulty: 2 for reduct, 3 for representation building

Higher-order Programmatic Constructs

An important project, appropriate for a student with some functional programming background, is to extend MOSES to handle higher-order programmatic constructs, including variable expression.

Our design for this involves Sinot's formalism of "director strings as combinators", and there is opportunity to assist with working out the details of the design as well as the implementation. This can be done many ways, including using combinatory logic or lambda calculus. The route that seems best at the moment would be to use Sinot's formalism of "director strings as combinators." Much of the work here is in Reduct and representation-building, which would be useful for both MOSES and Pleasure.

Difficulty: 2

  • Looks good for Semester of Code; Nil could mentor

Library of Known Routines

Currently, MOSES constructs programs out of a very minimal collection of functions and operators; for example, boolean expressions are constructed entirely from and, or, not. However, learning of programs could happen much faster if more complex primitives were provided: for example, xor, a half-adder, a multiplexer, etc. Rather than hard-code such functions into combo, this project calls for creating a generic library that could hold such functions, and the infrastructure to allow MOSES to sample from the library when building exemplars.

This is a needed step, on the way to implementing the pleasure algorithm below, as well as for transfer learning, etc.

This might be doable over the summer, although it would still need a very strong programmer. Proof that it is working would be defining xor in some file (as combo, viz. that xor($1,$2) := and(or($1, $2), not (and($1, $2))) and then show that this definition was actually used in solving the k-parity problem.

  • Looks good for Semester of Code; Nil could mentor

Pleasure

Complete implementation (or starts a new one from scratch) and then test and explore the Pleasure Algorithm for program learning started in 2011 by Alesis Novik. See MOSES: the Pleasure Algorithm

Pre-requisite: Library of Known Routines, above.

Difficulty: 3

Transfer Learning

Causing MOSES to generalize across problem instances, so what it has learned across multiple problem instances can be used to help prime its learning in new problem instances. This can be done by extending the probabilistic model building step to span multiple generations, but this poses a number of subsidiary problems, and requires integration of some sort of sophisticated attention allocation method into MOSES to tell it which patterns observed in which prior problem instances to pay attention to.

A much simpler approach is to exchange the MOSES inner loop needs with the outer loop. Unfortunately, this would be a major tear-up of the code base. Exchanging the order of the loops would allow an endless supply of new stimulus to be continuously processed by MOSES, in essence performing continuous, ongoing transfer learning. This should be contrasted to the current design: the input is a fixed-size, fixed-content table, and during training, it is iterated over and over, within the inner loop. If the order of the loops were reversed, then equivalent function could be obtained simply by replaying the same input endlessly. However, there would now be the option to NOT replay the same input over and over, but rather, it could slowly mutate, and the learned structures would mutate along with it. This is a much more phenomenology, biologically correct approach to learning from input stimulus.

Difficulty: 2-3

Arbitrarily Complex Program Learning

More on the previous project suggestion: The motivation for the above is to allow MOSES to learn arbitrarily complex programs. For instance, we would like it to be able to easily learn nlogn sorting algorithms without any fancy data preparation or other "cheating." It is possible that integrating Sinot's formalism into MOSES will allow effective learning of moderately complex programs using recursive control, which is something no one has achieved before and which is of critical importance in automated program learning.

Pre-requisite: Library of Known Routines, above.

Difficulty: 3

Action-Sequences Handling

The current version of MOSES does not elegantly or efficiently handle the learning of programs involving long sequences of actions. This is problematic for applications involving the control of robots or virtual agents. So, an important project is the extension of the Reduct and representation-building components of MOSES to effectively handle action-sequences. This work will be testable via using MOSES to control agents in virtual worlds such as Multiverse or Unity.

Use SVM for program tree selection

Try replacing the Bayes-net modeling in MOSES with SVM. SVM with tree kernels have now matured somewhat, see e.g.

http://www.xialuo.net/2010/11/07/svm-light-tk-1-2/

So the idea would roughly be to:

  • use SVM to learn a model distinguishing fit from unfit program trees.
  • use this model as a filter on generated program trees, to determine

which ones are worth fitness-evaluating.

Difficulty: 2

Improved hBOA

MOSES consists of four critical aspects: deme management, program tree reduction, representation-building, and population modeling. For the latter, the hBOA algorithm (invented by Martin Pelikan in his 2002 PhD thesis) is currently used, but we've found it not to be optimal in this context. So there is room for experimentation in replacing hBOA with a different algorithm; for instance, a variant of simulated annealing has been suggested, as has been a pattern-recognition approach similar to LZ compression. A student with some familiarity with evolutionary learning, probability theory and machine learning may enjoy experimenting with alternatives to hBOA so as to help turn MOSES into a super-efficient automated program learning framework. It already works quite well, dramatically outperforming GP, but we believe that with some attention to improving the hBOA component it can be improved dramatically.

  • Looks good for Semester of Code; Nil or Linas could mentor

Dimensional Embedding for Improved Program Learning

Suppose one is using MOSES (or some related technique) to learn a program tree that contains nodes referring to semantic knowledge in a large knowledge base (e.g. a program tree that contains terms like "cat", "walk" etc. that represent concepts in OpenCog's AtomTable).

Then, mutating these nodes (for "representation building", in MOSES lingo) requires some special mechanism -- for instance, one wants to mutate "cat" into "some other concept that is drawn from a Gaussian of specified variance, centered around 'cat'". One straightforward way to do this is to embed the concepts in the semantic knowledge base in an n-dimensional space, and then use Gaussian distributions in this dimensional space to do mutation.

We have identified some good algorithms for dimensional embedding, but the coding needs to be done, and a bunch of fiddling will likely be required!

MOSES Evolution of Recurrent Neural Nets

In 2009 Joel Lehman worked on Extending MOSES to evolve Recurrent Neural Networks. He did a great job but due to a limitation in MOSES' capability in handling continuous knobs he could not fully experiment with his work. Continuous knobs are now fully supported but the performance of NN evolutionary learning has only shown a slight increase and does not currently out-perform other more classical approaches. We don't know why, could be a bug in the code or a design error or something more fundamental.

So the goal of this project would be to pursue Joel Lehman's work, to first try and understand the low performance and fix/improve it, and then if time permits implement and experiment with other classes of RNN.

PLN - Probabilistic Logical Reasoning

  • Language: advanced C++
  • Code: TBD
  • List: TBD
  • License: TBD

Probabilistic Logic Networks (PLN) are a novel conceptual, mathematical and computational approach to uncertain inference. In order to carry out effective reasoning in real-world circumstances, AI software must robustly handle uncertainty. However, previous approaches to uncertain inference do not have the breadth of scope required to provide an integrated treatment of the disparate forms of cognitively critical uncertainty as they manifest themselves within the various forms of pragmatic inference. Going beyond prior probabilistic approaches to uncertain inference, PLN is able to encompass within uncertain logic such ideas as induction, abduction, analogy, fuzziness and speculation, and reasoning about time and causality.

Reasoning about Biological Interactions

There are many databases denoting interactions between genes, proteins, chemical compounds and other biological entities. Some of these have recently be imported into OpenCog's AtomSpace format (after some scripting to reformat them) so they are ready to be reasoned on using PLN.

An example of the value of this kind of reasoning would be: using analogical inference to approximatively transfer knowledge from one organism to another, e.g. flies or mice to humans.

  • Looks good for Semester of Code; Eddie Monroe could mentor

Planning via Temporal Logic & Consistency-Checking

A general and elegant way to do planning using PLN would be

  • Use explicit temporal logic, so that the temporal dependencies between different actions are explicitly encoded in Atoms using relations like Before, After, During, etc.
  • Improve the backward chainer so that, when it spawns a new Link in its chaining process, it checks whether this link is logically consistent with other promising Links occurring elsewhere in the backward chaining tree... (where if A and B are uncertain, logical inconsistency means that (A and B) has a much lower truth value strength than one would expect via an independence assumption.)

The search for logical consistency can be done heuristically, via starting at the top of the back-inference tree and working down. If quick inference says that a certain Link L is consistent with the link at a certain node N in the BIT, then consistency-checking with the children of N is probably not necessary.

This approach would subsume the heuristics generally used in planning algorithms, into a generally sensible method for inference control...

But, while conceptually straightforward, this would be significant work to implement.... So, for the near term, Jade is working on a simpler state-space planner extending the current, crude planner....


Spatial and temporal reasoning

Code exists for doing temporal reasoning within PLN, using a fuzzy/probabilistic version of Allen Interval Algebra. However, this code has not been used much. Building a good test suite for temporal reasoning is important.

Building an analogous system for spatial reasoning using Region Connection Calculus is also a pending task.

(3)

Combining PLN and MOSES

Integrate MOSES-based supervised categorization into PLN, so that when PLN chaining hits a confusing point, it can launch MOSES to learn patterns in the members of the Atoms at the current end of the chain (which may then provide additional information useful in pruning). (1)

History-Guided Inference

Cause PLN's backward and forward chaining inference to utilize history -- so that an inference step is more likely to be taken if similar steps have been taken in similar instances. (1)

Some notes on how to do this, written in early May 2012 in the context of Dingjie Wang potentially starting work on the topic...


Here's an example inference tree pattern that might be mined from a data store of inference trees:

Implication
   AND
      Evaluation navigating me
      Evaluation stuck me
      Evaluation near (me $X)
      SimilarityLink $X $Y
   Evaluation
       UsefulForInference
       SimilarityLink $X $Y

Let's call this pattern "P1", just for simplicity...

What this pattern P1 says, informally, is "If I'm stuck, and I'm near $X, and $X is similar to $Y ... then inferences involving (SimilarityLink $X $Y) may be useful.."

If we had ContextLinks, a nicer version of the above would be

ContextLink
     navigation
     Implication
        AND
          Evaluation stuck me
          Evaluation near (me $X)
          SimilarityLink $X $Y
       Evaluation
          UsefulForInference
          SimilarityLink $X $Y

but for the moment, the ContextLink-free version P1 given previously will be OK...

For an initial study in the context of navigating, the relevant form of the pattern would just be

Implication
   AND
      Evaluation stuck me
      Evaluation near (me $X)
      SimilarityLink $X $Y
   Evaluation
       UsefulForInference
       SimilarityLink $X $Y

The conceptual definition of the predicate UsefulForInference (which does not yet exist in the code or in the PLN book), is:

Evaluation UsefulForInference W

means that doing inference using W is more likely than not to increase the speed or quality of one's inference conclusion.

The practical definition of UsefulForInference is:

Evaluation UsefulForInference W

means that W occurs more often in "good" sub-dags than "bad" sub-dags, in the historical inference dags that have been stored for analysis.

The goal of history-based inference control is to learn patterns like P1 from a historical database of pairs of the firm

( inference dag, context of inference dag)

The "context" of an inference dag consists of Atoms referenced in that dag, and other Atoms linked to them.

In order to make the learning tractable, we will initially want to specify the predicates involved. Spatial predicates are a good start, if the initial context of study is PLN navigation. The above example used "stuck" -- we may want to invent some other predicates representing practical contexts that a system may get into while navigating.

The use of context here complexifies things, yet, also makes them more realistic and powerful. In humans, inference control really does depend on context, and on patterns learned via analyzing the relationships between inferences and their contexts.

It seems to me that the pattern learning could reasonably be done using MOSES, acting on program trees of restricted form. For starters maybe we could look at patterns of the restricted form

Implication
   AND
      Link_1 targ_11 targ_12
      ...
      Link_k targ_k1 targ_k2
   Evaluation
       UsefulForInference
       AND
           Link_(k+1) targ_(k+1)1 targ_(k+1)2
            ...
           Link_(k+r) targ_(k+r)1 targ_(k+r)2

where the targets could be concrete Atoms or variables... and initially k and r would be kept pretty small...

Steps toward implementing this would seem to be:

1) Build the mechanism for storing a bunch of (inference dag, context) pairs [probably in the Atomspace?]

2) Run a bunch of PLN navigation experiments in different situations, and save the inference dags & contexts

3) Do some basic statistical analysis on this datastore of inference dags & contexts, so as to understand it

4) Customize MOSES to use the inputs, and the restricted functional form, needed to learn the above sorts of patterns (like P1)

5) Apply MOSES and see what patterns it learns

Natural Language Processing (NLP)

NLP is used in OpenCog in two different ways. First, an existing infrastructure, consisting of Link Grammar and RelEx, is used to provide raw data for higher-level cognitive processing. One of the current focus projects is using Relex2Logic output with PLN to perform reasoning.

The other primary focus area is unsupervised language learning. A proposal for this is described in the paper: Learning Language from a Large (Unannotated) Corpus on ArXiv. This project is described in greater detail on the language learning wiki page. There are several GSOC-2015 Candidate Projects described on that page.

Although NLP and language comprehension are core foci of OpenCog, many of the language pipeline tools are not currently within the core OpenCog framework and instead do a lot of pre-processing before feeding their output into the OpenCog AtomSpace for reasoning and learning. Eventually we plan to move this NLP processing into the core so that language rules can be updated based on experience. The current, defacto NLP pipeline is described here.

Below are some additional, related NLP projects that are worthy of pursuit.

RelEx

RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech tagging, noun-number tagging, verb tense tagging, gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. Optionally, it can use GATE for entity detection. RelEx also provides semantic relationship framing, similar to that of FrameNet.

The output from RelEx is a hypergraph of Nodes and Links, which are input into OpenCog, and may then be processed in various ways. In addition, the LexAt (lexical attaction) package provides many scripts and database tools to collect statistical information on parsed sentences.

PLN Inference on extracted semantic relationships

Currently RelEx takes in a sentence, and outputs a set of logical relationships expressing the semantics of the sentence. It is possible to take the logical relationships extracted from multiple sentences, and combine them using a logical reasoning engine, to see what conclusions can be derived. Thus, for example, the English-language input Aristotle is a man. Men are mortal should allow the deduction of Aristotle is mortal.

Some prototype experiments along these lines were performed in 2006, using sentences contained in PubMed abstracts (see paper). But no systematic software approach was ever implemented.

These experiments could be done in a variety of rule engines, including Probabilistic Logic Networks, as well as more standard crisp rule engines such as SWIRLS.

This project is appropriate for a student who is interested in both computational linguistics and logical inference, and has some knowledge of predicate logic.

A simple but detailed example of language-based inference using RelEx and PLN is given here: File:RelEx PLN Example Inference.pdf.

Related working notes can be found in the OpenCog source code tree, in the directory opencog/nlp/seme/README and opencog/nlp/triples/README.

See also NLP-PLN-NLGen pipeline.

  • Looks good for Semester of Code; Amen or Cosmo could mentor

Improved Reference Resolution

Reference resolution (anaphora resolution) is the problem of determining what words like "it", "him" and "her" refer to. There are several ways in which this problem might be attacked.

1. Statistical methods: The current RelEx implementation uses the Hobbs algorithm for this, which is an intelligent but crude mechanism that achieves about 60% accuracy. By combining Hobbs algorithm with statistical measures (one of which is sometimes called the Hobbs score), it should be possible (according to literature results) to get up to 85% accuracy or so. Appropriate for anyone who is interested in getting some hands-on experience with statistical corpus linguistics.

2. Reasoning: Sentences commonly make assertions, which can be checked for accuracy. So, for example: Jules went to Paris. He saw many things there. e.g.

  • Does he refer to Jules, or to Paris?
  • Can Jules see things?
  • Can Paris see things?
  • Does there refer to Paris, or to Jules?
  • Can one see many things in Jules?
  • Can one see many things in Paris?

This approach is more powerful than a statistical approach, but requires establishing a large knowledge-base, and performing PLN reasoning there-on.

For example: The current RelEx code "knows" that "he" cannot refer to "Paris", because it knows that "he" is a masculine pronoun, and that Paris is a location. It does know that "Jules" is masculine, and can thus propose this as the resolution.

By contrast, Relex does not know that who or what can "see" things. Considerable work would be needed to construct a database of "see-able" things. Such a database ca be constructed by collecting certain simple grammatical constructions ... e.g. parsing a large corpus, and looking for all occurences of "X saw Y", and memorizing the X's and Y's, to be later used for reasoning.

3. Discourse theory: In discourse theory, a distinction is made between the theme and the rheme: what is being talked about, and what is being said. In the above example, "Jules" is the theme, and his trip is the rheme: the conversation is centered on Jules. Thus, it is appropriate to assume that "he" must surely refer to "Jules", and not "Paris", simply because "Jules" is the topic of the conversation. Word order and other syntactic phenomena can help clarify what the topic of the conversation is. Thus, discourse theory provides an indirect but strong way of assisting in reference resolution, and, in particular, disambiguating between otherwise equals choices.

The task here is then to discover the attentional focus of of the conversation, and use that to guide reference resolution. See, for example: Grosz, Joshi, Weinstein (1994) "Centering: A Framework for Modelling the Coherence of Discourse"

OpenCog, not RelEx: In any of these cases, the work would be done within the OpenCog AtomSpace, using scheme or python MindAgents. There would be little or no effort placed into extending the current Java code in RelEx. The reason for this is that OpenCog has (will have) access to a large database of knowledge, and the reasoning tools to work with it; RelEx does not.

Link Grammar Parser

A number of improvements to the Link Grammar parser would be useful.

Explore Landmark Transitivity in Link Grammar

The Link Grammar uses a constraint of "planar graphs" (i.e. no link crossings) to rule out unreasonable parses. It seems that it might be possible to replace this rule by the notion of "Landmark Transitivity" taken from Hudson's Word Grammar. The basic idea is this:

Each Link Grammar link is given a parent-child relationship: one end of the link is the parent, the other the child. Thus, for example, given a noun to noun-modifier link, the noun is the parent of the link. Then, parents are landmarks for children. Transitivity (in the mathematical sense of "transitive relation") is applied to these parent-child relationships. Specifically, the no-links-cross rule is replaced by two landmark transitivity rules:

  • If B is a landmark for C, then A is also a type-L landmark for C
  • If A is a landmark for C, then B is also a landmark for C

where type-L means either a right-going or left-going link.

Ben hypothesizes that adding Landmark transitivity might be able to eliminate most or all of Link Grammar's post-processing rules. See Ben's PROWL grammar for details. See also Natural Language Processing, below.

  • Looks good for Semester of Code; Ben and Linas could mentor

Word Grammar Parsing

Implement a Word Grammar parser that utilizes the link grammar dictionary. This gives a strong impression of being an approach to NL comprehension that is suitable for general intelligence.

A next step would be utilization of PLN for semantic biasing of parsing.

  • Looks good for Semester of Code; Ben or Linas could mentor

Limited-window parse, Viterbi parsing.

Implement a limited-window search through parse space. The parser currently examines sentences as a whole, which means that the parsing of long sentences becomes very slow (approximately as N^3, with N the number of words in the sentence). Implementing a "window" to limit long-distance connections between distant words should dramatically improve parse performance. Such "windows" also make the parser more "neurologically plausible", by limiting difficult, long-range correlations between words. There are two approaches possible: the "easy way" and the "hard way".

The "easy way" is to replace the "left-wall" of link-grammar by the "state of possible right-going connections from where I last left off". That is, the current "left wall" consists entirely of right-going connectors that are appropriate for the beginning of a sentence. This can be replaced by a different set of right-going connectors: those that might be appropriate after a colon or semi-colon, or even, in extreme cases, a comma. Thus, an extremely long sentences (say, longer than 20 words) could be broken into parts, after a suitable semicolon or comma. The first part could be parsed using the normal algorithm. Any dangling right-moving parts would be memorized (instead of attaching to he "right wall"). Then to parse the second half of the sentence, these dangling links become the new, virtual left-wall.

The "hard way" is to do the above at word-by-word boundaries, instead of at long-phrase boundaries. That is, as each new word is encountered, a check is made to see if it can attach to the current left side. If so, the attachment is made, and any right-going links are added to the possible set of possible right-going state (i.e. becomes the new "left wall"). Such word-by-word wall-state tracking is known as "Viterbi parsing" or "Viterbi decoding" (see wikipedia Viterbi algorithm).

This project is very technically challenging, as it involves changes deep within the link-grammar code base. In terms of theory, it requires a passing understanding of probability theory, hidden Markov models, and basic idea behind Viterbi decoding.

This project is the most critical, outstanding improvement needed for Link Grammar, as it finally enables continuous cost-based parsing.

This would be a very difficult task, at this time, and requires a master C coder.

Context-specific parsing

Implement support for "speech registers", such as newspaper headlines, medical notes, children's books, conversational speech. The style of linguistic expression varies depending on context: for example, in newspaper headlines, articles are frequently omitted. Children's books often sometimes use archaic grammatical constructions not commonly used in adult writing. During conversational speech, the rules of grammar are significantly relaxed; run-on sentences can predominate. Thus, a "one-size-fits-all" set of grammar rules has trouble with these situations, because the kind of grammar rules appropriate in one context might not be appropriate in another. Loosening parse rules sufficiently to work well for newspaper headlines or chat will result in rules that are too loose for book or newspaper text, generating an over-abundance of incorrect parses. Thus, a better approach is to apply a different set of grammatical rules, based on the context.

One way that this can be done is by adjusting the "cost" of using certain links in different situations. Thus, for example, omitting determiners such as "the" or "a" could be very high, when normal text is being parsed, but the cost would be low when newspaper headlines are parsed.

This project requires the development of altered grammar rules that can begin to tackle some of these different "speech registers". t also requires an extension of the scoring system, so that different sets of costs are used in different situations.

A more interesting variant would be perhaps the recognition of the context, given input text. Thus, for example, a block of Middle-English text would parse very poorly using the modern-English parse costs; the same text, assigned costs appropriate for Middle English, would result in a low cost: in effect, the different scoring systems act as a classifier for the text being parsed.

This is a difficult and time-consuming task, mostly because creating link-grammar dictionary entries by hand is very difficult, time-consuming and error-prone. This task might be better tackled after the unsupervised language learning task is providing results.

Quoted text, long-range structure, hierarchical structure

Currently, Link Grammar is unable to properly handle quoted text and dialogue. A mechanism is needed to disentangle the quoting from the quoted text, so that each can be parsed appropriately. This might be done with some amount of pre-processing (e.g. in RelEx), or possibly within Link Grammar itself.

Its somewhat unclear how to handle this within link-grammar. This is somewhat related to the problem of morphology (parsing words as if they were "mini-sentences",) idioms (phrases that are treated as if they were singe words), set-phrase structures (if ... then... not only... but also ...) which have a long-range structure similar to quoted text (he said ...)

  • Looks good for Semester of Code; Linas or Ruiting or Rodas could mentor

Statistically similar expressions for question answering

Proper language-based inference requires that similar expressions be recognized as such. For example, X is the author of Y and X wrote Y are more or less synonymous phrases. Lin and Pantel describe how to automatically discover such similar phrases (See Dekang Lin and Patrick Pantel. 2001. "Discovery of Inference Rules for Question Answering." Natural Language Engineering 7(4):343-360). A similar, but more sophisticated approach is given by Hoifung Poon & Pedro Domingos, 2009, "Unsupervised Semantic Parsing".

Both of these systems use statistical techniques to find commonly occurring sub-expressions. Both start with text that has been processed with a dependency parser. While the first searches for chains of grammatical dependency relations that can be taken as synonymous, the second searches for arbitrary patterns that can be clustered into the same "synonym" or "concept". The two papers also differ in the statistical measures used to assign synonymous patterns into a common cluster, but it is not clear which statistical technique is better, or if that even matters very much. Lin uses measures based on mutual information, while Domingos applies the theory of Markov Logic Networks.

The goal of this task would be to implement a frequent-pattern-mining system within the OpenCog infrastructure. That is, given a large number of OpenCog hypergraphs, common subgraphs are to be identified, and then, using some statistical measure (such as mutual information, or possibly MLN, or another distance measure), such hypergraphs would be clustered together. Each such discovered cluster can be thought of as being a "concept"; the elements of the cluster are "synonymous expressions for the concept".

Ideally, by searching not just for synonyms, but for common patterns in general, the system will hopefully automatically find "speech formula", or perhaps "lexical functions" as described in Melcuk's Meaning-Text Theory. Presumably, such clustering techniques should also be able to group similar things, such as a cluster containing names of musical instruments, etc.

The discovery of such clusters should be useful both for question answering, as well as for natural-language generation (as clusters would contain different ways of expressing the same "idea".)

Natural Language Generation

We have a Java software system called NLGen that, given a set of RelEx-like relations, generate a syntactically correct English-language sentence.

It works OK on short sentences but needs to be improved significantly to be able to handle sentences with complex phrasal or clausal structure. Also, some statistical linguistics should be inserted to allow it to use observed word frequencies to guide its sentence generation.

The basic idea underlying NLGen's algorithm is described here: SegSim

Other language generation approaches include Markus Guhe's (2003) "Incremental conceptualisation for language production"

DeSTIN - Vision Processing

DeSTIN stands for Deep SpatioTemporal Inference Network and is a scalable deep learning architecture that relies on a combination of unsupervised learning and Bayesian inference. A paper by the inventors of this method is available. It is well suited to become the vision module of OpenCog and has been adopted by the project to tidy up the code for release as OSS. DeSTIN is implemented in CUDA.

Several people are already working on DeSTIN, but having someone involved to assist with the packaging, implement a testing framework using CTest, and/or create Python language bindings will help develop it into a worthwhile project in its own right.


Using DeSTIN and SVM or Neural Nets to Effectively Classify Images

DeSTIN can be used as a feature generator for supervised image classification using tools like SVM, Neural Nets or MOSES.

Some work has been done on this, using e.g. the CiFAR image classification corpus. But the accuracy achieved so far is not that high. It would be good to have someone focus on tweaking and tuning DeSTIN, and doing appropriate image preprocessing, to get the classification accuracy higher. This is useful because it is a way of teaching us how to adjust DeSTIN to make it work better in general.

Connecting DeSTIN and OpenCog

Here is one AI-intensive idea for a project involving DeSTIN (by Ben Goertzel)

I want to feed input from a webcam into DeSTIN, and then (in abstracted form) into OpenCog. Toy problems like MNIST are not really interesting to me, because my research goal in this context is robot vision. So scalability is important.

There will be a "smart interface" between DeSTIN and OpenCog, that will basically

-- record a database of DeSTIN state vectors

-- identify patterns in that database using machine learning algorithms

-- output these patterns into OpenCog's probabilistic/symbolic knowledge base

Here are some detailed ideas about how to potentially do this


The goal would to get really simple stuff to work, like

-- feed DeSTIN pictures of cars, motorcycles, bicycles, unicycles and trucks (videos would be great, but let's start with pictures)

-- see if the whole system results in OpenCog getting ConceptNodes for car, motorcycle, bicycle, unicycle and truck, with appropriate logical links between them like

Inheritance car motor_vehicle <.9>

Inheritance truck motor_vehicle <.9>

Inheritance bicycle motor_vehicle <.1>

Similarity bicycle unicycle <.7>

Similarity bicycle motorcycle <.7>

Similarity motorcycle car <.6>

Similarity bicycle car <.3>

Note that while I'm using English terms here, in this experiment the OpenCog ConceptNodes would have no English names and English words would not be involved at all -- I'm just using the English terms as shorthands.

This would be a first example of the bridging of DeSTIN's subsymbolic knowledge and OpenCog's symbolic knowledge, which would be pretty cool and a good start toward further work on intelligent vision ;)

Concept Formation

Blending

Implement conceptual blending as a heuristic for combining concepts, with a fitness function for a newly formed concept incorporating the quality of inferential conclusions derived from the concepts, and the quality of the MOSES classification rules learned using the concept.

Attention Allocation and Psychopathology

Some intriguing parallels may be drawn between the dynamics of attention allocation and various human mental disorders. The analogies to human psychological issues shouldn't be taken very seriously; it was more a way of thinking about the various dynamical phenomena that could be observed by appropriately fiddling with the parameters of attention allocation and its relationship to OpenPsi...

...

  • Obsession. Focusing too long on just one thing. This could be achieved via not having the STI of Atoms in the AttentionalFocus delay rapidly enough. This should result in more and more HebbianLinks being built btw the Atoms in the AF, which causes the obsession to be reinforced yet further.
  • Attention deficit disorder. Inability to pay attention to anything. This could be achieved via having the STI of Atoms in the AF decay too fast. Not many strong associations are learned because nothing stays in the AF that long. However, if the mind is in a situation where a lot of stimuli are constantly streaming in, from some coherent context, then the mind will pay attention to that same context for a while and build associations regarding it; this is similar to how folks with ADD can often be good at focusing on video games or other contexts involving rich fast streams of data and total interaction.
  • Hypermania.* This would happen if a mind became very happy with each new thing that came into its AF, but then rapidly became unhappy with it afterwards. So, novelty must be a very highly weighted goal; perhaps the goal would be formulated as fulfilled if things with high novelty have high STI.
  • Depression.* One kind of depression involves dwelling on unhappy thoughts. One way this could happen is if: 1) certain unhappy thoughts (thoughts linked to low satisfaction) are in the AF for a while, 2) everything that passes through the AF gets HebbianLinks built to these unhappy thoughts, 3) PLN inference builds yet more links between Atoms and the unhappy thoughts; 4) all these HebbianLinks then pull the unhappy thoughts back into the AF, whenever other things linked to the unhappy thoughts are in the AF...

Map formation

Implement "map formation," a process that uses frequent subgraph mining methods to find frequently co-occurring subnetworks of the AtomTable, and then embodies these as new nodes. This requires some extension and adaptation of current algorithms for frequent subgraph mining. It also requires functional attention allocation.

Context formation

Implement context formation, wherein an Atom C is defined as a context if it is important, and restricting other Atoms A to the context of C leads to conclusions that are significantly different from A's default truth value.

Clustering

See Clustering

Embodiment - Controlling Avatars and Robots

These projects are related to supporting the embodiment of OpenCog in real and virtual environments.

Build a ROS proxy for OpenCog

Extend OpenCog's Embodiment module to communicate with physical robots, using [ROS: Robot Operating System]. As mechanisms for communicating with agents in virtual worlds will be provided, this is not a huge conceptual leap, but will doubtless lead to many practical complexities. (1-3 depending on how far you go)

Work has been done toward this project by the Artificial Brain Lab at Xiamen University in Fujian, China.

  • XIA-MAN: An Extensible, Integrative Architecture for Intelligent Humanoid Robotics PDF, Ben Goertzel, Hugo de Garis

Build a Minecraft interface for OpenCog

Build an integration between OpenCog and a stable, well-documented open-source Minecraft client API (such as Mineflayer or Spock) in order to allow use of any server that implements the Minecraft standard (including the official Minecraft, as well as any open-source implementations), so that this existing standard environment could be utilized to build simulation environments for Artificial General Intelligence experiments without having to spend extensive time engineering new software for simulation.

The project would involve writing interface code between a Minecraft client API and OpenCog and would also involve replacing some of the existing OpenCog embodiment code that implements messaging with external systems with a simplified and modernized implementation. The project would be an opportunity to utilize the software design concepts of message queues, loose coupling and object serialization for external consumers. Ideally, a candidate should have experience in both Python and C++.

For more ideas about what this would look like, refer to this thesis paper on Minecraft as a simulation environment for Artificial General Intelligence: http://tmi.pst.ifi.lmu.de/thesis_results/0000/0073/thesis_final.pdf Note that this thesis was written in the context of a different project, and many of the implementation decisions contained within will be different from this project proposal; however, the document does include many nice explanations and examples that give a good general overview of the problem space.

This project has started and is currently looking for volunteers. If interested, please visit the Minecraft Bot Development Roadmap.

Inference regarding skeletons and surfaces of objects in a 3D game world

How can an OpenCog controlled game character, in a 3D game world implemented in Unity3D, reason about solid objects (represented either as 3D models or as Minecraft-like assemblages of bricks)?

One way is to use existing OSS code to map solid objects into skeletons, and to identify the parts of the surface of the objects. This information about skeletons and surfaces can then be fed into the PLN (Probabilistic Logic Networks) engine for reasoning.

This is good for someone with experience or very strong interest related to 3D modeling, graphics or game programming, as well as OpenCog and logical inference.

  • Looks good for Semester of Code; Lake or Shujing could mentor

Convert message format from XML to protocol buffers

XML is slow and verbose. Google protocol buffers are better choice for passing perceptual updates and actions between OpenCog's "mind" and the environment it's interacting with. This project would involve a conversion of XML messages into protocol buffer objects, and given that some of the XML messages are built using string concatenation it may require some refactoring of the messaging system.

New Embodiment Module

See New Embodiment Module, March 2015

OpenCog Applications

This section lists projects related to using OpenCog components for specific applications.

Test Page

Sokoban

Sokoban would be a good toy domain for experimenting with various OpenCog methods. So hooking up OpenCog to a simple Sokoban server would seem worthwhile. MOSES and PLN could be used for Sokoban on their own, but it's more interesting of course to take an integrative approach. (1)