Talk:Development

From OpenCog

Jump to: navigation, search

THIS IS A TALK PAGE, NOT AN ARTICLE PAGE. PLEASE REMOVE ALL CONTENT FROM THIS PAGE THAT DOES NOT PERTAIN TO TALK, OR IT WILL BE DELETED FOR YOU!

IF ANY OF THIS STUFF IS READY TO BE PUT INTO AN ARTICLE, PLEASE DO SO NOW!

Contents

Interactive shell

An interactive shell to manipulate the AtomSpace is needed.

  1. Should we have a foreign language function interface, which thus needs to be run locally? or
  2. Do we have a remote shell that connects with OpenCog and sends commands/receives output. or
  3. Both?

As of mid-2008, there is an easy-to-use scheme shell in opencog; see the cookbook for simple examples, or the scheme command reference for a complete set of documentation.

SWiG can be used to generate the bindings between C++ and Python (as well as Ruby, Lisp, etc.). One can then use Twisted to create a remote Python shell.

Not every C++ object should be exposed,the correct thing to do would be to create python functions for the following:

1.atom (link, node) creation, destruction
2.truth value creation
3.setting and getting atom properties, such as truth values,
4.python wrappers for "miscellaneous" opencog
  "mind agents", e.g. possibly some certain entry
  points to attention allocation, being careful to
  limit/restrict the number of entry points to a
  minimum. 


User:Linas has implemented a Guile LISP interpretor and extended Scheme so there are functions for adding links/atoms. Joel Pitt 04:18, 29 May 2008 (CDT)

Swapping, Distributed Processing

TODO: summarize email thread starting April 29th 2008. (??)

MemcacheDB:

Memcachedb is a distributed key-value storage system designed for persistent. It is NOT a cache solution, but a persistent storage engine for fast and reliable key-value based object storage and retrieval. It conforms to memcache protocol(not completed, see below), so any memcached client can have connectivity with it. Memcachedb uses Berkeley DB as a storing backend, so lots of features including transaction and replication are supported.

Upshot: an attempt top uses memcachedb was made, but it was a looser. See code in opencog/memcache for the smoking remains. A robust, scalable atom persistence mechanism was built on top of Postgres. It can bulk-save/restore atoms from a database; alternately, multiple cogserver instances can access the database on the fly (dynamically), and share atoms with one-another that way. See the opencog/persist/README for details. See opencog/atomspace/BackingStore.h for the (currently very very simple) dynamic API.

This API is not fully integrated with the Economic Attention Allocation (ShortTerm/LongTerm Importance) subsystem. It should be/is meant to be. Once short-term importance is low enough, the Atom is deleted from RAM, but remains on disk. It's current truth value should be saved to disk before it is deleted. In addition, atoms whose LongTerm Importance has vanished should be deleted from disk; at this time, they are not.

Note that the BackingStore API could be used/extended to allow opencog server instances to trade atoms to the fly, without rendezvous'ing with a database.

Distributed Processing

The easiest way to implement distributed processing is to have multiple different cog-servers chew on the data in one big giant hypergraph store. This is sort-of more-or-less possible today; with work to fully enable this going on slowly but actively.

Currently, OpenCog server instances can work on the same set of atoms by tapping into the same database -- all servers that are attached to the same database would see the same set of atoms, and can thus manipulate them, etc. This provides a simple, quick-n-easy way of doing distributed processing today. See opencog/persist/README for details.

The interface to the database is currently defined in opencog/atomspace/BackingStore.h This API is currently dirt-simple but could be extended to provide direct server-to-server communications.

The current peristence store is built on SQL, but a more natural store would be natively hypergraph-aware. Scalability and performance are issues to watch out for. Possible data store technologies include:

  • BigTable
  • HyperTable
  • Hadoop

Some of the questions include what sort of indexes to maintain on the master table.

Visualization Tool

The basic idea is to allow AI engineers to more easily visualise OpenCog dynamics. As well as provide an interface for manipulating the AtomSpace and debug high-level processes (i.e. not code errors, but erratic or unwanted system dynamics). It is discussed in more detail here VisualizationToolIdeas.

Multithreading

Currently OpenCog is single threaded. Making it multithreaded is highly desirable.

Some considerations:

  • Will a single instance of a MindAgent be expected to run at the same time? Or will multiple instances be created on the off chance a MindAgent runs at the same time.
  • Locking the AtomTable. What form should this take? Should each atom have a mutex? Each AtomType? Should MindAgents be expected to hold access only briefly per write, or to intelligently obtain locks trading freedom of other agents to make changes versus efficiency.

(Note from Joel: IANAMTA - I Am Not A Multi-Threaded Architect)

Many, but not all, of the difficulties to making OpenCog multi-threaded could be solved in one swell foop by linking it to the Boehm garbage collector. This would at least solve the problem of referencing atoms in one thread that might have been freed in another. Enabling Boehm GC should be pretty easy.

Peer to Peer Communications

Uhh, why do we need direct peer-to-peer communications?

Just listing potential options as they are thought of at the moment:

Peer to peer systems:

  • Using Skype as a p2p platform was discussed generally on the AGI mailing list. [1]
  • n2n [2]
  • Berkeley Open Infrastructure for Network Computing (BOINC) [3]

For message passing:

Indirect communications (peer to storage to peer) can be gotten by having two systems talk to one storage backend.

Optimization

  • CUDA - utilize the power of NVIDIA consumer graphic cards [4].
    • Paper on implementing several graph algorithms using CUDA [5].
    • Programmer's Guide [6]


So, using CUDA could make the PLN equation solving go really fast....

And there is a convenient python interface for CUDA.

http://mathema.tician.de/software/pycuda

What are the relevant chapters in the PLN book? What are the equations?

What will the overall PLN implementation look like?

Is it ok to pull images out of pdfs and put them on wiki?

Atom roles

Tables of atom properties such that MindAgents can associate arbitary properties with atoms.

Documentation

Tutorial

Write a tutorial like that found in OpenBiomind

Packaging

For more details, visit Packaging page.

Assuming a very inclusive trunk at lp:opencog (i.e. allowing just about anything that's deemed worthy and is license compatible with the Framework & OpenCog Core),

Create Debian package build directives in the lp:opencog branch and distribute Ubuntu packages via Launchpad using its native build system (the Personal Package Archive, or PPA).

Where applicable, also create complimentary packages of the form opencog-prime-source and opencog-prime-dev (header files only, e.g. so new and independent MindAgents may be developed).

opencog-prime and opencog-collective packages should be usable without the need to compile source code, for example in lab exercises or real-world applications where only configuration, input and output are used.

Approximately in chronological order:

opencog-framework package

A meta-package for developers, aka an SDK (software development kit); proposed packaging scheme:

  • opencog-framework
    • opencog-freecore-dev
      • cogserver-dev
        • libatomspace-dev
    • opencog-framework-doc
      including tutorial with interesting but simple application

opencog-prime package

proposed packaging scheme:

  • opencog-prime
    includes cognitive architecture configuration files
    • opencog-freecore
      • cogserver
        • libatomspace
    • opencog-prime-mindagents
      includes MindAgents for attention allocation, reasoning, creativity, etc.
      • opencog-prime-algorithms
        • libPLN
        • libMOSES
    • opencog-prime-doc
      including OpenCogPrime tutorial

opencog-collective package

aka contrib for ad-hoc MindAgents & logic/NN/whatever libraries

  • opencog-collective
    • example-mindagent
      • example-library

OpenBiomind Integration

Determine the best method (e.g. Java bindings for AtomSpace, CogServer and Core Algorithm APIs, or other methods).

Reference Implementations of Textbook AI Techniques

This section needs volunteers! Implementations of techniques listed here are required for experimentation, for comparative analysis, to enhance OpenCog's reach as a general and universal research tool, to extend OpenCog as an educational platform and as a tool useful in the presentation and demonstration of work for publication, and to attract developers who might not otherwise be knowledgeable about the bleeding-edge of OpenCog's core research.

Other projects & general frameworks

For use as examples and sources of ideas & inspiration.

Weka has many machine learning algorithms implemented in a Java framework for classification and clustering.

Neural Networks

For reference wikipedia:Artificial_neural_network

Feedforward neural networks

Radial basis functioning networks

Kohonen self-organizing networks

multilayer perceptrons

Recurrent networks

  • Hopfield networks
  • Echo state networks

Stochastic neural networks

  • Boltzmann machines

Bayesian Networks

For reference wikipedia:Bayesian_networks

Evolutionary Algorithms

Other Classification Algorithms

Kernel methods / Support vector machines