GSoc 2008 - Distributed HypergraphDB

From OpenCog
Jump to: navigation, search

Description

The aim is to create a distributed version of the HyperGraphDB database (HGDB). While there are no clear cut requirements of what a distributed HGDB should do, we intend to develop the basic building blocks from which behavior tailored towards specific application needs can be constructed. This includes the basic communication layer for transferring atoms and performing queries between HGDB instances in a plain client-server fashion as well as an extensible high level protocol that can be used to implement a peer-to-peer HGDB networks.


Communication

We have chosen JXTA as a communication protocol between peers hosting HGDB instances because: - Abstracts network topologies - Has a C++ version (for when the project will be ported to C++) - Allows creating logical network topologies that can help organizing the data flow in the network

We tested the communication in different scenarios (from peers on the same computer to peers behind different firewalls and NAT). What remains to be done is some fine tuning and performance tests of JXTA it self.

In order to transmit the custom data we used JSON (as a lightweight replacement for XML)

The structure of the messages was designed according to FIPA standards.

More information about the communication mechanisms used can be found on http://code.google.com/p/hypergraphdb/wiki/Communication

Replication

We implemented a basic replication schema. The idea is that every peer that is interested to store certain atoms publish their interests and they get a message every time such an atom is modified with one of the peers.

The mechanism guarantees that, once the interest in announced, all atoms will be delivered, even if the peer is down when the atom is modified.

The mechanism still requires a lot of work in terms of performance and eventual consistency guarantees.

Other

We also modified the replication mechanism to provide a synchronous interface to remote peers. For the time being a basic API is provided (add/remove/delete/query/batch), but it will be extended as needed.