Using a Backing Store

From OpenCog
Jump to: navigation, search

OpenCog uses the opensource Postgres database as a datastore for backing up the AtomSpace

Caution: This wiki page is a bit stale and out of date. The ODBC drivers are deprecated; they are slow. The native Postgres drivers are smaller, faster, and much much easier to set up. Note also that some of the performance-tuning advice given below can result in data loss if you loose electric power while writing to the database. Uninterruptible power supplies are recommended, as are SSD disks for performance.

Installation

Note, since 2015, these instructions have been updated to suit Postgres 9.3 and Ubuntu 14.04

Backing Store Setup

There are four basic steps needed to setup the database: 1) installation of the database engine, 2) device driver setup, 3) user and password setup, and 4) table initialization. Each step discussed below.


On Ubuntu the Postgres database is not installed by default, nor is it installed during the opencog install.

Preliminary checks

Is postgres already installed on your system? Type: which -a psql. Did this command return a path? if so, it's likely to be installed.

If postgres is already installed, check which version is installed. Based on the path found as output to the last command, type this /usr/bin/psql --version - the output should be something like: psql (PostgreSQL) 9.3.15

Is libpg-dev installed yet?

Note, if libpg-dev is not yet compiled into the atomspace build, you need to do so.

Download and install the libpq-dev packages, which provide the C-language bindings to the Postgres client library.

sudo apt-get install libpq-dev
cmake ..
make

A positive confirmation that it was included in the atomspace build is in the output under the 'The following components will be built:', you see 'SQL Postgres bindings - Save/Restore of AtomSpace to Postgres database.'


Note before using apt-get to install postgres, might as well update local apt repositories by entering: sudo apt-get update

Database Install

Download and install Postgres version 9.3 or newer. The current design simply won't work with MySQL, because of a lack of array support. Same holds true for SQLite. Sorry. There is some work in the code-base to support these other databases, but the work-arounds for the missing features are kind-of complicated, and likely to be slow.

Be sure to install the postgres server, the postgres client, and the odbc-postgresql device driver. At the time of writing the current version of postgres was 9.6.1

Note, do not install postgres-xc as the ubuntu bash commandline may suggest after you type postgres before you install it
Note, to find the latest info on the supported version of postgres on ubuntu: apt-cache show postgresql
Note, to find the dependancies of postgres on ubuntu: apt-cache showpkg postgresql

On the bash commandline, enter :

sudo apt-get install postgresql postgresql-contrib

Note if you want to make sure it is postgres 9.3 you are installing try : sudo apt-get install postgresql-9.3

(note that postgresql-contrib may not be required, but the ubuntu postgress install documentation suggests it)

(note that the gid md file suggests installation of a particular version of postgres (9.3) or later, though I'm assuming this is an artifact of the time the md file was written)

For ease of postgres admin you may want to install pgadmin3 as such: sudo apt-get install pgadmin3, but this is not required.

Device Driver Setup

Configure the ODBC driver. After install, verify that /etc/odbcinst.ini contains the stanza below (or something similar). If it is missing, then edit this file (as root) and add the stanza. Notice that it uses the Unicode drivers, and NOT the ANSI (ASCII) drivers. Opencog uses unicode!

    sudo vi /etc/odbcinst.ini &


   [PostgreSQL Unicode]
   Description = PostgreSQL ODBC driver (Unicode version)
   Driver      = psqlodbcw.so
   Setup       = libodbcpsqlS.so
   Debug       = 0
   CommLog     = 0

(Note in my (adam's) install, by default after installing Postres, odbcinst.ini revealed that the ANSI and Unicode version of the drivers were installed

The above stanza associates the name PostgreSQL Unicode with a particular driver. This name is needed for later steps. Notice that this is a quite long name, with spaces! You can change the name, (e.g. to shorten it) if you wish, however, it MUST be consistent with the name given in the .odbc.ini file (explained below in 'ODBC Setup, Part the Second').

MySQL users need the stanza below; the /etc/odbcinst.ini file can safely contain multiple stanzas defining other drivers.

WARNING: MySQL is currently not supported. Anyway, if you wanted to mess with it, then add the below:

   [MySQL]
   Description = MySQL driver
   Driver      = libmyodbc.so
   Setup       = libodbcmyS.so
   CPTimeout   =
   CPReuse     =

Database Performance Tweaks

The Postgres default configuration can be/should be tweaked for performance. The newest version of Postgres seem to be OK (v9.3) but in some cases, performance will be a disaster if the database is not tuned.

Lets begin by editing postgresql.conf (a typical location is /etc/postgresql/9.3/main/postgresql.conf) and make the changes below. The first two changes are recommended on the Postgresql wiki

  shared_buffers = default was 32MB, change to 25% of install RAM
  work_mem = default was 1MB change to 32MB
  effective_cache_size = default was 128MB, change to 50%-75% of installed RAM
  fsync = default on  change to off
  synchronous_commit = default on change to off
  wal_buffers = default 64kB change to 2MB or even 32MB
  commit_delay = default 0 change to 10000 (10K) microseconds
  checkpoint_segments = 32 (each one takes up 16MB disk space)
  ssl = off
  autovacuum = on
  track_counts = on

Note, to find the amount of ram you have, enter: vmstat -s -SM | grep 'total memory'

Once the changes to the postgresql.conf file have been made, restart postgres so they take effect, and to see if there were no errors in your changes: sudo /etc/init.d/postgresql restart <- if there were errors, correct them, and then restart again.
Note, a common mistake is to add ram size in lowercase i.e. mb or gb (Valid units for this parameter are "kB", "MB", and "GB".)

Explanation: A large value for wal_buffers is needed because much of the database traffic consists of updates. Enabling vacuum is very important, for the same reason; performance degrades substantially (by factors of 3x-10x) without regular vacuuming. (Newer versions of postrgres vacuum automatically (your mileage may vary).

Restarting the server might lead to errors stating that max shared mem usage has been exceeded. This can be fixed by telling the kernel to use 6.4 gigabytes (for example):

  vi /etc/sysctl.conf
  kernel.shmmax = 6440100100

save file contents, then:

  sysctl -p /etc/sysctl.conf

Note, if you continue to get errors after that, read this post on 'postgresql shared memory settings' at stackoverflow for help fixing them.

User & Database Setup

A database user needs to be created; the database tables need to be created.

Note, the database user is NOT the same thing as a unix user: the user login is for the database, not the OS. Do NOT use the same login and password!

Multiple databases can be created. In this example, the database name will be "mycogdata". Change this as desired.

So, at the Unix command line: createdb mycogdata

If you get an error message 'createdb: could not connect to database template1: FATAL: role "<USER>" does not exist', then try doing this, replacing '<USER>' with your username.

At the commandline, enter: sudo su - postgres

The look of your commandline prompt should change to postgres@<MACHINENAME>:~$ or something like that.

At the commandline, enter: psql template1 Then enter:

  template1=# CREATE ROLE <USER> WITH SUPERUSER;
  template1=# ALTER ROLE <USER> WITH LOGIN;

Verify that worked out by typing \dg to see:

                             List of roles
 Role name |                   Attributes                   | Member of
-----------+------------------------------------------------+-----------
 <USER>    | Superuser                                      | {}
 postgres  | Superuser, Create role, Create DB, Replication | {}

Then do Ctrl+D to exit the 'template1=#' prompt, ignoring any message about psql_history

Then return to your own account (exiting the postgres commandline prompt) by typing: exit

If you ran into the error above you still need to create the database of course (no output if succesful): createdb mycogdata

Next, create a database user named opencog_user with password cheese. You can pick a different username and password, but it must be consistent with the ~/.odbc.ini file. Do NOT use your unix login password! Pick something else!

Create the user at the shell prompt: psql -c "CREATE USER opencog_user WITH PASSWORD 'cheese'" -d mycogdata

Check that the above worked, by manually logging in: psql mycogdata -U opencog_user -W -h localhost

If you can't login, something up above failed.

Navigate to the atomspace folder you cloned from GitHub:

cd ~/atomspace or cd ~/opencog/atomspace/ or something like that...

So we can create the database tables: cat opencog/persist/sql/multi-driver/atom.sql | psql mycogdata -U opencog_user -W -h localhost .. and enter password (remember your password?)


Verify that the tables were created. Login as before: psql mycogdata -U opencog_user -W -h localhost .. and enter password..

To list the tables, enter \d at the postgres prompt (mycogdata=>). You should see this:

             List of relations
 Schema |   Name    | Type  |    Owner
--------+-----------+-------+--------------
 public | atoms     | table | opencog_user
 public | edges     | table | opencog_user
 public | global    | table | opencog_user
 public | spaces    | table | opencog_user
 public | typecodes | table | opencog_user
(5 rows)


If the above doesn't work, go back, and try again.

Verify that opencog_user has write permissions to the tables. Do this by entering: INSERT INTO TypeCodes (type, typename) VALUES (97, 'SemanticRelationNode');

You should see the appropriate response:

INSERT 0 1

If the above doesn't work, go back, and try again.

Exit the postgres prompt (mycogdata=>) with Ctrl+D.

ODBC Setup

Edit ~/.odbc.ini in your home directory to add a stanza of the form below. You MUST create one of these for EACH repository you plan to use! The name of the stanza, and of the database, can be whatever you wish. The name given for Driver, here PostgreSQL Unicode, must match a stanza name in /etc/odbcinst.ini. Failure to have it match will cause an error message:

Can't perform SQLConnect rc=-1(0) [unixODBC][Driver Manager]Data source name not found, and no default driver specified


Note that below specifies a username and a password. These should NOT be your regular unix username or password! Make up something else, something completely different! These two will be the username and the password that you use when connecting from the cogserver, with the sql-open command (below).

Pay special attention to the name given for the Database. This should correspond to a database created with postgres. In the examples above, it was mycogdata.

IMPORTANT: MAKE SURE THERE ARE NO SPACES AT THE START OF EVERY LINE! IF YOU COPY PASTE THE TEXT BELOW YOU WILL HAVE SPACES! REMOVE THEM! OR YOU WILL ALSO GET THE ERROR ABOVE!

[mycogdata]
Description       = My Favorite Database
Driver            = PostgreSQL Unicode
CommLog           = No
Database          = mycogdata
Servername        = localhost
Port              = 5432
Username          = opencog_user
Password          = cheese
ReadOnly          = No
RowVersioning     = No
ShowSystemTables  = Yes
ShowOidColumn     = Yes
FakeOidIndex      = Yes
ConnSettings      =

Opencog Setup

Edit ~/opencog/build/lib/opencog.conf and set the STORAGE, STORAGE_USERNAME and STORAGE_PASSWD there to the same values as in ~/.odbc.ini.

STORAGE               = "mycogdata"
STORAGE_USERNAME      = "opencog_user"
STORAGE_PASSWD        = "cheese"

Or copy lib/opencog.conf to your build directory, edit the copy.. and start the opencog server from your build folder as: $ ./opencog/server/cogserver -c my.conf

Verify that everything works. Start the cogserver:

, and bulk-save. (Actually this didn't work for me, I had to do sql-open before sql-store worked, as shown below, even though my opencog.conf was correct and when using a custom my.conf file.)

Enter the cogserver from the unix commandline: telnet localhost 17001 Run Commands from the 'opencog>' prompt:

   opencog> sql-store

Some output should be printed in the COGSERVER window:

Max UUID is 57
Finished storing 55 atoms total.

The typical number of atoms stored will differ from this.

You don't have to put the database credentials into the opencog.conf file. In this case, the database would need to be opened manually, using the sql-open command:

   $ telnet localhost 17001
   `opencog> sql-open mycogdata opencog_user cheese

Notice that the user-name and the password are the same as that given in the ~/.odbc.ini file. These are NOT (and should not be) your unix username and password!

NOTE: the telnet cogserver seems broken!!! (2017-01-13) Will try to move forward using the guile shell

Ok enter the scheme shell from the unix prompt by typing:guile

Load the persist-sql module - type: (use-modules (opencog) (opencog persist-sql))

Open the database - type: (sql-open "mycogdata" "opencog_user" "cheese")

Test to see if sql-store works - type: (sql-store)

Close the database - type:(sql-close)

ANOTHER NOTE: on 20 June 2017, from the opencog prompt needed to type 'sql-open localhost:5432/mycogdata <USER> <PASSWD>' Ignored the note above.

Concept

  • Example of saving stuff to BackingStore (using ConceptNet?)
  • Example of loading stuff from BackingStore (using ConceptNet?)

Give same examples with both postgres and Neo4j backing store

Appendix

Check out documentaiton on GitHub (recently updated in 2017): SQL Persist (long version): https://github.com/opencog/atomspace/blob/master/opencog/persist/sql/README.md (short version): https://github.com/opencog/atomspace/blob/master/tests/persist/sql/README.md


UnixODBC Drivers

If you need the UnixODBC drivers, or are just interested in using them - check if they are installed already. If not at the bash commandline, enter sudo apt-get install unixodbc-dev Note: For the most part, ODBC has been deprecated (ODBC has been a thorn, and it also has poor performance) - the extra complexity required by ODBC leads to additional mistakes, which were hard to debug.

Notes

Maintained by : Adam Ford, Mandeep, Linus Priority: High

Neo4J

Also see notes on project to see neo4J as a backing store: http://wiki.opencog.org/w/Neo4j_Backing_Store
See the main website for neo4J: https://neo4j.com/


ToDo: There may be a wiki page on this - backup to postgres... (search)