PostgreSQL

From OpenCog

The AtomSpace can communicate with databases and other remote AtomSpaces using the StorageNode and the API's described in Distributed AtomSpace.

This wiki page describes how to configure a PostgreSQL database so that you can use the PostgresStorageNode to save/restore AtomSpace contents to it.

An up-to-date tutorial, but without the install and config steps, can be found at examples/atomspace/persistence.scm in the AtomSpace github examples directory.

Caution: This page is slightly stale, and some of the notes below might not be quite right. Some of the performance-tuning advice given below can result in data loss if you loose electric power while writing to the database. Uninterruptible power supplies are recommended, as are SSD disks for performance.

Also: Please use RocksStorageNode if you want high performance. Use Postgres only if you feel you need to. The Rocks storage node is 4x to 10x faster than the Postgres node. It is also much easier to use: there is approximately zero configuration that needs to be done, when using RocksDB! Hooray!

Installation

Note, since 2015, these instructions have been updated to suit Postgres 9.3 and Ubuntu 14.04

Backing Store Setup

There are four basic steps needed to setup the database: 1) installation of the database engine, 2) device driver setup, 3) user and password setup, and 4) table initialization. Each step discussed below.

On Ubuntu the Postgres database is not installed by default, nor is it installed during the opencog install.

Preliminary checks

Is postgres already installed on your system? Type: which -a psql. Did this command return a path? if so, it's likely to be installed.

If postgres is already installed, check which version is installed. Based on the path found as output to the last command, type this /usr/bin/psql --version - the output should be something like: psql (PostgreSQL) 9.3.15

Is libpg-dev installed yet?

Note, if libpg-dev is not yet compiled into the atomspace build, you need to do so.

Download and install the libpq-dev packages, which provide the C-language bindings to the Postgres client library.

sudo apt-get install libpq-dev
cmake ..
make

A positive confirmation that it was included in the atomspace build is in the output under the 'The following components will be built:', you see 'SQL Postgres bindings - Save/Restore of AtomSpace to Postgres database.'

Note before using apt-get to install postgres, might as well update local apt repositories by entering: sudo apt-get update

Database Install

Download and install Postgres version 9.3 or newer. The current design simply won't work with MySQL, because of a lack of array support. Same holds true for SQLite. Sorry. There is some work in the code-base to support these other databases, but the work-arounds for the missing features are kind-of complicated, and likely to be slow.

Be sure to install the postgres server, the postgres client, and the odbc-postgresql device driver. At the time of writing the current version of postgres was 9.6.1

Note, do not install postgres-xc as the ubuntu bash commandline may suggest after you type postgres before you install it
Note, to find the latest info on the supported version of postgres on ubuntu: apt-cache show postgresql
Note, to find the dependancies of postgres on ubuntu: apt-cache showpkg postgresql

On the bash commandline, enter :

sudo apt-get install postgresql postgresql-contrib

Note if you want to make sure it is postgres 9.3 you are installing try : sudo apt-get install postgresql-9.3

(note that postgresql-contrib may not be required, but the ubuntu postgress install documentation suggests it)

(note that the gid md file suggests installation of a particular version of postgres (9.3) or later, though I'm assuming this is an artifact of the time the md file was written)

For ease of postgres admin you may want to install pgadmin3 as such: sudo apt-get install pgadmin3, but this is not required.

Database Performance Tweaks

The Postgres default configuration can be/should be tweaked for performance. The newest version of Postgres seem to be OK (v9.3) but in some cases, performance will be a disaster if the database is not tuned.

Lets begin by editing postgresql.conf (a typical location is /etc/postgresql/9.3/main/postgresql.conf) and make the changes below. The first two changes are recommended on the Postgresql wiki

  shared_buffers = default was 32MB, change to 25% of install RAM
  work_mem = default was 1MB change to 32MB
  effective_cache_size = default was 128MB, change to 50%-75% of installed RAM
  fsync = default on  change to off
  synchronous_commit = default on change to off
  wal_buffers = default 64kB change to 2MB or even 32MB
  commit_delay = default 0 change to 10000 (10K) microseconds
  checkpoint_segments = 32 (each one takes up 16MB disk space)
  ssl = off
  autovacuum = on
  track_counts = on

Note, to find the amount of ram you have, enter: vmstat -s -SM | grep 'total memory'

Once the changes to the postgresql.conf file have been made, restart postgres so they take effect, and to see if there were no errors in your changes: sudo /etc/init.d/postgresql restart <- if there were errors, correct them, and then restart again.
Note, a common mistake is to add ram size in lowercase i.e. mb or gb (Valid units for this parameter are "kB", "MB", and "GB".)

Explanation: A large value for wal_buffers is needed because much of the database traffic consists of updates. Enabling vacuum is very important, for the same reason; performance degrades substantially (by factors of 3x-10x) without regular vacuuming. (Newer versions of postrgres vacuum automatically (your mileage may vary).

Restarting the server might lead to errors stating that max shared mem usage has been exceeded. This can be fixed by telling the kernel to use 6.4 gigabytes (for example):

  vi /etc/sysctl.conf
  kernel.shmmax = 6440100100

save file contents, then:

  sysctl -p /etc/sysctl.conf

Note, if you continue to get errors after that, read this post on 'postgresql shared memory settings' at stackoverflow for help fixing them.

User & Database Setup

A database user needs to be created; the database tables need to be created.

Note, the database user is NOT the same thing as a unix user: the user login is for the database, not the OS. Do NOT use the same login and password!

Multiple databases can be created. In this example, the database name will be "mycogdata". Change this as desired.

So, at the Unix command line: createdb mycogdata

If you get an error message 'createdb: could not connect to database template1: FATAL: role "<USER>" does not exist', then try doing this, replacing '<USER>' with your username.

At the commandline, enter: sudo su - postgres

The look of your commandline prompt should change to postgres@<MACHINENAME>:~$ or something like that.

At the commandline, enter: psql template1 Then enter:

  template1=# CREATE ROLE <USER> WITH SUPERUSER;
  template1=# ALTER ROLE <USER> WITH LOGIN;

Verify that worked out by typing \dg to see:

                             List of roles
 Role name |                   Attributes                   | Member of
-----------+------------------------------------------------+-----------
 <USER>    | Superuser                                      | {}
 postgres  | Superuser, Create role, Create DB, Replication | {}

Then do Ctrl+D to exit the 'template1=#' prompt, ignoring any message about psql_history

Then return to your own account (exiting the postgres commandline prompt) by typing: exit

If you ran into the error above you still need to create the database of course (no output if succesful): createdb mycogdata

Next, create a database user named opencog_user with password cheese. You can pick a different username and password, but it must be consistent with the ~/.odbc.ini file. Do NOT use your unix login password! Pick something else!

Create the user at the shell prompt: psql -c "CREATE USER opencog_user WITH PASSWORD 'cheese'" -d mycogdata

Check that the above worked, by manually logging in: psql mycogdata -U opencog_user -W -h localhost

If you can't login, something up above failed.

Navigate to the atomspace folder you cloned from GitHub:

cd ~/atomspace or cd ~/opencog/atomspace/ or something like that...

So we can create the database tables: cat opencog/persist/sql/multi-driver/atom.sql | psql mycogdata -U opencog_user -W -h localhost .. and enter password (remember your password?)

Verify that the tables were created. Login as before: psql mycogdata -U opencog_user -W -h localhost .. and enter password..

To list the tables, enter \d at the postgres prompt (mycogdata=>). You should see this:

             List of relations
 Schema |   Name    | Type  |    Owner
--------+-----------+-------+--------------
 public | atoms     | table | opencog_user
 public | global    | table | opencog_user
 public | spaces    | table | opencog_user
 public | typecodes | table | opencog_user
(4 rows)

If the above doesn't work, go back, and try again.

Verify that opencog_user has write permissions to the tables. Do this by entering: INSERT INTO TypeCodes (type, typename) VALUES (97, 'SemanticRelationNode');

You should see the appropriate response:

INSERT 0 1

If the above doesn't work, go back, and try again.

Exit the postgres prompt (mycogdata=>) with Ctrl+D.

Opencog Setup

Edit ~/opencog/build/lib/opencog.conf and set the STORAGE, STORAGE_USERNAME and STORAGE_PASSWD there to the same values as in ~/.odbc.ini.

STORAGE               = "mycogdata"
STORAGE_USERNAME      = "opencog_user"
STORAGE_PASSWD        = "cheese"

Or copy lib/opencog.conf to your build directory, edit the copy.. and start the opencog server from your build folder as: $ ./opencog/server/cogserver -c my.conf

Verify that everything works. Start the cogserver:

, and bulk-save. (Actually this didn't work for me, I had to do sql-open before sql-store worked, as shown below, even though my opencog.conf was correct and when using a custom my.conf file.)

Enter the cogserver from the unix commandline: telnet localhost 17001 Run Commands from the 'opencog>' prompt:

   opencog> sql-store

Some output should be printed in the COGSERVER window:

Max UUID is 57
Finished storing 55 atoms total.

The typical number of atoms stored will differ from this.

You don't have to put the database credentials into the opencog.conf file. In this case, the database would need to be opened manually, using the sql-open command:

   $ telnet localhost 17001
   `opencog> sql-open mycogdata opencog_user cheese

Notice that the user-name and the password are the same as that given in the ~/.odbc.ini file. These are NOT (and should not be) your unix username and password!

NOTE: the telnet cogserver seems broken!!! (2017-01-13) Will try to move forward using the guile shell

Ok enter the scheme shell from the unix prompt by typing:guile

Load the persist-sql module - type: (use-modules (opencog) (opencog persist-sql))

Open the database - type: (sql-open "mycogdata" "opencog_user" "cheese")

Test to see if sql-store works - type: (sql-store)

Close the database - type:(sql-close)

ANOTHER NOTE: on 20 June 2017, from the opencog prompt needed to type 'sql-open localhost:5432/mycogdata <USER> <PASSWD>' Ignored the note above.

Appendix

Check out documentation on GitHub (recently updated in 2017): SQL Persist (long version): https://github.com/opencog/atomspace/blob/master/opencog/persist/sql/README.md (short version): https://github.com/opencog/atomspace/blob/master/tests/persist/sql/README.md

UnixODBC Drivers

If you need the UnixODBC drivers, or are just interested in using them - check if they are installed already. If not at the bash commandline, enter sudo apt-get install unixodbc-dev Note: For the most part, ODBC has been deprecated (ODBC has been a thorn, and it also has poor performance) - the extra complexity required by ODBC leads to additional mistakes, which were hard to debug.

Notes

Maintained by : Adam Ford, Mandeep, Linus Priority: High