A Cognitive API

From OpenCog
(Redirected from API)
Jump to: navigation, search

The goal here is to sketch some general ideas regarding an API that might sensibly be used for a “cognitive application toolkit” aimed at enabling software developers to create AI applications with AGI or proto-AGI functionality.

For a developer whose focus is the creation or improvement of AI algorithms, the appropriate interface for an AI system is one that is extremely general and flexible, providing the maximum possible latitude in experimenting with new ideas and increasing the intelligence, scope and/or efficiency of existing code. (For example: While rough around the edges in various places, OpenCog’s current Scheme shell is a reasonable first approximation of an interface of this sort.)

On the other hand, for a developer whose focus is the creation of AI-based application systems, a different sort of interface is appropriate – an Application Programming Interface or API that supplies a limited but powerful set of application functionalities, in a manner providing:

  • Simplicity of access to the AI functionalities commonly required by application developers
  • For each functionality, either a reliable level of intelligence, or a reliable confidence level associated with each output (indicating the system’s assessment of its own confidence in the intelligence of its output in a given instance)
  • Robustness as a software system, when carrying out the specific application functionalities directly accessible by the API

The initial author of this page (Ben Goertzel) has little experience or knowledge of the particulars of API design, so the details on this page should not be taken too seriously. They should be taken as an attempt to fairly precisely specify the content, but not the form, of a potential Cognitive API.

See also: REST API

Example Use Cases

For example, use cases that were kept in mind while formulating this very rough API sketch were:

  • A personal assistant app on a smartphone or computer (like Siri or Calo but smarter)
  • A natural language search engine (or dialogue system) against a multimedia database
  • A scientist's assistant that helps a scientist explore databases and datasets and find new conclusions and hypotheses
  • The cognitive and perceptual (but not motoric) aspects of toy robot [for movement aspects one could use e.g. ROS, and integration of an OpenCog cognitive API with ROS would be a different topic]


The draft API as very roughly sketched is focused more on cognitive stuff than low-level perceptual or motor processing, though it does include the latter in some simple ways....

It also, as a first sketch, probably has holes in it relative to what would be needed for the above examples...

Representation Language

A Cognitive API will need to be associated with a flexible knowledge representation language.

For instance, if a cognitive API were to be realized using OpenCog, then the representation language would be the language of the AtomSpace, e.g. as realized via the Scheme representation now commonly used to load Atoms into the AtomSpace.

In general, one wants a KR (Knowledge Representation) language that is highly general, and is reasonably easily both human and machine readable.

The bulk of the API outlined here constitutes a set of queries to be made of a cognitive system. However, the query API assumes the existence of a KR that can straightforwardly be used to provide descriptions of the following entities:

  • Actions (that Agents can take)
  • Agents, e.g. people, AIs or corporations
  • Bodies (e.g. for a robot or game character) and their instantaneous Positions
  • Categories (sets of other Entities or Expressions)
  • Communication media (e.g. speech, text, video, image)
  • Communications, in relevant media; e.g. a communication might be
    • A chunk of text
    • Some speech, or another sound file
    • An image
    • A video
  • Constraints
  • Datasets (including perceptual datasets from sensors, or datasets from other sources provided for analysis)
  • Expressions, e.g. statements or descriptions in an unambiguous formal language associated with the KR (natural language expressions are included under the category of Communications rather than Expressions)
  • Events
  • Goals; where a goal may be specified as one of
    • A predicate, i.e. a function whose input is a Situation and whose output is a fuzzy truth value in [0,1]
    • A target Situation, in which the implicit “goal function” is to create a situation as similar as possible to the target
  • Maps (geospatial, with annotations); these may be crisp or (as e.g. in SLAM for robotics) probabilistic
  • Movements (for some set of actuators, e.g. a robot body); a special case of Actions
  • Objects (in the physical object sense)
  • Patterns (logical expressions representing combinations of Entities, including expressions bearing variables representing abstract patterns)
  • Situations (a situation being a set of events, objects and agents)

The term Entity will be used in the below to encompass all of the above except for Expressions and Patterns.


Given a reasonably flexible KR language, any of the above entities can be described in a huge variety of different ways. This is not necessarily a problem, but for the cognitive system to have a better chance of understanding a user’s queries, it will be easier if the user poses his knowledge in terms of standard ontologies wherever relevant and feasible.

Generally valuable ontologies in this context are, for example, WordNet, ConceptNet, YAGO, and DBpedia. These provide a degree of standardization of common concepts that span various data types and sources.

Loading of Data

The queries described below are, basically, requests that refer to a specific AtomSpace that has already been instantiated and populated.

Additionally, it may also be useful to have requests to load certain data into an AtomSpace.

Then one could have patterns of usage such as

  • query an existing AtomSpace known to be populated with certain types of information already
  • load in some data to an AtomSpace, and then submit some queries based on that data

Response Formatting

A good question, posed by Cosmo Harrigan, is "should the response contain the recursive information necessary to be usable in a standalone manner, or would it generally require additional requests to the API to assemble related information about the entities contained in it?"

Perhaps that would be an option that could be set as part of each request? It seems there are use cases for either option...


Now we will describe a set of queries that, we suggest, a cognitive API should support. The set of queries is designed to cover all the key aspects of human-like intelligence. It is assumed that any query can be made contextual, ie. contextualized to an agent A or situation S, or constraints C. In the following, each query will be described verbally, and then in a simple query language. The query language used for presentation here is simply standard function-argument notation, with * prepended to optional arguments. Of course, the actual query language implementing the API suggested may be syntactically different from what is given here.


Given a body description, generate movements needed for body B to generate movement satisfying constraints C

  Movement-List GetMovementScript(Body B, InitialPosition P, Goal G, *Constraints C)

How can tool/object A be used (to achieve goal G)?

  Action-List GetUsageScript(Object T, Goal G, * Agent A, * Constraints C)

Agent Modeling

What does agent A know about situation S?

  Expression-List GetKnowledge(Agent A, Situation S)

What are agent A's goals in situation S?

  Goal-List GetGoals(Agent A, Situation S)
  Goal-List GetSubgoals(Agent A, Goal G, Situation S)

What might agent A do in situation S?

  Action-List PredictActions(Agent A, Situation S)

What are agent A's beliefs / emotions in situation S?

  Expression-List GetBeliefs(Agent A, Situation S)
  Expression-List GetEmotions(Agent A, Situation S)

What are agent A's beliefs / emotions about X in situation S?

  Expression-List GetBeliefsAbout(Agent A, Entity E, Situation S)
  Expression-List GetEmotionsAbout(Agent A, Entity E, Situation S)

What should agent A know, but does not, that would be highly useful for achieving their goals in situation S?

  Expression-List GetMissingKnowledge(Agent A, *Goal G, Situation S)

What are the recurrent behavioral patterns of agent A?

  Expression-List GetBehaviorPatterns(Agent A, Situation S)

How is X interacting with Y?

  Expression-List GetInteractions(Agent A, Agent B, *Situation S)

How does X want to be interacted with?

  Expression-List GetDesiredInteractions(Agent A, *Situation S)


Focus attention on X

  DirectAttention(Entity X, *Time T)

What is in the attention of Agent X?

  List GetAttention(Agent A, *Time T)


Of what parts might X be constructed?

  Object-List GetParts(Object O)

By what process might X be constructed (out of parts P)?

  Expression GetProcessToBuild(Object O)


How to communicate X to agent A (or to an agent of type T) using medium M

  Communication Express(Expression X, Agent A, Medium M)
  Communication Express(Entity E, Agent A, Medium M)

How to interpret communication X from agent A (or to an agent of type T) received via medium M

  Expression-List Interpret(Communication X, *Agent A)

Providing an answer for a question Q posed by agent A

  Answer(Communication X, *Agent A, *Situation S)

Contextual understanding

Relevant aspects of context of Agent A and/ or situation S

  GetContext(Agent A, *Time T)
  GetContext(Situation S, *Time T)

How might situation S most likely change if context changes? How might a context change most likely affect A?


Find emotions associated with X

  GetAssociatedEmotions(Entity X, *Situation S, *Time T)

Guess emotions experienced by agent A in situation S

  GetAgentsAssociatedEmotions(Entity X, Agent A, *Situation S, *Time T)

Interaction Mode

Find content in medium M corresponding to C (which is in some other medium)

  GetCorrespondingContent(Communication C, Medium M)

e.g. Sound associated with object Object associated with sound etc.


What patterns distinguish categories C1, C2, C3, ...?

  Expression-List GetDistinguishingPatterns(Category-List C)

What patterns of temporal, spatial or spatiotemporal change exist in data X?

  Expression-List GetTemporalPatterns(Situation-List S)
  Expression-List GetTemporalPatterns(Dataset D)
  Expression-List GetSpatialPatterns(Situation-List S)
  Expression-List GetSpatialPatterns(Dataset D)
  Expression-List GetSpatioTemporalPatterns(Situation-List S)
  Expression-List GetSpatioTemporalPatterns(Dataset D)

What surprising or frequent patterns exist in data X?

  Expression-List GetFrequentPatterns(Situation-List S)
  Expression-List GetFrequentPatterns(Dataset D)

How might E be imitated in situation S, or subject to constraints C?

  Entity GetAnalogue(Entity E, Situation Source, Situation Target)

Memory / Knowledge

Conversion between types of memory, e.g.: supported memory types shuld include episodic, procedural, declarative/semantic, intentional

This enables answering questions like: What kind of episodes correspond to fact/belief B? etc.

  Expression_List GetCorrespondingMemories(Entity E, MemoryType T)

Also a generic query like

  Entity_List GetKnowledgeAbout(Entity E)

to find what is known about a person, place, thing or whatever -- no matter what that may be...


  Media_Item Annotate( Media_Item)

which would take the input media item (e.g. document, image, video) and annotate it with the important bits of the system's knowledge about the entity.


  EntityList GetRelationshipsBetween(Entity_List E)

which finds the known relationships between the entities on the list...



To achieve goal G, what subgoals might first be usefully achieved?

  Expression_List GetSubgoals(Goal G, * Situation S, * Agent A)

If G were achieved, what other goals would likely be achieved along the way?

  Expression_List GetImpliedGoals(Goal G, * Situation S, * Agent A)


Objects in scene/situation S

  Entity-List GetObjects(VisualScene S)
  Entity-List GetObjects(Situation S)

People in scene/situation S

  Entity-List GetPeople(VisualScene S)
  Entity-List GetObjects(Situation S)

Identify events occurring in situation S

  Expression-List GetEvents(VisualScene S)
  Expression-List GetEvents(Situation S)

Identify types of activity occurring in situation S

  Expression-List GetActivities(VisualScene S)
  Expression-List GetActivities(Situation S)

Planning / Action Selection

Suggest action that agent A might take to cause goal G to get achieved in situation S

  Expression SuggestNextAction(Agent A, Goal G, *Situation S)

What might be a plan for achieving G in situation S? (for agent A)

  Expression-List SuggestPlan(Agent A, Goal G, * Situation S)

Given (perhaps partial) map M, generate movement that is likely to get an agent with capabilities C at position A, to position B reasonably rapidly

  Expression-List SuggestRoute(Agent A, Map M, InitialLocation I, TargetLocation T)

Planning / Action Selection

Suggest action that agent A might take to cause goal G to get achieved in situation S

  Expression SuggestNextAction(Agent A, Goal G, *Situation S)

What might be a plan for achieving G in situation S? (for agent A)

  Expression-List SuggestPlan(Agent A, Goal G, *Situation S)

Given (perhaps partial) map M, generate movement that is likely to get an agent with capabilities C at position A, to position B reasonably rapidly

  Expression-List SuggestRoute(Agent A, Map M, InitialLocation I, TargetLocation T)


Predict what comes after X (on time scale T)

  Expression PredictNext(Event X, *TimeScale T)

Predict what would be the consequence if X happened

  Expression PredictConsequence(Event X, *Situation S, *TimeScale T)


How does X compare to Y by metric M?

  Number Compare(Entity X, Entity Y, Metric M)

What is a good metric M for comparing X versus Y?

  Expression GetMetrics(Entity X, Entity Y)


Find (the most interesting) conclusions of evidence-set

  Expression-List GetConclusions(Entity-List E1, Expression-List E2)

Given evidence-set, find lines of reasoning that might support or refute hypothesis H

  Expression-List ExploreHypothesis(Entity-List E1, Expression-List E2, Expression H)

Find hypotheses, perhaps speculative, supported by evidence E

  Expression-List GetHypotheses(Entity-List E1, Expression-List E2)

Social Interaction

What are the social relationships between the individuals in agent-set A?

  Expression-List GetSocialRelationships(Agent-List A, *Situation, *Time)

What are the key social groups among the individuals in agent-set A?

  Expression-List GetSocialGroups(Agent-List A, *Situation, *Time)