This page summarizes ideas recently developed regarding the building of a largely-new Embodiment module, with the initial use-case being OpenCog-based control of the Hanson Robotics simulated Eva robot head.
The design should be considered unstable at this moment (March 5, 2015). It's the subject of current discussion.
The following summarizes Ben Goertzel's current (March 2015) understanding of the situation with the Embodiment module of OpenCog:
- The code is large and complex and contains some parts that are pretty good (e.g. 3DSpaceMap, inquery interface, etc.) and some parts that everyone agrees are pretty terrible (action interface, XML messaging layer)
- The code currently has bugs that make it crash sometimes (though sometimes it runs fine on some peoples' machines, etc. etc.)
- The people who wrote the code, or who know the code, aren't working actively on OpenCog at this moment and have other priorities besides debugging or improving the Embodiment code
- The initial, mostly-working functionality of the Embodiment module has been to control a game character in the OpenCog Unity3D game world. However, the code inside the Unity3D game world, while it works without major bugs, is also recognized as in many ways nonoptimal. A majority of this code deals with converting btw XML messages and internal Unity3D stuff and sending/receiving messages, and this seems to be the messiest part of the code. The other parts of the code largely deal with blocks management, terrain generation and so forth and according to my casual impression are better quality....
Initial Use Case
I propose to take interfacing with the Hanson Robotics Eva robot simulation system
as the initial application for experimenting with a new/improved Embodiment, rather than the Unity3D game world.
Once a new/improved Embodiment is gotten working with the simulated Eva, then make it work with the Unity3D game world afterwards.
Part of the point here is that, while the simulated Eva robot head infrastructure is far from perfect, at least it is under active development and being actively maintained by someone (Linas), whereas the Unity3D infrastructure is sorta orphaned at this moment...
ROS Messaging Between OpenCog and Eva
(NOTE from Ben: I am not a ROS guru and would welcome additions, modifications or improvements to this section by someone who knows ROS more thoroughly. If I've gotten some terminology wrong feel free to fix it.)
To be clear, the suggestion here is to totally eliminate the XML messaging code existing in the current Embodiment module. This should be replaced with a new system utilizing ROS messaging.
In this architecture, there should be one or more ROS nodes corresponding to Eva and communicating with OpenCog; and one ROS node corresponding to OpenCog.
ROS nodes do queueing, so that various OpenCog processes can then push messages on to the OpenCog ROS node's queue for sending-out; and messages received by OpenCog's ROS node will be stored in a queue for OpenCog processes to pop off and utilize as they are able to.
Based on a discussion with Nil, it seems it may be best to use python for the OpenCog ROS node.
ROS just handles the messaging mechanics, it doesn't say anything about the content of the messages (other than specifying what format they should take). We will need to create custom ROS messages for the signals to be transmitted back and forth between OpenCog and Eva.
MESSAGES FROM OPENCOG TO BODY-CONTROLLER (EVA)
We would want an action message for each of:
blink', 'blink-micro', 'blink-relaxed', 'blink-sleepy', 'nod-1', 'nod-2', 'nod-3', 'shake-2', 'shake-3', 'yawn-1', 'irritated', 'happy', 'recoil', 'surprised', 'sad', 'confused', 'afraid', 'bored', 'engaged', 'amused', 'comprehending' Glance at location (x,y,z) Turn head toward location (x,y,z) Turn off perception stream, Turn on perception stream Get location of object with ID number N
MESSAGES FROM BODY-CONTROLLER (EVA) TO OPENCOG
Perception messages initially needed would be:
Face has become visible, and has tracking number N Face with tracking number N is no longer visible Face with tracking number N has location (x,y,z)
Triggering Messages from the Atomspace
The sending of a ROS message to a robot (or other embodied entity) from within the Atomspace would be triggered by execution of an appropriate GroundedSchemaNode or by a GroundedPredicateNode. The former would be used if there needs to be a return value that is an atom; the later, if the return value is a TV. If a sequence of several messages need to be strung together, such that the next doesn't take place unless the previous one returned true, then the GroundedPredicateNode would be used to return the true/false values.
The schema execution involved is handled by code the scheme API calls cog-execute! and cog-evaluate!. For programming and usage examples, see ExecutionOutputLink and EvaluationLink. To run a sequence of behaviors, string them together into a SequentialAndLink and use the pattern matcher to run it. Note that, in this case, one may place variables into expressions; the pattern matcher will find groundings for the variables.
For instance we might have:
EvaluationLink GroundedPredicateNode "py:send_blink_message" ConceptNode "Eva"
EvaluationLink GroundedPredicateNode "py:send_point_head_message" ListLink ConceptNode "Eva" NumberNode x_coord NumberNode y_coord NumberNode z_coord
In the convention indicated above, the first argument of these "body control schema" indicates the name of the robot body to be controlled. This is intended to allow the same OpenCog Atomspace to control multiple bodies. So if the same Atomspace was to send a control message to some body besides Eva, it might send something like:
EvaluationLink GroundedPredicateNode "py:send_blink_message" ConceptNode "R2D2"
Example blink message:
import rospy import roslib from opencog.atomspace import TruthValue from blender_api_msgs.msg import SetGesture def __init__(self): self.eva_gesture_pub = rospy.Publisher("/blender_api/set_gesture", SetGesture, queue_size=1) def send_blink_message(atom): if 'eva' == atomspace.getNamee(atom): self.eva_gesture_pub.publish("blink-1") return TruthValue(1.0,1.0) throw (RuntimeError, "Unknown robot name")
That's pretty much it. Notice that using ROS is almost trivial, in this example.
Porting Owyl Behavior Trees into the Atomspace
On the Behavior_tree page, Linas has explained how to implement a behavior tree in OpenCog. I think this may work now (March 5, 2014); ask Linas to know for sure....
The current Eva system relies on a bunch of behavior trees created using a python tool called Owyl. All these Owyl behavior trees can be ported to the Atomspace, using the ideas on the Behavior_tree page. This will enable the high-level control of the Eva robot to be done within the Atomspace -- which will then send ROS messages to the Eva robot, each message specifying a certain primitive action to be taken by the robot.
A Simple Perception Manager for OpenCog
After a perception message is received by OpenCog's ROS node, from the Eva robot, then the PerceptionManagerAgent has got to pop that message off the ROS node's queue and do something with it.
The PerceptionManagerAgent must perform the translation/transformation of each ROS message into appropriate Atom structures.
For instance, suppose one wants the following behavior: When the message "face present" is received at time $T from Eva, then the Atom structure
AtTimeLink $T EvaluationLink PredicateNode "face present" ConceptNode "Eva"
Two ways to achieve this come to mind immediately.
Strategy 1 is simply to put python code into the PerceptionManagerAgent, to create the appropriate Atoms for each incoming ROS message. This is probably the best approach, at least initially. It's simpler and involves less programmer effort, and there are no obvious drawbacks in the short term.
Strategy 2 would be to do the transformation in the Atomspace. Even though I don't advocate doing this immediately, it's educational to see how this would work.
Caution: the example below is misleading/incorrect: ROS messages are not generic, and they don't need to be decoded to figure out what they mean. They already come in a very specific form, from a very specific location. So much of this section doesn't really apply.
For example, our ROS vision system publishes two topics: /vision/face_event and /vision/face_location Think of a "topic" as a URL. When you are subscribed to a topic, you receive all events (messages) sent to that topic (by anyone). Currently, the face_event topic has two events: "lost_face" and "new_face". These indicate that a face is no longer visible, or that a new face has entered the scene. Both events are very simple: they just include a face ID number, and nothing more. The face_location topic consists of events that are a numeric face ID, and an x,y,z position for that face, and nothing more.
Thus, the example below needs to be modified to be more specific. Perhaps it could match on the numeric face ID in some way. But you do not need to make any decisions based on the message type -- the message type was fixed and determined before you received it. There is no way, in ROS, to receive all messages of some indeterminate, generic type. (Well, maybe there is, but it is not how ROS is meant to be used).
But .. if you did want to receive generic ROS messages, the following Atom could be placed in the Atomspace
BindLink $R ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode $R TimeNode $T EvaluationLink PredicateNode "Perception_type" ListLink $R PhraseNode "face present" EvaluationLink PredicateNode "Message_source" ListLink $R PhraseNode "Eva" AtTimeLink $T EvaluationLink PredicateNode "face present" ConceptNode "Eva"
In Strategy 2, PerceptionManagerAgent would then have two jobs:
1) Pop incoming ROS messages off the queue, and use them to create Atom sets such as
ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode "Message1234346" TimeNode "1234" EvaluationLink PredicateNode "Perception_type" ListLink ROSMessageNode "Message1234346" PhraseNode "face present" EvaluationLink PredicateNode "Message_source" ListLink ROSMessageNode "Message1234346" PhraseNode "Eva"
2) Contain the Handle to an Atom which is a SetLink, whose elements are all the perception processing BindLinks, similar to the example given above. When Atoms are created corresponding to a new ROS message, then try to match all the BindLinks in this SetLink to the new message, using the Pattern Matcher. This will result in the new message being used to create appropriate Atoms.
The beauty of this approach is that the logic of transforming incoming messages into Atoms becomes cognitive content. The potential downside of this approach is processing time overhead.
A More Complex Example
Now let's consider a slightly less trivial example -- the perception "face present at coordinates (x,y,z)".
In Strategy 2 (not advocated for immediate implementation), this could be processed by a BindLink such as
BindLink $R, $xcoord, $ycoord, $zcoord ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode $R TimeNode $T EvaluationLink PredicateNode "Perception_type" ListLink $R PhraseNode "face present at location" EvaluationLink PredicateNode "Message_source" ListLink $R PhraseNode "Eva" EvaluationLink PredicateNode "face_location" ListLink $R ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord ANDLink ExecutionOutputLink GroundedSchemaNode "copy_node_name" $R ObjectNode $O InheritanceLink ObjectNode $O ConceptNode "face" DefiniteLink ObjectNode $O AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "$O AtLocationLink ObjectNode $O ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
To unravel the outputs above and give their intended meanings...
The below assigns the ObjectNode $O to be created to represent the newly recognized face, the same name as the node representing the ROS message indicating the presence and location of the face. (This is not the only way to name the node, obviously.)
ExecutionOutputLink GroundedSchemaNode "copy_node_name" $R ObjectNode $O
The below indicates that this ObjectNode does indeed represent a face:
InheritanceLink ObjectNode $O ConceptNode "face"
The below indicates that this ObjectNode represents a definite, specific entity that is a face, rather than a set of faces or a kind of face:
DefiniteLink ObjectNode $O
The below indicates that the face was recognized by Eva to be at some location, at the time $T$:
AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "$O
The below indicates where the face was described to be located. The ConceptNode "Eva_room" indicates what map the location is intended to exist within.
AtLocationLink ObjectNode $O ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
In the Strategy 1 approach, python code inside the PerceptionManagerAgent would create an ObjectNode with a name such as "object_123456" where the "123456" comes from some ID associated with the ROS message (or could be a time-stamp -- whatever)...
Then the same links as in the Strategy 2 treatment of the example would be created, via python code within the PerceptionManagerAgent:
InheritanceLink ObjectNode "object_123456" ConceptNode "face" DefiniteLink ObjectNode "object_123456" AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "object_123456" AtLocationLink ObjectNode "object_123456" ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
Integrating the 3DSpaceMap
One of the parts of the current Embodiment system that seems well worth retaining for the time being, is the 3DSpaceMap. This structure has some nice query operations associated with it.
Currently the OpenCog code embodies the implicit assumption that there is only one 3DSpaceMap associated with a given OpenCog system. This clearly doesn't make sense. For instance, even an indoor robot may have separate maps of each room it knows about, and then a separate overall map of the house it lives in (because it may not know how the rooms are oriented with respect to each other, even if it knows well how things are arranged in each room). Further, an OpenCog system may want to maintain a map of the surface of Earth, and a map of a certain room, without knowing how the room is oriented relative to the other stuff on the surface of the Earth.
What I suggest is that when a new link such as
AtLocationLink ObjectNode "object_123456" ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
is created, this should automatically lead to the relevant information being put into the 3DSpaceMap labeled "Eva_room" (in this case the information that the location of object $O is (x,y,z)).
Similarly, when an AtTimeLink is created, this should automatically lead to the relevant information being indexed in the TimeServer (which is not really a server, just an index).
Initially we may deal only with a single 3DSpaceMap. But the assumption that there is only a single map shouldn't be wired into the system -- the single map, even if there is just one, should be referred to by its name (e.g. "Eva_room") and the code should be written assuming multiple maps is a possibility.
Improving the 3DSpaceMap
From a robotics view, the current 3DSpaceMap also has some shortcomings -- it assumes a uniform octree grid over all of space, rather than allowing a grid with variable granularity in different regions; and it assumes each cell of its internal octree is either wholly occupied or wholly unoccupied, rather than probabilistically or fuzzily occupied. To remedy these issues, I've suggested to replace the octree inside the current 3DSpaceMap with an OctoMap (an octree structure that incidentally has been integrated with ROS).
(Update: Octomap is now integrated, and an Octomap-based Visual Perception Buffer is under construction, as of Jan 2016).
Connecting to OpenPsi
OpenPsi maintains a set of goals, and should correspond to a set of links of the conceptual form
CONTEXT & PROCEDURE ==> GOAL
Backward chaining from the goals then leads to contextually appropriate procedures being chosen for activation, with goal achievement in mind.
An example link of this nature would be:
BindLink $X ImplicationLink ANDLink EvaluationLink PredicateNode "face present" ConceptNode $X ExecutionLink GroundedSchemaNode "smile" ConceptNode $X EvaluationLink PredicateNode "please_people" ExecutionOutputLink GroundedSchemaNode "smile" ConceptNode $X
To break down the parts... we have here:
EvaluationLink PredicateNode "face present" ConceptNode $X
ExecutionLink GroundedSchemaNode "smile" ConceptNode $
EvaluationLink PredicateNode "please_people"
In other words, what this says is that if a face is present, then smiling is expected to achieve the goal of pleasing people.
Having validated this conclusion, the action
ExecutionOutputLink GroundedSchemaNode "smile" ConceptNode $X
is then triggered for enaction.
A Simple Execution Manager for OpenCog
What causes actions in the Atomspace to actually get executed?
Currently, the pattern matcher causes things to execute. Here are some working examples:
- Sequential And example -- using functions to decide whether to continue or to stop.
- General computation example -- mixing function calls, variables, pattern matching.
Random action generator
For testing of ROS messaging and other aspects, it could be worthwhile to create a RandomActionGeneratorAgent, which
- points to a SetLink that contains ExecutionOutputLinks corresponding to actions that OpenCog knows how to tell Eva to do
- periodically generates a random action via executing one of the ExecutionOutputLinks in this SetLink
Simple OpenPsi based Execution manager
Given links connecting OpenPsi goals with actions, a simple execution manager would be a tweak of the existing OpenPsi action selection agent . Links connecting goals to ExecutionOutputLinks would be checked in the action selection phase, then action execution would be done via executing the appropriate GroundedSchemaNodes.
The OCPlanner, written by Shujing, contains a quite interesting planning algorithm that integrates navigation and logical reasoning in an unprecedentedly tight way. This is not useful for the current Eva robot which is just a head, but will be useful for lots of other robotics and gaming OpenCog applications.
Ben thinks that, in the medium term, the right course is to reimplement the algorithm inside the OCPlanner in a different way, to utilize the Unified Rule Engine rather than having its own rule engine inside its own C++ code.
For the time being, however, the OCPlanner works interestingly well, so it's worthwhile to quickly integrate it with the new Embodiment framework being outlined here. I think this could be done something like
ExecutionOutputLink GroundedSchemaNode "run_OCPlanner" ListLink ConceptNode "Eva" ConceptNode "Eva_room" ListLink NumberNode $xcoord_start NumberNode $ycoord_start NumberNode $zcoord_start ListLink NumberNode $xcoord_end NumberNode $ycoord_end NumberNode $zcoord_end
The above command would cause the OCPLanner to get invoked, for the body "Eva" relative to the map "Eva_room". It would ask the OCPLanner to find a plan going from the given start coordinates to the given end coordinates.
The output of the above command would be a plan, learned by the OCPlanner. The plan would be represented in the form of a set of Atoms, which could be wrapped in a SequentialANDLink in the manner discussed on the Behavior_tree page.
Once all the above is done in the context of the Eva robot head, the same framework can of course be extended to other cases.
Utilizing the same framework for the modded RoboSapien that Mandeep has designed should not be difficult, as this robot is already being controlled using ROS. The set of actions and perceptions is a bit different but the basic logic of interaction should be the same. The 3DSpaceMap will be more heavily used since the RoboSapien moves around whereas the Eva head does not.
Re-integrating the Unity3D World
To integrate the OpenCog Unity3D world directly into this framework, we would need to make a ROS proxy for Unity3D. It's not yet 100% obvious to me whether this is a good idea, but I don't see any particular problems with the concept.
If we don't want to use ROS in this context it would be best to replace the XML messaging layer used in the current Unity3D world with something lighter-weight, perhaps ZMQ/ProtoBuf. This wouldn't really be a lot of work on the OpenCog side (GSNs can send messages using whatever protocol they want).
Most of the work will be on the Unity3D side in any case. The new Embodiment will make customizing perception and action processing on the OpenCog side for new worlds much easier than is currently the case.