Embodiment (2015 Archive)
This page summarizes ideas developed between Fall-2014 to Spring-2015 regarding the building of a largely-new Embodiment module, with the initial use-case being OpenCog-based control of the Hanson Robotics Eva avatar robot head.
This page is a historical archive of a now-obsolete design. The ideas discussed below have achieved one of four states: (a) implemented as described, (b) implemented in some other way (c) not implemented, but maybe still viable, or (d) turned out to be bad ideas.
Editorial sections were added in 2018, indicating what the final status of each idea was. Please do not attempt to modernize this page! If you need something here (but you don't, not actually), copy it to a new page (really, you don't need anything from here).
The following summarizes Ben Goertzel's current (March 2015) understanding of the situation with the Embodiment module of OpenCog:
- The code is large and complex and contains some parts that are pretty good (e.g. 3DSpaceMap, inquery interface, etc.) and some parts that everyone agrees are pretty terrible (action interface, XML messaging layer)
- The code currently has bugs that make it crash sometimes (though sometimes it runs fine on some peoples' machines, etc. etc.).
- The people who wrote the code, or who know the code, aren't working actively on OpenCog at this moment and have other priorities besides debugging or improving the Embodiment code.
- The initial, mostly-working functionality of the Embodiment module has been to control a game character in the OpenCog Unity3D game world. However, the code inside the Unity3D game world, while it works without major bugs, is also recognized as in many ways nonoptimal. A majority of this code deals with converting btw XML messages and internal Unity3D stuff and sending/receiving messages, and this seems to be the messiest part of the code. The other parts of the code largely deal with blocks management, terrain generation and so forth and according to my casual impression are better quality....
The 3DSpaceMap code was deleted in 2015, it was unusable. See SpaceMap for the current best, finest design proposal. All of the old pet-avatar code was deleted circa summer-2015. So, for example, there seemed to be nothing left of the Unity3D game world that could be preserved.
Initial Use Case
I propose to take interfacing with the Hanson Robotics Eva robot simulation system
as the initial application for experimenting with a new/improved Embodiment, rather than the Unity3D game world.
Once a new/improved Embodiment is gotten working with the simulated Eva, then make it work with the Unity3D game world afterwards.
Part of the point here is that, while the simulated Eva robot head infrastructure is far from perfect, at least it is under active development and being actively maintained by someone (Linas), whereas the Unity3D infrastructure is sorta orphaned at this moment...
2018 Status: The https://github.com/opencog/docker/indigo github page contains a version of docker that was fully functional in the 2015-2017 time-frame. It worked! It has bit-rotted, due to ongoing changes throughout the various github repos.
ROS Messaging Between OpenCog and Eva
(NOTE from Ben: I am not a ROS guru and would welcome additions, modifications or improvements to this section by someone who knows ROS more thoroughly. If I've gotten some terminology wrong feel free to fix it.)
To be clear, the suggestion here is to totally eliminate the XML messaging code existing in the current Embodiment module. This should be replaced with a new system utilizing ROS messaging.
In this architecture, there should be one or more ROS nodes corresponding to Eva and communicating with OpenCog; and one ROS node corresponding to OpenCog.
ROS nodes do queueing, so that various OpenCog processes can then push messages on to the OpenCog ROS node's queue for sending-out; and messages received by OpenCog's ROS node will be stored in a queue for OpenCog processes to pop off and utilize as they are able to.
Based on a discussion with Nil, it seems it may be best to use python for the OpenCog ROS node.
ROS just handles the messaging mechanics, it doesn't say anything about the content of the messages (other than specifying what format they should take). We will need to create custom ROS messages for the signals to be transmitted back and forth between OpenCog and Eva.
2018 Status: A set of ROS shims were developed and placed in the github 'ros-behavior-scripting' repo. That repo was mostly gutted in 2017, and then finally moved to a new name in 2018. Meanwhile, a new and improved proposal for ROS interfacing can be found on the Values page.
MESSAGES FROM OPENCOG TO BODY-CONTROLLER (EVA)
We would want an action message for each of:
blink', 'blink-micro', 'blink-relaxed', 'blink-sleepy', 'nod-1', 'nod-2', 'nod-3', 'shake-2', 'shake-3', 'yawn-1', 'irritated', 'happy', 'recoil', 'surprised', 'sad', 'confused', 'afraid', 'bored', 'engaged', 'amused', 'comprehending' Glance at location (x,y,z) Turn head toward location (x,y,z) Turn off perception stream, Turn on perception stream Get location of object with ID number N
2018 Status: yes, this got built, and still describes the current system.
MESSAGES FROM BODY-CONTROLLER (EVA) TO OPENCOG
Perception messages initially needed would be:
Face has become visible, and has tracking number N Face with tracking number N is no longer visible Face with tracking number N has location (x,y,z)
2018 Status: yes, this got built, and still describes the current system.
Triggering Messages from the Atomspace
The sending of a ROS message to a robot (or other embodied entity) from within the Atomspace would be triggered by execution of an appropriate GroundedSchemaNode or by a GroundedPredicateNode. The former would be used if there needs to be a return value that is an atom; the later, if the return value is a TV. If a sequence of several messages need to be strung together, such that the next doesn't take place unless the previous one returned true, then the GroundedPredicateNode would be used to return the true/false values.
The schema execution involved is handled by code the scheme API calls cog-execute! and cog-evaluate!. For programming and usage examples, see ExecutionOutputLink and EvaluationLink. To run a sequence of behaviors, string them together into a SequentialAndLink and use the pattern matcher to run it. Note that, in this case, one may place variables into expressions; the pattern matcher will find groundings for the variables.
For instance we might have:
EvaluationLink GroundedPredicateNode "py:send_blink_message" ConceptNode "Eva"
EvaluationLink GroundedPredicateNode "py:send_point_head_message" ListLink ConceptNode "Eva" NumberNode x_coord NumberNode y_coord NumberNode z_coord
In the convention indicated above, the first argument of these "body control schema" indicates the name of the robot body to be controlled. This is intended to allow the same OpenCog Atomspace to control multiple bodies. So if the same Atomspace was to send a control message to some body besides Eva, it might send something like:
EvaluationLink GroundedPredicateNode "py:send_blink_message" ConceptNode "R2D2"
Example blink message:
import rospy import roslib from opencog.atomspace import TruthValue from blender_api_msgs.msg import SetGesture def __init__(self): self.eva_gesture_pub = rospy.Publisher("/blender_api/set_gesture", SetGesture, queue_size=1) def send_blink_message(atom): if 'eva' == atomspace.getNamee(atom): self.eva_gesture_pub.publish("blink-1") return TruthValue(1.0,1.0) throw (RuntimeError, "Unknown robot name")
That's pretty much it. Notice that using ROS is almost trivial, in this example.
2018 Status: Yes, this is pretty much exactly what got built, and remains quite close to what is in the current system.
Porting Owyl Behavior Trees into the Atomspace
On the Behavior_tree page, Linas has explained how to implement a behavior tree in OpenCog. I think this may work now (March 5, 2014); ask Linas to know for sure....
The current Eva system relies on a bunch of behavior trees created using a python tool called Owyl. All these Owyl behavior trees can be ported to the Atomspace, using the ideas on the Behavior tree page. This will enable the high-level control of the Eva robot to be done within the Atomspace -- which will then send ROS messages to the Eva robot, each message specifying a certain primitive action to be taken by the robot.
2018 Status: Yes, the Owyl code got ported to Atomese, and was a major spur in clarifying and solidifying the core ideas behind Atomese. However, the actual implementation proved too difficult for any except the most senior programmers to understand. To add injury to insult, it turned out that the perception subsystem was far too primitive for the behavior scripts: the robot was very nearly blind, and more or less totally deaf, and so the behavior scripts were reacting to low-quality, low-bit-rate, almost-white-noise from the sensory subsystem. Scripting proved to be not a very good approach. Despite this, the ghost scripting system was created, to bring scripting usability to a new level, and meet the operational demands of a celebrity robot.
A Simple Perception Manager for OpenCog
After a perception message is received by OpenCog's ROS node, from the Eva robot, then the PerceptionManagerAgent has got to pop that message off the ROS node's queue and do something with it.
The PerceptionManagerAgent must perform the translation/transformation of each ROS message into appropriate Atom structures.
For instance, suppose one wants the following behavior: When the message "face present" is received at time $T from Eva, then the Atom structure
AtTimeLink $T EvaluationLink PredicateNode "face present" ConceptNode "Eva"
2018 Status: The above was not implemented. One primary problem is that it pollutes the atomspace with vast quantities of grunge. Creating Atoms is CPU-intensive; placing them in the atomspace, which indexes them, is CPU-expensive; and then, to add insult to injury, the actual time values are more-or-less never used, because they are mostly never needed. We do not need to index or permanently store fleeting values. Ah- there's the word Values: these were designed and implemented in the 2015-2016-2017 time frame as a mechanism for storing fleeting, time-varying values, such as time-stamps and 3D locations. See SpaceServer for a discussion of the current best implementation idea.
Two ways to achieve this come to mind immediately.
Strategy 1 is simply to put python code into the PerceptionManagerAgent, to create the appropriate Atoms for each incoming ROS message. This is probably the best approach, at least initially. It's simpler and involves less programmer effort, and there are no obvious drawbacks in the short term.
2018 Status: not done. Yeah, OK, the Visual Perception Buffer got created, but it never worked well, and was never used. Using Values, as described on the SpaceServer page, is a much simpler, and a much better idea.
Strategy 2 would be to do the transformation in the Atomspace. Even though I don't advocate doing this immediately, it's educational to see how this would work.
Caution: the example below is misleading/incorrect: ROS messages are not generic, and they don't need to be decoded to figure out what they mean. They already come in a very specific form, from a very specific location. So much of this section doesn't really apply.
For example, our ROS vision system publishes two topics: /vision/face_event and /vision/face_location Think of a "topic" as a URL. When you are subscribed to a topic, you receive all events (messages) sent to that topic (by anyone). Currently, the face_event topic has two events: "lost_face" and "new_face". These indicate that a face is no longer visible, or that a new face has entered the scene. Both events are very simple: they just include a face ID number, and nothing more. The face_location topic consists of events that are a numeric face ID, and an x,y,z position for that face, and nothing more.
Thus, the example below needs to be modified to be more specific. Perhaps it could match on the numeric face ID in some way. But you do not need to make any decisions based on the message type -- the message type was fixed and determined before you received it. There is no way, in ROS, to receive all messages of some indeterminate, generic type. (Well, maybe there is, but it is not how ROS is meant to be used).
But .. if you did want to receive generic ROS messages, the following Atom could be placed in the Atomspace
BindLink $R ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode $R TimeNode $T EvaluationLink PredicateNode "Perception_type" ListLink $R PhraseNode "face present" EvaluationLink PredicateNode "Message_source" ListLink $R PhraseNode "Eva" AtTimeLink $T EvaluationLink PredicateNode "face present" ConceptNode "Eva"
In Strategy 2, PerceptionManagerAgent would then have two jobs:
1) Pop incoming ROS messages off the queue, and use them to create Atom sets such as
ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode "Message1234346" TimeNode "1234" EvaluationLink PredicateNode "Perception_type" ListLink ROSMessageNode "Message1234346" PhraseNode "face present" EvaluationLink PredicateNode "Message_source" ListLink ROSMessageNode "Message1234346" PhraseNode "Eva"
2) Contain the Handle to an Atom which is a SetLink, whose elements are all the perception processing BindLinks, similar to the example given above. When Atoms are created corresponding to a new ROS message, then try to match all the BindLinks in this SetLink to the new message, using the Pattern Matcher. This will result in the new message being used to create appropriate Atoms.
The beauty of this approach is that the logic of transforming incoming messages into Atoms becomes cognitive content. The potential downside of this approach is processing time overhead.
2018 Status: Not implemented. Basically, ROS just doesn't work like that. Plus we have the atomspace-pollution problem mentioned before. The ideas described on the SpaceServer page are much better, and also simpler.
A More Complex Example
Now let's consider a slightly less trivial example -- the perception "face present at coordinates (x,y,z)".
2018 Status: Not implemented. As mentioned above, ROS just doesn't work like that. Plus we have the atomspace-pollution problem mentioned before. The ideas described on the SpaceServer page are much better, and also simpler. For transient, fleeting values, such as streaming ROS data, or streaming position data, or any kind of time stamps, using values is a much, much better idea. Putting atoms into the atomspace is CPU-expensive. Removing them is CPU-expensive. Heck, just creating atoms is CPU-expensive. Just don't do that! Do not pollute the atomspace with temporary data!
In short, none of the below was actually implemented in any shape or form.
In Strategy 2 (not advocated for immediate implementation), this could be processed by a BindLink such as
BindLink $R, $xcoord, $ycoord, $zcoord ANDLink EvaluationLink PredicateNode "ROS_message_received" ListLink ROSMessageNode $R TimeNode $T EvaluationLink PredicateNode "Perception_type" ListLink $R PhraseNode "face present at location" EvaluationLink PredicateNode "Message_source" ListLink $R PhraseNode "Eva" EvaluationLink PredicateNode "face_location" ListLink $R ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord ANDLink ExecutionOutputLink GroundedSchemaNode "copy_node_name" $R ObjectNode $O InheritanceLink ObjectNode $O ConceptNode "face" DefiniteLink ObjectNode $O AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "$O AtLocationLink ObjectNode $O ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
To unravel the outputs above and give their intended meanings...
The below assigns the ObjectNode $O to be created to represent the newly recognized face, the same name as the node representing the ROS message indicating the presence and location of the face. (This is not the only way to name the node, obviously.)
ExecutionOutputLink GroundedSchemaNode "copy_node_name" $R ObjectNode $O
The below indicates that this ObjectNode does indeed represent a face:
InheritanceLink ObjectNode $O ConceptNode "face"
The below indicates that this ObjectNode represents a definite, specific entity that is a face, rather than a set of faces or a kind of face:
DefiniteLink ObjectNode $O
The below indicates that the face was recognized by Eva to be at some location, at the time $T$:
AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "$O
The below indicates where the face was described to be located. The ConceptNode "Eva_room" indicates what map the location is intended to exist within.
AtLocationLink ObjectNode $O ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
In the Strategy 1 approach, python code inside the PerceptionManagerAgent would create an ObjectNode with a name such as "object_123456" where the "123456" comes from some ID associated with the ROS message (or could be a time-stamp -- whatever)...
Then the same links as in the Strategy 2 treatment of the example would be created, via python code within the PerceptionManagerAgent:
InheritanceLink ObjectNode "object_123456" ConceptNode "face" DefiniteLink ObjectNode "object_123456" AtTimeLink $T EvaluationLink PredicateNode "face present at location" ListLink ConceptNode "Eva" ObjectNode "object_123456" AtLocationLink ObjectNode "object_123456" ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
Integrating the 3DSpaceMap
One of the parts of the current Embodiment system that seems well worth retaining for the time being, is the 3DSpaceMap. This structure has some nice query operations associated with it.
2018 Status: The 3DSpaceMap code was deleted in the summer of 2015, or thereabouts. See SpaceServer for a superior design proposal.
Currently the OpenCog code embodies the implicit assumption that there is only one 3DSpaceMap associated with a given OpenCog system. This clearly doesn't make sense. For instance, even an indoor robot may have separate maps of each room it knows about, and then a separate overall map of the house it lives in (because it may not know how the rooms are oriented with respect to each other, even if it knows well how things are arranged in each room). Further, an OpenCog system may want to maintain a map of the surface of Earth, and a map of a certain room, without knowing how the room is oriented relative to the other stuff on the surface of the Earth.
What I suggest is that when a new link such as
AtLocationLink ObjectNode "object_123456" ListLink ConceptNode "Eva_room" ListLink NumberNode $xcoord NumberNode $ycoord NumberNode $zcoord
is created, this should automatically lead to the relevant information being put into the 3DSpaceMap labeled "Eva_room" (in this case the information that the location of object $O is (x,y,z)).
Similarly, when an AtTimeLink is created, this should automatically lead to the relevant information being indexed in the TimeServer (which is not really a server, just an index).
Initially we may deal only with a single 3DSpaceMap. But the assumption that there is only a single map shouldn't be wired into the system -- the single map, even if there is just one, should be referred to by its name (e.g. "Eva_room") and the code should be written assuming multiple maps is a possibility.
Improving the 3DSpaceMap
From a robotics view, the current 3DSpaceMap also has some shortcomings -- it assumes a uniform octree grid over all of space, rather than allowing a grid with variable granularity in different regions; and it assumes each cell of its internal octree is either wholly occupied or wholly unoccupied, rather than probabilistically or fuzzily occupied. To remedy these issues, I've suggested to replace the octree inside the current 3DSpaceMap with an OctoMap (an octree structure that incidentally has been integrated with ROS).
(Update: Octomap is now integrated, and an Octomap-based Visual Perception Buffer is under construction, as of Jan 2016).
2018 Status: The octomap-based server got implemented. However, just on first principles, bot octomap and pointclouds are terrible representations for AGI-style knowledge data. The SpaceServer page provides a superior proposal for representing time and position data in the AtomSpace.
Connecting to OpenPsi
OpenPsi maintains a set of goals, and should correspond to a set of links of the conceptual form
CONTEXT & PROCEDURE ==> GOAL
Backward chaining from the goals then leads to contextually appropriate procedures being chosen for activation, with goal achievement in mind.
An example link of this nature would be:
BindLink $X ImplicationLink ANDLink EvaluationLink PredicateNode "face present" ConceptNode $X ExecutionLink GroundedSchemaNode "smile" ConceptNode $X EvaluationLink PredicateNode "please_people" ExecutionOutputLink GroundedSchemaNode "smile" ConceptNode $X
To break down the parts... we have here:
EvaluationLink PredicateNode "face present" ConceptNode $X
ExecutionLink GroundedSchemaNode "smile" ConceptNode $
EvaluationLink PredicateNode "please_people"
In other words, what this says is that if a face is present, then smiling is expected to achieve the goal of pleasing people.
Having validated this conclusion, the action
ExecutionOutputLink GroundedSchemaNode "smile" ConceptNode $X
is then triggered for enaction.
2018 Status: This particular variant of OpenPsi-based behavior and animation control eventually came about to replace the scripted behavior trees. It still lives on, in a way, inside of ghost. The biggest problem with OpenPsi is that it remains a schizophrenic mashup of two different, unrelated concepts: one is a priority-based, class-based rule selection mechanism (which is great, and we need that); the other is an inadequate model of human emotions, affects and moods. These two different parts of OpenPsi need to be cleanly decoupled; they remain confusingly tangled.
A Simple Execution Manager for OpenCog
What causes actions in the Atomspace to actually get executed?
Currently, the pattern matcher causes things to execute. Here are some working examples:
- Sequential And example -- using functions to decide whether to continue or to stop.
- General computation example -- mixing function calls, variables, pattern matching.
2018 Status: The above two demos still work, I beleive (they are unit tests, now), and were used in the behavior tree implementation.
Random action generator
For testing of ROS messaging and other aspects, it could be worthwhile to create a RandomActionGeneratorAgent, which
- points to a SetLink that contains ExecutionOutputLinks corresponding to actions that OpenCog knows how to tell Eva to do
- periodically generates a random action via executing one of the ExecutionOutputLinks in this SetLink
2018 Status: not done. We mostly do not use agents; we use threads, instead. The agent code is naive, to be nearly childish in its implementation.
Simple OpenPsi based Execution manager
Given links connecting OpenPsi goals with actions, a simple execution manager would be a tweak of the existing OpenPsi action selection agent . Links connecting goals to ExecutionOutputLinks would be checked in the action selection phase, then action execution would be done via executing the appropriate GroundedSchemaNodes.
2018 Status: A version of OpenPsi controls the robot today.
The OCPlanner, written by Shujing, contains a quite interesting planning algorithm that integrates navigation and logical reasoning in an unprecedentedly tight way. This is not useful for the current Eva robot which is just a head, but will be useful for lots of other robotics and gaming OpenCog applications.
Ben thinks that, in the medium term, the right course is to reimplement the algorithm inside the OCPlanner in a different way, to utilize the Unified Rule Engine rather than having its own rule engine inside its own C++ code.
For the time being, however, the OCPlanner works interestingly well, so it's worthwhile to quickly integrate it with the new Embodiment framework being outlined here. I think this could be done something like
ExecutionOutputLink GroundedSchemaNode "run_OCPlanner" ListLink ConceptNode "Eva" ConceptNode "Eva_room" ListLink NumberNode $xcoord_start NumberNode $ycoord_start NumberNode $zcoord_start ListLink NumberNode $xcoord_end NumberNode $ycoord_end NumberNode $zcoord_end
The above command would cause the OCPLanner to get invoked, for the body "Eva" relative to the map "Eva_room". It would ask the OCPLanner to find a plan going from the given start coordinates to the given end coordinates.
The output of the above command would be a plan, learned by the OCPlanner. The plan would be represented in the form of a set of Atoms, which could be wrapped in a SequentialANDLink in the manner discussed on the Behavior tree page.
2018 Status: The OCPlanner code was deleted from github in the summer of 2015, as part of the general obsolesence of the Novamente pet-avatar codebase. Reviving it in its naive conception is probably a bad idea. There are several problems: polluting the atomspace with transient atoms is a bad idea (repeatedly mentioned on this page); use values instead. In addition, there are now some fairly sophisticated off-the-shelf robot-motion planners that would seem to provide a better short/medium term solution.
Once all the above is done in the context of the Eva robot head, the same framework can of course be extended to other cases.
Utilizing the same framework for the modded RoboSapien that Mandeep has designed should not be difficult, as this robot is already being controlled using ROS. The set of actions and perceptions is a bit different but the basic logic of interaction should be the same. The 3DSpaceMap will be more heavily used since the RoboSapien moves around whereas the Eva head does not.
2018 Status: Unknown. There was a RoboSoccer championship in Addis Ababa, Ethiopia, in 2017, with three university teams using modded RoboSapiens. I'm not sure what the code base for that was.
Re-integrating the Unity3D World
To integrate the OpenCog Unity3D world directly into this framework, we would need to make a ROS proxy for Unity3D. It's not yet 100% obvious to me whether this is a good idea, but I don't see any particular problems with the concept.
If we don't want to use ROS in this context it would be best to replace the XML messaging layer used in the current Unity3D world with something lighter-weight, perhaps ZMQ/ProtoBuf. This wouldn't really be a lot of work on the OpenCog side (GSNs can send messages using whatever protocol they want).
Most of the work will be on the Unity3D side in any case. The new Embodiment will make customizing perception and action processing on the OpenCog side for new worlds much easier than is currently the case.
2018 Status: The Unity3D code was removed from github in the summer of 2015, as part of the general cleanup of the various repos.