EmbodimentNotes (2005 Novamente)
THIS PAGE IS OBSOLETE
THIS DOC IS OBSOLETE. This document describes a form of embodiment for the Novamente Virtual Pet, circa 2004-2008. It is maintained here as a record, a historical archive of things that used to be, but are no more.
Virtual pet, Novamente Cognition Engine
This document summarizes some ideas about how to create an interesting "virtual pet" for the Embodiment project, controlled by the Novamente Cognition Engine.
These notes are highly preliminary and subject to potential massive revision. They are outlined here as part of a collaborative brainstorming process that will eventually (hopefully soon) result in a coherent "vision document" to guide development.
Basic Functionality Envisioned
The basic functionality envisioned, from an end-user perspective, is: a virtual pet that will
- "live" in a body in a sim world
- be initially controlled by scripted rules and goals
- have user-adjustable goals
- be able to learn various simple behaviors
- via reinforcement learning (where the partial reward signals are given by a human-controlled teacher avatar)
- by imitation of the teacher
- possibly, by having the teacher take over its body and show it what to do
- by spontaneous exploratory learning
- recognize the teacher and other familiar faces
- be able to navigate and manipulate things adeptly in its virtual world
- have an "individual personality" that is not entirely predictable, but is highly influenced by its experiences and its teacher's/master's actions
Body-wise: As a working assumption, let's assume a quadruped with a head and a tail (turtle, dog, bunny) and no terrifying amounts of flexibility.... Monkeys or birds would lead to other complications, which may be dealt with if we need to.
General NCE Considerations
Compared to what we are doing now, with having the NCE learn to play fetch and carry out other simple behaviors in the AGISim simulation worlds, there are a couple major differences arising with Embodiment:
- We want one NCE server to be able to control a bunch of pets
- We want to be able to supply each pet with an initial set of behavioral rules and goals
The work we are doing with the NCE right now involves a single NCE server controlling a single embodied agent carrying out spontaneous learning and teacher-guided reinforcement learning, based on simple initial goals and no initial behavioral rules.
Note that the behavioral rules supplied to a pet are supposed to be mutable based on the pet's experiential learning.
The rules should be expressible in Novamente's internal node and link format, which is currently achievable via coding them in XML. However, coding things in XML is obviously awkward. As we will want to experiment with variations on the pets' rules, it's going to be important to be able to code rules in something more closely resembling a scripting language than XML.
We used to have a scripting language called Sasha that compiled into Novamente nodes and links, but it is way obsolete. We will want to resurrect or replace it for this project.
Potentially, the pet's goal system may be conceptualized in terms of Maslow's hierarchy of needs.
Note: The Goal system was initially implemented but was replaced by another action selection system based on Rules (see Rule Engine for more details).
Also, we can program in a number of canned subgoals, each one a subgoal of some particular Maslow-ian goal (each one thus living at a specific level in Maslow's hierarchy), e.g.:
- Seek novelty
- Learn stuff
- Please the master
- Please others besides the master
- increase physical pleasure
- Avoid physical pain
[Physical pain and pleasure may be coded in, i.e. we can make a rule that getting kicked causes pain, getting petted causes pleasure, etc.]
The user may then be able to explicitly adjust the weights of each of these goals, as they are combined to form an overall Satisfaction supergoal.
A pet's goals will not change over its lifetime as a result of learning. We may want to program the pet's goals to change over its lifetime gradually as it ages (e.g. puppies are more interested in novelty and learning than dogs), but this isn't certain yet.
Goals may usefully be scripted in the same way as rules, but this is less critical as there will be less experimentation with goal internals than with rule internals, and less variation among organisms in terms of the subgoals (because the inter-organism variation in goals can be done via weight-adjustment).
Each organism will need to have a separate Active Schema Pool associated with it, associated with schemata intended to fulfill the organism's Satisfaction supergoal in the current context.
Each organism's private knowledge (declarative and procedural) may be treated as a Novamente.Context (in NM and PLN terms), so that all the knowledge of all the pets knowledge may be stored in Novamente in a common AtomTable.
Transfer of knowledge between individual and collective context is a subtle issue to be discussed in a separate section of this document.
Distribution of Collective Knowledge and Cognition
Assuming there are multiple servers controlling pets (each one controlling multiple pets), it seems most sensible for collective knowledge to be stored in a centralized way in a MindDB.
One may then envision a centralized CollectiveKnowledgeMiner server, whose job it is to grab the best knowledge out of all the different agents' individual contexts, and revise it into collective knowledge to store in the MindDB.
We may also want a cognitive engine (a set of servers) devoted only to improving on the collective knowledge store. This would grab the collective knowledge from the MindDB, reason about it, then put the results back....
I assume our current NM scheduler is not capable of scheduling real-time actions among a bunch of simultaneously acting pets, as it has been designed for the one-machine/one-agent scenario.
It will need redesign and revision.
Transfer of Knowledge Between Collective and Individual Mindspaces
Moving Knowledge from Individual to Collective Mindspace
A process should exist to take knowledge out of individual contexts and turn it into collective knowledge.
This is fairly simple: generically speaking, we want to transfer the knowledge that is most confident, most general, and most compact
Revision will play a major role here: collective knowledge should be the revised version of relevant individual knowledge. So, what we really want is to transfer individual knowledge in such a way that the ultimately obtained collective knowledge (after revision of various individual versions) will be maximally confident, general and compact.
An individual should also be able to query others' individual contexts to see if there is useful knowledge there -- and then request that this knowledge be made collective. So, for instance, if my pet encounters Idi Amin, and wants to know how to please Idi Amin, it may fish for this knowledge in the private contextual knowledge of Idi Amin's pet.
Individual Utilization of Collective Knowledge
The question arises: How avidly should each individual pet utilize collective knowledge?
One might argue that maximal intelligence will come from all pets utilizing all available knowledge. Now, this isn't necessarily true, because different pets utilizing different knowledge stores could serve as a sort of "island model" for learning, allowing different pets to explore different parts of knowledge space, sharing only their best discoveries with each other. But even if it IS true, from an end-user perspective, we may also want to ensure that different pets have different personalities, and this impression may be destroyed if each pet has immediate access to all knowledge gained by all other pets.
I can think of various amusing ways to explicitly expose the "collective knowledge infusion" process within a sim world. For instance, we could get psychedelic, and create special objects (say, flowers ... perhaps mushrooms would be a bit too explicit ;-> ... although they certainly feature prominently in Mario games!!) so that when the pet consumes the object, it gets an infusion of collective knowledge relevant to its recent experience....
Or of course one could make it a life-cycle thing: once the pets get old enough they start to tap into the collective knowledge.
Or one could explicitly let users toggle how much collective knowledge their pets get (though this is likely a bad idea)
Or one could introduce collective knowledge automatically, in a limited way, via some adaptive rule intended to prevent the pets from seeming too incredibly stupid....
What AI functionalities should the early-stage virtual pets display?
There should be an in-built navigation routine, so that if a pet wants to go to location A, it can chart a reasonable path to get there.
In principle, learning of navigation is an interesting and worthwhile AI problem. But for the near term, for practical purposes, we should just program it in like everyone else. (Eric Baum and some others believe this is wired into the human brain, btw).
Pets should be able to identify their owner using its ID (in Second Life that should be trivial, each user has an ID). They should also be able to identify their owners' friends (via an friends list, or via who their owner has interacted with). They should have an initial rule that gives them differential preference to interact with their owner, then after that preference to interact with the owner's friends, etc.
But beyond that, pets should also be able to identify visual characteristics of players, so that they should display an initial differential preference to interact with folks who LOOK LIKE their owner or owner's friends.
Later on, if the pet comes to hate person X, it should also by default have an aversion to others who associate with X or who look like X ;-)
We should be able to, initially, just implement one of the existing standard approaches for similarity matching between 3D models. This is not trivial stuff, but doesn't require any original science.
Learning Stupid Pet Tricks
This is the place where Novamente's AI code has the chance to be useful. There are really several different things here:
- Learning tricks based on a fitness function consisting of precise (or noisy) reinforcement
- Learning tricks based on imitation
- Learning tricks based on a combination of reinforcement and imitation
- Generalizing tricks already learned
Defining Stupid Pet Tricks
We can define a "stupid pet trick" as follows.
Assume the pet is in the presence of a small set of objects (e.g. some blocks and balls), a small set of agents (some may be humans, some may be other pets), and a small set of predicates, e.g.
- pick up
- put down
- place on top of
- go to
- roll over
Then, a trick can be defined as a certain set of predicates that an action sequence must fulfill.
We don't want to define a trick as an exact action sequence since every pet may carry out a trick a little differently.
So for instance for one trick (involving three blocks and no other agents)we might want to specify that, somehow, the trick needs to result in
- first, block 3 is on top of block 2 which is on top of block 1
- then, the pet knocks all the blocks down
all occurring within T time cycles.
There may be many ways for an agent to achieve this, obviously....
Reinforcement Learning of Stupid Pet Tricks
Given partial reward, many stupid pet tricks should be viably learnable via reinforcement learning using either PLN or MOSES alone, or possibly using simpler learning algorithms.
We can program the pets to understand a command such as "Learn X!" where X is an utterance, say "Learn fetch", "learn juggle", etc.
Then, after it's issued the command "Learn X", it assumes that the commands it receives from its instructor until it gets the command "Stop Learning X" are part of learning the trick X.
Of course, this may take a little artistry (what happens if the instructor stops teaching for a moment to interact with someone else), but the basic idea should be clear....
The system will then learn to associate a certain behavior pattern with a certain command X
Imitative learning of stupid pet tricks
We can also program the pets to understand a command such as "Copy X!" where X is an utterance, say "Copy fetch", "Copy juggle", etc.
Then, after it's issued the command "Copy X", it assumes that the commands it receives from its instructor until it gets the command "Stop Copying X" are part of learning the trick X by imitation.
We can make some code that solves the math problem of mapping the observed behaviors of the teacher into the first-person perspective, and can hard-code heuristics mapping the teacher's body parts onto the student's body parts. So, imitative learning should be basically hard-codeable....
Merging imitative and reinforcement learning
Merging the results of the two kinds of learning shouldn't require any special mechanisms, it should be taken care of via inference in the AtomTable.
A simple instance of trick generalization, suggested for initial focus, is: Once the agent knows a trick T that involves a certain object set, you give it a different object set and ask it to do the trick T. Then you reward it based on how well it seems to have fulfilled the task of mapping the old trick to the new object set.
This may be automated in cases where one can program in some predicates describing what a "correct generalization" should look like. Of course in many cases there may be multiple correct generalizations....
For instance, in the above example trick, the correct generalization to 4 blocks instead of 3 is pretty obvious....
-- Main.BenGoertzel - 30 Apr 2007