Goals and Time
The OCP system maintains an explicit list of "ubergoals", which as will be explained in a later section, receive attentional currency which they may then allocate to their subgoals according to a particular mechanism.
It is quite possible to have a OCP system with multiple ubergoals. However, in our work so far we have been assuming a single ubergoal named "Satisfaction," so that the ubergoal is: Maximize Satisfaction, as measured by the Satisfaction FeelingNode.
In our practical work so far, the splitting of the Satisfaction ubergoal into subgoals corresponding to particular subcomponents of Satisfaction is then achieved:
- in part, by the system's learning processes
- in part, via explicit programmer-specification, carried out via defining the Satisfaction FeelingNode's internal evaluator in terms of other complex predicates, such as GainingUnderstanding (a grounded SchemaNode that measures the amount of new learning occurring in the system) or Reward (a FeelingNode corresponding to the receipt of reward from human teachers, e.g. from Reward perceptions coming in from OpenSim).
(Domain-specific applications of the OCP system may involve the creation of application-specific goals, such as creating a particular type of predicate, recognizing patterns in a certain type of biological data, giving humans satisfactory answers to their questions, etc. These may be wired into the basic SatisfactionMaximization goal, and also given a substantial importance on their own.)
All this is very simple conceptually, but it leaves out one very important factor: time. The truth value of a FeelingNode is averaged over the relevant past, but the time scale of this averaging can be very important. In very many cases, it may be worthwhile to have separate FeelingNodes measuring exactly the same thing, but doing their truth-value time-averaging over different time scales. In fact this is the case for all the elementary FeelingNodes listed above: InternalNovelty, Health, GainingUnderstanding, ObservedUserSatisfaction etc. Corresponding to each of these FeelingNodes as described above, we may posit a short-term version, leading to such things as CurrentInternalNovelty, CurrentHealth, etc. It is possible that even more flexibility than this may be useful, i.e. more than 2 different variants of the same FeelingNode, all with different time scales.
The most interesting case, however, is CurrentSatisfaction. What is interesting is that it may be specifically valuable not to have CurrentSatisfaction and Satisfaction constructed the same way. The reason is that, if CurrentSatisfaction is different from Satisfaction, then there can be a CurrentSatisfactionMaximization goal, which seeks specifically to maximize the qualities that have been associated with CurrentSatisfaction.
Of course, all this time-dependence could be left for the system to figure out all by itself. In figuring out how to best achieve Satisfaction, the system will create short-term goals, and reason that achieving these short-term goals may be the best way to achieve its long-term goals. The building-in of Feelings and Goals with particular temporal structure is an assertion that this time-dependent nature of Satisfaction is an extremely fundamental aspect of mind that perhaps should not have to be learned on the individual-mind level (though obviously in human history and prehistory it was learned in the evolutionary sense).
Given a multi-time-scale notion of Satisfaction, one valuable method of system control is for the system to purposefully modify the CurrentSatisfaction FeelingNode. I.e., it may judge that in order for it to fulfill its long-term goals, its short-term goals should be so-and-such, and it may embody so-and-such in the CurrentSatisfaction FeelingNode.
An example of a Satisfaction control schema of this type would be the following:
ImplicationLink AND PredictiveImplicationLink $X CurrentSatisfaction PredictiveImplicationLink $X (NOT Satisfaction) PredictiveImplicationLink $X NOT CurrentSatisfaction
The rule says: If X makes you happy in the short run, but achieving X is acting against long-term Satisfaction, then revise your CurrentSatisfaction so that X no longer makes you happy in the short run. Needless to say, humans do not have this kind of explicit control over the drivers of their short-time-scale Satisfaction; but there is no reason that a OCP system cannot. This kind of self-modification does not require source code modification, but merely the learning of relatively compact (but fairly abstract) cognitive schemata.