OpenCogPrime:AutomaticParameterAdaptation

From OpenCog
Jump to: navigation, search

System Parameters: Control, Adaptation and Optimization

One important property of OCP is that it is highly configurable via adjustment of a large number of parameters.

This page discusses how parameters are handled in OCP, and how the OCF automatically controls them to prevent such problems.

Some have argued that any AGI system with a large number of parameters, on which the system's behavior potentially depends in a sensitive way, is necessarily doomed to fail. However, this seems an unrealistically idealistic perspective. The human brain, if modeled as a mathematical dynamical systems, contains an incredible number of adjustable parameters, and if any of them gets outside their acceptable range, the functionality of the overall system may be dramatically impacted. Most mind-altering drugs operate via making relatively small tweaks to the levels of specific chemicals in the brain, thus inducing subjectively massive alterations in overall system state. Nudging some parameter of some brain-process 10% out of line may cause madness or death. The truth is that living systems rely on multiple quantitative parameters remaining within their acceptable ranges, and contain complex parameter interdependencies that guide overall system functionality in such a way that when one parameter threatens to move too far out of line, the others adjust themselves in such a way as to mitigate the threat.

While it's valuable to work to minimize the number of parameters in OCP, and especially to minimize the number of parameters on which the system's dynamics depends sensitively, and to minimize the number of parameters whose direct and obvious impact extends beyond a particular subsystem of the system — in reality, this can never be done completely, and so it is necessary to have systems in place to enable automated parameter adaptation, so that OCP can behave like a biological system in terms of keeping its own critical parameters within acceptable bounds, and (the next step) to some extent auto-adjusting its own parameters to improve its efficiency at achieving its goals.

Architecturally, parameter optimization is handled in the current OCP design by three MindAgents:

  • HomeostaticParameterAdaptation MindAgent, which adapts parameters according to certain rules that it contains, every system cycle
  • ParameterAdaptationRuleLearning MindAgent, which learns the rules used by the HPAMindAgent
  • LongTermParameterAdaptation MindAgent, which optimizes parameters to maximize system-wide goal-achievement, in a slow way based on sophisticated analysis

As of July 2008 none of these are implemented yet.

System Parameters

What are these parameter values, that these parameter-adaptive MindAgents deal with? Each CIM-Dynamic has a number of parameters, some of which may vary between different instances of the same dynamic that are active in the same [[OCP]] system at the same time. These system parameters are complicated in nature. Of the parameter set for a CIM-Dynamic type, we may say that:

  • Some are under control of that CIM-Dynamic type, and will be the same for all instances across all Units.
  • Some are under control of the hosting Unit, and will vary across Units but will be the same for all instances in a single Unit.
  • Some are under control of the particular instance.

Parameters may be considered as members of the abstract space Parameter-Space, and may be specified by a quadruple:

(controller, type, range, value)

where

  • controller within { CIM-Dynamic, Unit, instance}
  • type within {Float, Integer, Boolean, String}
  • range = list of values, if type equals (Boolean or String)
  • range = interval, if type equals (Float or Integer)
  • value within range

The type and range are the same for a parameter across all Units in a [[OCP]] instance. The value can be set by the controller, inside the valid range. Thus, the parameter set of a [[OCP]] instance with k types of CIM-Dynamics, j Units and n particular instances of those dynamics can be defined as

CIM-Dynamic-Parameter-Set = Union_k Parameter-Space

Unit-Parameter-Set = Union_j Parameter-Space

Instance-Parameter-Set = Union_n Parameter-Space

[[OCP]]-Parameter-Set = CIM-Dynamic-Parameter-Set x 

Unit-Parameter-Set x Instance-Parameter-Set

Some semi-random, hopefully evocative examples of important parameters for OCP:

  • k, the default personality parameter of the inference engine
  • The rate of importance decay for each kind of Node and Link.
  • The maximum sizes or expected sizes of compound PredicateNodes and SchemaNodes, during learning processes.

Homeostasis

We have seen that the values of the multiple system parameters can influence system behavior very drastically. This influence has two main aspects, which we call intelligence and health.

The influence of parameter values on intelligence can be seen when incorrect parameters cause the system to make wrong inferences (the k parameter mentioned above), or waste resources (by scheduling its CIM-Dynamics in sub-optimal ways), for example.

The influence of parameter values on system health has more serious consequences. When wrong parameters influence the system's health, it can crash, as we've already mentioned, or it can proceed very slowly due to spending all its resources on unnecessary OCF-level tasks.

Ideally, we would expect an intelligent system to be able to tune its own parameters, for maximum health and intelligence. The former is easier than the latter. This section presents our approach to homeostatic control, the OCF component responsible for the system's health. Automated parameter tuning for increased intelligence is an aspect of intelligent, goal-directed self-modification, discussed in a later chapter.

The health of a [[OCP]] system can be measured by a number of HealthIndicators, which are formally the same as FeelingNodes, but have a different orientation than other FeelingNodes. A HealthIndicator embodies a formula that assesses some aspect of the system's state, and the overall feeling of Health is then a combination of HealthIndicator values. Maximizing Health should be one of the system's conscious goals, therefore the overall Satisfaction FeelingNode that comprises one of the systems ubergoals (top-level goals) should contain Health as a component. However, at least in the early stages of development, the system will not want to rely on its explicit goal-achieving function to ensure Health, and some Health-maintenance rules may be hard-coded into the system, inside MindAgents dealing with "homeostatic control."

Homeostatic control (HC), as a general process, acts by keeping track of the system's health and the past values of each HealthIndicator. HC acts in two modes: long term optimization and firefighting. The two modes are handled by different MindAgents.

Firefighting

Firefighting mode is executed by the HomeostaticParameterAdaptation MindAgent. This MindAgent checks, each system cycle, if any of the system's HealthIndicators have come close to a dangerous value. This can happen when the system is running out of memory, or when the time it takes to answer queries from user applications becomes too long, or when the amount of new knowledge generated by the system's cognitive dynamics drops too low, for example.

In this case, the HC will identify the problematic HealthIndicators, and change the parameters responsible for these indicators. How does it know which changes need to be made? This is a combination of a set of built-in rules with automated mining of past information.

The built-in rules are hard-coded into the system, and are variations of the form:

if HealthIndicator > threshold change Parameter by amount

An example would be: if the system's free memory is below x%, increase i1 by y%. The parameters x and y would then be tuned by experimentation. Multiple rules may be created to address the same problem. The above rule will cause more Atoms to be removed from RAM and frozen. However, it may be that the rate at which new Atoms are being generated is so fast that this wouldn't solve the problem. In that case, the system can activate other rules that can:

  • Increate the rate of importance decay for all Atoms
  • Decrease the rate of Atom creation for some CIM-Dynamics

Therefore, the HC should keep track of the rules it has applied in the recent past to address a given HealthIndicator, so it can use alternatives when it fails to mitigate the risky situation in the first attempt.

Coding enough rules to fight all possible fires would be a complex task. Not only does one need a large number of rules — a bigger problem is that the correct set of rules may be very difficult to identify. OCP has numerous parameters, and numerous health indicators. Since each parameter can influence multiple health indicators, there is ample space for emergent behavior in the space of HealthIndicators, and changes in the parameters can have results that are very hard to predict. Also, different OCPs with different AtomSpaces may react differently to the same rules.

The solution is to hard-code only very rigorous rules, ones that will definitely prevent the system from crashing, but at a potentially large cost. Other rules have to be learned.

The process by which this learning takes place is the mining of temporal information, which will be discussed in detail later on. Here, a brief outline will suffice. The system keeps track of changes done to its parameters, and measure how they have affected HealthIndicators. This information is stored in a DB that's optimized for the following operation.

When faced with a potentially dangerous situation, HC will look in the DB for situations in which the HealthIndicators had values close to the present ones, which were followed by healthier outlooks. It will then identify the differences in the parameter space, and apply those differences.

This is a continuously improving process; the more experience the system has, the better it will behave under these circumstances. A critical mass is necessary, if this method is to work well, and this will be provided under controlled circumstances during the early life of a new OCP, being part of Experiential Learning.

Long term optimization

Complementary to firefighting mode, we have the LongTermParameterAdaptation MindAgent. This is the slow mode in which the HC process is always seeking to improve the system's health, over a very long timescale. It does so by changing system parameters on purpose, on a controlled environment. When resources are available, the LTPA MindAgent will allocate one or more Units for this process, where the effects of these changes can be observed without any harm to the other parts of the system. This is a very elementary kind of introspection.

This speculative experimentation, and its careful monitoring, will immediately provide more data for the data mining based firefighter. It will also slowly allow the system to crystallize knowledge about good and bad values for each system parameter. This process will eventually give rise to a more proactive HC function, which will alter the range of valid parameter values according to this long term learning system.