Type constructor

From OpenCog
Jump to: navigation, search

The AtomSpace is equipped with several formally-defined type constructors which can be used to create algebraic data types. These are SignatureLink, ArrowLink, TypeChoice and TypeNode. In addition to these, almost all other Link types can be implicitly used as type constructors, simply by using them underneath one of the first three. Roughly speaking, most link types are "duck typed" into type constructors when they appear under one of these.

So, for example, all Atom types are the primitive data types of the AtomSpace. In order to use one of these primitive types, to specify the type of a variable or the type of an atom, one uses the TypeNode, and simply gives the name of the primitive type, as a string. Thus, the TypeNode is like a type constructor, but it can only create primitive types.

The ArrowLink can be used to define function types -- that is, functions that accept some sort of type as input, and yeild some other type as output.

To define a product type, one uses a SignatureLink together with a ListLink/AndLink. An AndLink all by itself functions as a normal set-theory boolean-and. An AndLink that appears underneath a SignatureLink should be understood as defining a product of types. Since almost any link can appear under the SignatureLink, there appears to be a very rich variety of product-like or composition-like types available on the AtomSpace, having various properties. Note, however, for function arguments, it is customary to use the ListLink instead of AndLink for the product type.

The dual to the product type is the sum type; it is explicitly given by the TypeChoiceLink. In principle, one might have simply considered using a plain OrLink under a SignatureLink to accomplish this task. However, that would be more verbose that TypeChoice. In addition, the OrLink has the taint of first-order logic; it is often more convenient to approach problems in OpenCog from the intuitionist logic viewpoint, thus eschewing "logical-or" for "choice".

At this time, there is nothing that corresponds to a quotient type or setoid because there is no way of specifying an equivalence between types.

At this time, there is no support for type variables, and thus no support for parametric types, and thus, no support for generalized algebraic types (GADTs).

Newly constructed types can be given a name, by placing a string name in a DefinedTypeNode, and using DefineLink to join the name to the definition.

Utility

The above-described type system was not implemented for some sort of woolly theoretical feel-good reasons; rather, it was motivated by the need to solve real problems: for example, to narrow the search that the pattern matcher performs, so that, if you are searching for words, you don't get sentences or concepts or parts of speech back in your query results. The arrow type, and the composability of types is needed to control the stringing-together of rules by the backward chainer and the forward chainer in the unified rule engine. Indeed, the composition of arrow-signatures during chaining closely resembles dependency parsing.

Type constructors are tested in several places; most importantly in /tests/query/DeepTypeUTest.cxxtest.

Examples, scheme interfaces

Examples can be found in /examples/pattern-matcher/type-signature.scm.

The scheme function cog-value-is-type? returns true or false depending on whether the indicated value is compatible with the given type.

The scheme function cog-type-match? returns true or false, to indicate if the left-hand type can be mated with the right-hand type. If the right-hand side is a plain value, it behaves identically to cog-value-is-type?. It can be used, for example to determine if its OK to pass an argument of some given type to a function call that only accepts certain types. In particular, it is useful for determining if two rules can be chained together.

The scheme function cog-type-compose computes the composition of two types. For example, given two functions or two rules, it can be used to compute the function type of the two functions composed together. It will always return a non-empty type, the smallest type satisfying both left and right sides, whenever cog-type-match? returns true. It will always return the empty set (undefined handle) whenever cog-type-match? returns false. (currently implementation is incomplete; not hard, just incomplete.).

The C++ interface is here: /opencog/atomutils/TypeUtils.h

Type inheritance

The OpenCogPrime:TypeInheritance page discusses type inheritance; it has not been implemented yet. It is not clear to me what this is useful for, what we should use it for.

Type variables, type classes

At this time, there is no coherent notion of type variables or type classes or concepts in Atomese.

These concepts might be useful to solve one problem: improving the performance of the pattern matcher for extremely simple searches. The pattern matcher was designed to support very complicated pattern searches, as needed for natural language processing; as a result, various simple searches, such as those needed for PLN and the unified rule engine run quite slowly: most of the pattern matcher infrastructure is not needed or used. It would be nice if, for certain "simple" queries, the pattern matcher could be overloaded (i.e. BindLink and GetLink overloaded) with simpler/faster implementations.

Such overloading requires that the arguments to GetLink, BindLink be identified and type-checked: if the arguments are of a certain simple type, then a simplified search could be conducted. There are two ways to do this. The abstract way is to provide type class support, and then overload GetLink by (for example) "FastGetLink" when that typeclass is seen.

The more realistic way to actually implement this is "under the covers", in the C++ code for the pattern matcher, so that, whenever a particularly simple GetLink is spotted, then equally simple search code is used. It is possible that this could maybe dramatically improve PLN and URE performance!?