Type constructor

From OpenCog
(Redirected from Type system)

The AtomSpace is equipped with several formally-defined type constructors which can be used to create algebraic data types. These are SignatureLink, ArrowLink, TypeChoice, TypeNode and TypeSetLink. In addition to these, almost all other Link types can be implicitly used as type constructors, simply by using them underneath one of the first three. Roughly speaking, most link types are "duck typed" into type constructors when they appear under one of these.

So, for example, all Atom types are the primitive data types of the AtomSpace. In order to use one of these primitive types, to specify the type of a variable or the type of an atom, one uses the TypeNode, and simply gives the name of the primitive type, as a string. Thus, the TypeNode is like a type constructor, but it can only create primitive types.

The ArrowLink can be used to define function types -- that is, functions that accept some sort of type as input, and yield some other type as output.

To define a product type, one uses a SignatureLink together with a ListLink/AndLink. An AndLink all by itself functions as a normal set-theory boolean-and. An AndLink that appears underneath a SignatureLink should be understood as defining a product of types. Since almost any link can appear under the SignatureLink, there appears to be a very rich variety of product-like or composition-like types available on the AtomSpace, having various properties. Note, however, for function arguments, it is customary to use the ListLink instead of AndLink for the product type.

A related concept is implemented with the TypeSetLink, which is used to specify multiple type properties that must hold simultaneously.

The dual to the product type is the sum type; it is explicitly given by the TypeChoiceLink. In principle, one might have simply considered using a plain OrLink under a SignatureLink to accomplish this task. However, that would be more verbose than TypeChoice. In addition, the OrLink has the taint of first-order logic; it is often more convenient to approach problems in OpenCog from the intuitionist logic viewpoint, thus eschewing "logical-or" for "choice".

At this time, there is nothing that corresponds to a quotient type or setoid because there is no way of specifying an equivalence between types.

At this time, there is no support for type variables, and thus no support for parametric types, and thus, no support for generalized algebraic types (GADTs), dependent types or type families. There has been some experimental prototyping in this direction, but it has remained inconclusive due to a lack of user demand. Please ask for it, if you need it.

Note, however, there is a system resembling dependent types, called Connectors and Sections. It generalizes the idea of lambdas, function types and tensor types into a more general sheaf-theoretic framework. Individual connectors can be viewed as a kind of dependent type.

Newly constructed types can be given a name, by placing a string name in a DefinedTypeNode, and using DefineLink to join the name to the definition.


The above-described type system was not implemented for some sort of woolly theoretical feel-good reasons; rather, it was motivated by the need to solve real problems: for example, to narrow the search that the query engine performs, so that, if you are searching for words, you don't get sentences or concepts or parts of speech back in your query results. The arrow type, and the composability of types is needed to control the stringing-together of rules by the backward chainer and the forward chainer in the unified rule engine. Indeed, the composition of arrow-signatures during chaining closely resembles dependency parsing.

Type constructors are tested in several places; most importantly in /tests/query/DeepTypeUTest.cxxtest.

Examples, scheme interfaces

Examples can be found in /examples/pattern-matcher/type-signature.scm.

The scheme function cog-value-is-type? returns true or false depending on whether the indicated value is compatible with the given type.

The scheme function cog-type-match? returns true or false, to indicate if the left-hand type can be mated with the right-hand type. If the right-hand side is a plain value, it behaves identically to cog-value-is-type?. It can be used, for example to determine if its OK to pass an argument of some given type to a function call that only accepts certain types. In particular, it is useful for determining if two rules can be chained together.

The scheme function cog-type-compose computes the composition of two types. For example, given two functions or two rules, it can be used to compute the function type of the two functions composed together. It will always return a non-empty type, the smallest type satisfying both left and right sides, whenever cog-type-match? returns true. It will always return the empty set (undefined handle) whenever cog-type-match? returns false. (currently implementation is incomplete; not hard, just incomplete.).

The C++ interface is here: /opencog/atomutils/TypeUtils.h

Type variables, type classes

At this time, there is no coherent notion of type variables or type classes or concepts in Atomese. Thus, we have no "parametric types" and no "Generalized Algebraic Data Types" (GADTs).

These concepts might be useful to solve one problem: improving the performance of the query engine for extremely simple searches. The query engine was designed to support very complicated pattern searches, as needed for natural language processing; as a result, various simple searches, such as those needed for PLN and the unified rule engine run quite slowly: most of the query engine infrastructure is not needed or used. It would be nice if, for certain "simple" queries, the query engine could be overloaded (i.e. QueryLink and MeetLink overloaded) with simpler/faster implementations.

Such overloading requires that the arguments to MeetLink, QueryLink be identified and type-checked: if the arguments are of a certain simple type, then a simplified search could be conducted. There are two ways to do this. The abstract way is to provide type class support, and then overload MeetLink by (for example) "FastMeetLink" when that typeclass is seen.

The more realistic way to actually implement this is "under the covers", in the C++ code for the query engine, so that, whenever a particularly simple MeetLink is spotted, then equally simple search code is used. It is possible that this could maybe dramatically improve PLN and URE performance!?