SetLink

From OpenCog
Jump to: navigation, search

The SetLink is a type of UnorderedLink used to group its arguments into a set.

For example:

 SetLink
    ConceptNode "x"
    ConceptNode "y"
    ConceptNode "z" 

simply indicates that there is a set {x,y,z}.

Note that, in general, most users are encouraged to use the MemberLink instead of the SetLink. There are some fundamental data representation issues that make the SetLink undesirable and inept at its job. See discussion below, at the end.

Naming Sets

Sets can be given a name, using the EquivalenceLink or the DefineLink:

 EquivalenceLink
    ConceptNode "last three letters of the alphabet"
    SetLink
       ConceptNode "x"
       ConceptNode "y"
       ConceptNode "z"

The EquivalenceLink is used to make declarative statements; the DefineLink is used to make imperative statements; see Link comparison for a discussion of the differences.

Subsets

The SubsetLink can be used to indicate subsets:

 SubsetLink
    SetLink
       ConceptNode "x"
       ConceptNode "y"
    SetLink
       ConceptNode "x"
       ConceptNode "y"
       ConceptNode "z"

which states that {x,y} is a subset of {x,y,z}. If the set is named, then one can write:

 SubsetLink
    SetLink
       ConceptNode "x"
       ConceptNode "y"
    ConceptNode "last three letters of the alphabet"

Set membership

Sets can also be constructed by specifying individual members:

  MemberLink
     ConceptNode "x"
     ConceptNode "last three letters of the alphabet"
  MemberLink
     ConceptNode "y"
     ConceptNode "last three letters of the alphabet"
  MemberLink
     ConceptNode "z"
     ConceptNode "last three letters of the alphabet"

Note, however, it is not exactly equivalent: each MemberLink could have a distinct truth value indicating the degree to which the letters x,y,z, are members. Neither the SetLink nor the SimilarityLink allow this to be expressed. A clearer example is provided by the following:

  MemberLink  <1.0>
     ConceptNode "z"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.9>
     ConceptNode "w"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.8>
     ConceptNode "s"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.2>
     ConceptNode "m"
     ConceptNode "last letters of the alphabet"

which suggests that perhaps its a poor idea to group the letter "m" among the set of the last letters of the alphabet. There is no particularly good way to express this fuzzy-membership notion with a simple SetLink, although one could have that

  SetLink <0.4>
     ConceptNode "z"
     ConceptNode "w"
     ConceptNode "s"
     ConceptNode "m"

is a way of weakly grouping together m,s,w and z into a somewhat ephemeral group, with strength 0.4. However, this cannot indicate that perhaps some of the elements are stronger members than others.

Data Representation Issues

The SetLink can be terribly awkward for certain kinds of data representation problems. The MemberLink can overcome most of these issues. The issues are these:

  • The SetLink, just like any link, is immutable: it cannot be modified, it can only be created or destroyed. That means that set members can not be either added nor removed, except by destroying the SetLink, and creating a new one. If some other link includes the SetLink in it's outgoing set, then the Setlink cannot even be destroyed, until every atom that is holding it is destroyed first. This makes it hard, if not impossible, to alter set membership.
  • The SetLink link is difficult to search over (to pattern match over). It is difficult to write a pattern that says "find me all sets that have A,B,C as members, but not D,E or F." This can be done partly by employing the GlobNode, but checking for the non-inclusion of D,E,F is hard: maybe this can be done with type constraints on the GlobNode, and maybe it can be done by using the AbsentLink as a post-condition in the search. Computationally, this is inefficient, since the SetLink is a kind of UnorderedLink, and globbing then requires examining many (all?) alternative permutations of the set contents.
  • The current implementation of the SQL persistent storage has a hard-coded maximum of 330 atoms in the outgoing set. Fixing this requires a major redesign; fixing this would still not fix the other two issues mentioned above.

Knowledge representation systems can overcome all of these limitations by using MemberLink instead of SetLink.

The above comments also apply to other link types that behave link SetLink, such as AndLink and OrLink. Currently, we do not have have any special equivalents to MemberLink indicating And, Or membership relations.

Procedural vs. Declarative

There are several ways to avoid the above issues, besides using MemberLink. One very wide-spread and conventional way of doing so is to create SetLinks "on the fly", procedurally, by using procedural devices like GetLink and PutLink, to create the needed SetLinks on the fly, when they are invoked. These can then be used once or twice, and then freed when no longer needed.

Thus, the non-modifiability of SetLink is only a problem when it is used in declarative contexts: e.g. when, as the result of sensory input, or as the result of reasoning, that we have obtained the "set of all x", declare it to be true by inserting it into the atomspace, and later discover, to our chagrin, that the "set of all x" is incomplete or wrong. Dissolving it and reforming it becomes a chore, and worse: it fundamentally cannot be dissolved if some other link is holding a reference to it!

We can conclude: SetLink is OK in procedural contexts, but is wholly inadequate in declarative contexts. (See atom relationships for a general review of procedural vs. declarative usage in the AtomSpace).

See also