SetLink

From OpenCog
Jump to: navigation, search

The SetLink is a type of UnorderedLink used to group its arguments into a set.

SetLinks are Bad

Before telling you what a SetLink is, we will tell you: SetLinks are bad! Although they are used in many places in the current implementation, it is only after long hard experience beating us over the head that it's been discovered that SetLinks are bad. Do not use them in new code. Use MemberLinks instead.

Data Representation Issues

The SetLink is terribly awkward for certain kinds of data representation problems. The MemberLink can overcome most of these issues. The issues are these:

  • The SetLink, just like any link, is immutable: it cannot be modified, it can only be created or destroyed. That means that set members can not be either added nor removed, except by destroying the SetLink, and creating a new one. If some other link includes the SetLink in it's outgoing set, then the Setlink cannot even be destroyed, until every atom that is holding it is destroyed first. This makes it hard, if not impossible, to alter set membership.
  • The SetLink link is difficult to search for, to use the pattern matcher to find it. It is difficult to write a pattern that says "find me all sets that have A,B,C as members, but not D,E or F." This can be done partly by employing the GlobNode, but checking for the non-inclusion of D,E,F is hard: maybe this can be done with type constraints on the GlobNode, and maybe it can be done by using the AbsentLink as a post-condition in the search. Computationally, this is inefficient, since the SetLink is a kind of UnorderedLink, and globbing then requires examining many (all?) alternative permutations of the set contents.
  • The current implementation of the SQL persistent storage has a hard-coded maximum of 330 atoms in the outgoing set. Fixing this requires a major redesign; fixing this would still not fix the other two issues mentioned above.

Knowledge representation systems can overcome all of these limitations by using MemberLink instead of SetLink.

The above comments also apply to other link types that behave link SetLink, such as AndLink and OrLink. Currently, we do not have have any special equivalents to MemberLink indicating And, Or membership relations.

Procedural vs. Declarative

There are several ways to avoid the above issues, besides using MemberLink. One very wide-spread and conventional way of doing so is to create SetLinks "on the fly", procedurally, by using procedural devices like GetLink and PutLink, to create the needed SetLinks on the fly, when they are invoked. These can then be used once or twice, and then freed when no longer needed.

Thus, the non-modifiability of SetLink is only a problem when it is used in declarative contexts: e.g. when, as the result of sensory input, or as the result of reasoning, that we have obtained the "set of all x", declare it to be true by inserting it into the atomspace, and later discover, to our chagrin, that the "set of all x" is incomplete or wrong. Dissolving it and reforming it becomes a chore, and worse: it fundamentally cannot be dissolved if some other link is holding a reference to it!

We can conclude: SetLink is OK in procedural contexts, but is wholly inadequate in declarative contexts. (See atom relationships for a general review of procedural vs. declarative usage in the AtomSpace).

The AtomSpace is a giant set

Note that the AtomSpace is more-or-less a giant set. See the discussion in the section titles "Re-imagining the AtomSpace" on the AtomSpace page. It includes a proposal to make atomspaces look even more set-like, and to make atomspace membership declarative using a MemberLink-style link.

Definition

An example of a SetLink is:

 SetLink
    ConceptNode "x"
    ConceptNode "y"
    ConceptNode "z" 

This simply indicates that there is a set {x,y,z}.

Note that, in general, most users are encouraged to use the MemberLink instead of the SetLink. There are some fundamental data representation issues that make the SetLink undesirable and inept at its job. See discussion below, at the end.

The PresentLink is a kind of SetLink that disallows (removes) duplicated members.

Naming Sets

Sets can be given a name, using the EquivalenceLink or the DefineLink:

 EquivalenceLink
    ConceptNode "last three letters of the alphabet"
    SetLink
       ConceptNode "x"
       ConceptNode "y"
       ConceptNode "z"

The EquivalenceLink is used to make declarative statements; the DefineLink is used to make imperative statements; see Link comparison for a discussion of the differences.

Note that this illustrates the immutability problem of SetLinks: the set cannot be modified without first deleting the EquivalenceLink, then creating the new set, then re-attaching the EquivalenceLink. Tis is very awkward in a multi-processing, multi-threaded implementation, where one thread might be accessing the definition/equivalence, while another thread is deleting the equivalence in order to modify it.

Subsets

The SubsetLink can be used to indicate subsets:

 SubsetLink
    SetLink
       ConceptNode "x"
       ConceptNode "y"
    SetLink
       ConceptNode "x"
       ConceptNode "y"
       ConceptNode "z"

which states that {x,y} is a subset of {x,y,z}. If the set is named, then one can write:

 SubsetLink
    SetLink
       ConceptNode "x"
       ConceptNode "y"
    ConceptNode "last three letters of the alphabet"

As above, this highlights the immutability problem that SetLinks have. The SetLink cannot be modified without first deleting the SubSetLinks, and then re-creating them later - a computationally difficult problem. SubSetLinks are also bad.

Set membership

Sets can also be constructed by specifying individual members:

  MemberLink
     ConceptNode "x"
     ConceptNode "last three letters of the alphabet"
  MemberLink
     ConceptNode "y"
     ConceptNode "last three letters of the alphabet"
  MemberLink
     ConceptNode "z"
     ConceptNode "last three letters of the alphabet"

Note, however, it is not exactly equivalent: each MemberLink could have a distinct truth value indicating the degree to which the letters x,y,z, are members. Neither the SetLink nor the SimilarityLink allow this to be expressed. A clearer example is provided by the following:

  MemberLink  <1.0>
     ConceptNode "z"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.9>
     ConceptNode "w"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.8>
     ConceptNode "s"
     ConceptNode "last letters of the alphabet"
  MemberLink  <0.2>
     ConceptNode "m"
     ConceptNode "last letters of the alphabet"

which suggests that perhaps its a poor idea to group the letter "m" among the set of the last letters of the alphabet. There is no particularly good way to express this fuzzy-membership notion with a simple SetLink, although one could have that

  SetLink <0.4>
     ConceptNode "z"
     ConceptNode "w"
     ConceptNode "s"
     ConceptNode "m"

is a way of weakly grouping together m,s,w and z into a somewhat ephemeral group, with strength 0.4. However, this cannot indicate that perhaps some of the elements are stronger members than others.

See also