The Pattern Miner mines frequent and interesting patterns from Atomspace. It's a hypergraph pattern miner. It can be seen as the invert function of pattern matching, that is given a dataset of grounding atoms, find a pattern matcher query that would return a frequent and interesting enough collection of matches.
What is a Pattern?
A pattern is a connected set of links and nodes with variables that occurs in the Atomspace repeatedly for multiple times. For example: Americans drink Coke:
(Lambda (Variable "$X") (And (Evaluation (Predicate "drink") (List (VariableNode "$X") (ConceptNode “Coke"))) (InheritanceLink (VariableNode "$var_1") (ConceptNode "American"))))
If many instances can be found in the Atomspace that fit this pattern, then this is a frequent pattern. Such instances would be like:
(EvaluationLink (PredicateNode "drink") (ListLink (VariableNode "Ben") (ConceptNode “Coke"))) (InheritanceLink (VariableNode "Ben") (ConceptNode "American")) ... (EvaluationLink (PredicateNode "drink") (ListLink (VariableNode "Andrew") (ConceptNode “Coke"))) (InheritanceLink (VariableNode "Andrew") (ConceptNode "American")) ...
Pattern Gram or Conjunct
- 1-gram/1-conjunct pattern contains 1 clause.
- 2-gram/2-conjunct pattern contains 2 clauses.
- N-gram/N-conjunct pattern contains n clauses.
- Gram indicates the number of clauses of a pattern.
Pattern Mining Algorithm
The pattern mining is implemented on top of the URE, so finding patterns is framed as a form of reasoning to justify the interestingness of patterns. The algorithm itself is still greedy in nature. First, frequent 1-conjunct patterns are produced
then combined to produce N-conjunct patterns
See the pattern miner README.md for an indept presentation of the algorithm.
Interesting Pattern Mining
Most of frequent patterns are boring. We use a surprisingness measure to find interesting patterns. For now only I-Surprisingness is implemented, see Interesting Pattern Mining and Measuring Surprisingness, in the future a broader measure of surprisingness, realizing the notion of surprisingness as what is contrary to expectation, will be implemented, see 
Usage and Examples
Here's a list of pattern miner use cases. This is handy to drive the pattern miner development, its API, etc. Ideally each of these should be turned into examples put under , as well as unit tests.
- Mining of inference histories, for inference control
- Mining of dialogue histories, for learning dialogue patterns (or more generally, verbal/nonverbal interaction patterns)
- Mining of sets of genomic datasets or medical patient records, to find surprisingly common combinations of features
- Mining of surprising combinations of visual features in the output of a relatively "disentangled" deep NN (such as the pyramid-of-InfoGANs that Ralf, Selameab, Tesfa, Yenat and I are working on)
- Mining of surprising combinations of semantic relationships, in the R2L output of a large number of simple sentences read into Atomspace
- Mining of surprising combinations of syntactic relationships, in an Atomspace containing a set of syntactic relationships corresponding to each word in the dictionary of a given language (to be done iteratively within the language learning algorithm Linas is implementing)
- Mining of surprising (link-parser link combination, Lojban-Atomese-output combination) pairs, in a corpus of (link parses, Lojban-Atomese outputs) obtained from a parallel (English, Lojban) corpus
See Pattern_Miner_Prospective_Examples for detailed suggestions on some use cases.