Interesting Pattern Mining

From OpenCog


Please reference to Pattern Miner if you have not read it. The frequent pattern mining results , contains mostly boring / useless pattern. How to find interesting patterns from all these frequent patterns? The critical issue is to evaluate interestingness / surprisingness. Currently we use two different evaluation methods: 1. Interaction Information 2. Measuring Surprisingness

Evaluating interestingness by Interaction Information


Interaction Information, is an elegant way of evaluating multivariate mutual information. It expresses the amount information (redundancy or synergy) bound up in a set of variables, beyond that which is present in any subset of those variables.

Interaction Information Calculation in Our Pattern Mining

1.Example Corpus description:

-- 20 are human
-- 10 are men, 10 are ugly and 10 are soda drinkers
-- 5 are ugly men
-- 5 are ugly soda drinkers
-- 5 are male soda drinkers
-- 5 are ugly male soda drinkers

You can find the scm file of this corpus in :


2.Example Pattern description:

For this 3-gram pattern (ABC): 
A= ( InheritanceLink  "$var1"  "man"),  
B= ( InheritanceLink  "$var1"  " ugly "),  
C= ( InheritanceLink  "$var1"  "soda drinker"),  

Count(A) = 10
Count(B) = 10
Count(C) = 10

Count(AB) = 5
Count(AC) = 5
Count(BC) = 5


3.Interaction Information calculation for example pattern 3.1.Formula

    I(ABC) = H(A) + H (B) + H (C) - H(AB) - H(AC) - H(BC) + H(ABC)

3.2.Entropy Actually the entropy H(X) in such corpus has nothing to do with one pattern X's possibility among the whole corpus compared to other patterns. It's only related to the probability distribution among all its instances inside it. e.g. H(A): The entropy of a fact "someone is a man" :

Entropy calculation.jpg

Because in such corpus the frequency of each instance is not taken into account, so each instance has the equal possibility. For A, there are 10 instances are men,(even all the Atomspace is taken into account, other objects all have 0 probability of being a man, so those items are all 0, so they can all be ignored). Therefore,the probability of each instance P(xi ) = 1/10 :

Entropy calculation 2.jpg

Therefore, for such corpus, H(X) = log2(Count(X)).

3.3.Interaction Information

I(ABC) = H(A) + H (B) + H (C) - H(AB) - H(AC) - H(BC) + H(ABC)
       = log2(Count(A) ) + log2(Count(B) ) + log2(Count(C) )  
        -log2(Count(AB) ) -log2(Count(AC) ) - log2(Count(BC) )
        +log2(Count(ABC) )
       = log2(10) + log2(10) + log2(10)  
        -log2(5) - log2(5) - log2(5)
       = 3*log2 (10) - 2*log2(5)   
       = 5.3219

Evaluating interestingness by Surprisingness Measure

Please see Measuring Surprisingness