User talk:Richardbrucebaxter

From OpenCog
Jump to: navigation, search

Reply

Thanks for the note!

I gather its a kind of a question-answering system. You should write a few paragraphs explaining this in greater detail, and post this on the opencog mailing list.

What would be really interesting would be

  • A detailed description of how it works (the algo's it uses, the overall architecture of the thing)
  • A discussion of strengths and limitations
  • Lessons learned, what to do, what not to do.

So, for example, one limitation I ran into was that my system was unable to understand tht the following sentences are more or less equivalent:

  • Yarn is used for making cloth.
  • Yarn is used to make cloth.
  • Yarn is used in making cloth.
  • Yarn is used in the making of cloth.
  • Cloth is made from yarn.
  • etc.

It couldn't "normalize" these into the same semantic form. I started writing normalization rules for such things, and I stopped when I realized the task was over-whelming -- I estimated that I needed at lest hundreds if not thousands of such rules, and that hand-writing them was stupid: we need the machine to learn them on their own. I'd be curious to see what you discovered. Linas (talk) 22:14, 26 October 2012 (CDT)

Cheers Linas,
Thanks for your quick feedback. I thought that a brief summary was in order also. I will post the following introduction on the OpenCog maillist when I next get a chance.
GIA is a method for reducing/normalising language into a semantic network, enabling QA - although this is best considered a feature/by-product. GIA is currently implemented by processing NLP output, where the dependency relations are first simplified to a form similar to that implemented by RelEx, and after that a set of semantic relations are extracted (described below);
Semantic relationships include the following;
  • definitions (generated from is statements "are"/"is")
  • actions (generated from verbs eg "rode")
  • conditions (generated from prepositions eg "on/when")
  • properties (these currently encompass both aristotelian qualities eg "are blue" and possessive relationships eg "have bikes").
Entity nodes within the network include the following;
  • concept nodes (non-specific concepts; "dog" of "dogs are"/"a dog is")
  • substance nodes (instances; eg "dog" of "a dog"/"the dog". NB specific concepts eg "dog" of class "red dogs" are currently implemented as a subtype of substance node - although this is a developmental artefact. Qualities are declared a subtype of substance also.)
  • action nodes (eg "rode" of "tom rode the bike", "tom rode", or "the bike was ridden")
  • condition nodes (eg "on" of "tom rode the bike on wednesday". NB conditions are currently represented as nodes with input and output links - although this is implementation dependent and they could instead be represented as a single link, as highlighted in BAI's provision patent, figure 21)
Queries are performed by matching the temporary semantic network generated by GIA from the query text with (ie within) the existing semantic network generated by GIA from a textual context (text block / entire database eg wikipedia).
For more information on the algorithm, see some BAI (now white) papers here; https://sourceforge.net/p/opengia/wiki/Files/#algorithm (eg BAI_GeneralAIPatentAUProv1cDRAFT_FiguresOnly_21August2012.pdf
Regarding your question on normalisation capacity; I think I remember coming across a blog of yours identifying this problem (although perhaps not using this specific example).
I have executed GIA to demonstrate where GIA is at with respect to this specific normalisation problem. GIA can currently only handle this specific form of normalisation during queries (it does not store the data in a normalised state for a number of reasons; simplicity included). It performs such normalisation by not assigning a high priority/match mandate to condition (ie preposition) nodes about the comparison variable/query node during the subnet (ie query network) comparison. This happens to be one of the first features built into GIA's query system. There may be limitations with this approach, but wordnet synonymn detection during subnet (ie query network) matching is currently able to mitigate the most obvious problems resultant from this approach. Further limitations are evident (eg GIA Advanced Referencing failures - eg matching something like "the red dog nearby the park" with "a red dog situated nearby the park") - although I havn't yet encountered these during my testing to date.
It is hoped that further normalisation limitations may be overcome when offline data organisation/dream state is implemented. NB dream state is envisaged to encompass for example a) the reduction of identical text for the assignment/correction of knowledge base probability, b) the identification of instances/specific concepts so far undetected during real-time advanced referencing (say if limited database lookup time was available when system was online), c) the identification of action paths (eg x requires y, y requires z etc), and d) the detection of matching subnet layouts - enabling for example humour/lateral thinking (where subnet structures may be identical bar one or more completely unassociated words).
NB the GIA NLG output is slightly artificial at the moment (reusing words from input sentences rather than regenerating from lemmas) - I need a morphology generator. NB NLG2 support is disabled by default, as I have found this unstable - and besides NLG2 still requires a morphology generator anyway as its "lemmas" are not lemmas (something I recall you pointing out in a markmail archive; http://markmail.org/message/g2jfxuhbc7o3xpl4).
Results Part 1/3 (after applying some minor updates to NLG system to properly handle the case of actions defined without a subject; OpenGIA 28 October 2012 / 1q6a) 

./OpenGIA.exe -itxt inputText.txt -itxtq inputTextQuery.txt -oall semanticNet -nlprelation 2 -nlpfeature 1 -nlprelationq 2 -nlpfeatureq 1 -nlprelexfolder "/home/systemusername/soft/BAISource/relex/relex-1.4.0" -nlpstanfordcorenlpfolder "/home/systemusername/soft/BAISource/stanford/coreNLP/stanford-corenlp-2012-04-03" -nlpstanfordparserfolder "/home/systemusername/soft/BAISource/stanford/parser/stanford-parser-2012-03-09" -notshow
 
	(inputText.txt/inputTextQuery.txt);
	Yarn is used for making cloth.
	Yarn is used to make cloth.
	Yarn is used in making cloth.
	Yarn is used in the making of cloth.
	Cloth is made from yarn.	
	What is yarn used for?
		Answer Found.
		Exact Found Answer: make
		Answer Context: yarn use for make 
		Answer Context (NLG):  making a cloth
		Answer Context (NLG):  Yarn used for making
		confidence: 50.000000
		max confidence: 40.000000
		GIA execution time: 15:36:44 28/10/2012 (finish)
	(inputText.txt/inputTextQuery.txt);
	Yarn is used for making cloth.
	What is yarn used for?
		Answer Found.
		Exact Found Answer: make
		Answer Context: yarn use for make 
		Answer Context (NLG):  making a cloth
		Answer Context (NLG):  Yarn used for making
		confidence: 50.000000
		max confidence: 40.000000
		GIA execution time: 15:36:44 28/10/2012 (finish)						
	(inputText.txt/inputTextQuery.txt);
	Yarn is used to make cloth.	
	What is yarn used for?
		Answer Found.
		Exact Found Answer: make
		Answer Context: yarn use _to-do make 
		Answer Context (NLG):  make a cloth
		Answer Context (NLG):  Yarn used _to-do make
		confidence: 40.000000
		max confidence: 40.000000
		GIA execution time: 15:45:47 28/10/2012 (finish)
	(inputText.txt/inputTextQuery.txt);
	Yarn is used in making cloth.
	What is yarn used for?
		Answer Found.
		Exact Found Answer: make
		Answer Context: yarn use in make 
		Answer Context (NLG):  making a cloth
		Answer Context (NLG):  Yarn used in making
		confidence: 40.000000
		max confidence: 40.000000
		GIA execution time: 15:47:36 28/10/2012 (finish)
	(inputText.txt/inputTextQuery.txt);
	Yarn is used in the making of cloth.
	What is yarn used for?
		Answer Found.
		Exact Found Answer: making
		Answer Context: yarn use in making 
		Answer Context (NLG):  Yarn used in a making
		confidence: 40.000000
		max confidence: 40.000000
		GIA execution time: 15:49:6 28/10/2012 (finish)
	Cloth is made from yarn.
	What is yarn used for making?
		FAIL
		NB* GIA can't handle the normalisation of the last case. For reference, this appears slightly ambiguous; just because yarn is used for making cloth, it doesn't imply that cloth is made from yarn. Eg;

Results Part 2/3 (After applying some minor generalisations to GIATranslatorRedistributeStanfordRelations.cpp to handle the case of "What is ... condition action (condition)?" / redistributeStanfordRelationsCreateQueryVarsAdjustForActionPrepositionAction(); OpenGIA 28 October 2012 / 1q6b) 

	(inputText.txt/inputTextQuery.txt);
	Yarn is used for making cloth.
	Yarn is used to make cloth.
	Yarn is used in making cloth.
	Yarn is used in the making of cloth.
	Cloth is made from yarn.
	What is yarn used for making?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: yarn use for make cloth 
		Answer Context (NLG):  making a cloth
		Answer Context (NLG):  Yarn used for making
		confidence: 60.000000
		max confidence: 50.000000			
	What is yarn used to make?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: yarn use _to-do make cloth 
		Answer Context (NLG):  make a cloth
		Answer Context (NLG):  Yarn used _to-do make
		confidence: 60.000000
		max confidence: 50.000000						
	What is yarn used in making?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: in make cloth 
		Answer Context (NLG):  making Cloth
		confidence: 60.000000
		max confidence: 50.000000
		GIA execution time: 22:50:54 28/10/2012 (finish)			
	What is yarn used in the making of?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: in making possessed by cloth 
		confidence: 60.000000
		max confidence: 50.000000
		GIA execution time: 22:49:0 28/10/2012 (finish)			
	What is yarn used in the creation of? (NB query system utilises wordnet lookups by default)						 
		FAIL (error in pos tagging?)
		Answer Not Found.
		confidence: 40.000000
		max confidence: 50.000000
		GIA execution time: 23:30:39 28/10/2012 (finish)		
	What is cloth made from?
		Answer Found.
		Exact Found Answer: yarn
		Answer Context: from yarn 
		confidence: 50.000000
		max confidence: 40.000000
		GIA execution time: 23:22:9 28/10/2012 (finish)			

Results Part 3/3

	(inputText.txt/inputTextQuery.txt);
	Yarn is used for making cloth.
	What is yarn used for making?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: yarn use for make cloth 
		Answer Context (NLG):  making a cloth
		Answer Context (NLG):  Yarn used for making
		confidence: 60.000000
		max confidence: 50.000000
		GIA execution time: 1:0:53 29/10/2012 (finish)	
	Yarn is used for making cloth.			
	What is yarn used to make?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: make cloth 
		Answer Context (NLG):  making a cloth
		confidence: 30.000000
		max confidence: 50.000000
		GIA execution time: 0:59:40 29/10/2012 (finish)			
	Yarn is used for making cloth.						
	What is yarn used in making?
		Answer Found.
		Exact Found Answer: cloth
		Answer Context: make cloth 
		Answer Context (NLG):  making a cloth
		confidence: 30.000000
		max confidence: 50.000000
		GIA execution time: 1:2:9 29/10/2012 (finish)	
	Yarn is used for making cloth.			
	What is yarn used in the making of?
		Answer Not Found.
		confidence: 30.000000
		max confidence: 50.000000
		GIA execution time: 1:3:48 29/10/2012 (finish)	
		FAILS (NB in GIA "of" is interpreted as a property link for all concept/substances. The following is now under consideration; "of" is interpreted as an object for some actions - however it is difficult to identify "making" as an action as Stanford Parser/CoreNLP registers "making" as NN not VBG in this instance, "the making of") {UPDATE; I have resolved this issue, and will upload the corresponding version of OpenGIA later this evening}
	Yarn is used for making cloth.		
	What is cloth made from?
		Answer Not Found.
		confidence: 30.000000
		max confidence: 40.000000
		GIA execution time: 1:5:1 29/10/2012 (finish)
		FAILS (see above NB* "GIA can't handle the normalisation of the last case")
For reference, I have released a version of OpenGIA with these updates (http://sourceforge.net/projects/opengia/files/release-15October2012).
Thanks again for your feedback - I will write more soon.
All the best till then,
Richardbrucebaxter (talk) 11:05, 28 October 2012 (CDT)

More

I guess I should tell you more: I realized that the only thing to do is to get the machine to learn this stuff itself. I have a project to this, but its going very slowly. The first step, I decided, was that I need a real-time parser, and a real-time parse-dictionary update, so that as the system learns new parse rules, it can immediately start parsing with those recently learned rules, and go from there. Towards that end, I've started writing something called the "viterbi parser" for link-grammar. So far, it can parse sentances with as many as two words! Woot! ... OK, well, I;'ve barely started. Time is short... Linas (talk) 22:20, 26 October 2012 (CDT)