I've been experimenting with Link Grammar for several weeks. It's somewhat akin to a reverse regular expression. You input a regular English (or German or Italian) sentence, and it outputs a pattern parsed from the structure of the sentence.
This is used in a grammar checker for a word processor, various NLP software, and is also one of the core NLP components of OpenCog.
For those of you I've lost, this is what it actually does.
Give Link Grammar a sentence like:
"My dog likes dog food."
It outputs:
+------------------Xp-----------------+
+----Wd----+ +------Ou------+ |
| +-Ds+--Ss--+ +---AN--+ |
| | | | | | |
LEFT-WALL my dog.n likes.v dog.n food.n-u .
Or, even better, it chunks the input into phrases, similar to the way your english teacher had you do in school:
(S (NP My dog) (VP likes (NP dog food)) .)
Even better, Link Grammar can take words it's unsure of and make educated guesses about the grammatical structure. Let's invent a word, like foofsnarfle. foofsnarfle is a verb that means "to throw a cow at a rapidly moving target."
My input sentence is "My brother foofsnarfled a truck." The result is correct; Link Grammar determines that foofsnarfle is a verb and generates the proper diagram.
(S (NP My brother) (VP foofsnarfled (NP a truck)) .)
LG can also output PostScript format, along with a 2D list of word - type pairs, like so:
My.f Ds+
dog.n Wd- Ds- Ss+
likes.v Ss- Ou+
dog.n AN+
food.s Ou- AN-
. Xp-
The latest news and documentation can be found on this page: http://www.abisource.com/projects/link-grammar/
This is the documentation of the English dictionary, and what the symbols mean: http://www.abisource.com/projects/link-grammar/dict/index.html
This is the API documentation, which contains all the functions I have wrapped: http://www.abisource.com/projects/link-grammar/api/index.html
There are two additional functions that aren't documented, required for different output. Those are:
Func _LG_LinkagePrintDisjuncts($hLinkage)
$result = DllCall($_LG_DLL, "str:cdecl", "linkage_print_disjuncts", "ptr", $hLinkage)
Return $result[0]
EndFunc
This function provides the list of word <-> type pairs.
The other function is :
Func _LG_LinkagePrintConstituentTree($hLinkage, $iOpt)
;char * linkage_print_constituent_tree(Linkage linkage int iOpt);
$result = DllCall($_LG_DLL, "str:cdecl", "linkage_print_constituent_tree", "ptr", $hLinkage, "int", $iOpt)
Return $result[0]
EndFunc ;==>_LG_LinkagePrintConstituentTree
This outputs the parsed phrase format, explained here: http://www.link.cs.cmu.edu/link/ph-explanation.html
Here's the sample code to get you going:
#include "_LinkGrammar.au3"
$Test = "My dog likes dog food."
$options = _LG_ParseOptionsCreate()
$dict = _LG_DictionaryCreateLang("en")
$Sentence = _LG_SentenceCreate($Test, $dict)
_LG_SentenceSplit($Sentence, $options)
$num_linkages = _LG_SentenceParse($Sentence, $options)
If $num_linkages > 0 Then
$linkage = _LG_LinkageCreate(0, $Sentence, $options)
$diagram = _LG_LinkagePrintDiagram($linkage)
$diagram2 = _LG_LinkagePrintConstituentTree($linkage, 3)
$diagram3 = _LG_LinkagePrintDisjuncts($linkage)
ConsoleWrite($diagram & @CRLF)
ConsoleWrite($diagram2 & @CRLF)
ConsoleWrite($diagram3 & @CRLF)
EndIf
And finally: Here's the AutoIt packge, with the dll, the English, and the German dictionaries:
http://www.AutoIt.me/_LinkGrammar_au3.zip
Enjoy!