Supported functions starting over raw text strings are:
- String tokenization (string -> string)
- Part of speech tagging (string -> tokens -> vector-document)
- Phrase chunking (vector-document -> phrases)
- Miscellaneous (other functions like surface form generation and lemmatization can be found on the main page)
The main page for the distribution can currently be found at: http://common-lisp.net/project/langutils/
Detailed guide to the main parts of the API can be found in the distribution README file. Also included is a paper presented at the 2005 Lisp Conference discussing aspects of the library implementation.