Regular expression libraries: (main topic: regular expression)
- cl-irregsexp - A fast regular expression library with a lispy alternative to traditional syntax for text matching
- cl-ppcre - A portable, Perl-compatible regular expression library by Edi Weitz
- cl-string-match - Provides substring (subsequence) search and text processing algorithms implementations including regular expression, prefix/suffix tree data structures, etc
- lol-re - Tiny wrapper around CL-PPCRE, inspired by #~M and #~S read-macro from Let Over Lambda, making use of regular expressions more perly
- pregexp - Portable Regular Expressions for Scheme and Common Lisp
- re - The re package is a small, portable, lightweight, and quick, regular expression library for Common Lisp
- recursive-regex - A library to extend CL-PPCRE to make regular expression named capture groups dispatch to custom matcher functions and named-expression patterns
- Regex - Regex is a full-featured regular expression compiler and matching engine written by Michael Parker
- regex (library by asciian) - A relatively incomplete (as of Jan 2018) relatively concise backtracking POSIX compatible regular expression library
- terse-ppcre - TERSE-PPCRE aims to make manipulating CL-PPCRE regular expression parse trees easier and more succinct
- The Regex Coach - A graphical Common Lisp application which can be used to experiment with (Perl-compatible) regular expressions interactively
Parser generators: (main topic: parser generator)
- bintype - BINTYPE is a specification-driven parser generator for binary formats
- cl-opossum - CL-Opossum is a Common Lisp implementation of a Parsing Expression Grammar parser generator
- cl-peg - The cl-peg package (cl-peg_0.03.tar.gz) is a PEG packrat parser generator by John Leuner
- CL-Yacc - CL-Yacc is a LALR(1) parser generator for Common Lisp, somewhat like Zebu or lalr.cl.
- com.nklein.parser-generator - Generate SAX-based XML parsers for Lisp or Objective-C
- de.setf.atn-parser - de setf atn-parser is an atn-based BNF -> Common Lisp LR(*) parser generator
- ebnf-parser - An EBNF (ISO/IEC 14977) parser generator
- esrap-liquid - ESRAP-LIQUID -- more lispy packrat parser generator for Common Lisp, than ESRAP
- esrap-peg - Esrap-PEG is a parser generator; it takes files with portable (language-agnostic) PEG notation and produces Esrap rules to parse this grammar
- hh-parse - hh-parse is an LALR(1) parser generator written in Common Lisp
- LALR - LALR is a LALR(1) parser generator available at the CMU AI repository.
- MaxPC - Max’s Parser Combinators is a simple and pragmatic library for writing parsers and lexers based on combinatory parsing
- Meta - A recursive-descent parser DSL that is a simpler alternative to parser generators
- meta-sexp - meta-sexp is a META parser generator using LL(1) grammars with s-expressions
- metapeg - Metapeg is a PEG parser generator created by John Leuner
- monkeylib-parser - monkeylib-parser is a parser generator loosely based on Henry Baker's META paper
- NPG - NPG is a Naïve Parser Generator
- parse - The parse package is a simple token parsing library for Common Lisp
- parseq - A parser generator for common lisp, inspired by ESRAP
- proc-parse - Tools for parsing both strings and octet vectors efficiently
- rdp - com.informatimago.rdp is a simple Recursive Descent parser generator
- yid - yid (Yacc Is Dead) is a parser generator based on Brzozowski's derivative from regular expressions to context-free grammars
- Zebu - A Tool for Specifying Reversible LALR(1) Parsers
Lexers: (main topic: lexer)
- cl-lex - cl-lex is a set of Common Lisp macros for generating lexical analyzers automatically
- cl-shlex - A simple lexer for shell-like minilanguages
- DEFLEXER - The LEXER package implements a lexical-analyzer-generator called DEFLEXER, which is built on top of both REGEX and CLAWK
- dso-lex - Allows lexers to be defined using regular expressions a la cl-ppcre
- graylex - graylex offers a means to do string operations on input streams without slurping all input at once by using Common Lisp Gray Streams, fixed-sized and flexible buffers
- token-stream - Lexer class for cl-stream
- Zebu - A Tool for Specifying Reversible LALR(1) Parsers
String processing: (main topic: string)
- B-Tries - A prototypical implementation of the data structure described in the paper "B-tries for disk-based string management" (PDF)
- bobbin - Bobbin is a simple word-wrapping library for strings in Common Lisp
- boxen - Boxy string syntax for Common Lisp
- charseq - This package provides charseq structure which represents an efficient character sequence
- chio - Chio is a String Processing Library for Common Lisp
- cl-change-case - cl-change-case is a library to convert strings between camelCase, param-case, snake_case and more
- cl-cidr-notation - cl-cidr-notation is a library for converting IP addresses and CIDR blocks from integer to string representations and vice versa
- cl-date-time-parser - Parse date-time-string, and return (as multiple values) universal-time and fraction
- cl-interpol - CL-INTERPOL modifies the reader so that you can have interpolation of strings similar to Perl or Unix Shell scripts
- cl-string-complete - A small library for string completion by Robert Smith
- cl-string-match - Provides substring (subsequence) search and text processing algorithms implementations including regular expression, prefix/suffix tree data structures, etc
- cl-strings - cl-strings is a portable, dependency-free set of utilities to manipulate strings in Common Lisp (split, join, replace, insert, clean, change case,…)
- collapse-string - A function to remove whitespace from a string, with the option of collapsing each "run" to a single character, while optionally ignoring whitespace on the left, right, or both ends of the string
- diff-match-patch - This is a Common Lisp port of Neil Fraser's Diff, Match and Patch
- fuzzy-match - fuzzy-match is a library to fuzzy search an input string against a set of candidates
- kebab - String case conversion:
- Levenshtein - The Levenshtein Distance algorithm finds the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character
- lispbuilder-clawk - Stub added for asdf-install see Lispbuilder site for more details
- lispbuilder-lexer - Stub added for asdf-install see Lispbuilder site for more details
- lispbuilder-regex - Stub added for asdf-install see Lispbuilder site for more details
- meta-sexp - meta-sexp is a META parser generator using LL(1) grammars with s-expressions
- mk-string-metrics - This library implements efficient algorithms that calculate various string metrics in Common Lisp:
- parse-float - A function to parse floating-point values from a string in Common Lisp
- str - str is a modern and consistent string manipulation library (split, join, concat, replace, blank-p,…) to install with (ql:quickload :str)
- string-case - A macro that generates specialised decision trees to dispatch on string equality
Text: (main topic: text)
- ACUTE-TERMINAL-CONTROL - Permit fast control of a terminal device
- changed-stream - Is a lisp library for non-destructive changing of streams by inserting or deleting characters at a position
- cl-ansi-term - cl-ansi-term allows to print various primitives on ANSI-complaint terminals
- cl-ascii-table - Common Lisp library to present tabular data in ASCII-art tables
- CL-CSV - CL-CSV is a library to parse and write csv (comma-separated-values) files
- CL-ECMA-48 - Implement the ECMA-48 standard
- cl-heredoc - cl-heredoc is an implementation of "here documents" that allow the user to
- cl-inflector - cl-inflector is a branch of vana-inflector to make it more standard and loadable / testable
- cl-interpol - CL-INTERPOL modifies the reader so that you can have interpolation of strings similar to Perl or Unix Shell scripts
- CL-Pango - CL-PANGO is a cffi binding to the Pango text formatting library
- CL-READLINE - Common
- CL-Yacc - CL-Yacc is a LALR(1) parser generator for Common Lisp, somewhat like Zebu or lalr.cl.
- CLAWK - CLAWK is an AWK text manipulation language implementation embedded into Common Lisp, by Michael Parker
- CSV (library) - The CSV library, authored by Jeffrey Massung, permits reading and writing comma-separated-values (CSV) files
- diff - DIFF is a simple text library which can compute unified-style or context-style diffs between two files
- dso-parse - This is a simple PEG (parsing-expression grammar) parser-generator, aimed mostly at parsing text but capable of parsing other structures as well
- Enchant - An interface for Enchant spell-checker library
- ESRAP - A packrat parser for Common Lisp
- extsort - extsort is a Common Lisp library for external sorting of Unicode text files
- fare-csv - fare-csv is a library for reading and writing CSV files
- format-setf - The Common Lisp equivalent of scanf()
- guess - guess Japanese encoding (gauche's algorithm)
- html-encode - html-encode is a small library for encoding text in various web-savvy formats
- Inravina - A portable and extensible pretty printer
- Invistra - A portable implementation of Common Lisp FORMAT
- linewise-template - A Common Lisp library for processing text files and streams as templates conforming to simple line-based hierarchical formats
- LQN - Lisp Query Notation is for working with text files
- mk-string-metrics - This library implements efficient algorithms that calculate various string metrics in Common Lisp:
- monkeylib-prose-diff - monkeylib-prose-diff is a diff program optimized for comparing text files containing prose
- Montezuma - Montezuma is a text search engine for Common Lisp
- parser - Pages tagged as on the topic of parser:
- parser-combinators - An implementation of parser combinators
- pastelyzer - Analyze text documents (e.g
- persistent-variables - Persistent-variables is a convenience library that makes it easy to serialize and deserialize variables
- PorterStemmer - The Porter Stemmer is a stemming text algorithm by Martin Porter
- read-csv - Read-csv is a library for reading csv (comma-separated-values) files
- sequence-search-replace - A library for sequence search and replace so it's useful on Text
- Soundex - The Soundex algorithm indexes words by their sound when pronounced in English, for example to account for differences in spelling
- Tagger - The Tagger project is a revival of the Xerox Part-of-Speech (POS) Tagger program released somewhere in 1993
- text-query - The text-query package is a generalized form of Common Lisp's builtin Y-OR-N-P and YES-OR-NO-P, adding more
- Texticl - Texticl is a library that transforms a text markup language similar to
- umlisp - umlisp is an CLOS object-oriented interface to the Unified Medical Language System
- vana-inflector - A common lisp library to easily pluralize and singularize English words
- vas-string-metrics - vas-string-metrics provides the Jaro, Jaro-Winkler, Soerensen-Dice, Levenshtein, and normalized Levenshtein string distance/similarity metrics algorithms for text analysis
Streams: (main topic: stream) can be useful for, but are not limited to text processing.
- basic-binary-ipc - The Basic Binary IPC system provides an interface for performing inter-process communication using IPv4 or local streams
- Bivalent Streams - An overview of support for this concept at the implementation level
- CAPTURED-STREAM - captured-stream is a small Common Lisp library for viewing streams as sequences
- changed-stream - Is a lisp library for non-destructive changing of streams by inserting or deleting characters at a position
- Chunga - Chunga is a web/networking library which implements portable chunked HTTP streams as described in RFC 2616
- circular-streams - Circular-Streams allows you to read streams circularly by wrapping real streams
- cl-binary-file - The binary file package contains utilities to read and write binary files
- CL-PLUS-SSL - This library is a fork of SSL-CMUCL
- Cyclosis - Cyclosis is a combined implementation of the functionality of the Common Lisp stream dictionary and that of the Gray streams proposal
- deflate - Deflate by Pierre Mai is a Common Lisp implementation of Deflate (RFC 1951) decompression, with optional support for ZLIB-style (RFC 1950) and gzip-style (RFC 1952) wrappers of deflate streams
- Eclector - A portable, extensible Common Lisp reader
- fast-io - Fast-io is about improving performance to octet-vectors and octet streams (though primarily the former, while wrapping the latter)
- flexi-streams - FLEXI-STREAMS is a library which implements "virtual" bivalent streams that can be layered atop real binary/bivalent streams
- Gray streams - "Gray Streams" are a generic function wrapping of the COMMON-LISP streams in the standard library, allowing for further specialization by end users
- gzip-stream - gzip-stream is a simple wrapper around salza which gives CL users gzip compression and decompression in the form of streams (gzip-input-stream and gzip-output-stream)
- incless - A portable, extensible Common Lisp printer
- Inravina - A portable and extensible pretty printer
- Invistra - A portable implementation of Common Lisp FORMAT
- MaxPC - Max’s Parser Combinators is a simple and pragmatic library for writing parsers and lexers based on combinatory parsing
- MIME4CL - MIME4CL allows you to craft MIME compliant messages or to parse and handle them programmatically
- nontrivial-gray-streams - Like trivial-gray-streams but:
- odd-streams - ODD-STREAMS implements binary streams with "odd" byte sizes
- one - An input processing framework
- pretty-function - pretty-function provides an API for making individual functions pprint differently when written to an output stream
- replay-streams - Replay streams let the programmer rewind to points in a stream that have already been read
- rfc2388 - rfc2388 processes HTTP POST form data using enctype "multipart/form-data", as described in RFC 2388
- simple-stream - Simple-streams are Franz's proposal for a Gray-streams replacement
- sparse-streams - Gray Streams for subsets of underlying streams
- tar-file - This project is a fork of Nathan Froyd's archive library
- trivial-bit-streams - Trivial-bit-streams implements flexible buffered bit streams
- trivial-gray-streams - trivial-gray-streams provides an extremely thin compatibility layer for Gray streams
- Tungsten - Tungsten is a Common Lisp toolkit providing a wide range of features
Misc:
- Francis Leboutte's functions to compute frequencies of characters, digrams and trigrams from a text file (function names and comments in French - some comments in English).
- SPLIT-SEQUENCE. Part of Common Lisp Utilities.
- CL-BibTeX is a replacement for the BibTeX program.
- CL-DATA-FORMAT-VALIDATION - generic interface for parsing and formating data
- literate-lisp - an literate programming tool to write common lisp codes in org mode.
See also the pages for Regular Expression, XML libraries, HTML Parsers, Lisp Markup Languages, document formats, Unicode support, Unicode and Lisp