CLiki: COLLECTING

COLLECTING

Common Lispy ways of collecting values during evaluation are diverse; here we discuss some well established and some proposed.

LOOP has a COLLECT clause which is nice and handy. As a clause however, (COLLECT x INTO y) doesn't work inside forms or nested loops. Various COLLECTING macros have been proposed for this purpose. They can be considered as simple language extensions.

The following shows a typical situation where one wishes for a collecting macro:

(let ((all-elements '())) (dotimes (i 10) (dotimes (j 5) (push (aref array i j) all-elements))) (nreverse all-elements))

There are two basic forms of collecting:

collect into a single container (and yield the list), like above. It usually takes a form similar to: (collecting (dolist ... (collect form)) ...)
collect into multiple containers. Two mechanisms are needed: one for naming the desired container, one for selecting the result.

A simple implementation of the first idiom uses MACROLET to locally define the inner COLLECT operator:

(defmacro collecting (&body forms) ;; Collect some random stuff into a list by keeping a tail-pointer ;; to it, return the collected list. By Tim Bradshaw, ;; and may be used for any purpose whatsoever by anyone "Collect things into a list forwards. Within the body of this macro The form `(COLLECT THING)' will collect THING into the list returned by COLLECTING." (let (($resnam$ (gensym "COLLECTOR")) ($tail$ (gensym)) ($thing$ (gensym))) `(let (,$resnam$ ,$tail$) (macrolet ((collect (thing) ;; Collect returns the thing it's collecting `(let ((,',$thing$ ,thing)) (if ,',$resnam$ (setf (cdr ,',$tail$) (setf ,',$tail$ (list ,',$thing$))) (setf ,',$resnam$ (setf ,',$tail$ (list ,',$thing$)))) ,',$thing$))) ,@forms) ,$resnam$)))

It also uses tail-consing, which was measured by Sam Steingold to be more performant than the PUSH/NREVERSE idiom on several implementations (TODO Reference, c.l.l April 1999 or elsewhere). OTOH, there's an article by Richard C. Waters from 1993 comparing NREVERSE and RPLACD (aka. SETF CDR) where NREVERSE usually wins, presumably because it is heavily optimized by the implementation.

In November 2001, yet another discussion about the design of a collecting facility for the second situation began in comp.lang.lisp under "Collection utilities."

Do you prefer (collect FORM into var) ...) or (var FORM) inside the loops? And what result should be returned by the macro in the presence of multiple variables, would you introduce a FINALLY clause or use VALUES?

Several people have posted their own versions in the long history of c.l.l. IIRC among others Tim Bradshaw (both cases), Bruno Haible, Sam Steingold, and more recently, a naive implementation by Chris Capel. Helmut Eller points at the COLLECT macro available in CMUCL (extensions.lisp):

(collect ((c1) (c2 '(init) cons)) (dolist (e '(x y z)) (c1 e) (c2 e)) (values (c1) (c2))) ;; => (X Y Z), (Z Y X INIT)

A version of with-collect is distributed with CLISP and CLOCC in CLLIB/simple.lisp:

(with-collect (c1 c2) (dotimes (i 10) (if (oddp i) (c1 i) (c2 i)))) ;; => (1 3 5 7 9), (0 2 4 6 8)

One could also use an adjustable array and VECTOR-PUSH-EXTEND instead of lists, but that's another story (See c.l.l around November 2001 about "Loop? ptooey" for Dylan code from Bruce Hoult for a generic collecting macro that works on arrays, lists etc.).

UIOP has while-collecting. We define names for a collector function that, when called with an argument in the body, adds it to the corresponding collection. The macro returns multiple values, a list for each collection.

(while-collecting (foo bar) (dolist (x '((a 1) (b 2) (c 3))) (foo (first x)) (bar (second x))))

There is also collectors, a system available in Quicklisp which provides fairly rich and seemingly extensible set of features.

macro example accumulators