Values, separated by delimiters. For guile scheme
5d7a4b25ecb8 — Linus Björnstam default tip 4 years ago
There is already a guile-dsv. A very pretty one indeed. Use that instead.
093f103d011a — Linus Björnstam 4 years ago
corrected error in code example
9478628cfa2e — Linus Björnstam 4 years ago
Changed the implementation a bit.


browse log




The delimiter-separated values format is a superset of CSV (although headers are not currently supported). This implements a DSV parser for guile with a streaming interface and a more convenient port-exhausting or string-reading interface.

#If you want it properly done, there is a better library:

Look here. Much better. Modularized. Documented. Much more finite state automata:


(import (vsd))
(define file (open-input-file "csv.csv"))

;; These are all the available options for the procedures in this library.
;; All options below are the standard ones, and do not have to be provided.
;; #:newline can be 'cr, 'lf, 'crlf  and 'lax. Lax accepts all other newline
;; characters
(define reader (make-dsv-reader file #:delimiter #\, #:newline 'lf #:escape #\"))

;; reader is now a thunk that returns a vector of dsv cells:
(reader) ;; => (#("my" "delimited" "data"))

;; When there is no more data to be read #<eof> is returned.
(reader) ;; => #<eof>
(close-port file)

;; There is also a higher level interface for exhausting data:

(dsv-file->list "csv.csv") ;; => (#("my" "delimited" "data"))

(call-with-input-file "csv.csv" dsv->list) ;; => (#("my" "delimited" "data"))

;; Both the above procedures (dsv-file->list and dsv->list) take an 
;; optional keyword spec as shown for make-dsv-reader


It is slightly faster than guile-csv for CSV files, with the bonus that it actually parses proper CSV files with CRLF line endings. This means a 35mb CSV file is parsed in about 4s using guile 2.9.4. Python is twice as fast, due to it's csv reader being written in optimized and nicely buffered C.


LGPLv3. See the file header.


I was trying my best to use data-type specific comparisons, but apparently eqv? was faster (probably due to fewer type checks in the generated code). That yielded quite a speed increase. I will have to try to find other such nice little speedups.

Re-add trimming.

Enforce length o f rows.

Change the interface to allow composing with call-with-input-xxxx and the likes.

I tried using a bigger string buffer and using the same buffer for each instantiated reader, but that made it run slower than using a new buffer for each line.

Anyway, I would like to write some tests to make sure it outputs correct code. Then I would like to make it fast. After that, I would like to make it pretty.