Subject: Re: Can not find older posting: Reading files (fast)
From: rpw3@rpw3.org (Rob Warnock)
Date: Sat, 27 Aug 2005 21:23:17 -0500
Newsgroups: comp.lang.lisp
Message-ID: <7ZKdnVpy7JiIvYzeRVn-pw@speakeasy.net>
Bernd Schmitt  <Bernd.Schmitt.News@gmx.net> wrote:
+---------------
| There was an interesting post (for a novice like me) about fast loading 
| a file, so that each line would be appended to a list. Actually this was 
| the wrong way, somebody pointed out that this would lead to quadratic 
| time consumption if file size would doubled, so the solution had been to 
| put new lines at the beginning and reverse the list after finishing.
+---------------

If you're willing to use a simple LOOP without necessarily
understanding it completely at this point in your learning,  ;-}
the following [from my "random utilities" junkbox] has linear
behavior in most CL implementations:

    (defun file-lines (path)
      "Sucks up an entire file from PATH into a list of freshly-allocated
      strings, returning two values: the list of strings and the number of
      lines read."
      (with-open-file (s path)
	(loop for line = (read-line s nil nil)
	      while line
	  collect line into lines
	  counting t into line-count
	  finally (return (values lines line-count)))))

With CMUCL-19a on FreeBSD 4.10 on a 1.855 GHz Mobile Athlon, the
above takes a hair over one second to read in a 11637220 byte file
of 266478 lines.

And if, instead of a list of lines, you suck up the whole thing into
a single string [for later manipulation with CHAR or SUBSEQ]:

    (defun file-string (path)
      "Sucks up an entire file from PATH into a freshly-allocated string,
      returning two values: the string and the number of bytes read."
      (with-open-file (s path)
	(let* ((len (file-length s))
	       (data (make-string len)))
	  (values data (read-sequence data s)))))

then on the same machine/file/etc. as above, it goes ~20 times as fast
(0.05 seconds of real time).


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607