Subject: Re: why is this code inefficient?? From: Erik Naggum <email@example.com> Date: 1998/07/29 Newsgroups: comp.lang.lisp Message-ID: <firstname.lastname@example.org> * kp gores | why is this code so inefficient? where do i waste so much space?? presuming it is only consing that causes the waste, let me try and synthesize some of the other responses and add a little of my own: | (do | ((line (read-line input-file nil *eof*)(read-line input-file nil *eof*)) READ-LINE frequently conses much more than one might think. if you allocate a string buffer (henceforth called <buffer>) to write into with (make-array <some guess> :element-type 'character :adjustable t :fill-pointer 0) and fill it with some loop like (loop initially (setf (fill-pointer <buffer>) 0) for character = (read-char <stream> nil nil) when (null character) do <some non-local exit> until (char= character #\newline) do (vector-push-extend character <buffer>)) you will keep the same buffer all the time, and it will grow to accomodate the longest string in a space-economical fashion. | (spelling "") | (pronounciation "") | (word-class-code "") | (verb-pattern "")) these appear to be mere substrings into the line you just read, but you don't use them for anything other than to cons up new strings. I'd do away with these altogether and instead use indices into the buffer. | (setf spelling (mytrim (subseq line 0 23))) ;; doc: 1-25 (position-if-not #'whitespace-char-p <buffer> :start 0 :end 23) (position-if-not #'whitespace-char-p <buffer> :start 0 :end 23 :from-end t) will produce the starting and ending positions of a substring of buffer that is the same under EQUAL as your expensive solution. | (nsubstitute | #\Space | #\, | (subseq line 46 69)))) ;; doc: 51-70 SUBSTITUTE and friends take :START and :END arguments -- use them. | (princ "(setf (gethash " output-file) | (prin1 spelling output-file) | (princ " " output-file) | (princ 'dict output-file) | (princ ") '(" output-file) | (prin1 word-class-code output-file) | (princ "))" output-file) if you are merely going to write out substrings, you can use WRITE-STRING and use its :START and :END arguments to delimit the substring written. this will require that you check for backslashes and quotation marks in the string written, and that might not be cost-effective. if you can statically determine that they don't appear in the material, you can save a lot this way, however. you can write (write-string " dict) '(" ...) directly, since PRINC will print the symbol-name of a symbol. if you want to use strings, I would instead go for displaced arrays into the pre-allocated buffer, and then you can do something even nicer. preallocate the variables with this general form: (make-array 0 :adjustable t :element-type 'character :displaced-to <buffer> :displaced-index-offset 0) and construct the list to be printed like this once and for all, i.e., outside of the loop: `(setf (gethash ,spelling dict) '(,word-class-code)) now _move_ the displaced array instead of consing a new array header for `spelling' and `word-class-code'. (adjust-array <array> (- <end> <start>) :adustable t :displaced-to <buffer> :displaced-index-offset <start>) this should not cons a new array header and the value _should_ be EQ to the <array> argument. (I don't find any guarantees to this effect right now, but it appears to have been the intention. you might want to check.) now you can print this whole list with PRINT, and it will take care of itself. I don't see any reason why printing this list should cons. glue this stuff together and let me know how it works... #:Erik -- http://www.naggum.no/spam.html is about my spam protection scheme and how to guarantee that you reach me. in brief: if you reply to a news article of mine, be sure to include an In-Reply-To or References header with the message-ID of that message in it. otherwise, you need to read that page.