From ... Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!newsfeeds.belnet.be!news.belnet.be!news2.kpn.net!news.kpn.net!nslave.kpnqwest.net!nloc.kpnqwest.net!nmaster.kpnqwest.net!nreader3.kpnqwest.net.POSTED!not-for-mail Newsgroups: comp.lang.lisp Subject: Re: MD5 in LISP and abstraction inversions References: <87lmhrznup.fsf@Samaris.tunes.org> Mail-Copies-To: never From: Erik Naggum Message-ID: <3213610074551907@naggum.net> Organization: Naggum Software, Oslo, Norway Lines: 43 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 01 Nov 2001 13:27:56 GMT X-Complaints-To: newsmaster@Norway.EU.net X-Trace: nreader3.kpnqwest.net 1004621276 193.71.66.49 (Thu, 01 Nov 2001 14:27:56 MET) NNTP-Posting-Date: Thu, 01 Nov 2001 14:27:56 MET Xref: archiver1.google.com comp.lang.lisp:18982 * Lieven Marchand | You've done the work on md5 and I haven't so this is guess work, but my | first thought where the problem would lie is that a lot of these | algorithms assume efficient 32 bit operations, which in CL will | cons. Given sufficient declarations the character->(unsigned-byte 8) | conversions with char-code should be negligable. MD5 seems to be designed with hardware implementations in mind. This is not a bad idea, but it tends to make software implementations weirder. I have written an MD5 function in Allegro CL, using several low-level features (which are still at a higher level than C) and decided to split the 32-bit numbers in two 16-bit parts. Emacs has done the same thing, but in a slightly different way. I wanted to add support for a bitwise rotate function that would end up using such instructions if available, but it is probably easier to write MD5 in assembly on each platform than to optimize it better. In the application I used this, MD5 hashes were a serious bottleneck, so it had to be better than using FFI to a C function. After having worked with the MD5 functions on and off for a month or so, I came to conclude that it _should_ be written in assembly, and started to do that, but it turned out to be extremely time-consuming work. The Intel processors are shy too many registers, so all the fun in trying to make it superfast was replaced by increasing frustration over the design of those processors. Still, initial estimates indicated that a hand- tuned assembly version would be about twice as fast as the code that gcc produced for the naive implementation found in the RFC, so it would be worth it at significant cost. I also think the new streams design from Franz Inc makes for a good way to deal with the 64-byte buffers. It is wrong to try to work with MD5 at the character level, and the typical use of MD5 is to ensure that some data stream is intact. If the application that consumes the stream can do both MD5 hashing and its real work at the same time, and can roll back the work if the MD5 hash turns out bad, much will be saved compared to making two passes, since we must assume that the MD5 hash is usually good. /// -- Norway is now run by a priest from the fundamentalist Christian People's Party, the fifth largest party representing one eighth of the electorate. -- Carrying a Swiss Army pocket knife in Oslo, Norway, is a criminal offense.