multisort


multisort takes any number of httpd logfiles in the Common Log Format and merges them together, ordered by the date string. This is extremely useful for servers that use round-robin DNS for a website hosted on multiple servers; it allows you to merge the logfiles to do analysis.

multisort is badly named, since it doesn't do any actual sorting. It just takes two sorted logfiles and merges them. A better name might be mergelog. But someone already has that, and it's not only named better than multisort, it allegedly performs better too.

multisort is written in C, and it's quite fast. I made a perl version, but the startup time alone for small sorts (~150,000 lines) makes the C version of multisort much faster.

Source

Version 1.1 is available here. If you have any questions or comments about it, feel free to email me at xach@xach.com. I'm reasonably sure there aren't any bugs, but only reasonably sure. :-)

Example

Here's an example:

log1.log

127.198.54.33 - - [14/Jan/1999:13:54:37 -0500] "GET /mint/images/top_maindesigns_on.jpg HTTP/1.0" 200 7164
127.222.216.2 - - [14/Jan/1999:13:54:38 -0500] "GET /sad49/ HTTP/1.0" 200 11012
127.14.111.187 - - [14/Jan/1999:13:54:40 -0500] "GET /~caldev/1-12-1.JPG HTTP/1.0" 302 221
127.222.216.2 - - [14/Jan/1999:13:54:51 -0500] "GET /mint/images/top_mint_off.jpg HTTP/1.0" 200 6566
127.222.216.2 - - [14/Jan/1999:13:54:51 -0500] "GET /mint/images/top_mint_on.jpg HTTP/1.0" 200 7515
127.222.216.2 - - [14/Jan/1999:13:54:54 -0500] "GET /mint/images/top_gseg_on.jpg HTTP/1.0" 200 6952
127.222.216.2 - - [14/Jan/1999:13:54:54 -0500] "GET /~halifax1/festoon.jpg HTTP/1.0" 302 224
127.220.39.41 - - [14/Jan/1999:13:54:55 -0500] "GET /~icemanx/ HTTP/1.0" 302 212
127.67.244.87 - - [14/Jan/1999:13:54:55 -0500] "GET /cgi-bin/digits.pl?name=ALWAYSLIVEANDSTUFF|style=digital HTTP/1.0" 200 10257
127.147.65.114 - - [14/Jan/1999:13:54:56 -0500] "GET /agi/brochure.pdf HTTP/1.0" 200 150780

log2.log

127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_infocenter_off.jpg HTTP/1.0" 200 6904
127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_gseg_off.jpg HTTP/1.0" 200 6462
127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_pdirect_on.jpg HTTP/1.0" 200 7545
127.222.216.2 - - [14/Jan/1999:13:54:33 -0500] "Connection: close" 400 -
127.198.54.33 - - [14/Jan/1999:13:54:34 -0500] "GET /mint/images/top_quicksilver_on.jpg HTTP/1.0" 200 7397
127.198.54.33 - - [14/Jan/1999:13:54:37 -0500] "GET /mint/images/top_infocenter_on.jpg HTTP/1.0" 200 7103
127.26.123.43 - - [14/Jan/1999:13:54:44 -0500] "GET /~ravensfm/nec.jpg HTTP/1.1" 302 232
127.213.176.25 - - [14/Jan/1999:13:54:47 -0500] "GET /~ravensfm/purse.jpg HTTP/1.0" 302 222
127.220.46.147 - - [14/Jan/1999:13:54:54 -0500] "GET /mint/mint_info/password.html HTTP/1.0" 200 2687
127.205.44.63 - - [14/Jan/1999:13:54:59 -0500] "GET /~caldev/1-14-3.JPG HTTP/1.0" 302 221

And the merged output would be:

127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_infocenter_off.jpg HTTP/1.0" 200 6904
127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_gseg_off.jpg HTTP/1.0" 200 6462
127.198.54.33 - - [14/Jan/1999:13:54:29 -0500] "GET /mint/images/top_pdirect_on.jpg HTTP/1.0" 200 7545
127.222.216.2 - - [14/Jan/1999:13:54:33 -0500] "Connection: close" 400 -
127.198.54.33 - - [14/Jan/1999:13:54:34 -0500] "GET /mint/images/top_quicksilver_on.jpg HTTP/1.0" 200 7397
127.198.54.33 - - [14/Jan/1999:13:54:37 -0500] "GET /mint/images/top_maindesigns_on.jpg HTTP/1.0" 200 7164
127.198.54.33 - - [14/Jan/1999:13:54:37 -0500] "GET /mint/images/top_infocenter_on.jpg HTTP/1.0" 200 7103
127.222.216.2 - - [14/Jan/1999:13:54:38 -0500] "GET /sad49/ HTTP/1.0" 200 11012
127.14.111.187 - - [14/Jan/1999:13:54:40 -0500] "GET /~caldev/1-12-1.JPG HTTP/1.0" 302 221
127.26.123.43 - - [14/Jan/1999:13:54:44 -0500] "GET /~ravensfm/nec.jpg HTTP/1.1" 302 232
127.213.176.25 - - [14/Jan/1999:13:54:47 -0500] "GET /~ravensfm/purse.jpg HTTP/1.0" 302 222
127.222.216.2 - - [14/Jan/1999:13:54:51 -0500] "GET /mint/images/top_mint_off.jpg HTTP/1.0" 200 6566
127.222.216.2 - - [14/Jan/1999:13:54:51 -0500] "GET /mint/images/top_mint_on.jpg HTTP/1.0" 200 7515
127.222.216.2 - - [14/Jan/1999:13:54:54 -0500] "GET /mint/images/top_gseg_on.jpg HTTP/1.0" 200 6952
127.222.216.2 - - [14/Jan/1999:13:54:54 -0500] "GET /~halifax1/festoon.jpg HTTP/1.0" 302 224
127.220.46.147 - - [14/Jan/1999:13:54:54 -0500] "GET /mint/mint_info/password.html HTTP/1.0" 200 2687
127.220.39.41 - - [14/Jan/1999:13:54:55 -0500] "GET /~icemanx/ HTTP/1.0" 302 212
127.67.244.87 - - [14/Jan/1999:13:54:55 -0500] "GET /cgi-bin/digits.pl?name=ALWAYSLIVEANDSTUFF|style=digital HTTP/1.0" 200 10257
127.147.65.114 - - [14/Jan/1999:13:54:56 -0500] "GET /agi/brochure.pdf HTTP/1.0" 200 150780
127.205.44.63 - - [14/Jan/1999:13:54:59 -0500] "GET /~caldev/1-14-3.JPG HTTP/1.0" 302 221