Subject: Re: Loading a web page with Lispworks
From: rpw3@rpw3.org (Rob Warnock)
Date: Wed, 02 Apr 2008 03:39:54 -0500
Newsgroups: comp.lang.lisp
Message-ID: <4PGdnWuCjO3H227anZ2dnUVZ_jGdnZ2d@speakeasy.net>
<littlelisper@hotmail.com> wrote:
+---------------
| Hi, I am a Lisp newbie. I am trying to load a web page into a string
| with Lispworks, in order to parse the html code.
| I am trying to use open-tcp-stream, but I am confused about the value
| I have to use for the parameter "service".
| Basically I just want to do:
| 
| (with-open-stream (web-page (comm:open-tcp-stream
| "www.thewebpage.com" ?))
|   (loop for line = (read-line web-page nil nil)
|         while line
|         do (write-line line))))
| 
| First, I don't know what the value of "?" should be. Second, I have
| the hunch this will not work anyway. Can anyone tell me what am I
| missing?
+---------------

You're missing quite a bit. See RFC 1945 "Hypertext Transfer
Protocol -- HTTP/1.0" <http://www.ietf.org/rfc/rfc1945.txt>
and RFC 2616 "Hypertext Transfer Protocol -- HTTP/1.1"
<http://www.ietf.org/rfc/rfc2616.txt> for details.

Note that for simple client apps you can usually get away with
implementing only the HTTP/1.0 protocol, *except*... even when
doing HTTP/1.0 you should *always* send the HTTP/1.1 "Host:" request
header anyway, since so many web sites these days use name-based
virtual hosting. [Fortunately, all of the popular web servers will,
as required by HTTP/1.1, correctly interpret the "Host:" request
header even if the request uses HTTP/1.0 protocol otherwise.]


-Rob

p.s. A simple standards-conforming HTTP/1.0 [+"Host:"] "GET"-only
client *can* be written in only ~100 lines of CL. [Handling "POST"
is only slightly more complicated.]

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607