Christopher Browne <email@example.com> wrote:
| Will Hartung said:
| >Somewhere, perhaps PG's home page (not that I have the URL, mind you), is
| >his enumerations about "Web Success". In them he frowns on server generated
| >HTML pages, so this would fit in quite well philisophically.
| The thing that seems most relevant:
| Dynamically generated HTML is bad, because search engines ignore it.
Also see Philip Greenspun's comments on this issue, and his solution:
[...skip down 2/3 of the way...]
Hiding Your Content from Search Engines (By Mistake)
I built a question and answer forum...all the postings were
stored in a relational database. ... The URLs end up looking
AltaVista comes along and says, "Look at that question mark.
Look at the strange .tcl extension. This looks like a CGI script
to me. I'm going to be nice and not follow this link even though
there is no robots.txt file to discourage me."
Then WebCrawler says the same thing.
I achieved oblivion.
Briefly, his solution was:
Write another AOLServer TCL program that presents all the messages
from URLs that look like static files, e.g., "/fetch-msg-000037.html"
and point the search engines to a huge page of links like that.
The text of the Q&A forum postings will get indexed out of these
pseudo-static files and yet I can retain the user pages with their
(see my discussion of why the AOLserver *.tcl URLs are so good in
the chapters on Web programming; see http://photo.net/wtr/thebook/
bboard-for-search-engines.txt for the source code).
[Greenspun uses Tcl where many of us would choose Lisp (or even Scheme).]
Rob Warnock, 31-2-510 firstname.lastname@example.org
Network Engineering http://reality.sgi.com/rpw3/
Silicon Graphics, Inc. Phone: 650-933-1673
1600 Amphitheatre Pkwy. PP-ASEL-IA
Mountain View, CA 94043