The World Wide Web by Email

by Odd de Presno


Sample text from the Online World Monitor newsletter
ISSN: 0805-6315. August 1994. (C) by Odd de Presno, Norway.

Links are not maintained! Check the handbook for current links.


Most people only have email access to the Internet, and are therefore deprived of interactive access to the World Wide Web. The good news is that most pages are available by email!
Request WWW pages by sending email to agora@www0.cern.ch (click here for current providers). Put your retrieval commands in the BODY of the mail, like this

send <URL>

Example:

www http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html

That's all. Lean back and wait. You will get a page filled with hints on how to use the WebCrawler service. The mail will look like this:

Example

Date: Mon, 15 Aug 1994 18:10:44 +0200
From: daemon@www0.cern.ch (The CERN WWW Team Administration)
Subject: Hints for Searching the WebCrawler Index (was: )

This is a test version. Please mail any comments to www-request@info.cern.ch

The document you requested, which URL is http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html, follows

Hints for Searching the WebCrawler Index The WebCrawler knows about a lot of documents, so it pays to make precise queries. Often, though, you can be too precise, so finding what you want may take a couple of queries. Here are some suggestions about what to do when you don't get what you want, some examples to help you out, and detailed explanation of what happens to your query before it's run.

WHAT TO DO WHEN...

Your search produces no results. Check your spelling! If that looks OK, then try to be less specific in your query. For instance, the query molecular biotechnology DNA sequencing genetics chromosome human genome project is too specific -- no one document contains all of those keywords. Something like molecular biotechnology DNA sequencing is more appropriate.
Your search produces too many results. Be more specific, and make sure you have the AND button checked. Try to think of words that uniquely identify what you're looking for. Some words are of little value, because they identify lots of documents in the WebCrawler's index. For instance, the words information and university together identify nearly half the documents in the index, so they're not very useful in trying to narrow down the search.
You get an error from the WebCrawler. The WebCrawler will return an unfriendly error message if it's too busy, or if it chokes on your query. If it repeatedly has trouble with your query, please let me know, as I'm trying to eliminate these problems. Thanks!

Examples

Most specific queries work quite well. For instance, if you're looking for information on the music group They Might Be Giants, search for They Might Be Giants, or just TMBG.
Some keywords are found in many places. For example, instead of searching for kermit, use something more descriptive like kermit columbia or kermit source code communication. Make sure the "AND" button is checked.
To find references to the New York Times, try the query New York Times. To be more specific, try something like New York Times online newspaper.

How a query works

The query is parsed in to keywords on space and punctuation boundaries.
Each word is folded to lower case, and any endings are stripped (NeXT Computers becomes next computer).
Each word is checked against a stop list, to see if it's too common to worry about (to be or not to be is a null query!).
Each word is fed to the index, and the resulting lists of documents are combined.

bp@cs.washington.edu[1]

*** References from this document *** [1] http://www.cs.washington.edu/homes/bp/bp.html


The last line of the report is interesting. The "[1]" refers to the following entry in the page's text:

bp@cs.washington.edu[1]

Interactive WWW users can click at this reference to see the associated page. Those using email must send the URL at the bottom of the report back to the LISTSERV to get it.
Actually, there is also a WWWmail command called "deep" that allows you to get all documents in the URL you mentioned. If you replace "www" above with

deep http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html

you will get both the "Hints" page, and the one giving more information about bp@cs.washington.edu .

Note: If the requested document is too large,you'll only get
the
first 5,000 lines.

There may be many such references pointers in the text, as illustrated by this page at URL: http://web2.xerox.com/digitrad

Example:

Date: Mon, 15 Aug 1994 14:03:10 +0200 From: daemon@www0.cern.ch (The CERN WWW Team Administration) Subject: Digital Tradition Folk Song Full Text Search (was: )

This is a test version. Please mail any comments to www-request@info.cern.ch

The document you requested, which URL is http://web2.xerox.com/digitrad, follows

Digital Tradition Folk Song Full Text Search
DIGITAL TRADITION FOLK SONG DATABASE

This is a searchable index of the Digital Tradition Folk Song Database (April 1994 version). Please read About The Digital Tradition[1] and Searching Digital Tradition[2].

Full Text Search

You may enter a Search Pattern to select songs from the database.

Options: search titles[3] or search full text; show matching text or list titles only[4]; list first 50 or list more (100)[5]; default settings.

Contents

Keywords List[6]

Titles List[7]

Tunes List[8]

(DT of April 1994)

*** References from this document ***
[1] http://web2.xerox.com/docs/DigiTrad/AboutDigiTrad.html
[2] http://web2.xerox.com/docs/DigiTrad/DigiTradSearch.html
[3] http://web2.xerox.com/digitrad/titles
[4] http://web2.xerox.com/digitrad/short
[5] http://web2.xerox.com/digitrad/list=100
[6] http://web2.xerox.com/docs/DigiTrad/DigiTradKeywords.html
[7] http://web2.xerox.com/docs/DigiTrad/DigiTradTitles.html
[8] http://web2.xerox.com/docs/DigiTrad/DigiTradTunes.html


For more information about this WWW by mail service, send the word "help" to agora@www0.cern.ch.

Note: There is another service delivering WWW by email at the email address webmail@www.ucc.ie . Check http://www.ucc.ie/webmail/ for instructions.

The Online World Monitor newsletter

The newsletter and the book were companions. While the book describes the online world as it is, the newsletter tracked changes. It could more freely focus on selected offerings or phenomena than could be done within the strict framework of the book.

For more about the newsletter, see monitor.html
KIDLINK: http://www.kidlink.org


Feel free to redistribute as long as the text remains intact as it appears here (including this paragraph). Permission to quote/excerpt/reference in other media is hereby granted, so long as cited material is identified as coming from The Online World Monitor newsletter. For any other use, contact the author for permission.

| Index | Expanded index | Register | For Quick Navigation |

Search the handbook:

The Online World resources handbook's text on paper, disk and in any other electronic form is © copyrighted 2000 by Odd de Presno.
Updated at November 15, 2000.
Feedback please.