Re: <URL: yada yada

Rohit Khare (khare@w3.org)
Wed, 25 Jun 1997 22:17:48 -0400 (EDT)


> in the newsletter RFC-1738 is my story and I'm stickin' to it. See
> the notes to <http://www.tbtf.com/archive/03-01-97.html>.

Keith is referring to the obsolescent December 1994 URL RFC (not stds-track)
which quoth as below.

=======

Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".

All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.

=======

Roy Fielding did the community a service with his definitive rewrite of a
correct URL grammar in RFC 1808, which IS standards track, but introduced
the ugliness we are now fighting :-)

=======
For protocols that make use
of message headers like those described in RFC 822 [5], we recommend
that the format of this header be:
base-header = "Base" ":" "<URL:" absoluteURL ">"
where "Base" is case-insensitive and any whitespace (including that
used for line folding) inside the angle brackets is ignored. For
example, the header field
Base: <URL:http://www.ics.uci.edu/Test/a/b/c>
would indicate that the base URL for that message is the string
"http://www.ics.uci.edu/Test/a/b/c". The base URL for a message
serves as both the base for any relative URLs within the message
headers and the default base URL for documents enclosed within the
message, as described in the next section.
======

I've CC:d Connolly and Fielding in case they'd like to weigh in ... RK