text/quoted-unprintable MIME Content-type - Apr 1 thirtyfive days later...

I Find Karma (adam@cs.caltech.edu)
Sun, 5 May 96 13:01:37 PDT


Sorry it's so long after April 1, but I really enjoyed this... -- A.

---> included message:

Network Working Group J. Williams
Request for Comments: XXXX 1 April 1996
Category: Experimental

The text/quoted-unprintable MIME Content-type

Status of this Memo

This memo describes an experimental method for the encoding of obscene
and indecent speech for transmission on the Internet. This encoding
allows for adult discussion of obscene speech without violation of the
Communications Decency Act, while remaining child-safe. This is an
experimental, not recommended standard. Distribution of this memo is
unlimited.

Overview and Rationale

The Communications Decency Act (CDA), part of the Telecommunications
Reform bill of 1996, has make the transmission of "indecent" speech
via the Internet a criminal offense. In protest to this Act, various
authors [ADAMS96] [BARRY96] have suggested the use of alternate
encodings for various words and phrases that have been previously
classified as "obscene", and therefore not permissible under the CDA.
Unfortunately, these authors don't all agree on the encoding that
should be used. This presents a serious impediment to the transmission
of indecent speech on the net. Note that the author does not advocate
the transmission of indecent speech, however, if the Internet community
is to debate this issue, it would be helpful to have an encoding for
indecent speech that would allow us to freely interchange messages that
contain such speech without violating the CDA.

Obviously, there is an urgent need for a standard in this area. This
document defines a MIME content-type by which obscene words can be
transmitted via the Internet without violating the CDA. The encoding
used is reversible such that software on the receiving end can, if
desired, translate the encoded strings back into their original form.
Note that this encoding may not be sufficient to guarantee that a
message does not violate the CDA.

Scope

The Communications Decency Act bans "indecent" or "patently obscene"
speech on the net. Unfortunately, no adequate definition of these
terms exists that would let us define a comprehensive standard. We
have therefore chosen to encode only a few words that are likely to
be judged "obscene" and therefore "indecent". There are seven well
known words that have previously been banned for use on the public
airways [FCC78]. We have chosen to provide an encoding for only those
seven words. Future versions of this standard may define additional
encodings.

Content-type text/quoted-unprintable

It is difficult to define a standard for the encoding of obscene words
in such a way that the standard itself can be transmitted over the
Internet without being encoded. The author has chosen the
unsatisfactory solution of replacing all the vowels in the obscene
words, (hereinafter called "target strings") with the octet '*'. It
is hoped that the reader will understand the difficulties here and bear
with us. Here then are the seven target strings which must be encoded.

Target Strings
--------------
1) c*ck-s*ck*r
2) c*nt
3) f*ck
4) m*th*r-f*ck*r
5) p*ss
6) sh*t
7) t*t

The observant reader will note that word 4 is a special case of word
3. Thus, any encoding that renders word 3 acceptable for transmission
should therefore work for word 4 as well, leaving us with only six
words to encode.

Method of Encoding

The encoding method chosen was inspired by the quoted-printable
content-transfer-encoding defined by the MIME standard [RFC1521]. A
single octet, '=', was chosen as an escape character. Every instance
of '=' in the input text is replaced by the string "==". Every
instance of one of target strings in the input text is replaced by the
string "=XXX=", where XXX represents the encoding for that target
string.

Upon receipt, the message can be decoded by replacing all strings of
the form "==" with the string "=", and all strings of the form
"=XXX=" with the appropriate target string. Any string found between
matching '=' octets that is not one of the defined encodings should
be left unchanged by the decoding process.

A stated goal of the CDA is the prevention of exposure of children to
obscene or indecent material. It is not the intent of this memo to
circumvent this goal. To prevent children from gaining access to the
original, unencoded strings, it is suggested that parents provide their
children with MIME handlers that treat text/quoted-unprintable exactly
as if it were text/plain. The ability of the end user to control
whether he or she sees the target strings encoded or unencoded makes
the use of a MIME-based encoding scheme ideal for this application.

Encodings

The question arises as to what encoded strings to use for the target
strings. There are many, many possibilities. One possibility would
be to replace the offending term with the strings used in this memo.
Thus, "The CDA was passed by a bunch of c*ck-s*ck*rs" would encode as
"The CDA was passed by a bunch of =c*ck-s*ck*rs=". (Except, of course,
that the first instance of "c*ck-s*ck*rs" above contains the actual
vowels in the word "c*ck-s*ck*rs".)

Another possibility is to use common slang terms. One could use
"=fudge=" for "f*ck", and "=shoot=" for "=sh*t=". Or, the strings
could be encoded numerically: "sh*t" could be encoded as "=6=", since
it is word 6 on the above list. This system has the advantage of being
highly extensible, in case future legislation should outlaw more target
strings.

Ultimately, however, we took our cue from the above cited works and
chose the following strings, in honor of those most responsible for
the CDA.

Target String Encoded String Contribution to the CDA
------------- -------------- -----------------------
1) c*ck-s*ck*r =exon= Author/co-sponsor
2) c*nt =coates= Author/co-sponsor
3) f*ck =goodlatte= Amended "harmful to minors"
to "indecent"
4) m*th*r-f*ck*r =clinton= Signed Telecom Bill
5) p*ss =hyde= Authored abortion-gag amendment
6) sh*t =conyers= Supported Goodlatte amendment
7) t*t =schroeder= Supported Goodlatte amendment

The encoded strings are case-insensitive, thus "=goodlatte=",
"=GOODLATTE=", and "=Goodlatte=" are all encodings for "f*ck". For
completeness, an encoding is provided for word 4, even though, as noted
above, that is not completely necessary, as most examples of word 4
will be encoded as examples of word 3.

Case Maps

As defined above, the encodings do not allow for the recovery of the
case of the original target string. "P*ssing on free speech is p*ssing
off the net" is encoded "=hyde=ing on free speech is =hyde=ing off the
net". To address this, the encoded string may include an optional case
map. The case map is placed after the encoded string, preceded by the
octet '-'. The case map consists of the octets 'u', 'U' (for upper
case), 'l' and 'L' (for lower case). Each case map octet specifies
the case of the corresponding octet in the target string, reading left
to right. The default case for decoded target string octets is lower
case.

Using case maps, the earlier example would encode as "=hyde-Ulll=ing
on free speech is =hyde-llll=ing off the net". Since lower case is
the default, this may be elided to "=hyde-U=ing on free speech is
=hyde=ing off the net". As a stylistic issue, the author recommends
using 'l' for lower case and 'U' for upper case.

Context for Encoding

In order to be certain that every instance of the target strings is
encoded, it is not acceptable to search for these strings as whole
words. They must be encoded regardless of context. Obviously, this
will lead to gratuitous encodings. For example, "unconstitutional"
will become "uncons=schroeder=utional". This is an unfortunate
consequence of the need to be 100% certain that none of the target
strings is inadvertently transmitted unencoded.

References

[ADAMS96] Adams, Scott, "The Dilbert Newsletter 10.0", March, 1996,
scottadams@InterNex.NET

[BARRY96] Barry, Dave, syndicated newspaper column.

[RFC1521] Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail
Extensions) Part One: Mechanisms for Specifying and Describing
the Format of Internet Message Bodies", RFC 1521, Bellcore,
Innosoft, September 1993

[FCC78] Supreme Court Case "FCC v. Pacifica Foundation" (1978)

Author's Address

James W. Williams
EMail: williams@rahul.net

Jim Williams Pixar Animation Studios
williams@pixar.com Richmond, CA