Interesting end-to-endedness failure [TCP Checksum]

Rohit Khare (rohit@bordeaux.ICS.uci.edu)
Wed, 14 Jan 1998 20:24:53 -0800


A true nightmare scenario: as computers and memories get larger, the
probability of *entirely* random corruption (cosmic rays, etc) becomes a
factor to defend against within OSes, within even applications, since
end-to-end reliability can only occur there. We already learned this about
disks with RAID. Now, the notion that errors can creep up from below, too:
dammit, reliable duplex bytestreams are supposed to be reliable!

And yet... if the Ethernet, IP, and TCP segment checksums fail, there is very
little recourse: we code to the 10^-X assumption of failure: but what happens
when we're manipulating 10^X bits of data and want 10^-Y reilabilty (Y >> X)?

Will error correction have to bubble up the stack, over the years? What
happens if microprocessor masks get so small, so high-speed that they can talk
directly to RF -- and what if RF interference can talk back? i.e. what if we
have to be paranoid about processors doing the wrong thing? Probably not in a
nintendo machine, but what about a life-partner (a munchkin)?

fft,
Rohit

PS. even MD5 is insufficient for some applications already: a German team may
yet crack it entirely; and even 2^-32 is too high a collision rate for some
needs.

------- Forwarded Message

Date: Thu, 08 Jan 98 11:30:39 PST
From: Jeffrey Mogul <mogul@pa.dec.com>
Resent-Message-Id: <"_K8ol1.0.hY3.2dIjq"@cuckoo>
Resent-From: http-wg@cuckoo.hpl.hp.com
X-Mailing-List: <http-wg@cuckoo.hpl.hp.com> archive/latest/5132
X-Loop: http-wg@cuckoo.hpl.hp.com
Precedence: list
Resent-Sender: http-wg-request@cuckoo.hpl.hp.com

Paul Leach writes:
> From: Roy T. Fielding[SMTP:fielding@kiwi.ics.uci.edu]
>
> [...] It turns out that Content-MD5 is not useful
> at all for HTTP/1.1, since the combination of the error-free transport
> layer and length-delimited content is sufficient.
>
"Error free" and "ones-complement checksum" are not 100%
commensurate. Plus, the existence of proxies menas that the TCP
"guarantee", such as it is, isn't in fact guaranteed anyway.

For some quantitative results on the problems with the TCP checksum,
see
Craig Partridge, Jim Hughes, Jonathan Stone,
"Performance of Checksums and CRCs over Real Data",
Proc. SIGCOMM '95

http://www1.acm.org:81/sigcomm/sigcomm95/papers/partridge.html

The abstract mentions the "spectacular failure rate" of the TCP
checksum "when trying to detect certain types of packet splices."
Packet splices are a potential problem when using ATM networks
that drop individual cells. In practice, of course, we haven't
been hit by a blizzard of TCP checksum failures ... yet.

The MD5 checksum is end-to-end, and much stronger than the
transport checksum.

Yes, but. Unfortunately, the MD5 checksum covers just the
message body, and so if one is reassembling a document from
several messages (e.g., using Range retrievals) one can still
have undetected errors. This is why I speculated that Content-MD5
is "not even particularly useful" ... it's end-to-end as far
as the HTTP messages go, but it's not end-to-end as far as the
actual documents (or whatever) are concerned.

More grist for Digest-NG, perhaps.

- -Jeff

------- End of Forwarded Message