[FoRK] Slow grep

Aidan Kehoe kehoea at parhasard.net
Fri Mar 19 01:20:15 PST 2004

 Ar an 18ú lá de mí 3, scríobh Eugen Leitl :

 > > An equally valid workaround for english speakers is simply to set
 > > locale to "C" and pretend there's no non-ASCII charsets out there ;)
 > IIRC, 7-bit ASCII goes back to 5-hole perforated paper tapes (and these
 > harken back to Morse). Bits having been expensive back then, the designers 
 > aren't really to blame (though they'd be able to avert most mayhem by 
 > having used 8-bit extensions).

No-one is saying the designers were to blame. And UTF-8 is a useful,
compatible answer to the question of how to usefully internationalise the
primary layer of things on the net. 

 > The bad shit starts when today's jackasses try to "fix" these broken
 > standards. Instead of transcribing these funny characters, they chose to
 > extend the set, use alternate keyboard layouts, etc.

What a fucked-up attitude to take. How much more uselessly difficult would
your life be if, to use URLs and the internet in general, you had to learn
to transcribe from your native Cantonese to an [-A-Z0-9] representation of
the sounds of proper nouns, to access information relevant to you? Oh, and
of course, the transcription the people setting up the servers used was from
Mandarin, so it bears no relation whatsoever to the sound of the words as
you say them. 

Or, the bind and sendmail people could suck it down, allow eight-bit-set
hostnames--and comparatively little work is needed for it, too--and you'd be
able to type the Han for what you mean, and it would Just Work. 

 > Do we really need to be able to use host names with umlauts, or spell
 > them in Klingon, or Urdu? It would have a point, if it wasn't such a
 > giant can of worms.

We need them in Kanji, and Han Chinese, and if we solve the issue for that,
we get Urdu for free, architecturally. (We don't get Klingon--Klingon made
it into the vendor private use area for Linux, but it didn't make it into
Unicode as a whole.)

I don't care if it rains or freezes/'Long as I got my Plastic Jesus
Riding on the dashboard of my car.

