[FoRK] Slow grep

Gavin Thomas Nicol gtn at rbii.com
Fri Mar 19 06:05:08 PST 2004


On Friday 19 March 2004 05:40 am, Eugen Leitl wrote:
> It's supposed to be compatible. It breaks a lot of things. It also
> inherently bloated, and will result in a security nightmare of
> unprecedented proportions.

Actually, that's pretty much hyperbole vis-a-vis UTF8. For many languages the 
bloat isn't all that great (maybe 30% or so). Also, UTF-8, as an encoding, is 
interesting because it has very little state.

> Make a scheme to map 7-bit ASCII to UTF-8 and back. Let application layer
> deal with the issue (a single library should do).

That's called UTF7....

> > it into the vendor private use area for Linux, but it didn't make it into
> > Unicode as a whole.)
>
> I'm not at all sure Unicode is a good idea.

Unicode *was* a good idea... but the core guiding principals have been eroded 
a bit over the years, so it's far less pure than it used to be (look at the 
Korean mess). Still, I think it's obvious that we do need to move forward...



More information about the FoRK mailing list