[FoRK] low probability bits
dmorton at bitfurnace.com
Tue Feb 24 19:00:20 PST 2009
Riffing off the compression angle: For true overkill, you use arithmetic
decoding, with a binary alphabet, and feed it your random stream to be
On Wed, Feb 25, 2009 at 12:58 PM, Stephen D. Williams <sdw at lig.net> wrote:
> How about the following, a refinement of my earlier algorithm:
> N desired 1 bits, M desired 0 bits
> Zero an array of size N+M bits.
> Out of stream of random bits, for each N bits, set bit Array[log2(N+M)) %
> (N+M))] to 1.
> If M < N, reverse sense.
> If N is close to M, start with random array and randomly flip the
> appropriate number of 1 or 0 bits. For 49/51, flip one 1 bit to 0 in an
> array of 100 bits. Randomly read Array[x] until a 1 bit is found, then set
> it to 0.
> A noise factor is needed, definitely for low values of N+M (1+1 for
> instance), but probably to some extent in most situations. Setting a floor
> for N+M by scaling partially meets this requirement. Flipping a random
> number of bits according to a distribution might be desired for some
> Assuming no complex need for noise, this results in needing approximately:
> N*log2(N+M) bits for every N+M output bits. For 1% 1 bits, this requires 7
> bits for every 100 output bits.
> I was just thinking that compressed data that might be at hand, with a
> little cheap decimation, would be a good source for roughly random data,
> boosted with a little real random data. Looks like adding compression to
> pseudorandom streams has been looked at:
> An interesting problem would be to take pseudorandom data with some wrong
> distribution and making it fit another distribution. Given a stream of
> bits, keep running totals and, using more imperfect bits, arrive at a stream
> that has a desired distribution along with desired noise. All with a
> minimum of storage, CPU, and spent bits. More tough, do so with local
> detection of "bad bits" that aren't even pseudorandom based on some
> unlikeliness measure.
> Rob Harley wrote:
>> You guys are all smart. You figure it out.
>> Instead of generating a uniform r in [0,1) and comparing with a threshold
>> start generating bits of r from most significant and stop as soon as
>> you can decide.
>> This needs at least 1 random bit per output bit (and close to 2 on
>> but it should be possible to get close to the entropy e.g., 0.92
>> random bits per output bit with p = 1/3.
>> This is a similar problem to Huffman encoding a stream with rare letters.
>> A standard solution is to group the letters into tuples giving a
>> bigger alphabet.
>> Then do the same trick of locating a random uniform but this time in a
>> partition of the unit interval.
>> This will effectively give an infinite Knuth-Yao tree.
>> If p is rational, some blank letters can be added with the right
>> probabilities to make the partition points dyadic,
>> and then the tree can be truncated to finite giving a Huffman code.
>> Example: (p=1/3) write a Huffman code for the 32 5-bit words with
>> probabilities from 32/243 down to 1/243.
>> Generate up to 8 random bits until you recognize the code word (unique
>> prefix oblige) or a blank (skip it).
>> 000 => 00000
>> 0010 => 00001
>> 0011 => 00010
>> 0100 => 00100
>> 0101 => 01000
>> 1111000 => 11110
>> 11110010 => 11111
>> 11110011 => blank
>> 111101 => blank
>> 11111 => blank
>> Total overkill unless you really get off on saving random bits.
>> - R
>> FoRK mailing list
> FoRK mailing list
More information about the FoRK