[FoRK] low probability bits
Stephen D. Williams
sdw at lig.net
Tue Feb 24 17:58:48 PST 2009
How about the following, a refinement of my earlier algorithm:
N desired 1 bits, M desired 0 bits
Zero an array of size N+M bits.
Out of stream of random bits, for each N bits, set bit Array[log2(N+M))
% (N+M))] to 1.
If M < N, reverse sense.
If N is close to M, start with random array and randomly flip the
appropriate number of 1 or 0 bits. For 49/51, flip one 1 bit to 0 in an
array of 100 bits. Randomly read Array[x] until a 1 bit is found, then
set it to 0.
A noise factor is needed, definitely for low values of N+M (1+1 for
instance), but probably to some extent in most situations. Setting a
floor for N+M by scaling partially meets this requirement. Flipping a
random number of bits according to a distribution might be desired for
Assuming no complex need for noise, this results in needing approximately:
N*log2(N+M) bits for every N+M output bits. For 1% 1 bits, this
requires 7 bits for every 100 output bits.
I was just thinking that compressed data that might be at hand, with a
little cheap decimation, would be a good source for roughly random data,
boosted with a little real random data. Looks like adding compression
to pseudorandom streams has been looked at:
An interesting problem would be to take pseudorandom data with some
wrong distribution and making it fit another distribution. Given a
stream of bits, keep running totals and, using more imperfect bits,
arrive at a stream that has a desired distribution along with desired
noise. All with a minimum of storage, CPU, and spent bits. More tough,
do so with local detection of "bad bits" that aren't even pseudorandom
based on some unlikeliness measure.
Rob Harley wrote:
>> You guys are all smart. You figure it out.
> Instead of generating a uniform r in [0,1) and comparing with a threshold p,
> start generating bits of r from most significant and stop as soon as
> you can decide.
> This needs at least 1 random bit per output bit (and close to 2 on average),
> but it should be possible to get close to the entropy e.g., 0.92
> random bits per output bit with p = 1/3.
> This is a similar problem to Huffman encoding a stream with rare letters.
> A standard solution is to group the letters into tuples giving a
> bigger alphabet.
> Then do the same trick of locating a random uniform but this time in a
> partition of the unit interval.
> This will effectively give an infinite Knuth-Yao tree.
> If p is rational, some blank letters can be added with the right
> probabilities to make the partition points dyadic,
> and then the tree can be truncated to finite giving a Huffman code.
> Example: (p=1/3) write a Huffman code for the 32 5-bit words with
> probabilities from 32/243 down to 1/243.
> Generate up to 8 random bits until you recognize the code word (unique
> prefix oblige) or a blank (skip it).
> 000 => 00000
> 0010 => 00001
> 0011 => 00010
> 0100 => 00100
> 0101 => 01000
> 1111000 => 11110
> 11110010 => 11111
> 11110011 => blank
> 111101 => blank
> 11111 => blank
> Total overkill unless you really get off on saving random bits.
> - R
> FoRK mailing list
More information about the FoRK