[FoRK] low probability bits

Damien Morton dmorton at bitfurnace.com
Tue Feb 24 19:00:20 PST 2009

```Riffing off the compression angle: For true overkill, you use arithmetic
decoding, with a binary alphabet, and feed it your random stream to be
decoded.

On Wed, Feb 25, 2009 at 12:58 PM, Stephen D. Williams <sdw at lig.net> wrote:

> Interesting.
>
> How about the following, a refinement of my earlier algorithm:
>
> Parameters:
> N desired 1 bits, M desired 0 bits
>
> Zero an array of size N+M bits.
> Out of stream of random bits, for each N bits, set bit Array[log2(N+M)) %
> (N+M))] to 1.
>
> If M < N, reverse sense.
> If N is close to M, start with random array and randomly flip the
> appropriate number of 1 or 0 bits.  For 49/51, flip one 1 bit to 0 in an
> array of 100 bits.  Randomly read Array[x] until a 1 bit is found, then set
> it to 0.
>
> A noise factor is needed, definitely for low values of N+M (1+1 for
> instance), but probably to some extent in most situations.  Setting a floor
> for N+M by scaling partially meets this requirement.  Flipping a random
> number of bits according to a distribution might be desired for some
> applications.
>
> Assuming no complex need for noise, this results in needing approximately:
> N*log2(N+M) bits for every N+M output bits.  For 1% 1 bits, this requires 7
> bits for every 100 output bits.
>
> I was just thinking that compressed data that might be at hand, with a
> little cheap decimation, would be a good source for roughly random data,
> boosted with a little real random data.  Looks like adding compression to
> pseudorandom streams has been looked at:
>
>
> An interesting problem would be to take pseudorandom data with some wrong
> distribution and making it fit another distribution.  Given a stream of
> bits, keep running totals and, using more imperfect bits, arrive at a stream
> that has a desired distribution along with desired noise.  All with a
> minimum of storage, CPU, and spent bits.  More tough, do so with local
> detection of "bad bits" that aren't even pseudorandom based on some
> unlikeliness measure.
>
> sdw
>
>
> Rob Harley wrote:
>
>> You guys are all smart. You figure it out.
>>>
>>>
>>
>> Instead of generating a uniform r in [0,1) and comparing with a threshold
>> p,
>> start generating bits of r from most significant and stop as soon as
>> you can decide.
>> This needs at least 1 random bit per output bit (and close to 2 on
>> average),
>> but it should be possible to get close to the entropy e.g., 0.92
>> random bits per output bit with p = 1/3.
>> This is a similar problem to Huffman encoding a stream with rare letters.
>> A standard solution is to group the letters into tuples giving a
>> bigger alphabet.
>> Then do the same trick of locating a random uniform but this time in a
>> partition of the unit interval.
>> This will effectively give an infinite Knuth-Yao tree.
>> If p is rational, some blank letters can be added with the right
>> probabilities to make the partition points dyadic,
>> and then the tree can be truncated to finite giving a Huffman code.
>>
>> Example: (p=1/3) write a Huffman code for the 32 5-bit words with
>> probabilities from 32/243 down to 1/243.
>> Generate up to 8 random bits until you recognize the code word (unique
>> prefix oblige) or a blank (skip it).
>>
>> 000      => 00000
>> 0010     => 00001
>> 0011     => 00010
>> 0100     => 00100
>> 0101     => 01000
>> [...]
>> 1111000  => 11110
>> 11110010 => 11111
>> 11110011 => blank
>> 111101   => blank
>> 11111    => blank
>>
>> Total overkill unless you really get off on saving random bits.
>>
>> - R
>> _______________________________________________
>> FoRK mailing list
>> http://xent.com/mailman/listinfo/fork
>>
>>
> _______________________________________________
> FoRK mailing list
> http://xent.com/mailman/listinfo/fork
>
```