Continue to Site

# Ultra Fast Pseudorandom number generator for 8-bit

Status
Not open for further replies.

#### EternityForest

##### Member
Someones gotta have made this before, but anyways:

Most common random number generators in use use either multiplication(BBS,LCG,etc)
or array acesses(ARC4),or variables bigger than a byte(XORSHIFT).

All of these things are not so good on an 8 bit chip.
so, I designed this algorithm, which is easy to understand, produces decent numbers(passes most of the diehard tests,except the sparse occupancy and DNA tests)
and will only use 4 bytes of ram and very little code space.

Obligatory dont-use-this-for-crypto-disclaimer here, It will not do good at that.

If you have the ram, by all means, use RC4. Its a great fast algorithm that is easy to use. But if you need something small i think this might be useful.

Here is a C library with comments for ya!
Most of the code is just comments, the main function is 5 lines.

Code:
//X ABC Algorithm Random Number Generator for 8-Bit Devices:
//This is a small PRNG, experimentally verified to have at least a 50 million byte period
//by generating 50 million bytes and observing that there were no overapping sequences and repeats.
//This generator passes serial correlation, entropy , Monte Carlo Pi value, arithmetic mean,
//And many other statistical tests. This generator may have a period of up to 2^32, but this has
//not been verified.
//
// By XORing 3 bytes into the a,b, and c registers, you can add in entropy from
//an external source easily.
//
//This generator is free to use, but is not suitable for cryptography due to its short period(by //cryptographic standards) and simple construction. No attempt was made to make this generator
// suitable for cryptographic use.
//
//Due to the use of a constant counter, the generator should be resistant to latching up.
//A significant performance gain is had in that the x variable is nly ever incremented.
//
//Only 4 bytes of ram are needed for the internal state, and generating a byte requires 3 XORs , //2 ADDs, one bit shift right , and one increment. Difficult or slow operations like multiply, etc
//were avoided for maximum speed on ultra low power devices.

init_rng(s1,s2,s3) //Can also be used to seed the rng with more entropy during use.
{
//XOR new entropy into key state
a ^=s1;
b ^=s2;
c ^=s3;

x++;
a = (a^c^x);
b = (b+a);
c = (c+(b>>1)^a);
}

unsigned char randomize()
{
x++;               //x is incremented every round and is not affected by any other variable
a = (a^c^x);       //note the mix of addition and XOR
b = (b+a);         //And the use of very few instructions
c = (c+(b>>1)^a);  //the right shift is to ensure that high-order bits from b can affect
return(c)          //low order bits of other variables
}
What do you guys think?

If what you say is true about it's randomness it's more than adequate for general use and looks faster than some simpler rng's I've seen.

I made a true random number generator a few year ago. It was based on a PIC that had an ADC inside. I connected a resistive divider to an LM324 configured as a high gain amplifier, then ac coupled the output of that to another LM324. The LM324 noise characteristic is soooo poor that you can use it as a noise generator. You then feed the output into the PIC ADC and digitise that. The ADC code is then truely random. I did some other trickery to make sure I did not digitise max and min codes too to take out the effects of the op amps saturating.. Worked very well. I conducted 100 tests on it and the numbers appeared to be truly random.

Hope this helps

Simon

I made a true random number generator a few year ago. It was based on a PIC that had an ADC inside. I connected a resistive divider to an LM324 configured as a high gain amplifier, then ac coupled the output of that to another LM324. The LM324 noise characteristic is soooo poor that you can use it as a noise generator. You then feed the output into the PIC ADC and digitise that. The ADC code is then truely random. I did some other trickery to make sure I did not digitise max and min codes too to take out the effects of the op amps saturating.. Worked very well. I conducted 100 tests on it and the numbers appeared to be truly random.

Did you do any tests to see if the distribution was uniform, gaussian or something else?

EternityForest: Good job doing all the testing etc. I think the period should be calculated, not estimated by observing, but not a big deal with a simple PRNG like this. What software did you use for the diehard tests?

Last edited:
the data was uniform. By the nature of how it was collected, it could really be nothing else. Thanks for the comment

Just wondering.. Where else would you need a really fast RNG other than cryptography? Maybe in some kind of Compressive Sampling application, but those systems usually need a high quality RNG. In one research I used a hardware AES on 8-bit uC to generate very high quality 16 bit random numbers. I got almost 200 000 deviates per second and still had time to run the application I needed (50 Mhz system clock). I would like to get that system running faster, but not sacrifice quality much.

Last edited:
Just wondering.. Where else would you need a really fast RNG other than cryptography? Maybe in some kind of Compressive Sampling application, but those systems usually need a high quality RNG. In one research I used a hardware AES on 8-bit uC to generate very high quality 16 bit random numbers. I got almost 200 000 deviates per second and still had time to run the application I needed (50 Mhz system clock). I would like to get that system running faster, but not sacrifice quality much.

Wow nice going! AES on 8 bit with that much efficiency! Were you doing a compressed sensing application?

I never actually thought of using AES because ARC4 has always seemed to suit my needs when i have the ram space for it.

There are a few times you would want fast random numbers outside of crypto, like when you want to make white noise, games, little led dice,etc

I used the i diehard battery directly on the command line. I just used the standard DIEHARD battery. I also tried the NIST test suite, kinda hard to use, but it seems to pass some of them.

I agree that anything you want to use for anything important should be analysed theoretically and not just empirically by observing.

Mostly i just designed this for fun, and because i like to use the cheapest/lowest power chip possible for whatever I'm doing.

UPDATE: I just generated a giant file with a dump of the internal state on every line, then used grep to find a line matching the initial state.
The period of this generator(Starting from a seed of all zeros) is......drum roll....487,780,609.
around 8 times less than the theoretical maximum for a 4 byte state. And, that is just starting from all zeros.
Other Initial seeds could have greater or lesser period. But as far as i can tell, it can be used for what it is in simple applications.

Last edited:
AES on 8 bit with that much efficiency! Were you doing a compressed sensing application?

It is a hardware AES engine that generates 128 bits in 375 clock cycles, then DMA transfers the data to a double buffered 16 bit integer array. While I'm using the first array, the other array is filled with new set of random bits. I hope to do some Compressed Sensing stuff with it, but so far I just do some basic experiments. I'm trying ways to use the double buffered random bits as efficiently as possible so that I don't have to run the AES so often. Maybe I will try your algorithm to generate new random arrays from the AES generated arrays. The microcontroller is Atmel ATxmega128A1.

Mostly i just designed this for fun, and because i like to use the cheapest/lowest power chip possible for whatever I'm doing.
It is interesting little RNG. Mainly because you seem to know about the subject and you have done the "standard" tests etc. Nice work finding the period also.. and for zero initial states that is ok result for 8 bit generator (I believe). Not enough for Monte Carlo methods etc. But it does the job for games and such.

Last edited:
It's an interesting RNG, thanks for posting!

I'm not a big fan of byte generating RNGs and prefer to do the process on a bit level so every bit is processed in the same fashion and has equal qualities and flaws.

I like the way yours is "triangular" in approach, where A is made from C, B is made from A and C is made from B more or less. That's clever.

But I see a couple of potential issues with generating a byte;
x++;
this adds a lot of change to bit x,0 but 128 times less change to bit x,7. So as x is xored into the result the lower order bits of the byte result will have more entropy than the higher bits.

This is compounded by this function;
c = (c+(b>>1)^a);
where in C language the (b>>1) operation always puts a zero in bit b,7, a loss of a bit there and the entropy in B is shifted right (to the lower order bits again).
Both issues would probably cause the bit7 of your result to have less entropy than bit0 and in general the higher bits of the output byte to always be less random than the lower bits.

Have you looked at a 32bit xor-shift RNG, still using only 4 RAM bytes? Many have a "full sized" period of 2^32 or (2^32)-1, and they are very fast needing only a few instruction cycles to generate a bit. And every bit is equally biased. I'm not sure whether it would be slower or faster than yours, as an xor-shift needs 8 cycles to make one output byte, but it might be worth looking into.

If you have 1megabyte of generated data I can run a test on it to see if there is any obvious pattern, specifically to see if bit7 of each byte has less entropy than bit0.

Both issues would probably cause the bit7 of your result to have less entropy than bit0 and in general the higher bits of the output byte to always be less random than the lower bits.

In many (all) simple generators, the lower bits have a poor entropy. I think in this generator there is a deliberate effort made to compensate for that issue. Let's see what EternityForest says about that..

Both MisterT and mr. RB are correct.
The logical shift left very well could be making the whole thing have less entropy in higher bits, but it is there to compensate for the common loss of entropy in lower bits. Niether addition nor XOR allow high order bits to affect low order bits which intereferes with the avalanche principle that changing one bit should be able to affect all bits with 50% probability. Thanks for pointing both of those points out! I much rather would have used a circular shift instead. Maybe if we replaced the >>1 with a permutation like in AES's S-box(a rather fast operation on PIC's if you have a fixed permutation) you could get better performance.

changing the last line to:
c = ( c + inv_s );
or
c = ( c + inv_s ^ a );

particularly the former, where inv_s is the AES inverse(or forward) S box from wikipedia
Appears to work very well, a 10MB file showed no patterns that DIEHARD could find.

Another sink of entropy i am worried about is the fact that X+0 mod 256 and x+255 mod 256 are the same.
but thats not really losing entropy, its just not propagating through the system as well beacause of the strucure.
still, it could be part of the reason for the short period.

I have learned so much about math,randomness,entropy,and coding from this project! Thanks guys!

Last edited:
Another sink of entropy i am worried about is the fact that X+0 mod 256 and x+255 mod 256 are the same.

This is a bit strange. I know that 'x+0 mod 256' and 'x+256 mod 256' are the same. Are you sure you got the above right? Is that something the C compiler can't handle with 8-bit variables?

Actually, with 8-bit variables, it is useless to take 'mod 256' because it does not change the value of the 8-bit variable (255 mod 256 = 255). Did I miss something here? Why are you worried about the mod-operator?

Last edited:
This is a bit strange. I know that 'x+0 mod 256' and 'x+256 mod 256' are the same. Are you sure you got the above right? Is that something the C compiler can't handle with 8-bit variables?

Actually, with 8-bit variables, it is useless to take 'mod 256' because it does not change the value of the 8-bit variable (255 mod 256 = 255). Did I miss something here? Why are you worried about the mod-operator?

I was referring to the implicit modulo of the 8 bit values, But i guess because you cant have a var bigger than 255 the results will be unique.

...
Maybe if we replaced the >>1 with a permutation like in AES's S-box(a rather fast operation on PIC's if you have a fixed permutation) you could get better performance.
...

On the PIC you could do the (b>>1) as an 8bit circular operation very quickly using RRF with carry (either manual or auto carry depending on which PIC). So maybe you could just do the (b>>1) in inline assembler in the code?

As a suggestion to improve the X++ problem, instead of adding the minimum number (+1) you could add a larger prime number that rolls to a 256 modulus +1 or -1.
Like
X = (X+51);
adds a larger prime number 51, which gives more change to the higher bits of X per cycle, but +51 after 5 times is 255, which rolls nicely at 256-1.

51 might not be the optimum value to add but it could be worth a try.

Status
Not open for further replies.

Replies
2
Views
1K
Replies
3
Views
888
Replies
5
Views
1K
Replies
10
Views
2K
Replies
8
Views
2K