Simple nonlinear pseudorandom number generator

Let me suggest a minimalist generator of pseudorandom numbers. It is based on the structure with nonlinear feedback. I haven't seen this structure before.

  • Simple and efficient in hardware or software
  • Strongly nonlinear
  • Scalable to arbitrary number of states
  • No short loops
  • No special state initialization procedure
  • Passes Diehard battery of tests with state as small as 3 bytes
  • Generates one byte at time
  • The structure could be extended both in depth and width

///------------------------------- // (c) Vladimir L. Vassilevsky, July 21, 2013 // snipped-for-privacy@abvolt.com // All rights reserved //--------------------------------- //

// Any number of state bytes (more then 2 to pass Diehard test) const u32 number_of_states = 3;

u8 state[number_of_states]; // Initialize state to whatever anything u8 cntr = 0;

const u8 s_box[256]; // This is nonlinear S-box. // You can use S-box from AES or define your own. // Initialization with AES values is omitted for // clarity; one could find it in the // internet.

// Generate random byte

u8 vrnd() { u8 x = cntr++;

u8 tmp_state = state[0] + x; x ^= s_box[state[0]];

for(u32 ci = 1; ci < number_of_states; ci++) // Iteration { state[ci-1] = state[ci] + x; x ^= s_box[state[ci]]; }

state[number_of_states - 1] = tmp_state;

return x; }

This is it. Feedback is welcome.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

Reply to
Vladimir Vassilevsky
Loading thread data ...

On a sunny day (Mon, 22 Jul 2013 10:28:48 -0500) it happened Vladimir Vassilevsky wrote in :

Post it to sci.crypt. They will shoot at it:-)

Reply to
Jan Panteltje

Reply to
Vladimir Vassilevsky

Reminds me a bit of RC4.

Are you sure? How?

Is it supposed to be cryptogrphically strong? What is its intended use? Monte Carlo simulations? Stream cypher?

All 0?

Since the non-linearity comes in through the s_box, surely "define your own" is not a great idea. What if I intialise them to all 0? Will that work? If not, what restrictions are there on the s_box values? What are weak ones and which strong? Why are the AES values good ones to use for your purpose (since you are using them very much differently than AES does-- there they were designed specifically for the intended use in AES).

So, why would this be a good idea? It is pretty predictable.

Why "state[0]"? This cycles through the "states" in a very predictabl manner.

Reply to
unruh

I can shoot at it for ya.

Very dependent upon your definition of "short". There are no loops with length less than 256, but I believe that there are likely to be several loops with length less than, say, 65537. Of course, widening it to operate on 32 or 64 bit integers will greatly improve that.

Diehard is badly outdated. TestU01 and PractRand are the best modern equivalents for fast PRNGs. (disclosure: I wrote PractRand)

Also, cntr is part of the state and should be counted, despite the name of the "state" array. I have a hard time believing that even Diehard would pass it at number_of_states=2, which would be a 3 byte state size, so I think you mean number_of_states=3, with a state size of 4 bytes.

These two properties functionally contradict each other to some extent, though admittedly simply widening it to 32 or 64 bits would fix that.

The algorithm itself seems simple and reasonably sound. At the post setting of number_of_states=3 (4 bytes of state) it fails PractRand rapidly, but increasing number_of_states slightly seems to fix that - at larger settings it does well for its statespace.

The problem is that a serious PRNG needs a large enough statespace to avoid accidentally having any two randomly generated states be identical, adjacent, or otherwise too closely related. And for this algorithm at 8-bit that means a large number_of_states, which slows it down enormously.

At number_of_states=3 and widened to 16 bits, it exhibits serious problems after 2**32 bytes, and I have no clue why. It acted as if its period was exhausted, though the average period at those settings should be more like 2**64 bytes. At either number_of_states=4 or width=32 this issue ceases to be detectable. It's possible I might have screwed up converting it from 8 bits to 16 bits, but I have no idea how (I left it with an 8 bit s-box, just applied it separately to the high and low bytes of each s-boxed word of data).

As for security... to my eye it looks decent if the array is big enough... can anyone else see a way to attack it?

Reply to
Chris DH

Thank you for consideration.

Making the number of states an odd number (coprime to 256) could further improve that.

Do you have a win32/64 executable that takes binary file for analysis?

Number_of_states = 2 fails Diehard DNA test Number_of_states = 3 is a pass. You are correct, the cntr should be accounted for as a part of the state. With number_of_states = 3 that makes 32 bits total. However, most PRNGs require state in excess of 64 bits to pass.

The structure could be easily paralleled and pipelined, as well as widened. Unlike RC4, no true RAM is required; it is all shift registers and logic.

That would be interesting to hear.

I also thought of using the vrnd() machine as hash function, feeding the data stream into it instead of cntr variable.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

Reply to
Vladimir Vassilevsky

I'm not sure what you're suggesting. While cntr is used that way all cycle lengths must be multiples of 256. You could change how it's used, but so far as I know all alternatives that are better for minimum cycle length end up being worse for performance & complexity.

I think most releases include a win32 binary named /bin/RNG_test.exe. Technically you can use it to test files by piping the file in and telling it to test stdin, but in practice that is usually not a good idea because PractRand wants LOTS of bytes. Longer test runs will use more bytes than you have space on your drive. It tests something like 4 GB per minute per core for up to 4 cores if multithreading is enabled, provided that the PRNG is fast enough to supply data that quickly. Thus testing PRNGs with short cycle lengths on PractRand will inform you of the obvious fact that yes, they do in fact have a short cycle length.

Given the way I adapted the sbox usage for 16/32/64 bit variants, diffusion of information across byte boundaries is a major limiting factor. This might be the source of the issue that testing found in the

16 bit version. The diffusion out of elements near the end of the array is already very limited. It's hard to say if that hurts or not. Given knowledge of some elements and the output, it is possible to rule out certain values for other elements. This appears to be too limited to actually matter though.

Overall, given the simple regular structure, I'm feeling like although I can't find a way to attack it effectively there is too high a chance that one exists. Absent some reasonable-sounding argument that it's secure, atm I wouldn't want to trust it.

Reply to
Chris DH

Well, in my computer (64 bits amd) your byte generator takes 6.50 cycles/byte. That's a lot, some stream ciphers take less time. A modified version with 64 bits gives me 0.84 cycles/byte, that's more acceptable, but using 64 bits in the s-box. Using a byte and xor'ing x eight times probably will make cycle/byte ratio excessive. Byte operations in general are very poorly handled by nowadays computers, they are in fact slower than his 64 bit counterparts, so using them is not very advisable if you want extreme speed. So, the main justification of a so simplistic prng in my opinion is speed, so... code a 64 bit version and let's see.

Regards Daniel

Reply to
Daniel

After looking at it several times i still do not see how you are walking across the s_box array. It looks like you are just reusing the same four elements, plus i think i see some array bounds issues.

YMMV

?-)

Reply to
josephkk

Below is 32-bit generator using same idea. Simple nonlinear function is used instead of S-box LUT. The generator is quite fast in software or hardware implementation.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

//---------------------------- // (c) Vladimir L. Vassilevsky, 2013 //--------------------------- // Nonlinear function // static inline u32 vrnd32_s(u32 x) { x = (x 8); return x; } //--------------------------------------------------------------------- // No less then 3 state dwords // 8 state dwords recommended, i.e. 256 bit of state information // const u32 number_of_states = 8;

u32 vrnd32(u32 *state) { u32 x = state[0] = (state[0]

Reply to
Vladimir Vassilevsky

Reply to
Phil Carmody

That does well on PractRand tests, even scaled down to number_of_states=3 and 16 bit integers. I'm not entirely sure why... all of your shifts are multiples of 8 bits, so diffusion not at 8 bit multiples ought to be slow, occurring only through carries on the addition. Which I would expect to be detectable, but it's not showing up in the test results so either my analysis is off or it's slipping through the cracks somehow.

Reply to
Chris DH

Specifically:

32 bit: number_of_states=3: no failures found (2^38 bytes tested)

16 bit: number_of_states=3: 2^36 bytes needed (expected cycle: 2^48 bytes) number_of_states=4: no failures found (2^38 bytes tested)

8 bit: number_of_states=5: 2^36 bytes needed (expected cycle: 2^39 bytes) number_of_states=6: no failures found (2^38 bytes tested)

Those tests were with PractRand expanded test battery. It seemed to work better than the regular test battery on these.

Reply to
Chris DH

Have you cranked against TestU01? It and PractRand (which version?) seem to be the litmus tests these daze. __outer

Reply to
Richard Outerbridge

I hadn't, but have now. TestU01 *Crush does better than PractRand expanded on this, though the overall picture doesn't change much:

(bits)x(number_of_states): SmallCrush / Crush / BigCrush failures

32x3: 0/0/0 (pass) 16x3: 0/2/? 16x4: 0/0/0 (pass) 08x5: 0/6/? 08x6: 0/0/1
Reply to
Chris DH

Any fail is a non starter, really. __outer

Reply to
Richard Outerbridge

Well, yes. General purpose statistical tests are a rather low bar to pass. But those are deliberately reduced quality. 32x3 failed nothing. 64x3 likely would fail nothing. 16x4 failed nothing. 8x6 barely failed anything, so 8x7 would likely fail nothing. 32x8 ought to not just pass, but pass easily.

Of course, PRNGs that I actually recommend fail nothing, and reduced size versions fail little or nothing (and get more testing), and do well on various avalanche test metrics, and are quite a bit faster than that (relative to quality and feature set). But it's doing better than most algorithms posted on usenet do.

Here's the tweaks that my sensibilities would suggest to vrnd32_s: //x = (x

Reply to
Chris DH

[....] Choosing the samples to pass the tests rather defeats the purpose, eh? Just saying. __outer
Reply to
Richard Outerbridge

Could you clarify that comment please? I can't tell if your snarking on my comparison to the sample set of algorithms posted on usenet, or talking about something else completely.

Reply to
Chris DH

And a certain percentage of samples MUST fail if your generator is to be indistinguishable from random.

Reply to
David Eather

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.