News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Release version of random pad generator.

Started by hutch--, April 14, 2019, 01:02:21 PM

Previous topic - Next topic

hutch--

I can't find anything else to do on this release version, I have added command line options that has a user defined output file name and the options of 2 to 10 XOR passes against the original pad. The first argument is a user defined integer that is multiplied by 1 million to set the pad size. The arguments are as follows.

SYNTAX : arg1 = integer in the following form
            1 = 1 million bytes, 2 = 2 million bytes, etc ....
         arg2 = The file name with no spaces
         arg3 = The number of XOR passes, 2 to 10

The public domain application ENT is included in the zip file and when rpg.exe is run, it calls ENT to immediately test the randomness of the output. Preferred range is when the CHI SQUARED is between 25% and 75% but the reference material that you find with ENT says 10% to 90% are OK.

The random output is usually OK but as with proper random you get some results outside the 10 - 90% range which you should discard. The real work with this tool is to make the pad non reproducable and its done with a massive level of complexity to make unwinding its construction near impossible. This addresses the potential of a single DWORD pad being broken by a brute force attack on very high powered computer hardware by simply running the full DWORD range.

The internal pad generator uses 5 identical pseudo random generators that construct each pad at randomly variable lengths and each pad is then XORRED against another random pad generated in the same manner. Multiple re-seeding occurs with each of the 5 random generators to break any patterns that may emerge with the random generators. The basic random generator is a custom modified shift-xor type that exhibits good randomness in a single pass.

It has been tested on win10 64 bit up to 1.5 billion byte pad sizes, above that you run into memory allocation failures. On older machines you will not be able to build pads of that size as the limit is determined by available memory.

Tedd

Please don't do this; there are standard functions for generating cyptographically strong random data - I'll create an example if necessary.

Also, XORing multiple times is no more secure than XORing once, assuming your data is sufficiently random (and if it's not, this is exactly why you should be using functions that do produce sufficiently random data.)
Potato2

hutch--

Hi Tedd,

Good to hear from you. I am a testing person myself so I have ENT attached to immediately check the output but the real reason for the tool is to make reproducing the pad close enough to impossible. The random does not improve after the first pass but the capacity to unwind the method of creating the pad increases in complexity with each pass given that each pass has pseudo random length variation as it is being constructed.

As you would be aware, with DWORD size keys, even if the quality of randomness is very high, you can brute force the data by exhausting the DWORD range and with high powered computers, this does not take all that long. The other factor with using a crude XOR of the pad to the data is the is no indicator of success which makes automated encryption cracking a far more difficult task.

I have long ago written test code for RDRAND and the seed creation in later hardware but did not like the tested randomness but I am also aware of potential backdoors in the standard Intel method where the mechanism I have used cannot have a backdoor put into it.

Tedd

XOR encryption with a one-time random pad is 100% secure (uncrackable) as long as two conditions are met:

  • the random pad is unpredictable, i.e. sufficiently random (ideally, truly random); and
  • the pad is at least the length of the data being encrypted (meaning there will be no repeats).
We could argue over whether your random data is random enough, but from a quick look it's pseudo-random based on the current time-stamp; so, given that you've provided a working program, the random data is as predictable as the time-stamp.

On the second point, you're still producing a repeating pattern (a longer one, but still repeating.)
As a simple example, if we use a key of length 5 for the first pass, and length 7 for the second pass, the end result is identical to XORing with a single repeating key of length 35 (an XOR of the 5-key and 7-key until they line up; note this doesn't always multiply - keys of lengths 5 and 10 only produce an effective key of length 10.)
As you add more passes, you can increase the effective length of the key used, but it's always going to repeat eventually; so there really is no difference to XORing once with a random key of the same effective length. The only appropriate length is: not less than the length of the data you're encrypting. Also, don't re-use the pad to encrypt anything else, as that's a repeat.

There are standard ways to extract the effective key length used, and after that it's purely a matter of using minimal knowledge of the encrypted data's format to extract parts of the key, which becomes increasingly easy with the number of repeats; so the only safe number of repeats is 0.
Potato2

hutch--

The basic seeding design is as follows.

    rdtsc                       ; get number
    bswap eax                   ; reverse byte order
    mov esi, eax                ; write 1st result into esi
    cpuid                       ; clear cache
    pause                       ; spinlock delay

    mov edi, 4                  ; load the loop counter
  lpst:
    rdtsc                       ; get number
    bswap eax                   ; reverse byte order
    xor esi, eax                ; xor eax to esi
    cpuid                       ; produce a delay between samples
    pause                       ; spinlock delay
    sub edi, 1                  ; decrement counter
    jnz lpst                    ; loop back if not zero

The reason for the bswap is to get the fast changing end of the integer returned by rdtsc, the following loop and the delays from cpuid and pause are designed to capitalise on the fact that no 2 machines have identical timings and/or operations. This is used for multiple reseeding at different parts of the pad generation. Each pass uses randomly different lengths of reseeding to break up any possible patterns from the 5 individually seeded pseudo random algos.

The purpose of the additional complexity is to make reconstructing the pad close to impossible, it breaks brute force attacks on the DWORD range as the pad is not created by a single DWORD, the further complexity is to break partial cracking by multiple passes at the same data and then correlating the partials.

We agree on your points 1 and 2
the random pad is unpredictable, i.e. sufficiently random (ideally, truly random); and
the pad is at least the length of the data being encrypted (meaning there will be no repeats).

I would add that the pad cannot be reused but if you create a pad big enough you can keep using different offsets for a very long time with no risk. The main pseudo random generators are a modified shift/xor design that seems to generate reasonable random sequences and by sequencing the output it passes the available tests I have available with ENT. Always the hardest to get right is the CHI SQUARED result.




Tedd

Potato2

hutch--

Seems to work OK and its fast. CHI SQUARED is borderline.

Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 268435456 byte file by 0 percent.

Chi square distribution for 268435456 samples is 290.43, and randomly
would exceed this value 10.00 percent of the times.

Arithmetic mean value of data bytes is 127.5026 (127.5 = random).
Monte Carlo value for Pi is 3.141953456 (error 0.01 percent).
Serial correlation coefficient is 0.000014 (totally uncorrelated = 0.0).


Here is the comparison.

K:\basic_code_2\rpg>rpg 268 randpad.pad 2
Pass 1
Pass 2
Writing output file
268000000 bytes written to : randpad.pad
Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 268000000 byte file by 0 percent.

Chi square distribution for 268000000 samples is 247.74, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4981 (127.5 = random).
Monte Carlo value for Pi is 3.141929241 (error 0.01 percent).
Serial correlation coefficient is -0.000056 (totally uncorrelated = 0.0).


Yours is a lot faster as it does a lot less work. I am aiming at making the pad non-reproducible.

hutch--

Here is a direct comparison with RDRAND. It was just a test piece and it is unseeded.

Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 268000000 byte file by 0 percent.

Chi square distribution for 268000000 samples is 246.24, and randomly
would exceed this value 64.16 percent of the times.

Arithmetic mean value of data bytes is 127.4877 (127.5 = random).
Monte Carlo value for Pi is 3.141803957 (error 0.01 percent).
Serial correlation coefficient is -0.000026 (totally uncorrelated = 0.0).

That's all folks
Press any key to exit ....