Here is the test with added new implementation of Axrand proc.
The modified C source built with MSVC10 linked to a MSVCRT.DLL, maximal optimization by speed. Originally the unmodified C source was compilable with MSVC10 flawlessly, too.
The added ASM source axrand_asm.asm includes new Axrand proc which seems to be faster than the old one and also has better PRNG results according to a ENT tests, and an empty proc used in the reference loop to check the time which the calculations take.
Typical result on my machine:
Generating 200 Million random numbers with C.
That'll take a little while ...
Area = 0.250024975000000
Absolute Error = 0.000024975000000
Elapsed Time = 28.41 Seconds
Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...
Area = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time = 21.76 Seconds
This is empty reference loop to take the calculation code time in account
That'll take a little while ...
Area = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time = 8.44 Seconds
The ENT results for the Axrand output, just for reference:
#############################################################
Test for #0 byte
Entropy = 7.999284 bits per byte.
Optimum compression would reduce the size
of this 250000 byte file by 0 percent.
Chi square distribution for 250000 samples is 248.07, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.6159 (127.5 = random).
Monte Carlo value for Pi is 3.151154418 (error 0.30 percent).
Serial correlation coefficient is 0.000242 (totally uncorrelated = 0.0).
#############################################################
Test for #1 byte
Entropy = 7.999269 bits per byte.
Optimum compression would reduce the size
of this 250000 byte file by 0 percent.
Chi square distribution for 250000 samples is 254.58, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.2226 (127.5 = random).
Monte Carlo value for Pi is 3.152786445 (error 0.36 percent).
Serial correlation coefficient is 0.001665 (totally uncorrelated = 0.0).
#############################################################
Test for #2 byte
Entropy = 7.999284 bits per byte.
Optimum compression would reduce the size
of this 250000 byte file by 0 percent.
Chi square distribution for 250000 samples is 247.80, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.3210 (127.5 = random).
Monte Carlo value for Pi is 3.140018240 (error 0.05 percent).
Serial correlation coefficient is -0.000587 (totally uncorrelated = 0.0).
#############################################################
Test for #3 byte
Entropy = 7.999284 bits per byte.
Optimum compression would reduce the size
of this 250000 byte file by 0 percent.
Chi square distribution for 250000 samples is 248.13, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.4540 (127.5 = random).
Monte Carlo value for Pi is 3.136178179 (error 0.17 percent).
Serial correlation coefficient is 0.000549 (totally uncorrelated = 0.0).
#############################################################
Test for full DWORD
Entropy = 7.999807 bits per byte.
Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.
Chi square distribution for 1000000 samples is 267.70, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.4034 (127.5 = random).
Monte Carlo value for Pi is 3.130836523 (error 0.34 percent).
Serial correlation coefficient is 0.019429 (totally uncorrelated = 0.0).