The MASM Forum

General => The Laboratory => Topic started by: Gunther on October 07, 2013, 09:32:58 PM

Title: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on October 07, 2013, 09:32:58 PM
Attached is the archive MC32.ZIP. It's the 32 bit version of this thread. (http://masm32.com/board/index.php?topic=2432.0) The assembly language source works for jWasm and probably for MASM (not tested). The C source will work with GCC and should work for MS VC (not tested).

Here's a typical program output:

Generating 200 Million random numbers with RDRAND.
That'll take a little while ...

Area           = 0.250004200000000
Absolute Error = 0.000004200000000
Elapsed Time   = 23.10 Seconds

Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250011705000000
Absolute Error = 0.000011705000000
Elapsed Time   = 4.11 Seconds

It's the same situation: RDRAND tends to be slow. A good software RNG makes a better and faster job. Some test results would be fine.

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: dedndave on October 08, 2013, 12:48:19 AM
 :t

p4 prescott w/htt 3.0 GHz
Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250017755000000
Absolute Error = 0.000017755000000
Elapsed Time   = 19.33 Seconds
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on October 08, 2013, 12:50:25 AM
Thank you for testing, Dave.  :t

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: GoneFishing on October 08, 2013, 01:35:57 AM
...
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on October 08, 2013, 02:06:50 AM
Hi vertograd,

Quote from: vertograd on October 08, 2013, 01:35:57 AM
I simplified your C routine  for CPU performance testing purposes :

the simplification is okay. But did you read MC.PDF?

Quote from: vertograd on October 08, 2013, 01:35:57 AM
I'll post the results for GPU-generated random numbers as soon  as I figure out how to write CUDA code .

Is CUDA a real good idea? It's inseparably connected with Nvidia hardware. There's only the alternative to AMD/ATI. In any case: The application would be very strong hardware dependent.

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: GoneFishing on October 08, 2013, 02:49:09 AM
...
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on October 08, 2013, 05:07:35 AM
Hi vertograd,

Quote from: vertograd on October 08, 2013, 02:49:09 AM
It seems good for me at least - it's interesting and I  want to use it for testing hardware capabilities.

But it has drawbacks (http://www.herikstad.net/2009/05/cuda-and-double-precision-floating.html), too.

Quote from: vertograd on October 08, 2013, 02:49:09 AM
You wrote about your codec for compressing/decompressing fractal images ( for internal use I suppose). That's the case when using CUDA can drastically increase performance   

CUDA is good for 9 or 12 bit data and some kind of fixed point arithmetics. But for usual widths (32, 64, 128 bit) it's not so suitable. We solved our performance problem by using SIMD instructions.

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: GoneFishing on October 08, 2013, 05:34:58 AM
...
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 01:10:46 PM
Here is the test with added new implementation of Axrand proc.
The modified C source built with MSVC10 linked to a MSVCRT.DLL, maximal optimization by speed. Originally the unmodified C source was compilable with MSVC10 flawlessly, too.
The added ASM source axrand_asm.asm includes new Axrand proc which seems to be faster than the old one and also has better PRNG results according to a ENT tests, and an empty proc used in the reference loop to check the time which the calculations take.

Typical result on my machine:

Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250024975000000
Absolute Error = 0.000024975000000
Elapsed Time   = 28.41 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 21.76 Seconds

This is empty reference loop to take the calculation code time in account
That'll take a little while ...

Area           = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time   = 8.44 Seconds


The ENT results for the Axrand output, just for reference:

#############################################################
Test for #0 byte
Entropy = 7.999284 bits per byte.

Optimum compression would reduce the size
of this 250000 byte file by 0 percent.

Chi square distribution for 250000 samples is 248.07, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.6159 (127.5 = random).
Monte Carlo value for Pi is 3.151154418 (error 0.30 percent).
Serial correlation coefficient is 0.000242 (totally uncorrelated = 0.0).



#############################################################
Test for #1 byte
Entropy = 7.999269 bits per byte.

Optimum compression would reduce the size
of this 250000 byte file by 0 percent.

Chi square distribution for 250000 samples is 254.58, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.2226 (127.5 = random).
Monte Carlo value for Pi is 3.152786445 (error 0.36 percent).
Serial correlation coefficient is 0.001665 (totally uncorrelated = 0.0).



#############################################################
Test for #2 byte
Entropy = 7.999284 bits per byte.

Optimum compression would reduce the size
of this 250000 byte file by 0 percent.

Chi square distribution for 250000 samples is 247.80, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.3210 (127.5 = random).
Monte Carlo value for Pi is 3.140018240 (error 0.05 percent).
Serial correlation coefficient is -0.000587 (totally uncorrelated = 0.0).



#############################################################
Test for #3 byte
Entropy = 7.999284 bits per byte.

Optimum compression would reduce the size
of this 250000 byte file by 0 percent.

Chi square distribution for 250000 samples is 248.13, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4540 (127.5 = random).
Monte Carlo value for Pi is 3.136178179 (error 0.17 percent).
Serial correlation coefficient is 0.000549 (totally uncorrelated = 0.0).



#############################################################
Test for full DWORD
Entropy = 7.999807 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 267.70, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4034 (127.5 = random).
Monte Carlo value for Pi is 3.130836523 (error 0.34 percent).
Serial correlation coefficient is 0.019429 (totally uncorrelated = 0.0).
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 01:15:24 PM
Thank you, Gunther, for permission to add the code into your test :biggrin:
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: dedndave on October 11, 2013, 01:25:45 PM
nice   :t

P4 Prescott w/htt @ 3.0 Ghz
Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250029345000000
Absolute Error = 0.000029345000000
Elapsed Time   = 17.61 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 14.23 Seconds

This is empty reference loop to take the calculation code time in account
That'll take a little while ...

Area           = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time   = 5.42 Seconds
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 01:46:03 PM
Thank you, Dave :biggrin: :t
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: TWell on October 11, 2013, 07:27:54 PM
AMD Athlon(tm) II X2 220 Processor 2.80 GHzGenerating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250021205000000
Absolute Error = 0.000021205000000
Elapsed Time   = 10.84 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 5.98 Seconds

This is empty reference loop to take the calculation code time in account
That'll take a little while ...

Area           = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time   = 2.58 Seconds
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: sinsi on October 11, 2013, 08:43:59 PM
Alex's latest

Generating 200 Million random numbers with RDRAND.
That'll take a little while ...

Area           = 0.250021975000000
Absolute Error = 0.000021975000000
Elapsed Time   = 16.46 Seconds

Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.249995930000000
Absolute Error = 0.000004070000000
Elapsed Time   = 4.25 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 2.97 Seconds

This is empty reference loop to take the calculation code time in account
That'll take a little while ...

Area           = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time   = 1.55 Seconds

What are the numbers for area and error e.g. is 25.002 better than 24.999?
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 08:45:37 PM
Thank you, TWell! :biggrin:
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 08:49:18 PM
Oh, was posting simultaneously :biggrin:
Thank you, John! :t
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on October 11, 2013, 08:53:25 PM
One more :biggrin:
Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250098180000000
Absolute Error = 0.000098180000000
Elapsed Time   = 14.77 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 7.27 Seconds
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 08:54:42 PM
Quote from: sinsi on October 11, 2013, 08:43:59 PM
What are the numbers for area and error e.g. is 25.002 better than 24.999?

If I'm not mistaken, AFAIK the "ideal PRNG" result should be equal to the 0.25, so, the closer to this is the result, the better ("more" random distribution of the numbers) the PRNG output is. The difference though between generators in this test is that RDRAND and Axrand return numbers in the DWORD range, but the C rand returns the numbers in the range of 0...32767.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 08:56:19 PM
Quote from: jj2007 on October 11, 2013, 08:53:25 PM
One more :biggrin:

Thank you, Jochen! :biggrin:
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on October 11, 2013, 10:21:38 PM
Hi Alex,

Quote from: Antariy on October 11, 2013, 01:15:24 PM
Thank you, Gunther, for permission to add the code into your test :biggrin:

You're welcome, Alex. Thank you for testing the C source with MS VC.

Here is the output:

Generating 200 Million random numbers with RDRAND.
That'll take a little while ...

Area           = 0.250033590000000
Absolute Error = 0.000033590000000
Elapsed Time   = 16.08 Seconds

Generating 200 Million random numbers with C.
That'll take a little while ...

Area           = 0.250033955000000
Absolute Error = 0.000033955000000
Elapsed Time   = 5.02 Seconds

Generating 200 Million random numbers with ASM Axrand.
That'll take a little while ...

Area           = 0.249965115000000
Absolute Error = 0.000034885000000
Elapsed Time   = 2.89 Seconds

This is empty reference loop to take the calculation code time in account
That'll take a little while ...

Area           = 0.000000000000000
Absolute Error = 0.250000000000000
Elapsed Time   = 1.50 Seconds


Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Antariy on October 11, 2013, 11:22:16 PM
Quote from: Gunther on October 11, 2013, 10:21:38 PM
You're welcome, Alex. Thank you for testing the C source with MS VC.

Thank you, Gunther! You're welcome, too! :biggrin:
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 26, 2014, 06:46:25 PM
include \masm32\MasmBasic\MasmBasic.inc        ; download (http://masm32.com/board/index.php?topic=94.0)
  Init
  Dll "NtDll"                   ; don't believe MSDN (http://msdn.microsoft.com/en-us/library/windows/hardware/ff553181%28v=vs.85%29.aspx), it's NtDll, really...
  Declare RtlRandomEx, 1        ; _Inout_  PULONG Seed
  push "Ciao"
  push "Alex"
  m2m ebx, 20
  .Repeat
        PrintLine Str$(edx::RtlRandomEx(esp))
        dec ebx
  .Until Sign?
  pop eax
  pop edx
  Inkey " ok"
  Exit
end start


Output:
4791245291978797284
1200059578463366776
...
7431188666248921181
6381367696984855216

;)
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 26, 2014, 10:17:04 PM
Jochen,

interesting code. But why pushing "Ciao" and "Alex"?

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 26, 2014, 10:41:49 PM
Quote from: Gunther on January 26, 2014, 10:17:04 PM
interesting code. But why pushing "Ciao" and "Alex"?
If you prefer another qword seed, just
push "Gunt"
push "her"
;-)
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 26, 2014, 10:55:44 PM
Jochen,

Quote from: jj2007 on January 26, 2014, 10:41:49 PM
If you prefer another qword seed, just
push "Gunt"
push "her"
;-)

that makes more sense.  :lol: :lol: :lol:

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 27, 2014, 03:30:31 AM
After reading on MSDN (http://msdn.microsoft.com/en-us/library/windows/hardware/ff553181%28v=vs.85%29.aspx)...
QuoteThe RtlRandomEx function is an improved version of the RtlRandom function. Compared with the RtlRandom function, RtlRandomEx is twice as fast and produces better random numbers
... I could not resist the temptation to make some tests. Here are the results :dazzled:

1521 ms with writing the file, RtlRandomEx
121 ms without writing

1336 ms with writing the file, MasmBasic Rand()
2 ms without writing


############ ENT results RtlRandomEx:
Entropy = 7.953506 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 63775.98, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 111.5300 (127.5 = random).
Monte Carlo value for Pi is 3.491725967 (error 11.15 percent).
Serial correlation coefficient is -0.047085 (totally uncorrelated = 0.0).

############ ENT results MasmBasic Rand() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1030):
Entropy = 7.999827 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 240.28, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4222 (127.5 = random).
Monte Carlo value for Pi is 3.141804567 (error 0.01 percent).
Serial correlation coefficient is -0.002690 (totally uncorrelated = 0.0).
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 27, 2014, 03:43:13 AM
Jochen,

here are the test results:

351 ms with writing the file, RtlRandomEx
45 ms without writing

283 ms with writing the file, MasmBasic Rand()
1 ms without writing

############ ENT results RtlRandomEx:
Entropy = 7.953470 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 63817.08, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 111.4225 (127.5 = random).
Monte Carlo value for Pi is 3.489901960 (error 11.09 percent).
Serial correlation coefficient is -0.048078 (totally uncorrelated = 0.0).

############ ENT results MbRand:
Entropy = 7.999827 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 240.28, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4222 (127.5 = random).
Monte Carlo value for Pi is 3.141804567 (error 0.01 percent).
Serial correlation coefficient is -0.002690 (totally uncorrelated = 0.0).


Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: MichaelW on January 29, 2014, 04:53:36 AM
I spent several hours testing RtlRandomEx under Windows XP, and while most of the ENT results were reasonable, the Chi-square distribution was always far from reasonable. Looking at the RtlRandomEx code in a debugger it's not anything trivial, and it uses the bottom 7 bits of the seed to load a DWORD from an apparently 128-element table, so I suspect that under Windows XP the function is not operating as intended.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 29, 2014, 05:24:51 AM
Michael,

Results are the same on Windows 7. The mean is also too far off to be credible.
What I found interesting is that it modifies the seed's location, and you can use that value to get a 64-bit result with practically identical scores.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 29, 2014, 09:40:16 PM
I've tested it under Windows 7 (64 bit). I can test it under Windows 7, 32 bit version and WindowsXP.

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: MichaelW on January 30, 2014, 10:02:03 AM
In my ENT results the mean was reasonable, and the mean calculated by my test code below is also reasonable. And whatever problem the ENT Chi-square test is detecting, my simple distribution test does not show it.

;==============================================================================
    include \masm32\include\masm32rt.inc
;==============================================================================
    .data
        total         dq  0
        counts        dd  10 dup(0)
        seed          dd  1
        hModule       dd  0
        pRtlRandomEx  dd  0
    .code
;==============================================================================
RtlRandomEx proc pSeed:PULONG
    .IF hModule == 0
        invoke LoadLibrary, chr$("ntoskrnl.exe")
        mov hModule, eax
        invoke GetProcAddress, hModule, chr$("RtlRandomEx")
        mov pRtlRandomEx, eax
    .ENDIF
    push pSeed
    call pRtlRandomEx
    ret
RtlRandomEx endp
;==============================================================================
start:
;==============================================================================
    MAXLONG equ 7fffffffh       ; from winnt.h
    COUNT = 1000000000
    USEBYTE equ 0
    xor ebx, ebx
    .WHILE ebx < COUNT
        ;---------------------------------------------------------
        ; This to determine that "same seed" in the documentation
        ; means same seed variable and not same seed value.
        ;---------------------------------------------------------
        ;mov seed, 1
        invoke RtlRandomEx, ADDR seed
        IF USEBYTE
            movzx eax, al
        ENDIF
        add DWORD PTR total, eax
        adc DWORD PTR total+4, 0
        X=1
        REPEAT 10
            IF USEBYTE
                cmp al, X*(256/10)
            ELSE
                cmp eax, X*(MAXLONG/10)
            ENDIF
            ja  @F
            inc counts[(X-1)*4]
            jmp done
          @@:
            X=X+1
        ENDM
      done:
        inc ebx
    .ENDW
    mov eax, DWORD PTR total
    mov edx, DWORD PTR total+4
    mov ecx, COUNT
    idiv ecx
    printf("mean  : %Xh\n\n",eax)
    X=0
    REPEAT 10
        printf("%u\n",counts[X])
        X=X+4
    ENDM
    printf("\n")
    inkey
    exit
;==============================================================================
end start


My search for RtlRandomEx did turn up an interesting article:

http://blog.ptsecurity.com/2012/12/windows-8-aslr-internals.html

That makes me wonder how the Windows 8 RtlRandomEx would do in the ENT tests.




Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 30, 2014, 12:35:09 PM
Michael,
Very interesting...!

Your code crashes on my Win XP SP3 machine. It happens here:
RtlRandomEx                   8BFF                       mov edi, edi  ; ntdll.7C920228 
0047E789                  /.  55                         push ebp
0047E78A                  |.  8BEC                       mov ebp, esp
0047E78C                  |.  A1 30495C00                mov eax, [5C4930]
0047E791                  |.  83E0 7F                    and eax, 7F
0047E794                  |.  53                         push ebx
0047E795                  |.  8D0C85 38495C00            lea ecx, [eax*4+5C4938]
0047E79C                  |.  8B01                       mov eax, [ecx]
0047E79E                  |.  56                         push esi
0047E79F                  |.  8B75 08                    mov esi, [arg.1]
0047E7A2                  |.  A3 30495C00                mov [5C4930], eax
0047E7A7                  |.  57                         push edi


Obviously, I wondered why mine didn't crash:
RtlRandomEx               /$  8BFF                       mov edi, edi
7C947B82                  |.  55                         push ebp
7C947B83                  |.  8BEC                       mov ebp, esp
7C947B85                  |.  A1 8009997C                mov eax, [7C990980]
7C947B8A                  |.  83E0 7F                    and eax, 7F
7C947B8D                  |.  53                         push ebx
7C947B8E                  |.  8D0C85 8809997C            lea ecx, [eax*4+7C990988]
7C947B95                  |.  8B01                       mov eax, [ecx]
7C947B97                  |.  56                         push esi
7C947B98                  |.  8B75 08                    mov esi, [arg.1]
7C947B9B                  |.  A3 8009997C                mov [7C990980], eax
7C947BA0                  |.  57                         push edi


We are using two different DLLs, ntoskrnl.exe vs NtDll.dll

Running your code with "my" DLL yields this:
mean  : 4035C436h

98675333
100855976
99656502
97367024
101346380
97367072
99983825
103363306
103799432
97585150


How did you get yours (i.e. the native API) running without that exception?
When I use your code with ntdll for generating the ENT data file, the same bad results pop up, in particular the mean of 111.x as compared to 127.5 (=255/2).

There is actually a problem with the documentation. According to MSDN,
RtlRandomEx returns a random number in the range [0..MAXLONG-1]. (http://msdn.microsoft.com/en-us/library/windows/hardware/ff553181%28v=vs.85%29.aspx) -> WinNT.h: #define MAXLONG     0x7fffffff, i.e. a 32-bit number.

However, using this C++ snippet:
#include <iostream>
#include <Windows.h>

typedef ULONGLONG(__stdcall *_RtlRandomEx)(PULONG seed);

int main()
{
    _RtlRandomEx RtlRandomEx = (_RtlRandomEx)GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlRandomEx");
    ULONG seed = 123;
    ULONGLONG result;
    for (int i = 0; i < 10; ++i)
__asm int 3;
result=RtlRandomEx(&seed);
__asm int 3;
        std::cout << result << std::endl;
}

... you find this:
00411412    ³.  FF55 F8           call near [local.2]  <<<<<<< RtlRandomEx
00411415    ³.  3BF4              cmp esi, esp
00411417    ³.  E8 24FDFFFF       call 00411140                   ; [_RTC_CheckEsp
0041141C    ³.  8945 DC           mov [local.9], eax <<<<<<< QWORD, not
0041141F    ³.  8955 E0           mov [local.8], edx <<<<<<< DWORD!!!

RtlRandomEx Ú$  8BFF              mov edi, edi                    ; ntdll.RtlRandomEx(guessed Arg1)
7C947B82    ³.  55                push ebp
7C947B83    ³.  8BEC              mov ebp, esp
7C947B85    ³.  A1 8009997C       mov eax, [7C990980]
7C947B8A    ³.  83E0 7F           and eax, 0000007F
7C947B8D    ³.  53                push ebx
7C947B8E    ³.  8D0C85 8809997C   lea ecx, [eax*4+7C990988]
7C947B95    ³.  8B01              mov eax, [ecx]
7C947B97    ³.  56                push esi
7C947B98    ³.  8B75 08           mov esi, [ebp+8]
7C947B9B    ³.  A3 8009997C       mov [7C990980], eax
7C947BA0    ³.  57                push edi
7C947BA1    ³.  8BF8              mov edi, eax
7C947BA3    ³.  8B06              mov eax, [esi]
7C947BA5    ³.  69C0 EDFFFF7F     imul eax, eax, 7FFFFFED
7C947BAB    ³.  05 C3FFFF7F       add eax, 7FFFFFC3
7C947BB0    ³.  BB FFFFFF7F       mov ebx, 7FFFFFFF
7C947BB5    ³.  33D2              xor edx, edx
7C947BB7    ³.  F7F3              div ebx
7C947BB9    ³.  8BC7              mov eax, edi
7C947BBB    ³.  5F                pop edi
7C947BBC    ³.  8916              mov [esi], edx
7C947BBE    ³.  5E                pop esi
7C947BBF    ³.  8911              mov [ecx], edx
7C947BC1    ³.  5B                pop ebx
7C947BC2    ³.  5D                pop ebp
7C947BC3    À.  C2 0400           retn 4


... it is evident that RtlRandomEx returns a QWORD, not a DWORD. Which, by the way, has exactly the same ENT characteristics, i.e. the 111 mean and the 7.95 bits.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: TWell on January 30, 2014, 07:08:46 PM
return value in EDX is constant value 77520124h in Windows 7 64-bit ?
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: GoneFishing on January 30, 2014, 07:19:26 PM
Quote from: MichaelW on January 30, 2014, 10:02:03 AM
...
My search for RtlRandomEx did turn up an interesting article:

http://blog.ptsecurity.com/2012/12/windows-8-aslr-internals.html

That makes me wonder how the Windows 8 RtlRandomEx would do in the ENT tests.

Results for Windows 8.1 64:

1188 ms with writing the file, RtlRandomEx
65 ms without writing

1028 ms with writing the file, MasmBasic Rand()
1 ms without writing


############ ENT results RtlRandomEx:
Entropy = 7.954139 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 62902.70, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 111.5191 (127.5 = random).
Monte Carlo value for Pi is 3.487717951 (error 11.02 percent).
Serial correlation coefficient is -0.049256 (totally uncorrelated = 0.0).

############ ENT results MbRand:
Entropy = 7.999827 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 240.28, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4222 (127.5 = random).
Monte Carlo value for Pi is 3.141804567 (error 0.01 percent).
Serial correlation coefficient is -0.002690 (totally uncorrelated = 0.0).
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 30, 2014, 07:50:41 PM
Quote from: TWell on January 30, 2014, 07:08:46 PM
return value in EDX is constant value 77520124h in Windows 7 64-bit ?

Very good question. On XP, it's a random number, but here on my Win7-32 it's 777A74E8h... and under the hood, you discover that the code has considerably changed:
RtlRandom> $  8BFF          mov edi, edi
77732835   .  55            push ebp
77732836   .  8BEC          mov ebp, esp
77732838   .  56            push esi
77732839   .  57            push edi
7773283A   .  6A 00         push 0
7773283C   .  6A 00         push 0
7773283E   .  68 B5867377   push 777386B5
77732843   .  68 04717A77   push 777A7104
77732848   .  E8 6A47FFFF   call RtlRunOnceExecuteOnce
7773284D   .  8B7D 08       mov edi, dword ptr [ebp+8]
77732850   .  8B07          mov eax, dword ptr [edi]
77732852   .  8B35 E8747A77 mov esi, dword ptr [777A74E8]
77732858   .  B9 EDFFFF7F   mov ecx, 7FFFFFED
7773285D   .  F7E1          mul ecx                                  ;  MSVCP100.70B8ED48
7773285F   .  6A 00         push 0
77732861   .  83E6 7F       and esi, 7F
77732864   .  05 C3FFFF7F   add eax, 7FFFFFC3
77732869   .  68 FFFFFF7F   push 7FFFFFFF
7773286E   .  83D2 00       adc edx, 0
77732871   .  52            push edx                                 ;  MSVCP100.std::cout
77732872   .  50            push eax
77732873   .  E8 F81EFDFF   call _aullrem
77732878   .  8907          mov dword ptr [edi], eax
7773287A   .  8D0CB5 00A97A>lea ecx, dword ptr [esi*4+777AA900]
77732881   .  8701          xchg dword ptr [ecx], eax
77732883   .  8BC8          mov ecx, eax
77732885   .  BA E8747A77   mov edx, 777A74E8
7773288A   .  F0:0FC10A     lock xadd dword ptr [edx], ecx           ;  LOCK prefix
7773288E   .  5F            pop edi                                  ;  RtlRando.013B35F3
7773288F   .  5E            pop esi                                  ;  RtlRando.013B35F3
77732890   .  5D            pop ebp                                  ;  RtlRando.013B35F3
77732891   .  C2 0400       retn 4
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: MichaelW on January 30, 2014, 09:20:21 PM
I had no problem with my code working, at least as a console app. But when I tried to do a scatter plot of the return values, I could not because when called from the message loop of a modeless dialog, even though my seed was in the data section, the function would trigger an access violation when it tried to access the seed. I still have no idea why.

I used the function exported by ntoskrnl.exe because the first page in my search results:

http://msdn.microsoft.com/en-us/library/windows/hardware/ff553181(v=vs.85).aspx

Specified ntoskrnl.lib, and since my system has no ntoskrnl.dll I assumed that ntoskrnl.lib was an import library for ntoskrnl.exe. When I linked with the MASM32 import library my app would not start, and the same for the import library that I created, so I used run-time dynamic linking.


Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 30, 2014, 11:15:43 PM
Quote from: MichaelW on January 30, 2014, 09:20:21 PMmy system has no ntoskrnl.dll

There is no ntoskrnl.dll AFAIK (and if there is one, it's not from Microsoft (http://www.derkeiler.com/Newsgroups/microsoft.public.windowsxp.security_admin/2006-12/msg00122.html) ;)), just ntoskrnl.exe, but ntdll.dll exports the same stuff for user mode. Which OS are you using? Native APIs are meant for kernel mode...

By the way, I found a way to turn RtlRandomEx into an excellent generator, according to ENT:
  void RtlRandomEx(esi)        ; returns eax
  if SwapTest
        push eax
        void RtlRandomEx(esi)
        pop edx
        bswap eax
        mov ax, dx
  endif

High quality numbers*), although now a factor 3-4 slower than MasmBasic Rand() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1030).
So it seems the problem sits in the higher bits of that DWORD...

*) good on Win7 but not so overwhelming on WinXP, see tests below...
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 31, 2014, 05:54:20 AM
Jochen,

Quote from: jj2007 on January 30, 2014, 11:15:43 PM
By the way, I found a way to turn RtlRandomEx into an excellent generator, according to ENT:
  void RtlRandomEx(esi)        ; returns eax
  if SwapTest
        push eax
        void RtlRandomEx(esi)
        pop edx
        bswap eax
        mov ax, dx
  endif


and that'll bring more quality?

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 31, 2014, 07:09:46 AM
Quote from: Gunther on January 31, 2014, 05:54:20 AM
and that'll bring more quality?

See yourself:
############ ENT results RtlRandomEx, no swap:
Entropy = 7.951657 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 760582.95, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 111.3491 (127.5 = random).
Monte Carlo value for Pi is 3.493440114 (error 11.20 percent).
Serial correlation coefficient is -0.046557 (totally uncorrelated = 0.0).



############ ENT results RtlRandomEx32, with swap:
Entropy = 7.994638 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 84827.16, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 127.3251 (127.5 = random).
Monte Carlo value for Pi is 3.142689433 (error 0.03 percent).
Serial correlation coefficient is 0.000224 (totally uncorrelated = 0.0).


############ ENT results Rand():
Entropy = 7.999985 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 242.21, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4934 (127.5 = random).
Monte Carlo value for Pi is 3.141929807 (error 0.01 percent).
Serial correlation coefficient is -0.002164 (totally uncorrelated = 0.0).


It is even more evident when only the highest byte is being used:
############ ENT results RtlRandomEx, hibyte only:
Entropy = 6.994909 bits per byte.

Optimum compression would reduce the size
of this 2818048 byte file by 12 percent.

Chi square distribution for 2818048 samples is 2857358.49, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 63.7087 (127.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is -0.000405 (totally uncorrelated = 0.0).


And voilà, the mystery is solved - I should read the MSDN documentation (http://msdn.microsoft.com/en-us/library/windows/hardware/ff553181%28v=vs.85%29.aspx) more thoroughly:
QuoteRtlRandomEx returns a random number in the range [0..MAXLONG-1]

Bloody MAXLONG is 2^31, not 2^32-1... :redface:

Source is attached but you need version 31 January of MasmBasic (http://masm32.com/board/index.php?topic=94.0). Version 2a is with one "hibyte only" run.

Gunther has posted results below that seem to indicate quality has improved on Win7, as compared to XP.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 31, 2014, 08:27:30 AM
Jochen,

the results:


103 ms  incl. writing the file, M$ RtlRandomEx()
85 ms   without writing

169 ms  incl. writing the file, M$ RtlRandomEx()
165 ms  without writing

11 ms   incl. writing the file, MasmBasic Rand()
8210 µs without writing


############ ENT results RtlRandomEx, no swap:
Entropy = 7.954454 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 716481.11, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 111.5233 (127.5 = random).
Monte Carlo value for Pi is 3.485753866 (error 10.95 percent).
Serial correlation coefficient is -0.048975 (totally uncorrelated = 0.0).



############ ENT results RtlRandomEx32, with swap:
Entropy = 7.999985 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 233.26, and randomly
would exceed this value 75.00 percent of the times.

Arithmetic mean value of data bytes is 127.5205 (127.5 = random).
Monte Carlo value for Pi is 3.140280811 (error 0.04 percent).
Serial correlation coefficient is -0.000547 (totally uncorrelated = 0.0).


############ ENT results Rand():
Entropy = 7.999985 bits per byte.

Optimum compression would reduce the size
of this 11468800 byte file by 0 percent.

Chi square distribution for 11468800 samples is 242.21, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4934 (127.5 = random).
Monte Carlo value for Pi is 3.141929807 (error 0.01 percent).
Serial correlation coefficient is -0.002164 (totally uncorrelated = 0.0).


Quote from: jj2007 on January 31, 2014, 07:09:46 AM
Source is attached but it might not assemble with the current online version of MasmBasic.

so you've MasmBasic changed secretly.  ::)

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: jj2007 on January 31, 2014, 09:09:49 AM
Quote from: Gunther on January 31, 2014, 08:27:30 AM
so you've MasmBasic changed secretly.  ::)
I am constantly improving it. See line 56 of the source - the Alias is new, I needed it to avoid "already defined" errors:
  Dll "ntoskrnl.exe"
  Declare RtlRandomExNtos, 1 Alias "RtlRandomEx"

(don't use RtlRandomExNtos, it is a native API and crashes in user mode)

Besides, instead of calling GetProcAddress for every use of the "declared" function, it's now called only once, which is much faster, of course.
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: Gunther on January 31, 2014, 10:39:30 AM
Jochen,

thank you for the clarification.

Gunther
Title: Re: Monte Carlo Simulation with RDRAND (32 bit)
Post by: FORTRANS on February 01, 2014, 01:28:41 AM
Hi,

   I ran Michael's code posted in Reply #30 on my XP laptop.
Here are the results (more or less, it did not like being piped to
a file, so it was reformatted.)

mean  : 4035C48Ch

98675356
100855971
99656494
97367006
101346368
97367067
99983817
103363294
103799423
97585151
Press any key to continue ...


Regards,

Steve N.