Author Topic: Floating point PRNG  (Read 3320 times)

AW

  • Member
  • *****
  • Posts: 2583
  • Let's Make ASM Great Again!
Floating point PRNG
« on: September 16, 2018, 01:01:45 AM »
Probably, it is buried somewhere but I could not find any ASM floating point pseudo-random number generator (PRNG). Sure I found many ASM integer PRNG  :t
Actually, it is very easy to produce a floating point PRNG from an integer one, using the knowledge we have (or might have) about IEEE 754.
This is the main purpose of this essay, but I will use the opportunity to present the results for REAL8, REAL4 and HALF with 15, 6 and 3 significant digits (YES, printf does have a little know capacity for that).

And this time I will have no mercy for people with too old computers (older than Ivy Bridge), sorry. I will use the rdrand for random numbers, and the vcvtps2ph and vcvtph2ps for the HALFs in order to make the code shorter.

Code: [Select]
REAL8 Value (15 significant digits)=-9650.29779824708
REAL4 Value (6 significant digits)=-9650.3
HALF Value (3 significant digits)=-9.65e+003


REAL8 Value (15 significant digits)=-40373.2896508689
REAL4 Value (6 significant digits)=-40373.3
HALF Value (3 significant digits)=-4.04e+004


REAL8 Value (15 significant digits)=-22474.0836345981
REAL4 Value (6 significant digits)=-22474.1
HALF Value (3 significant digits)=-2.25e+004


REAL8 Value (15 significant digits)=43495.2071411472
REAL4 Value (6 significant digits)=43495.2
HALF Value (3 significant digits)=4.35e+004


REAL8 Value (15 significant digits)=42173.4249726214
REAL4 Value (6 significant digits)=42173.4
HALF Value (3 significant digits)=4.22e+004


REAL8 Value (15 significant digits)=10021.6594674816
REAL4 Value (6 significant digits)=10021.7
HALF Value (3 significant digits)=1e+004


REAL8 Value (15 significant digits)=14000.1614663608
REAL4 Value (6 significant digits)=14000.2
HALF Value (3 significant digits)=1.4e+004


REAL8 Value (15 significant digits)=7151.36409892466
REAL4 Value (6 significant digits)=7151.36
HALF Value (3 significant digits)=7.15e+003


REAL8 Value (15 significant digits)=16185.3460413327
REAL4 Value (6 significant digits)=16185.3
HALF Value (3 significant digits)=1.62e+004


REAL8 Value (15 significant digits)=-58318.6575650965
REAL4 Value (6 significant digits)=-58318.7
HALF Value (3 significant digits)=-5.83e+004


REAL8 Value (15 significant digits)=21377.3863558798
REAL4 Value (6 significant digits)=21377.4
HALF Value (3 significant digits)=2.14e+004


REAL8 Value (15 significant digits)=62768.6210416745
REAL4 Value (6 significant digits)=62768.6
HALF Value (3 significant digits)=6.28e+004


REAL8 Value (15 significant digits)=-7787.06495612929
REAL4 Value (6 significant digits)=-7787.06
HALF Value (3 significant digits)=-7.79e+003


REAL8 Value (15 significant digits)=42013.0045911184
REAL4 Value (6 significant digits)=42013
HALF Value (3 significant digits)=4.2e+004


REAL8 Value (15 significant digits)=-61322.6968192349
REAL4 Value (6 significant digits)=-61322.7
HALF Value (3 significant digits)=-6.13e+004


REAL8 Value (15 significant digits)=27707.4044548888
REAL4 Value (6 significant digits)=27707.4
HALF Value (3 significant digits)=2.77e+004


REAL8 Value (15 significant digits)=-4169.53884418745
REAL4 Value (6 significant digits)=-4169.54
HALF Value (3 significant digits)=-4.17e+003


REAL8 Value (15 significant digits)=21556.1687919651
REAL4 Value (6 significant digits)=21556.2
HALF Value (3 significant digits)=2.16e+004


REAL8 Value (15 significant digits)=54650.2552792932
REAL4 Value (6 significant digits)=54650.3
HALF Value (3 significant digits)=5.47e+004


REAL8 Value (15 significant digits)=29722.4088189263
REAL4 Value (6 significant digits)=29722.4
HALF Value (3 significant digits)=2.97e+004

HSE

  • Member
  • *****
  • Posts: 1379
  • <AMD>< 7-32>
Re: Floating point PRNG
« Reply #1 on: September 16, 2018, 01:12:03 AM »
ObjAsm32 RNG  :t

AW

  • Member
  • *****
  • Posts: 2583
  • Let's Make ASM Great Again!
Re: Floating point PRNG
« Reply #2 on: September 16, 2018, 02:16:40 AM »

HSE

  • Member
  • *****
  • Posts: 1379
  • <AMD>< 7-32>
Re: Floating point PRNG
« Reply #3 on: September 16, 2018, 02:30:47 AM »
Exactly!!

Later:
Just in case try 32 bit version.
I think 64 bit version only erase some macros implemented like HLL in JWASM derivatives, but I'm not sure.
   

I write well in previous post: 32 bit

Siekmanski

  • Member
  • *****
  • Posts: 2330
Re: Floating point PRNG
« Reply #4 on: September 16, 2018, 08:22:14 AM »
I use these 2 fast and small hacks to calculate random real4 values.
A:  between 0.0 and 1.0
B:  between -1.0 and 1.0

Code: [Select]
.const
Fl32_1      real4 1.0
Fl32_3      real4 3.0

.data
align 4
Seed        dd 476954562   ; initialize once with a seed value, can be anything except 0
MagicRnd    dd 16807
Scale       real4 255.0

.code
;A
    mov     eax,Seed
    mul     MagicRnd
    mov     Seed,eax
    shr     eax,9
    or      eax,03f800000h
    mov     dword ptr [esp-4],eax
    movss   xmm0,dword ptr [esp-4]
    subss   xmm0,Fl32_1 ; result = a random real4 between 0.0 and 1.0
;   mulss   xmm0,Scale 

;B
    mov     eax,Seed
    mul     MagicRnd
    mov     Seed,eax
    shr     eax,9
    or      eax,040000000h
    mov     dword ptr [esp-4],eax
    movss   xmm0,dword ptr [esp-4]
    subss   xmm0,Fl32_3 ; result = a random real4 between -1.0 and 1.0
;   mulss   xmm0,Scale 
 
Creative coders use backward thinking techniques as a strategy.

Siekmanski

  • Member
  • *****
  • Posts: 2330
Re: Floating point PRNG
« Reply #5 on: September 16, 2018, 05:30:48 PM »
New fully commented code, maximum 4 random real4 values at once.
Still have to check if the distribution is satisfactory....

Code: [Select]
.const

Shuffle MACRO V0,V1,V2,V3
    EXITM %((V0 shl 6) or (V1 shl 4) or (V2 shl 2) or (V3))
ENDM

align 16
Range12     dd    03f800000h,03f800000h,03f800000h,03f800000h
Range24     dd    040000000h,040000000h,040000000h,040000000h
MagicRnd    dd    16807,16807,16807,16807
Fl32_1      real4 1.0,1.0,1.0,1.0
Fl32_3      real4 3.0,3.0,3.0,3.0

.data
align 16
Seed        dd    476954562,473954562,471954562,479954562   ; initialize once with seed values
Scale       real4 255.0,255.0,255.0,255.0

.code

A:
    movdqa  xmm0,oword ptr Seed         ; Get the 4 seeds
    movdqa  xmm2,oword ptr MagicRnd     ; Get the 4 MagicRnds
    pshufd  xmm1,xmm0,Shuffle(0,1,2,3)  ; shuffle the second multiplication in place
    pmuludq xmm0,xmm2                   ; Save the first pair qword multiply results
    pmuludq xmm1,xmm2                   ; Save the second pair qword multiply results
    shufps  xmm0,xmm1,Shuffle(2,0,2,0)  ; Save the low 32 bit parts from the 4 qwords
    movdqa  oword ptr Seed,xmm0         ; Save the new 4 Seeds for the next run
    psrld   xmm0,9                      ; Shift them to the fractional parts
    orps    xmm0,oword ptr Range12      ; Generate 4 real4 random numbers between 1.0 and 2.0
    subps   xmm0,oword ptr Fl32_1       ; Set the ranges of the 4 values between 0.0 and 1.0
;    mulps   xmm0,oword ptr Scale       ; if you want to scale the values up to a range of your choice
    ret

B:
    movdqa  xmm0,oword ptr Seed         ; Get the 4 seeds
    movdqa  xmm2,oword ptr MagicRnd     ; Get the 4 MagicRnds
    pshufd  xmm1,xmm0,Shuffle(0,1,2,3)  ; shuffle the second multiplication in place
    pmuludq xmm0,xmm2                   ; Save the first pair qword multiply results
    pmuludq xmm1,xmm2                   ; Save the second pair qword multiply results
    shufps  xmm0,xmm1,Shuffle(2,0,2,0)  ; Save the low 32 bit parts from the 4 qwords
    movdqa  oword ptr Seed,xmm0         ; Save the new 4 Seeds for the next run
    psrld   xmm0,9                      ; Shift them to the fractional parts
    orps    xmm0,oword ptr Range24      ; Generate 4 real4 random numbers between 2.0 and 4.0
    subps   xmm0,oword ptr Fl32_3       ; Set the ranges of the 4 values between -1.0 and 1.0
;    mulps   xmm0,oword ptr Scale       ; if you want to scale the values up to a range of your choice
    ret

EDIT: 1 pmuludq is not enough to get 4 random values.
         We need 2 pmuludq and save the 4 low 32 bits of the 4 qwords.

      REPLACED THE OLD CODE WITH THIS NEW CODE !!!!!  :icon_redface:
« Last Edit: September 16, 2018, 11:26:48 PM by Siekmanski »
Creative coders use backward thinking techniques as a strategy.

AW

  • Member
  • *****
  • Posts: 2583
  • Let's Make ASM Great Again!
Re: Floating point PRNG
« Reply #6 on: September 16, 2018, 10:14:58 PM »
Good job  :t

Still have to check if the distribution is satisfactory....

May be not, it is a Lehmer random number generator  :(.  At least not as good as rdrand, which is certified by NSA  :badgrin:

Siekmanski

  • Member
  • *****
  • Posts: 2330
Re: Floating point PRNG
« Reply #7 on: September 16, 2018, 11:11:29 PM »
It was not reliable enough because of the behaviour of "pmuludq".
I had to split it up in 2 parts to get the full 32 bit range for the 4 seeds. ( didn't noticed it before.... )
See Reply #5 for the new code.

This will be enough for audio and graphics programming and its fast.  :biggrin:
Creative coders use backward thinking techniques as a strategy.

jj2007

  • Member
  • *****
  • Posts: 10548
  • Assembler is fun ;-)
    • MasmBasic
Re: Floating point PRNG
« Reply #8 on: September 16, 2018, 11:34:55 PM »
It was not reliable enough because of the behaviour of "pmuludq"

Yes, it's pretty confusing ;-)

Multiplies the first operand (destination operand) by the second operand (source operand) and stores the result in the destination operand. The source operand can be a unsigned doubleword integer stored in the low doubleword of an MMX� technology register or a 64-bit memory location, or it can be two packed unsigned doubleword integers stored in the first (low) and third doublewords of an XMM register or an 128-bit memory location. The destination operand can be a unsigned doubleword integer stored in the low doubleword an MMX register or two packed doubleword integers stored in the first and third doublewords of an XMM register. When packed doubleword operands are used, a SIMD multiply is performed on two sets of values, producing two results. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).

AW

  • Member
  • *****
  • Posts: 2583
  • Let's Make ASM Great Again!
Re: Floating point PRNG
« Reply #9 on: September 17, 2018, 02:13:41 AM »
Yes, it's pretty confusing ;-)
You need a drawing. Try to search with google and click the images tab.  :idea:

jj2007

  • Member
  • *****
  • Posts: 10548
  • Assembler is fun ;-)
    • MasmBasic
Re: Floating point PRNG
« Reply #10 on: September 17, 2018, 02:21:59 AM »
You need a drawing

My red highlighting above is enough. It's pretty simple once you've understood it.

Caché GB

  • Member
  • **
  • Posts: 97
  • MASM IS HOT
Re: Floating point PRNG
« Reply #11 on: September 25, 2018, 12:37:04 PM »
Hi Siekmanski.

Thank you very much for your random for 4 x real4 values.
This is awesome for game programming, just like your 15 at once timers from here:

http://masm32.com/board/index.php?topic=7060.48  (MultimediaTimers.zip)

Using jj2007's formula from here

http://masm32.com/board/index.php?topic=7419. (Reply #2)

to fill the seed

Code: [Select]
AutoSetRandom proc

   local  SaveEdi:dword

          mov  SaveEdi, edi    ; Object pointer

          mov  edi, offset Seed
          xor  esi, esi
    .Repeat

       invoke  Sleep, 1        ; leave the time slice

        cpuid                  ; serialise
        rdtsc
          mov  byte ptr[edi+esi], al
          inc  esi

     .Until( esi > 15 )

          mov  edi, SaveEdi
          ret

AutoSetRandom  endp


Thank you jj2007 also for this code.

Caché GB's 1 and 0-nly language:MASM

johnsa

  • Member
  • ****
  • Posts: 807
    • Uasm
Re: Floating point PRNG
« Reply #12 on: September 28, 2018, 06:31:10 PM »
For Crypto RNG, rdrand.. yep.. but it's slow as heck.

For PRNG, the best option at the moment is XoroShiro128+
It's distribution and statistical characteristics are excellent, performance is fantastic, sub nanosecond per result.
I've been using this approach with much success over standard PRNG algos both in terms of quality and performance.

Another alternative, depending on your application is to look at using low discrepancy sequences like Halton which in some cases can provide much better results than a PRNG (such as convergence rates etc).

Siekmanski

  • Member
  • *****
  • Posts: 2330
Re: Floating point PRNG
« Reply #13 on: September 28, 2018, 08:05:20 PM »
Another fast one is the PCG Family: http://www.pcg-random.org/
Creative coders use backward thinking techniques as a strategy.

jj2007

  • Member
  • *****
  • Posts: 10548
  • Assembler is fun ;-)
    • MasmBasic
Re: Floating point PRNG
« Reply #14 on: September 28, 2018, 08:57:25 PM »
For PRNG, the best option at the moment is XoroShiro128+

As Marinus mentioned already, there is PCG32. They claim it's better than XoroShiro128, as proven with PractRand. It's a long and controversial issue, though.