News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Create a file with RANDOM dwords

Started by jj2007, September 29, 2014, 07:48:21 AM

Previous topic - Next topic

jj2007

For use with all kinds of sorting algos - executable and source attached.

include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init
  mov ecx, Val(Input$("Rand(): #elements to create: ", "1000000"))
  .if signed ecx>0
      shl ecx, 2      ; get DWORDs
      NanoTimer()
      Let edi=New$(ecx)
      push edi      ; we need it later
      push ecx
      .Repeat
            void Rand(-1)
            stosd
            sub ecx, DWORD
      .Until Sign?
      pop ecx
      pop edi
      Let esi=Str$("Creating %i random DWORDs took ", ecx/DWORD)+Str$("%i ms\n", NanoTimer(ms))
      Open "O", #1, "RandomDwords.dat"
      PrintBuffer #1, edi, ecx
      Close #1
      Inkey esi, CrLf$, 'To read back the data into a buffer,', CrLf$, 'MasmBasic fans may use', CrLf$, 'Let esi=FileRead$("RandomDwords.dat")', CrLf$, 'otherwise check Masm32 InputFile'
  .endif
  Exit
end start

guga

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Looks OK:
0       3080
1       9158
2       13464
3       15479
4       17700
5       22867
6       26355
7       29191
8       32229
9       42286
999991  4294921291
999992  4294921704
999993  4294929912
999994  4294934235
999995  4294945952
999996  4294946053
999997  4294951189
999998  4294958802
999999  4294960543



include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init
  Dim MyDw() As DWORD
  ArrayRead MyDw(), "SortedDwords.dat"
  push eax      ; #elements
  xor ecx, ecx
  .Repeat
      mov eax, stack
      sub eax, 10
      .if ecx<10 || ecx>eax
            PrintLine Str$(ecx), Str$("\t%u", MyDw(ecx))
      .endif
      inc ecx
  .Until ecx>=stack
  pop eax
  Inkey "ok?"
  Exit
end start


And the timings?

guga

Tks, i´m uploading now the one from nrQsortA to compare. As i explained the resultatn data does not seems to be accurated. In fact, it is unsorted when feeding it with 1 million of random values. This explains why it took only a couple of secs to sort.

It seems like nrQsortA needs a review, because the sort it is being breaked when feeded with 1 million random values. Also, if i try to feed it with the already sorted data, nrQsortA ruins the result and produce unsorted values. There is a problem in the middle of the nrQsortA function causing it to stop and producing a wrong result.

Masm Result when feeded with RandomDwords.dat (Also, the same result is retrieved when you feede it with SortedDwords.dat):
Quote0   2147484709
1   2147490471
2   2147491291
3   2147491787
4   2147501199
5   2147513906
6   2147518668
7   2147519738
8   2147520028
9   2147521313
999991   2147439241
999992   2147439798
999993   2147440421
999994   2147450671
999995   2147450920
999996   2147454024
999997   2147454425
999998   2147476822
999999   2147481168
ok?

Guga result when feeded with RandomDwords.dat (This is the same correct result when feeded with SortedDwords.dat)
Quote0   3080
1   9158
2   13464
3   15479
4   17700
5   22867
6   26355
7   29191
8   32229
9   42286
999991   4294921291
999992   4294921704
999993   4294929912
999994   4294934235
999995   4294945952
999996   4294946053
999997   4294951189
999998   4294958802
999999   4294960543
ok?

I also gave a test on nrQsortA feeding it with an already sorted data (SortedDwords.dat), and even if it took only 16 milisecs to the function finishes, it completelly ruined the result, and changed. I didn´t analyzed nrQsortA to see why it is being breaked (perhaps, i´m porting it incorrecty to RosAsm)
On the other hand, on my algo, even if i pass a already sorted file it takes almost nothing to sort, n fact, if i feed it with SortedDwords.dat it took also only 16 milisecs to the algo finishes and the result is the same as in the loaded file.


Btw....did you tried to feed your algo (ArraySort) on this file (RandomDwords.dat) to test how much time it takes to parse ? Do you have the timmings of it ?


P.S: About the timmings, i´ve got this (i7 CPU 870, 2.93 Ghz)

QuoteGuga Algo 1 = 723718 miliseconds = 12.061966667 minutes
Guga Algo 1 = 719438 miliseconds = 11.990633333 minutes

Guga Algo 2 = 707640 miliseconds = 11.794 minutes


Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Hi Gustavo,

I don't understand what you mean with "wrong results". They do look different, but it's because your algo sorts unsigned values, while mine sorts signed DWORDs:

#       signed  unsigned hex (sorted by Guga)
     0  3080    3080    00000C08
     1  9158    9158    000023C6
     2  13464   13464   00003498
     3  15479   15479   00003C77
     4  17700   17700   00004524
     5  22867   22867   00005953
     6  26355   26355   000066F3
..
999994  -33061  4294934235      FFFF7EDB
999995  -21344  4294945952      FFFFACA0
999996  -21243  4294946053      FFFFAD05
999997  -16107  4294951189      FFFFC115
999998  -8494   4294958802      FFFFDED2
999999  -6753   4294960543      FFFFE59F

#       signed          unsigned        hex (sorted by ArraySort)
     0  -2147482587     2147484709      80000425
     1  -2147476825     2147490471      80001AA7
     2  -2147476005     2147491291      80001DDB
     3  -2147475509     2147491787      80001FCB
     4  -2147466097     2147501199      8000448F
     5  -2147453390     2147513906      80007632
     6  -2147448628     2147518668      800088CC
..
999994  2147450671      2147450671      7FFF7F2F
999995  2147450920      2147450920      7FFF8028
999996  2147454024      2147454024      7FFF8C48
999997  2147454425      2147454425      7FFF8DD9
999998  2147476822      2147476822      7FFFE556
999999  2147481168      2147481168      7FFFF650


Re timings: ArraySort() takes 110 ms on my trusty old Celeron. If you provide your algo in Masm format, I'll try to put it into a testbed.

guga

Hi JJ

I didn´t tested your algo yet, i used as a test the one from Masm lib (nrQsortA from qsorta.asm)

But, i´m getting confused, because on small chuncks  nrQsortA takes more time to parse, according to my benchmarks tests.

Can you test my algo ? I´m not understanding what´s happenning, because on a real file (SortedDwords.dat), my algo takes absurd 12 minutes, while nrQsortA takes 94 ms ????


My algo in masm, should be like this, i believe:


Guga_Insort proc Array:DWORD, ArrayLen:DWORD


push edi
push ebx
mov ebx, 1
mov edi, Array
mov eax, ebx
sub eax, ArrayLen
jge OutFunction

MainLoop:
mov ecx, [edi+ebx*4]

InternalLoop:
mov eax, [edi+ebx*4-4]
cmp eax, ecx
jbe OutInternalLoop
mov [edi+ebx*4], eax
dec ebx
jnz InternalLoop

OutInternalLoop:
mov [edi+ebx*4], ecx
inc ebx
cmp ebx, ArrayLen
jb MainLoop

OutFunction:
pop ebx
pop edi
ret ; <--- Is Endp in masm already using a retn ??? If it does, then, this ret must be removed
Guga_Insort endp


I´m not sure about the ret instruction in masm. In RosAsm "endp" is a macro that already contains a retn. So, if ret is ok in my attemtp to port, keep it, otherwise, just remove the ret.

I disassembled my own algo, if it makes easy to port to Masm.

Guga_insort proc near
Array = dword ptr  8
ArrayLen = dword ptr  0Ch

push ebp
mov ebp, esp
push edi
push ebx
mov ebx, 1
mov edi, [ebp+Array]
mov eax, ebx
sub eax, [ebp+ArrayLen]
jge short loc_4069FA

loc_4069DC:
mov ecx, [edi+ebx*4]

loc_4069DF:
mov eax, [edi+ebx*4-4]
cmp eax, ecx
jbe short loc_4069ED
mov [edi+ebx*4], eax
dec ebx
jnz short loc_4069DF

loc_4069ED:
mov [edi+ebx*4], ecx
inc ebx
cmp ebx, [ebp+ArrayLen]
jb loc_4069DC

loc_4069FA:
pop ebx
pop edi
mov esp, ebp
pop ebp
retn 8
insort endp

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Btw, here are some benchmarks i´m testing.

It seems that my algo (Insort optimized) is only faster for small chuncks of elements. The difference starts when there are more then 20 elements of the array. Up to that amount, nrQsortA  is faster

For a accurated result. I inserted before each one of the algos, a function called "RandomGen", that copies to a buffer the array elements on the original order before the algos starts. This is to avoid that after, using the algo, the array is sorted, and the next algo will use this already sorted array and not a unsorted one.



Note: The file BenchMark_100Elements_NoGen.zip shows what happens when RandomGen is not being used. My algo parses 10 times faster when he array is already sorted, while nrQsortA tries to sort all over again, even if they are already sorted
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Hi Gustavo,
I've added your algo to the testbed, but it seems there is quadratic behaviour present. Even when taking only 2% of the elements used by the other algos, it is slower, sorry :(

Guga_Insort proc uses edi ebx Array:DWORD, ArrayLen:DWORD
mov ebx, 1
mov edi, Array
mov eax, ebx
sub eax, ArrayLen
jge OutFunction

MainLoop:
mov ecx, [edi+ebx*4]

InternalLoop:
mov eax, [edi+ebx*4-4]
cmp eax, ecx
jbe OutInternalLoop
mov [edi+ebx*4], eax
dec ebx
jnz InternalLoop

OutInternalLoop:
mov [edi+ebx*4], ecx
inc ebx
cmp ebx, ArrayLen
jb MainLoop

OutFunction:
ret
Guga_Insort endp

guga

Damn...it was too good to be true  :icon_mrgreen:

I´ll give a test on dualPivotQuicksort to see the results.

Btw.does your algo (or another from masmlib) sorts unsigned integers ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

(...)Edited. JJ, If you read this post before i edited, forget it. I suceeded to make masmbasic work properly :)
I forgot to also rename Jwasm to mlv615.exe and not only renaming it with ml.exe  :t
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Hi Gustavo,

Just checked, it seems the minimum content of \Masm32\bin\ is this:

ml.exe       ; can be 6.15 ... 10.0, or a renamed JWasm.exe (better)
link.exe     ; the old 5.12 linker that comes with Masm32
mspdb50.dll  ; the old linker needs this one


hutch--

Guga,

At an algo level the only difference between a signed and unsigned compare is the jump you use after the compare.

guga

#12
Tks you for the feedback guys

I fixed the algo and mixed with other one i´building

The results are:

Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2)
Sorting 2000000 items:
DwRange=-1
224 ms for ArraySort MasmBasic
202 ms for nrQsortA Masm32 lib
388 ms for crt_qsort with callback
176 ms for Gugasedgesort
DwRange=-1
232 ms for ArraySort MasmBasic
208 ms for nrQsortA Masm32 lib
390 ms for crt_qsort with callback
174 ms for Gugasedgesort
DwRange=-1
221 ms for ArraySort MasmBasic
202 ms for nrQsortA Masm32 lib
390 ms for crt_qsort with callback
171 ms for Gugasedgesort
ok


JJ, can you see if the masm syntax of the algorithm is correct ?

If my porting to masm are correct, it is 15% faster then  nrQsortA. Probably i can optimize it a bit more if i succeed to remove the recursion and merge the functions i used. I´ll also finish the dualpivot (tri partition quicksort) to compare the results

I´m analyzing mine algos and see if i can reach something around 80-100 ms (Probably this timing can be reached if i suceed to make dualpivot algorithm work as expected and mix it with the other algos i built: insort and partial_quickersort)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Btw, I forgot

Steve, which/Where change the JCC instruction to make nrQsortA to sort unsigned ?

I would like to compare the timmings for the algos works on the same way (I mean, unsorted and sorted) to check for the differences
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

I´m almost there at 80 ms  :icon_mrgreen:

Updated gugasedgesort
Can someone test the results for accuracy ?

Btw...JJ, if the results are accurated, then the 120 ms can be even more faster, once i remove the recursion. I estimate a gain of 30% (at least) of speed without recursion
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com