Print Page - Create a file with RANDOM dwords

Title: Create a file with RANDOM dwords
Post by: jj2007 on September 29, 2014, 07:48:21 AM

For use with all kinds of sorting algos - executable and source attached.

include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
Init
mov ecx, Val(Input$("Rand(): #elements to create: ", "1000000"))
.if signed ecx>0
shl ecx, 2 ; get DWORDs
NanoTimer()
Let edi=New$(ecx)
push edi ; we need it later
push ecx
.Repeat
void Rand(-1)
stosd
sub ecx, DWORD
.Until Sign?
pop ecx
pop edi
Let esi=Str$("Creating %i random DWORDs took ", ecx/DWORD)+Str$("%i ms\n", NanoTimer(ms))
Open "O", #1, "RandomDwords.dat"
PrintBuffer #1, edi, ecx
Close #1
Inkey esi, CrLf$, 'To read back the data into a buffer,', CrLf$, 'MasmBasic fans may use', CrLf$, 'Let esi=FileRead$("RandomDwords.dat")', CrLf$, 'otherwise check Masm32 InputFile'
.endif
Exit
end start

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 29, 2014, 08:32:57 AM

Thank you. Here is the result

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on September 29, 2014, 09:04:15 AM

Looks OK:
0 3080
1 9158
2 13464
3 15479
4 17700
5 22867
6 26355
7 29191
8 32229
9 42286
999991 4294921291
999992 4294921704
999993 4294929912
999994 4294934235
999995 4294945952
999996 4294946053
999997 4294951189
999998 4294958802
999999 4294960543

include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
Init
Dim MyDw() As DWORD
ArrayRead (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1179) MyDw(), "SortedDwords.dat"
push eax ; #elements
xor ecx, ecx
.Repeat
mov eax, stack
sub eax, 10
.if ecx<10 || ecx>eax
PrintLine Str$(ecx), Str$("\t%u", MyDw(ecx))
.endif
inc ecx
.Until ecx>=stack
pop eax
Inkey "ok?"
Exit
end start

And the timings?

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 29, 2014, 10:00:57 AM

Tks, i´m uploading now the one from nrQsortA to compare. As i explained the resultatn data does not seems to be accurated. In fact, it is unsorted when feeding it with 1 million of random values. This explains why it took only a couple of secs to sort.

It seems like nrQsortA needs a review, because the sort it is being breaked when feeded with 1 million random values. Also, if i try to feed it with the already sorted data, nrQsortA ruins the result and produce unsorted values. There is a problem in the middle of the nrQsortA function causing it to stop and producing a wrong result.

Masm Result when feeded with RandomDwords.dat (Also, the same result is retrieved when you feede it with SortedDwords.dat):

Quote0   2147484709
1   2147490471
2   2147491291
3   2147491787
4   2147501199
5   2147513906
6   2147518668
7   2147519738
8   2147520028
9   2147521313
999991   2147439241
999992   2147439798
999993   2147440421
999994   2147450671
999995   2147450920
999996   2147454024
999997   2147454425
999998   2147476822
999999   2147481168
ok?

Guga result when feeded with RandomDwords.dat (This is the same correct result when feeded with SortedDwords.dat)

Quote0   3080
1   9158
2   13464
3   15479
4   17700
5   22867
6   26355
7   29191
8   32229
9   42286
999991   4294921291
999992   4294921704
999993   4294929912
999994   4294934235
999995   4294945952
999996   4294946053
999997   4294951189
999998   4294958802
999999   4294960543
ok?

I also gave a test on nrQsortA feeding it with an already sorted data (SortedDwords.dat), and even if it took only 16 milisecs to the function finishes, it completelly ruined the result, and changed. I didn´t analyzed nrQsortA to see why it is being breaked (perhaps, i´m porting it incorrecty to RosAsm)
On the other hand, on my algo, even if i pass a already sorted file it takes almost nothing to sort, n fact, if i feed it with SortedDwords.dat it took also only 16 milisecs to the algo finishes and the result is the same as in the loaded file.

Btw....did you tried to feed your algo (ArraySort) on this file (RandomDwords.dat) to test how much time it takes to parse ? Do you have the timmings of it ?

P.S: About the timmings, i´ve got this (i7 CPU 870, 2.93 Ghz)

QuoteGuga Algo 1 = 723718 miliseconds = 12.061966667 minutes
Guga Algo 1 = 719438 miliseconds = 11.990633333 minutes

Guga Algo 2 = 707640 miliseconds = 11.794 minutes

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on September 29, 2014, 06:23:33 PM

Hi Gustavo,

I don't understand what you mean with "wrong results". They do look different, but it's because your algo sorts unsigned values, while mine sorts signed DWORDs:

# signed unsigned hex (sorted by Guga)
0 3080 3080 00000C08
1 9158 9158 000023C6
2 13464 13464 00003498
3 15479 15479 00003C77
4 17700 17700 00004524
5 22867 22867 00005953
6 26355 26355 000066F3
..
999994 -33061 4294934235 FFFF7EDB
999995 -21344 4294945952 FFFFACA0
999996 -21243 4294946053 FFFFAD05
999997 -16107 4294951189 FFFFC115
999998 -8494 4294958802 FFFFDED2
999999 -6753 4294960543 FFFFE59F

# signed unsigned hex (sorted by ArraySort)
0 -2147482587 2147484709 80000425
1 -2147476825 2147490471 80001AA7
2 -2147476005 2147491291 80001DDB
3 -2147475509 2147491787 80001FCB
4 -2147466097 2147501199 8000448F
5 -2147453390 2147513906 80007632
6 -2147448628 2147518668 800088CC
..
999994 2147450671 2147450671 7FFF7F2F
999995 2147450920 2147450920 7FFF8028
999996 2147454024 2147454024 7FFF8C48
999997 2147454425 2147454425 7FFF8DD9
999998 2147476822 2147476822 7FFFE556
999999 2147481168 2147481168 7FFFF650

Re timings: ArraySort() takes 110 ms on my trusty old Celeron. If you provide your algo in Masm format, I'll try to put it into a testbed.

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 30, 2014, 01:41:20 AM

Hi JJ

I didn´t tested your algo yet, i used as a test the one from Masm lib (nrQsortA from qsorta.asm)

But, i´m getting confused, because on small chuncks nrQsortA takes more time to parse, according to my benchmarks tests.

Can you test my algo ? I´m not understanding what´s happenning, because on a real file (SortedDwords.dat), my algo takes absurd 12 minutes, while nrQsortA takes 94 ms ????

My algo in masm, should be like this, i believe:

Code Select


Guga_Insort	proc Array:DWORD, ArrayLen:DWORD


		push	edi
		push	ebx
		mov	ebx, 1
		mov	edi, Array
		mov	eax, ebx
		sub	eax, ArrayLen
		jge	OutFunction

MainLoop:
		mov	ecx, [edi+ebx*4]

InternalLoop:
		mov	eax, [edi+ebx*4-4]
		cmp	eax, ecx
		jbe	OutInternalLoop
		mov	[edi+ebx*4], eax
		dec	ebx
		jnz	InternalLoop

OutInternalLoop:
		mov	[edi+ebx*4], ecx
		inc	ebx
		cmp	ebx, ArrayLen
		jb	MainLoop

OutFunction:
		pop	ebx
		pop	edi
		ret ; <--- Is Endp in masm already using a retn ??? If it does, then, this ret must be removed
Guga_Insort endp

I´m not sure about the ret instruction in masm. In RosAsm "endp" is a macro that already contains a retn. So, if ret is ok in my attemtp to port, keep it, otherwise, just remove the ret.

I disassembled my own algo, if it makes easy to port to Masm.

Code Select


Guga_insort		proc near	
Array		= dword	ptr  8
ArrayLen	= dword	ptr  0Ch

		push	ebp
		mov	ebp, esp
		push	edi
		push	ebx
		mov	ebx, 1
		mov	edi, [ebp+Array]
		mov	eax, ebx
		sub	eax, [ebp+ArrayLen]
		jge	short loc_4069FA

loc_4069DC:
		mov	ecx, [edi+ebx*4]

loc_4069DF:
		mov	eax, [edi+ebx*4-4]
		cmp	eax, ecx
		jbe	short loc_4069ED
		mov	[edi+ebx*4], eax
		dec	ebx
		jnz	short loc_4069DF

loc_4069ED:
		mov	[edi+ebx*4], ecx
		inc	ebx
		cmp	ebx, [ebp+ArrayLen]
		jb	loc_4069DC

loc_4069FA:
		pop	ebx
		pop	edi
		mov	esp, ebp
		pop	ebp
		retn	8
insort		endp

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 30, 2014, 02:23:17 AM

Btw, here are some benchmarks i´m testing.

It seems that my algo (Insort optimized) is only faster for small chuncks of elements. The difference starts when there are more then 20 elements of the array. Up to that amount, nrQsortA is faster

For a accurated result. I inserted before each one of the algos, a function called "RandomGen", that copies to a buffer the array elements on the original order before the algos starts. This is to avoid that after, using the algo, the array is sorted, and the next algo will use this already sorted array and not a unsorted one.

Note: The file BenchMark_100Elements_NoGen.zip shows what happens when RandomGen is not being used. My algo parses 10 times faster when he array is already sorted, while nrQsortA tries to sort all over again, even if they are already sorted

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on September 30, 2014, 02:54:50 AM

Hi Gustavo,
I've added your algo to the testbed, but it seems there is quadratic behaviour present. Even when taking only 2% of the elements used by the other algos, it is slower, sorry :(

Code Select

Guga_Insort proc uses edi ebx Array:DWORD, ArrayLen:DWORD
	mov ebx, 1
	mov edi, Array
	mov eax, ebx
	sub eax, ArrayLen
	jge OutFunction

MainLoop:
	mov ecx, [edi+ebx*4]

InternalLoop:
	mov eax, [edi+ebx*4-4]
	cmp eax, ecx
	jbe OutInternalLoop
	mov [edi+ebx*4], eax
	dec ebx
	jnz InternalLoop

OutInternalLoop:
	mov [edi+ebx*4], ecx
	inc ebx
	cmp ebx, ArrayLen
	jb MainLoop

OutFunction:
	ret
Guga_Insort endp

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 30, 2014, 05:03:38 AM

Damn...it was too good to be true :icon_mrgreen:

I´ll give a test on dualPivotQuicksort to see the results.

Btw.does your algo (or another from masmlib) sorts unsigned integers ?

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 30, 2014, 02:27:42 PM

(...)Edited. JJ, If you read this post before i edited, forget it. I suceeded to make masmbasic work properly :)
I forgot to also rename Jwasm to mlv615.exe and not only renaming it with ml.exe :t

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on September 30, 2014, 05:45:43 PM

Hi Gustavo,

Just checked, it seems the minimum content of \Masm32\bin\ is this:

ml.exe ; can be 6.15 ... 10.0, or a renamed JWasm.exe (better)
link.exe ; the old 5.12 linker that comes with Masm32
mspdb50.dll ; the old linker needs this one

Title: Re: Create a file with RANDOM dwords
Post by: hutch-- on September 30, 2014, 07:16:31 PM

Guga,

At an algo level the only difference between a signed and unsigned compare is the jump you use after the compare.

Title: Re: Create a file with RANDOM dwords
Post by: guga on September 30, 2014, 11:32:49 PM

Tks you for the feedback guys

I fixed the algo and mixed with other one i´building

The results are:

Code Select


Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2)
Sorting 2000000 items:
DwRange=-1
224 ms for ArraySort	MasmBasic
202 ms for nrQsortA	Masm32 lib
388 ms for crt_qsort	with callback
176 ms for Gugasedgesort
DwRange=-1
232 ms for ArraySort	MasmBasic
208 ms for nrQsortA	Masm32 lib
390 ms for crt_qsort	with callback
174 ms for Gugasedgesort
DwRange=-1
221 ms for ArraySort	MasmBasic
202 ms for nrQsortA	Masm32 lib
390 ms for crt_qsort	with callback
171 ms for Gugasedgesort
ok

JJ, can you see if the masm syntax of the algorithm is correct ?

If my porting to masm are correct, it is 15% faster then nrQsortA. Probably i can optimize it a bit more if i succeed to remove the recursion and merge the functions i used. I´ll also finish the dualpivot (tri partition quicksort) to compare the results

I´m analyzing mine algos and see if i can reach something around 80-100 ms (Probably this timing can be reached if i suceed to make dualpivot algorithm work as expected and mix it with the other algos i built: insort and partial_quickersort)

Title: Re: Create a file with RANDOM dwords
Post by: guga on October 01, 2014, 01:15:38 AM

Btw, I forgot

Steve, which/Where change the JCC instruction to make nrQsortA to sort unsigned ?

I would like to compare the timmings for the algos works on the same way (I mean, unsorted and sorted) to check for the differences

Title: Re: Create a file with RANDOM dwords
Post by: guga on October 01, 2014, 02:37:14 AM

I´m almost there at 80 ms :icon_mrgreen:

Updated gugasedgesort
Can someone test the results for accuracy ?

Btw...JJ, if the results are accurated, then the 120 ms can be even more faster, once i remove the recursion. I estimate a gain of 30% (at least) of speed without recursion

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on October 01, 2014, 03:56:50 AM

Definitely fast...

Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (MMX, SSE, SSE2, SSE3)
Sorting 2000000 items:
DwRange=-1
495 ms for ArraySort MasmBasic
457 ms for nrQsortA Masm32 lib
878 ms for crt_qsort with callback
248 ms for Gugasedgesort
DwRange=-1
499 ms for ArraySort MasmBasic
387 ms for nrQsortA Masm32 lib
820 ms for crt_qsort with callback
249 ms for Gugasedgesort
DwRange=-1
488 ms for ArraySort MasmBasic
388 ms for nrQsortA Masm32 lib
837 ms for crt_qsort with callback
264 ms for Gugasedgesort

Title: Re: Create a file with RANDOM dwords
Post by: guga on October 01, 2014, 11:00:12 AM

Wahoo :icon_mrgreen: :icon_mrgreen:

Now, i´ll try to make it even faster.. I´m converting the dualpivot as i explained, and will later, remove the recursion on the gugasedgesort algo. I´m positive that gugasedgesort can be improved when removing the recursion and perhaps have a gain of 30% (or more) of it´s speed.

Also, once i made the tests on dualpivot, i´ll compare the results, and see if i can mix between gugasedgesort in order to force sedgesort to work with a tri partition technique. I hope i can make it work, because i´[m trying to achieve a speed of 40-80 ms on my I7 (Which means something around 120- 160 on your Celeron)

Btw, i´ll need to name the algo. "gugasedgesort" or GugaSomethign" is not exactly an algo´s name :icon_mrgreen:. I´m thinking in naming it as UltraSort or LightSpeed :icon_mrgreen:

Title: Re: Create a file with RANDOM dwords
Post by: MichaelW on October 01, 2014, 11:15:49 AM

For what it's worth, if I take the Microsoft qsort source that was distributed with my Server 2003 PSDK and hack it to sort 32-bit integers, the result is essentially twice as fast as the original (1021/511 on my P3, but only 125/78 on my Core-i3). The function is non-recursive, but does byte operations for everything (the explanation for this being to avoid alignment problems), and as you know calls an external comparison function from the inner loop (so it can be rigged to handle any data type). My modifications replaced the byte operations with integer operations, replaced the comparison function with a macro that is expanded inline, and did the same for the internal swap procedure.

I doubt that Microsoft would want me distributing their source file, hacked or otherwise, so I included only the object module in the attachment, along with my C test source and the EXE, compiled with the Visual C++ Toolkit 2003 compiler and /O2 /G6 optimizations.

Title: Re: Create a file with RANDOM dwords
Post by: guga on October 01, 2014, 11:52:41 AM

MichaelW, M$ qsort code is distributed for free for research. Take a look at the Windows Research Kernel source code (If i remember well, it is in one of the M$ pages. You need to register in order to dl). Essentially what is inside msvcrt (client and server) can be way more improved as you did, but, still, are slower then the ones i´m testing/implementing.

Also, if i remember well, the source code for qsort uses cdecl calling convention, while a regular stdcall would be enough. If your modifications removed the recursion, there seems be no need for using cdecl. If you try to change to stdcall will it be a bit faster ? And you could, as well, remove the 5 alignments inside the code.

If i can be able to make this algos even faster, i can then, try to add a 3rd argument to be used as the "compare" argument of qsort, but, i´ll need to make all of this faster before give a try to add another argument. Using recursion, proved to be a bad way on those algos, since it slow down the speed, but, in some cases, we have no other way to go (well...at least, i didn´t touched the algo to remove the recursion yet)

Btw...can u test the benchmark i posted ? I would like to check the timmings on your P3

Title: Re: Create a file with RANDOM dwords
Post by: Gunther on October 01, 2014, 06:57:21 PM

Gustavo,

the results from your last test program:

Code Select


Intel(R) Xeon(R) CPU           X3330  @ 2.66GHz (MMX, SSE, SSE2, SSE3, SSSE3, SS
E4.1)
Sorting 2000000 items:
DwRange=-1
392 ms for ArraySort    MasmBasic
296 ms for nrQsortA     Masm32 lib
517 ms for crt_qsort    with callback
172 ms for Gugasedgesort
DwRange=-1
387 ms for ArraySort    MasmBasic
312 ms for nrQsortA     Masm32 lib
498 ms for crt_qsort    with callback
154 ms for Gugasedgesort
DwRange=-1
386 ms for ArraySort    MasmBasic
299 ms for nrQsortA     Masm32 lib
553 ms for crt_qsort    with callback
194 ms for Gugasedgesort
ok

Gunther

Title: Re: Create a file with RANDOM dwords
Post by: MichaelW on October 01, 2014, 10:15:36 PM

Quote from: guga on October 01, 2014, 11:52:41 AM
Btw...can u test the benchmark i posted ? I would like to check the timmings on your P3

These benchmarks?

10
Results 8 pass average
timing nrQsortA (Masm Lib) 26150 ms
timing sedgesort Guga6995 ms
timing insort Guga6316 ms
20
Results 8 pass average
timing nrQsortA (Masm Lib) 41865 ms
timing sedgesort Guga24775 ms
timing insort Guga24013 ms
30
Results 8 pass average
timing nrQsortA (Masm Lib) 84504 ms
timing sedgesort Guga55358 ms
timing insort Guga54557 ms
52
Results 8 pass average
timing nrQsortA (Masm Lib) 134079 ms
timing sedgesort Guga149397 ms
timing insort Guga146562 ms
100
Results 8 pass average
timing nrQsortA (Masm Lib) 289919 ms
timing sedgesort Guga472609 ms
timing insort Guga471819 ms
100 NoGen
Results 8 pass average
timing nrQsortA (Masm Lib) 143974 ms
timing sedgesort Guga9573 ms
timing insort Guga8752 ms

The above took hours on a 500MHz P3.

Title: Re: Create a file with RANDOM dwords
Post by: guga on October 01, 2014, 11:02:53 PM

Tks guys :)

Btw...MichaelW, Did you tested the last one (ArraySortTimingsDword6a.zip ) ? For your results, i guess you tested the version that had problems with quadratic behaviour. Do´t use the ones from "NoGen". Those are the ones that had errors.

I fixed this on the last version (ArraySortTimingsDword6a.zip)

Can u test on a P3 this version below that i modified for your P3 ? I included only my algo to make the benchmark run faster. I would like to compare the results on a old P3.

Title: Re: Create a file with RANDOM dwords
Post by: MichaelW on October 01, 2014, 11:17:33 PM

ArraySortTimingsDword6a.zip and ArraySortTimingsDword6c.exe, both apparently using MasmBasic, require SSE2 support so they won't run on a P3.

Title: Re: Create a file with RANDOM dwords
Post by: nidud on October 03, 2014, 01:04:14 AM

deleted

Title: Re: Create a file with RANDOM dwords
Post by: Gunther on October 03, 2014, 02:04:44 AM

Hi nidud,

the results:

Code Select


41      18467   6334    26500   19169   15724   11478   29358   26962   24464
5705    28145   23281   16827   9961    491     2995    11942   4827    5436


qsort:
41      491     2995    4827    5436    5705    6334    9961    11478   11942
15724   16827   18467   19169   23281   24464   26500   26962   28145   29358

qsortmod:
41      491     2995    4827    5436    5705    6334    9961    11478   11942
15724   16827   18467   19169   23281   24464   26500   26962   28145   29358

sortdd:
41      491     2995    4827    5436    5705    6334    9961    11478   11942
15724   16827   18467   19169   23281   24464   26500   26962   28145   29358

qsort:          3100
qsortmod:       540
sortdd:         480

Gunther

Title: Re: Create a file with RANDOM dwords
Post by: hutch-- on October 03, 2014, 11:01:12 AM

There is an alternative to exchange based sorts if you know the range from lowest to highest value and it can be fitted into memory. They were long ago called "letter box" sorts and I have heard them called "pidgeon hole" sorts but the technique is much the same. If just for example you were working on a numeric range of 1 million, you allocate an array of 4 million bytes zero filled then scan through the data incrementing each array member that matches the number scanned as it is found.

Once the first scan is completed you then scan the target array and for non zero members and write them back to the original source sorted. If the range you are working on is limited, this technique will kill an exchange sort dead, they are so much faster.

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on October 03, 2014, 02:30:03 PM

Quote from: hutch-- on October 03, 2014, 11:01:12 AM
There is an alternative to exchange based sorts if you know the range from lowest to highest value and it can be fitted into memory. They were long ago called "letter box" sorts and I have heard them called "pidgeon hole" sorts ... If the range you are working on is limited, this technique will kill an exchange sort dead, they are so much faster.

Kind of bucket sorts, yes. My ArraySort() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1176) uses a variant called radix sort, which has no memory problems and beats nrQsortA up to a range of 400,000 or so (and crt qsort has no chance even for the full dword range).

Re the timings above: They are so sensational that you should get a patent for "sorting small arrays repeatedly". But check the timings for "sorting a fat array once" ;)

Title: Re: Create a file with RANDOM dwords
Post by: Gunther on October 03, 2014, 07:06:07 PM

Jochen,

Quote from: jj2007 on October 03, 2014, 02:30:03 PM
They are so sensational that you should get a patent for "sorting small arrays repeatedly". But check the timings for "sorting a fat array once" ;)

these are very different things.

Gunther

Title: Re: Create a file with RANDOM dwords
Post by: jj2007 on October 03, 2014, 10:02:35 PM

Yes indeed. How is the weather in Berlin?

Title: Re: Create a file with RANDOM dwords
Post by: Gunther on October 04, 2014, 12:16:43 AM

Jochen,

Quote from: jj2007 on October 03, 2014, 10:02:35 PM
Yes indeed. How is the weather in Berlin?

the weather is fine here: sunny and a warm wind. Good weather for a public holiday.

But back to the thread. Qsort needs in any case a good center for the data. That's necessary for such Divide et impera (divide and conquer) algorithm. Sedgwick has an own chapter for that point.

Gunther

The MASM Forum

General => The Laboratory => Topic started by: jj2007 on September 29, 2014, 07:48:21 AM