Hi,
I'm testing a mergesort and would like to get some timings. Does anybody have a bright idea how to test if a DWORD sort is stable??
AMD Athlon(tm) Dual Core Processor 4450B (MMX, SSE, SSE2, SSE3)
174 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
163 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
159 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
176 ms for MbMs
162 ms for ArraySort
163 ms for nrQsortA
177 ms for MbMs
161 ms for ArraySort
159 ms for nrQsortA
206 bytes for MbMs
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (MMX, SSE, SSE2, SSE3)
Array created
252 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
231 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
179 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
251 ms for MbMs
230 ms for ArraySort
179 ms for nrQsortA
252 ms for MbMs
231 ms for ArraySort
180 ms for nrQsortA
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (MMX, SSE, SSE2, SSE3)
229 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
240 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
203 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
234 ms for MbMs
236 ms for ArraySort
204 ms for nrQsortA
232 ms for MbMs
237 ms for ArraySort
203 ms for nrQsortA
JJ,
QuoteDoes anybody have a bright idea how to test if a DWORD sort is stable??
For testing a small set, consider constructing the input data as two WORDS. In the most significant (in the DWORD, i,e, the second WORD of the WORD pair), place the random values, and use the same most significant WORD value for two adjacent entries. Fill the array with 2^(31-2) values (32768 DWORD pairs). Then in the least significant WORD (in the DWORD, i,e, the first WORD of the WORD pair), fill the least significant WORDS with incrementing values.
After the DWORD sort, check that the the least significant WORDs of each pair of DWORDS have ascending values.
If you want a larger set to thrash the sort, then use QWORDS with DWORD pairs in each QWORD.
Dave.
yah - i can only tell you that i start with small sets and work up
in fact, i start with 0 elements, then 1, then 2......
i want to make sure those special cases work correctly
but - as you add more and more elements, you start to gain confidence
you can watch the end pointers and number of passes and see that everything is working as it should
Jochen,
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SS
E4.2, AVX)
105 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
98 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
74 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
95 ms for MbMs
98 ms for ArraySort
76 ms for nrQsortA
95 ms for MbMs
98 ms for ArraySort
75 ms for nrQsortA
206 bytes for MbMs
Gunther
You can cheat with a simple integer sort if you have enough memory OR the number range is not too large. Allocate an array that covers the number range then write each number to its array index. Limited to memory but really fast. Once its written to the array, you just scan the array in whatever direction you want and only count the indexes that are not zero. If you want to preserve duplicates just increment each array member.
I have tested these long ago and they really kick arse.
I should have added, with conventional sorts a quick sort with an insertion sort to handle the smaller recursion ranges has the best overall performance. This fixes a defect in an unprotected quick sort that has ordered sequences as data input.
i think bubble sort gets a bad rap because a lot of test code might completely reverse the data
that might be a good test to prove the code is working correctly, of course
but it would tend to make a bubble sort look very slow because it has to walk each elemennt up and down the array
on randomly ordered data of small or medium length, a bubble sort might fair a little better :P
Give this a try and let me know how it compares :)
I did extensive testing some time ago, it should be absolutely stable.
Edit: oops, picked up test version, attachment replaced.
apologies to the person that downloaded the bad one.
Edit #2. I removed the file as it had a bug with the string processor. I will repost in projects when fixed.
A bubble sort is fast on nearly ordered data sets, but terribly slow on random data, at its worst with a reverse ordered data set. They are OK on very small ranges which make them useful as trimmers on recursive sorts that get into trouble with ordered data sets but an insertion sort is faster still in the same context. A variation of a bubble sort is a double ender (sometimes called a cocktail shaker sort), it sweeps back and forth instead of repeatedly sweeping from one end. They are still much slower than many of the better known sorts.
In practical terms, creat a billion random numbers and attack it with a bubble sort and the computer may wear out before it finishes. :biggrin:
Out of curiosity, I ran your routine in my integer test program.
It looks like the problems that pop up during sorted and random tests are just related to handling negative numbers.
When the error pops up, just hit cancel.
Quote from: jimg on May 15, 2014, 02:28:59 AM
Out of curiosity, I ran your routine in my integer test program.
It looks like the problems that pop up during sorted and random tests are just related to handling negative numbers.
Yes indeed - line 119 should be jg @CopyFrom2nd
Thanks, Jim, your test program helped me a lot :t
Your merge sort is very fast indeed (but unfortunately not suitable for my purpose of string sorting :().
Thanks for testing. Just out of curiosity, what was it you needed for string sorts? I admit I haven't looked at this stuff in five years, so the string sorter may be broken.
Quote from: jj2007 on May 14, 2014, 12:56:57 AM
Hi,
I'm testing a mergesort and would like to get some timings. Does anybody have a bright idea how to test if a DWORD sort is stable??
AMD Athlon(tm) Dual Core Processor 4450B (MMX, SSE, SSE2, SSE3)
174 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
163 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
159 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
176 ms for MbMs
162 ms for ArraySort
163 ms for nrQsortA
177 ms for MbMs
161 ms for ArraySort
159 ms for nrQsortA
206 bytes for MbMs
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (MMX, SSE, SSE2, SSE3)
Array created
252 ms for MbMs stable mergesort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
231 ms for ArraySort radixsort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
179 ms for nrQsortA unstable quicksort
1540 4579 6732 7739 8850 11433 13177 14595 16114 21143 ...
2147472975 2147473026 2147475594 2147479400 2147480271
251 ms for MbMs
230 ms for ArraySort
179 ms for nrQsortA
252 ms for MbMs
231 ms for ArraySort
180 ms for nrQsortA
JJ,
Please explain what you mean by a DWORD sort that you are trying to verify if it is stable. Is it a string or array that contains a DWORD key, or is it just an array of DWORDS?
Dave.
Dave & Jim,
It's an array of pointers and string lengths, i.e. qwords.
It seems I have solved the problem in the meantime, but it'll take some time to fully implement it. I want to use it for an Excel-type tabfile reader, with multiple sort criteria (that's why it has to be stable).
It will also do a simple string sort, of course, and it's already competitive (timings for Windows.inc):
16547 µs for QSort
21102 µs for assort
jj,
Well, my second solution should serve to check for stability. Assuming that the data records contain two DWORDS (a QWORD) then set two DWORD keys to the same random value, and then set the other DWORD in each QWORD to an incrementing value, then select a new random value for the keys and save the next two keys the the same value ans continue incrementing the value to save in the second DWORD of each record. After the sort has ordered the records, check that pairs of matching DWORD keys have an ascending second DWORD. The second field does not have to be another DWORD, but needs to be at least large enough to contain the incremented second value. Note that this is a requirement to insure that if there were more than 2 keys with the same value, that they were correctly positioned in the output with ascending second DWORD values. If there would be only 2 DWORDS with matching keys (that is, never more than 2 consecutive DWORDS with the same key value), then a single byte could be used to contain a 0 and a 1 value and repeat the 0 and 1 for all other second DWORD pairs, but this would not completely test the stability. Essentially, if you only had a DWORD array and wanted to sort it by DWORDS, then you would have to be able to color the DWORDS Red and Green in order to check stability, but this is ridiculous because the entire DWORD is the entire record and if two DWORDS match, then it doesn't matter whether the sort is stable. Stability is only important when other data is present along with the key data.
Dave.
MasmBasic stable string sort is online - see http://masm32.com/board/index.php?topic=3231.0
Cheers, JJ
Jochen,
Quote
Your merge sort is very fast indeed (but unfortunately not suitable for my purpose of string sorting
What about to write a procedure (or ?)
to generate random strings of any length between LenMin to LenMax ? Do you know something about it ?
Quote from: RuiLoureiro on May 29, 2014, 05:58:58 AM
What about to write a procedure (or ?) to generate random strings of any length between LenMin to LenMax ? Do you know something about it ?
I can offer Scramble:
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
Init
Recall "\Masm32\include\Windows.inc", L$()
push eax
NanoTimer()
xor ecx, ecx
.Repeat
Scramble L$() ; un-sort the array
QSort L$()
Scramble L$() ; un-sort the array
QSortDesc L$()
inc ecx
.Until ecx>=50
pop ecx
Inkey Str$("Sorting 100 x %i lines", ecx), Str$(" took %i ms", NanoTimer(ms))
Exit
end startOutput:
Sorting 100 x 26902 lines took 2518 ms