Author Topic: Timings for fast median algo  (Read 1809 times)

TimoVJL

  • Member
  • ****
  • Posts: 586
Re: Timings for fast median algo
« Reply #15 on: July 24, 2020, 06:21:14 PM »
Code: [Select]
159     bytes for MbMedianP
AMD Ryzen 5 3400G with Radeon Vega Graphics     (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX)
Precision loss is calculated versus the 'real' median, which is obtained by sorting

10 numbers generated
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms for sorting, the real median is 30168.43
Speed gain is a factor 0.0191, precision loss is 0.00275%

101 numbers generated
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms for sorting, the real median is 27147.66
Speed gain is a factor 2.67, precision loss is -0.0145%

1000000 numbers generated
7 ms to find the median of 1 Mio items, which is 26572.96
11 ms to find the median of 1 Mio items, which is 26572.96
8 ms to find the median of 1 Mio items, which is 26572.96
100 ms for sorting, the real median is 26567.51
Speed gain is a factor 10.8, precision loss is 0.0205%

10000001 numbers generated
91 ms to find the median of 10 Mio items, which is 26505.42
86 ms to find the median of 10 Mio items, which is 26505.42
86 ms to find the median of 10 Mio items, which is 26505.42
1051 ms for sorting, the real median is 26510.57
Speed gain is a factor 12.0, precision loss is -0.0194%

100000010 numbers generated
933 ms to find the median of 100 Mio items, which is 26505.42
929 ms to find the median of 100 Mio items, which is 26505.42
931 ms to find the median of 100 Mio items, which is 26505.42
9461 ms for sorting, the real median is 26510.47
Speed gain is a factor 10.2, precision loss is -0.0190%
hit any key
May the source be with you

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Timings for fast median algo
« Reply #16 on: July 24, 2020, 06:52:29 PM »
It´s working for integers, real4 and negative too ?

Yes, but I still have to adapt it for other formats. Right now it handles only REAL4, negative numbers included.

It's MbMedianP in the source.

Siekmanski

  • Member
  • *****
  • Posts: 2453
Re: Timings for fast median algo
« Reply #17 on: July 24, 2020, 07:48:30 PM »
Code: [Select]
159     bytes for MbMedianP
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, S
SE4.2, AVX)
Precision loss is calculated versus the 'real' median, which is obtained by sort
ing

10 numbers generated
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms for sorting, the real median is 30168.43
Speed gain is a factor 0.126, precision loss is 0.00275%

101 numbers generated
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms for sorting, the real median is 27147.66
Speed gain is a factor 2.10, precision loss is -0.0145%

1000000 numbers generated
4 ms to find the median of 1 Mio items, which is 26572.96
4 ms to find the median of 1 Mio items, which is 26572.96
4 ms to find the median of 1 Mio items, which is 26572.96
105 ms for sorting, the real median is 26567.51
Speed gain is a factor 23.5, precision loss is 0.0205%

10000001 numbers generated
41 ms to find the median of 10 Mio items, which is 26505.42
41 ms to find the median of 10 Mio items, which is 26505.42
41 ms to find the median of 10 Mio items, which is 26505.42
1160 ms for sorting, the real median is 26510.57
Speed gain is a factor 28.1, precision loss is -0.0194%

100000010 numbers generated
421 ms to find the median of 100 Mio items, which is 26505.42
420 ms to find the median of 100 Mio items, which is 26505.42
416 ms to find the median of 100 Mio items, which is 26505.42
10341 ms for sorting, the real median is 26510.47
Speed gain is a factor 24.6, precision loss is -0.0190%
hit any key
Creative coders use backward thinking techniques as a strategy.

guga

  • Member
  • *****
  • Posts: 1378
  • Assembly is a state of art.
    • RosAsm
Re: Timings for fast median algo
« Reply #18 on: July 24, 2020, 07:54:49 PM »
For what i saw we need to previously know the max and minimum values, right ?

Btw...one small fix on Marinus version (perhaps on yours too it have the same issue).

The algo is calculating the middle of the table and getting the value of the last one in the table from the left, right ? I notice a small mistake when getting positive integers numbers.

At "FindSecondMiddleNumberLP" it is needed to change this:

Code: [Select]
         mov edx, eax ; Save the first middle number to calculate the average of 2 middle numbers.
FindSecondMiddleNumberLP: ; pretty rare case, mostly with small data sets
inc eax
cmp dword ptr[edi+4*eax], 1 ; ecx must be 1
jb FindSecondMiddleNumberLP
add eax, edx ; Second middle number found, add it to the first middle number.
shr eax, 1 ; And get the average.

To something like this

Code: [Select]
         mov edx, eax ; Save the first middle number to calculate the average of 2 middle numbers.
inc eax

FindSecondMiddleNumberLP: ; pretty rare case, mostly with small data sets
cmp dword ptr[edi+4*eax], 0 ; Search if the next address (so, the 1st one in the right table) is not empty
jne FoundFirstMiddleNumberRight ; If the address is already filled, exit the loop
                inc eax                                     ; search next address and jump over when the address at this position in the table is filled
        jmp FindSecondMiddleNumberLP ; <---- Sorry, forgot to port this when i was editing this post
FoundFirstMiddleNumberRight:

add eax, edx ; Second middle number found, add it to the first middle number.
shr eax, 1 ; And get the average.

I believe it is the proper masm version. In RosAsm, it is:

Code: [Select]
        mov edx eax     ; Save the first middle number to calculate the average of 2 middle numbers
        ; here we are always at the end of the left middle. Ex if size = 100, we are at pos 49th
        ; FindSecondMiddleNumberLP.
        inc eax
        while D$edi+eax*4 = 0
            inc eax
        End_while
        add eax edx
        shr eax 1           ; And get the average.

You can test it on these values (Positive integers):

[DataSelect9: D$ 136, 151, 216, 40, 13, 215, 24, 104,
                 230, 154, 227, 219, 52, 217, 38, 131,
                 228, 95, 178, 51, 70, 14, 9, 34,
                 80, 195, 59, 27, 100, 118, 78, 137,
                 65, 104, 7, 32, 23, 58, 202, 154,
                 238, 12, 11, 165, 223, 197, 98, 26,
                 231, 92, 29, 131, 25, 99, 97, 77,
                 155, 120, 154, 182, 177, 211, 103, 238,
                 209, 86, 205, 213, 180, 126, 55, 95,
                 186, 232, 196, 121, 237, 187, 26, 120,
                 13, 56, 186, 167, 14, 136, 90, 196,
                 120, 23, 190, 79, 164, 41, 239, 39,
                 18, 109, 35, 169] ; Size = 100, Median = 119

The reason is that when you see the data in the proper order, you will see something like this:

Left Middle Table

0 ___________________________49 (Position starting from zero)
7   9   11   12   13   13   (...)   118 <--- marinus code found this value when reached FindSecondMiddleNumberLP


Right Middle Table
50 ___________________________100 (Position starting from 50)
120   120   120   121   (...)   238   238   239
________
      |
and we need to find this

Since we have a gap of 2 (120-118), it means that the next address on the right that contains a value is the one 2 dword ahead. Since we already incremented this value at eax, we must 1st see if that address is not empty. If it is empty, we perform the loop untill the next address that is full and increment on each loop.


Can you please, test to see if this is correct. On the values i tested above, it found the proper median. (On positive integers it seems to work. Not sure how it will behave in negative or floats)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

  • Member
  • *****
  • Posts: 2453
Re: Timings for fast median algo
« Reply #19 on: July 24, 2020, 08:52:25 PM »
-> Btw...one small fix on Marinus version (perhaps on yours too it have the same issue).

My routine is not the fault, must be something else.  :biggrin:

compare this part with with your conversion:

Code: [Select]
FindSecondMiddleNumberLP:
    inc     eax   
    mov     ecx,[edi+eax*4]
    test    ecx,ecx                 ; If zero, its not a number.
    jz      FindSecondMiddleNumberLP
Creative coders use backward thinking techniques as a strategy.

daydreamer

  • Member
  • *****
  • Posts: 1422
  • building nextdoor
Re: Timings for fast median algo
« Reply #20 on: July 24, 2020, 10:01:04 PM »
Code: [Select]
159     bytes for MbMedianP
Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX)
Precision loss is calculated versus the 'real' median, which is obtained by sorting

10 numbers generated
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms for sorting, the real median is 30168.43
Speed gain is a factor 0.0437, precision loss is 0.00275%

101 numbers generated
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms for sorting, the real median is 27147.66
Speed gain is a factor 2.80, precision loss is -0.0145%

1000000 numbers generated
5 ms to find the median of 1 Mio items, which is 26572.96
5 ms to find the median of 1 Mio items, which is 26572.96
5 ms to find the median of 1 Mio items, which is 26572.96
127 ms for sorting, the real median is 26567.51
Speed gain is a factor 22.9, precision loss is 0.0205%

10000001 numbers generated
56 ms to find the median of 10 Mio items, which is 26505.42
56 ms to find the median of 10 Mio items, which is 26505.42
56 ms to find the median of 10 Mio items, which is 26505.42
1357 ms for sorting, the real median is 26510.57
Speed gain is a factor 24.0, precision loss is -0.0194%

100000010 numbers generated
560 ms to find the median of 100 Mio items, which is 26505.42
565 ms to find the median of 100 Mio items, which is 26505.42
552 ms to find the median of 100 Mio items, which is 26505.42
12239 ms for sorting, the real median is 26510.47
Speed gain is a factor 21.9, precision loss is -0.0190%
hit any key
Showcase :
With Masm sdk and 2-3 hours = a windows program :D
Beat that C zealots p:

FORTRANS

  • Member
  • *****
  • Posts: 1079
Re: Timings for fast median algo
« Reply #21 on: July 24, 2020, 10:49:47 PM »
Hi,

   Two results, the second may have run into trouble?  (Memory?)

Code: [Select]
159     bytes for MbMedianP
Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, S
SE4.2, AVX)
Precision loss is calculated versus the 'real' median, which is obtained by sort
ing

10 numbers generated
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms for sorting, the real median is 30168.43
Speed gain is a factor 4.37, precision loss is 0.00275%

101 numbers generated
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms for sorting, the real median is 27147.66
Speed gain is a factor 1.89, precision loss is -0.0145%

1000000 numbers generated
8 ms to find the median of 1 Mio items, which is 26572.96
8 ms to find the median of 1 Mio items, which is 26572.96
7 ms to find the median of 1 Mio items, which is 26572.96
214 ms for sorting, the real median is 26567.51
Speed gain is a factor 26.7, precision loss is 0.0205%

10000001 numbers generated
83 ms to find the median of 10 Mio items, which is 26505.42
82 ms to find the median of 10 Mio items, which is 26505.42
82 ms to find the median of 10 Mio items, which is 26505.42
2536 ms for sorting, the real median is 26510.57
Speed gain is a factor 30.6, precision loss is -0.0194%

100000010 numbers generated
839 ms to find the median of 100 Mio items, which is 26505.42
828 ms to find the median of 100 Mio items, which is 26505.42
842 ms to find the median of 100 Mio items, which is 26505.42
21065 ms for sorting, the real median is 26510.47
Speed gain is a factor 25.2, precision loss is -0.0190%
hit any key

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\Stephan R. Novosad>ncd f:test
NCD-Norton Change Directory, Advanced Edition, (C) Copr 1987, Peter Norton

Changing to F:\TEMP\TEST

F:\TEMP\TEST>arraymed
159     bytes for MbMedianP
Intel(R) Pentium(R) M processor 1.70GHz (MMX, SSE, SSE2)
Precision loss is calculated versus the 'real' median, which is obtained by sort
ing

10 numbers generated
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms to find the median of 10 items, which is 30169.25
0 ms for sorting, the real median is 30168.43
Speed gain is a factor 0.145, precision loss is 0.00275%

101 numbers generated
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms to find the median of 101 items, which is 27143.74
0 ms for sorting, the real median is 27147.66
Speed gain is a factor 1.14, precision loss is -0.0145%

1000000 numbers generated
22 ms to find the median of 1 Mio items, which is 26572.96
22 ms to find the median of 1 Mio items, which is 26572.96
22 ms to find the median of 1 Mio items, which is 26572.96
236 ms for sorting, the real median is 26567.51
Speed gain is a factor 10.6, precision loss is 0.0205%

10000001 numbers generated
225 ms to find the median of 10 Mio items, which is 26505.42
223 ms to find the median of 10 Mio items, which is 26505.42
222 ms to find the median of 10 Mio items, which is 26505.42
2545 ms for sorting, the real median is 26510.57
Speed gain is a factor 11.4, precision loss is -0.0194%

100000010 numbers generated
21138 ms to find the median of 100 Mio items, which is 26505.42
2266 ms to find the median of 100 Mio items, which is 26505.42
2240 ms to find the median of 100 Mio items, which is 26505.42
23114 ms for sorting, the real median is 26510.47
Speed gain is a factor 2.70, precision loss is -0.0190%
hit any key

Regards,

Steve

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Timings for fast median algo
« Reply #22 on: July 24, 2020, 11:36:33 PM »
Steve,

You mean this one? Very slow but otherwise it looks ok...

100000010 numbers generated
21138 ms to find the median of 100 Mio items, which is 26505.42
2266 ms to find the median of 100 Mio items, which is 26505.42

FORTRANS

  • Member
  • *****
  • Posts: 1079
Re: Timings for fast median algo
« Reply #23 on: July 25, 2020, 01:00:04 AM »
Hi Jochen,

   Yes, and the system became very sluggish and
somewhat unresponsive.  It is a fairly old laptop
though.

Regards,

Steve

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Timings for fast median algo
« Reply #24 on: July 25, 2020, 01:05:14 AM »
That's odd, because after the string "numbers generated" there is no HeapAlloc or similar that could slow down the system. The algo itself does not allocate anything. Is this behaviour reproducible? If it's a single core CPU, it could have been busy with other tasks...

FORTRANS

  • Member
  • *****
  • Posts: 1079
Re: Timings for fast median algo
« Reply #25 on: July 25, 2020, 03:53:17 AM »
Hi,

That's odd, because after the string "numbers generated" there is no HeapAlloc or similar that could slow down the system. The algo itself does not allocate anything. Is this behaviour reproducible?

   Sort of reproducible.  Two of around six runs below.
Longest was ~30xxx ms, the shortest ~67xx ms.

Code: [Select]
100000010 numbers generated
16467 ms to find the median of 100 Mio items, which is 26505.42
2245 ms to find the median of 100 Mio items, which is 26505.42
2233 ms to find the median of 100 Mio items, which is 26505.42

100000010 numbers generated
20118 ms to find the median of 100 Mio items, which is 26505.42
2235 ms to find the median of 100 Mio items, which is 26505.42
2237 ms to find the median of 100 Mio items, which is 26505.42

Quote
If it's a single core CPU, it could have been busy with other tasks...

   Yes a single core CPU.  Sluggishness seems to
persist until rebooted.  Somewhere between weird and
strange.

Regards,

Steve

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Timings for fast median algo
« Reply #26 on: July 25, 2020, 04:08:11 AM »
hmmm... the 100 Mio items mean 400MB on the heap. But that's not the algo, it's just the creation of the array. Weird indeed.

guga

  • Member
  • *****
  • Posts: 1378
  • Assembly is a state of art.
    • RosAsm
Re: Timings for fast median algo
« Reply #27 on: July 25, 2020, 07:15:23 AM »
-> Btw...one small fix on Marinus version (perhaps on yours too it have the same issue).

My routine is not the fault, must be something else.  :biggrin:

compare this part with with your conversion:

Code: [Select]
FindSecondMiddleNumberLP:
    inc     eax   
    mov     ecx,[edi+eax*4]
    test    ecx,ecx                 ; If zero, its not a number.
    jz      FindSecondMiddleNumberLP

Indeed. I mistakes the masm routines. Now it works like a gem :greensml: :greensml: :greensml:
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Timings for fast median algo
« Reply #28 on: July 25, 2020, 06:34:26 PM »
I stumbled over a set of data for which Marinus' (modified, using DWORDs) algo fails with an error of 3%: the algo returns 37386, but it should be 38532 :cool:
Code: [Select]
TestData dd 1332, 1575, 3891, 4226, 5629, 6868, 7295, 10283, 10375, 11866
dd 12064, 12231, 12895, 13365, 14384, 14490, 14501, 15995, 17093, 17275
dd 17833, 18070, 19096, 19988, 20168, 20765, 20861, 21012, 21344, 22404
dd 23377, 23947, 24033, 24401, 24709, 25594, 26025, 26058, 26928, 27084
dd 27571, 28237, 28792, 29066, 30189, 30724, 32569, 32765, 33198, 34558
dd 36270, 36987, 37167, 37288, 37573, 39491, 78260, 193983, 312869, 406851
dd 474584, 878091, 933154, 1535982, 1696834, 1826755, 1894374, 1935375, 1979114, 1980304
dd 1995476, 2163480, 2186807, 2364281, 2775589, 3086356, 3163282, 3296671, 3324133, 3368259
dd 3445517, 3506669, 3672249, 3797020, 4185277, 4691675, 5316867, 5403458, 6181758, 6361385
dd 6439150, 6544605, 6547329, 6819265, 6928806, 6963353, 7069587, 7384388, 7574183, 7697938
dd 7856932, 8106433, 8645157, 8771719, 8886262, 9014975, 9177706, 9481844, 9485408, 9488645

Siekmanski

  • Member
  • *****
  • Posts: 2453
Re: Timings for fast median algo
« Reply #29 on: July 25, 2020, 08:04:02 PM »
My routine is not to blame, must be something else.  :cool:
Creative coders use backward thinking techniques as a strategy.