News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Min And Max function?

Started by Farabi, June 13, 2012, 01:25:00 PM

Previous topic - Next topic

Farabi

Is there any function for floating point that determining the maximum value and minimum value from a set of array?
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

jj2007

include \masm32\MasmBasic\MasmBasic.inc   ; download
.data
MyArray   REAL8 12.34, -99.0, 123.4e4, 3.0e-33, 123456.789, -123.4e4, 99.99
   Init
   mov esi, offset MyArray
   ArrayMinMax REAL8 PTR esi:lengthof MyArray
   Print Str$("The Min=\t%f\n", f:xmm0)
   Inkey Str$("The Max=\t%f\n", f:xmm1)
   Exit
end start

The Min=        -1234000.0
The Max=        1234000.0


Documentation is here.

Farabi

Quote from: jj2007 on June 13, 2012, 03:30:56 PM
include \masm32\MasmBasic\MasmBasic.inc   ; download
.data
MyArray   REAL8 12.34, -99.0, 123.4e4, 3.0e-33, 123456.789, -123.4e4, 99.99
   Init
   mov esi, offset MyArray
   ArrayMinMax REAL8 PTR esi:lengthof MyArray
   Print Str$("The Min=\t%f\n", f:xmm0)
   Inkey Str$("The Max=\t%f\n", f:xmm1)
   Exit
end start

The Min=        -1234000.0
The Max=        1234000.0


Documentation is here.

Whoa thanks, very cool  :t That is amazing. I tried to build something like this but I think the result is wrong.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

qWord

a FPU solution for non-MB users:
lea edx,MyArray
mov ecx,1
fld REAL8 ptr [edx]
fld st
.while ecx < LENGTHOF MyArray - 1
fld REAL8 ptr [edx+ecx*8]
fxch st(1)
fcomi st,st(1)
fcmovnbe st,st(1)
fxch st(2)
fcomi st,st(1)
fcmovb st,st(1)
fstp st(1)
fxch
lea ecx,[ecx+1]
.endw
fstp r8Min
fstp r8Max
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Quote from: qWord on June 13, 2012, 09:08:12 PM
a FPU solution for non-MB users:

Elegant solution, and competitive, too :t

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
Getting min & max for 10000000 REAL8 values:
57177 µs for ArrayMinMax
62270 µs for FPU

56843 µs for ArrayMinMax
62275 µs for FPU

57216 µs for ArrayMinMax
62252 µs for FPU

Results:
ArrayMinMax=    -888.887867/999.998657
r8MinMax=       -888.887828/999.998801

qWord

Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz (SSE4)
Getting min & max for 10000000 REAL8 values:
31341 µs for ArrayMinMax
26515 µs for FPU

32582 µs for ArrayMinMax
26116 µs for FPU

30457 µs for ArrayMinMax
25545 µs for FPU

Results:
ArrayMinMax=    0.0/0.0
r8MinMax=       -888.887828/999.998801

:icon_confused:

maybe the complicated loop contruction is the problem?:
CPU Disasm
Address   Hex dump          Command                                  Comments
00402C49  |.  BD FFFFFF7F   MOV EBP,7FFFFFFF
00402C4E  |>  3B06          /CMP EAX,DWORD PTR DS:[ESI]
00402C50  |.  73 02         |JNB SHORT 00402C54
00402C52  |.  8B06          |MOV EAX,DWORD PTR DS:[ESI]
00402C54  |>  3B2E          |CMP EBP,DWORD PTR DS:[ESI]
00402C56  |.  7E 02         |JLE SHORT 00402C5A
00402C58  |.  8B2E          |MOV EBP,DWORD PTR DS:[ESI]
00402C5A  |>  3B16          |CMP EDX,DWORD PTR DS:[ESI]
00402C5C  |.  7D 02         |JGE SHORT 00402C60
00402C5E  |.  8B16          |MOV EDX,DWORD PTR DS:[ESI]
00402C60  |>  83FF 04       |CMP EDI,4
00402C63  |.  76 10         |JBE SHORT 00402C75
00402C65  |.  F20F5F0E      |MAXSD XMM1,QWORD PTR DS:[ESI]
00402C69  |.  F20F5D06      |MINSD XMM0,QWORD PTR DS:[ESI]
00402C6D  |.  3B5E 04       |CMP EBX,DWORD PTR DS:[ESI+4]
00402C70  |.  73 03         |JNB SHORT 00402C75
00402C72  |.  8B5E 04       |MOV EBX,DWORD PTR DS:[ESI+4]
00402C75  |>  03F7          |ADD ESI,EDI
00402C77  |.  49            |DEC ECX
00402C78  |.^ 75 D4         \JNE SHORT 00402C4E
00402C7A  |.  8BCD          MOV ECX,EBP
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

You want it faster? Here it is:
AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
Getting min & max for 10000000 REAL8 values:
56953 µs for ArrayMinMax
62258 µs for FPU
33072 µs for SSE2

56735 µs for ArrayMinMax
62351 µs for FPU
32944 µs for SSE2

56787 µs for ArrayMinMax
62246 µs for FPU
32830 µs for SSE2

Results:
ArrayMinMax=    -888.887875/999.998846
r8MinMax=       -888.887542/999.998829
SSE2Min=        -888.887520/999.998850


But I am worried about your results for ArrayMinMax - why 0/0?? :(

qWord

Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz (SSE4)
Getting min & max for 10000000 REAL8 values:
32887 µs for ArrayMinMax
25024 µs for FPU
17790 µs for SSE2

37237 µs for ArrayMinMax
25062 µs for FPU
15093 µs for SSE2

28999 µs for ArrayMinMax
24408 µs for FPU
14722 µs for SSE2

Results:
ArrayMinMax=    0.0/0.0
r8MinMax=       -888.887542/999.998829
SSE2Min=        0.0/0.0

Indeed, the result is strange...
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Does the order play a role? Attachment has a + b versions with different order.

qWord

The A problem is the start value: this must either a value of the list or the maximum and minimum values of the type (double):  +-1.7976931348623157E+308.
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Sure? I thought I had that right....

This is how I set the xmm initial values:

   push 07FEFFFFFh   ; set MinMax to xmm regs
   push -1
   movlps xmm0, REAL8 ptr [esp]   ; MaxVal, -1.7e308
   mov byte ptr [esp+7], -1   ; dirty hack - loading from mem is 8 bytes longer ;-)
   movlps xmm1, REAL8 ptr [esp]   ; MinVal, +1.7e308


What are your values for xmm0/xmm1 at the first and second int 3?

  int 3
  mov esi, Chr$("xmm0=-888, xmm1=+999??")
  ArrayMinMax MyR8()
  int 3 

Another possibility is that xmm regs get trashed somewhere in the Print Str$() process. Which Windows version are you running?

Third option: Rand() is not working on your system ::)

qWord

For the second, the correct result is placed in the registers - it gets overwritten SetConsoleCP() (Win7,x64)!
However, my above statement is correct, even it is not the problem here :biggrin:
Think about a list that has only positive values: 1,2,3,4....
When you start with 0, you will get zero as the minimum, instead of 1
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

Quote from: qWord on June 13, 2012, 10:39:21 PM
For the second, the correct result is placed in the registers - it gets overwritten SetConsoleCP() (Win7,x64)!
However, my above statement is correct, even it is not the problem here :biggrin:
Think about a list that has only positive values: 1,2,3,4....
When you start with 0, you will get zero as the minimum, instead of 1

That's correct but it is not the reason, see above in bold.
So I have to save xmm regs before calling SetConsoleCP, grrrr!

Farabi

 :t Whoa you guys are an experts now, just in a short time. I never able to make it myself. Now my graphic system is completed. I can calculate everything automatically from adjusting the center, pointing where the model from view, and the most important thing is, now the shadow system is complete.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

MichaelW

Another FPU solution, I think probably slow, with min and max as separate procedures, and since this was for graphics I guessed REAL4 instead of REAL8.

;==============================================================================
include \masm32\include\masm32rt.inc
.686
;==============================================================================

;-------------------------------------
; These from VC Toolkit 2003 float.h:
;-------------------------------------

FLT_MAX equ 3.402823466e+38
DBL_MAX equ 1.7976931348623158e+308

;==============================================================================
.data
      array real4 -8.8, -3.9, 111.5, 0.5, 3.6, 1.2, 4.9, 9.9, -98.2, 0.0
      r8    real8 ?
.code
;==============================================================================

fltmax proc p:dword, n:dword
    local _max:real4
    mov ecx, n
    mov edx, p
    fld4 -FLT_MAX
    fstp _max
  L0:
    fld real4 ptr [edx+ecx*4]
    fld _max
    fcomip st, st(1)
    ja  L1
    fst _max
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _max
    ret
fltmax endp

;==============================================================================

fltmin proc p:dword, n:dword
    local _min:real4
    mov ecx, n
    mov edx, p
    fld4 FLT_MAX
    fstp _min
  L0:
    fld real4 ptr [edx+ecx*4]
    fld _min
    fcomip st, st(1)
    jb  L1
    fst _min
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _min
    ret
fltmin endp

;==============================================================================
start:
;==============================================================================

    invoke fltmin, addr array, lengthof array
    fstp r8
    printf("%.1f\n",r8)
    invoke fltmax, addr array, lengthof array
    fstp r8
    printf("%.1f\n",r8)

    inkey
    exit
;==============================================================================
end start


Well Microsoft, here's another nice mess you've gotten us into.