News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Min And Max function?

Started by Farabi, June 13, 2012, 01:25:00 PM

Previous topic - Next topic

KeepingRealBusy

Rui,

Here is my laptop:


Intel(R) Pentium(R) 4 CPU 3.20GHz (SSE2)
Getting min & max for 10000000 REAL4 and REAL8 values:
72 ms for FPU
24 ms for ArrayMinMax, REAL4
45 ms for ArrayMinMax, REAL8
35 ms for SSE2JJ, REAL8
70 ms for fltminmax
25 ms for Myminmax, REAL4

75 ms for FPU
26 ms for ArrayMinMax, REAL4
42 ms for ArrayMinMax, REAL8
40 ms for SSE2JJ, REAL8
73 ms for fltminmax
28 ms for Myminmax, REAL4

Results:
ArrayMinMax=    -888.887818/999.998966
r4MinMax=       -888.887817/999.998962
r4MinMax=       -888.887817/999.998962
r8MinMax=       -888.887818/999.998966
SSE2Min=        -888.887818/999.998966

51       bytes for fltminmax
42       bytes for Myminmax
29       bytes for SSE2JJ
bye


I assume these functions return the max or min in st(0) or the min/max in st(0) and st(1.) The code for all three (Mymin, Mymax, Myminmax) look good except for one pesky problem, they will fail for exactly one entry in the array (they will walk off of the end of the array). You need to insert "jnz L0 ret 8" in front of L0.

I find FPU code hard to follow, I usually code it in C and copy the generated code from the .cod file (the third tenant of the programming creed "cheat lie and steal").

Dave.

RuiLoureiro

#61
Hi Dave,
        I need to answer this way       
1.               
Quote
The code for all three (Mymin, Mymax, Myminmax) look good
Yes you are right
2.
Quote
except for one pesky problem
Well, let's go to see where is the problem !     
3.
Quote
they will fail for exactly one entry in the array
Well, if THEY fail, Mymin fails (for example).
        So we can talk about Mymin to be simple.

        One note: Well you are saying but you dont prove nothing.
       
        a) the array has 10 real4 numbers
        b) Mymin starts with st(0)=MIN = element in ECX=0
        c) It starts the loop with ECX=9
        d) It starts comparing MIN with the element in [edx+ecx*4]
           It means that we use the element in [edx+9*4] (=the LAST)
        e) It stops when ECX=0, it means "when ecx=0 doesnt loop"

        So it is evident, obvious, that it uses ALL numbers in the array.
        But if you have some doubts run it and print each ECX.       
        or try
              array  real4 2, 3, 4, -1      you should get MIN=-1
        now try
              array  real4 2, 3, -1, 4      you should get MIN=-1
        now try
              array  real4 2, -1, 3, 4      you should get MIN=-1
        now try
              array  real4 -1, 2, 3, 4      you should get MIN=-1       
4.
Quote
You need to insert "jnz L0 ret 8" in front of L0.
No
5.
Quote
I find FPU code hard to follow, I usually code it in C     

        It seems  you have some problems in reading assembly.  ;)

EDIT: see the debug file

RuiLoureiro

Jochen,
        Could you replace Myminmax in your ArrayMinMax_vs_FPU2
        by this:


OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
Myminmax    proc    p:dword, n:dword

        mov     ecx, [esp+8]    ;n
        mov     edx, [esp+4]    ;p
        fld     real4 ptr [edx]             ; set st(1) to MAX value       
        fld     st(0)                       ; set st(0) to MIN value
        sub     ecx, 1                      ; points to the last value
  L0:
        fld     real4 ptr [edx+ecx*4]
        fcomi   st, st(1)                   ; compare st(1)=MIN with st(0)
        jae     L1
        fxch    st(1)

        fstp    st
        sub     ecx, 1
        jnz     L0                          ; if ecx>0 loop to L0       
        ret     8
               
  L1:   fcomi   st, st(2)                   ; compare st(2)=MAX with st(0)
        jbe     L2
        fxch    st(2)

  L2:   fstp    st
        sub     ecx, 1
        jnz     L0                          ; if ecx>0 loop to L0       
        ret     8
Myminmax    endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


KeepingRealBusy

Rui,

You set the max/min to the first entry, decrement ecx, (ecx now 0) and then uselessly  compare the first entry with min and find it is equal so you will uselessly compare the first entry with max and find it is equal, so you pop st(0) and decrement ecx again (ecx now -1, not zero), so you will loop back to L0 (fld     real4 ptr [edx+ecx*4]) and will get a memory access fault somewhere along the way. It will not work correctly for an array size of 1.

Actually, the following would work quite well:


OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
Myminmax    proc    p:dword, n:dword

        mov     ecx, [esp+8]    ;n
        mov     edx, [esp+4]    ;p
        fld     real4 ptr [edx]             ; set st(1) to MAX value       
        fld     st(0)                       ; set st(0) to MIN value
        sub     ecx, 1                      ; points to the last value
        jnz      L0                         ; not a single entry.
        ret      8                           ; st(0) and st(1) are set with min/max.
  L0:
        fld     real4 ptr [edx+ecx*4]
        fcomi   st, st(1)                   ; compare st(1)=MIN with st(0)
        jae     L1
        fxch    st(1)
        jmp    L2

;        fstp    st                          ; this code is exactly duplicated in L2
;        sub     ecx, 1
;        jnz     L0                          ; if ecx>0 loop to L0       
;        ret     8
               
  L1:   fcomi   st, st(2)                   ; compare st(2)=MAX with st(0)
        jbe     L2
        fxch    st(2)

  L2:   fstp    st
        sub     ecx, 1
        jnz     L0                          ; if ecx>0 loop to L0       
        ret     8
Myminmax    endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


Dave

RuiLoureiro

Hi Dave,
       
Oooops you are trying to rewrite it to work with an array of 1
element and then you follow with arguments one after another
ABOUT that singular case.

It is very very interesting ! Yes in 99.999 % of the cases
we define an array of 1 element and in that cases we are
VERY VERY interested the computer tell us what is the MIN and MAX
of ONE VALUE !
Without doubt, Dave!

Now i can tell you that the procedure you are showing us
doesnt work when you call
invoke  Myminmax, addr array, len in the case len=0 or len=-1 ...

Yes, actually, the following would work quite well

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
Myminmax    proc    p:dword, n:dword

        mov     ecx, [esp+8]    ;n
        mov     edx, [esp+4]    ;p
        fld     real4 ptr [edx]             ; set st(1) to MAX value       
        fld     st(0)                       ; set st(0) to MIN value
        ;
        sub     ecx, 1                      ; points to the last value
  L0:
        fld     real4 ptr [edx+ecx*4]
        fcomi   st, st(1)                   ; compare st(1)=MIN with st(0)
        jae     L1
        fxch    st(1)

        fstp    st                          ; YES this code is exactly duplicated in L2 YES
        sub     ecx, 1
        jnz     L0                          ; if ecx>0 loop to L0       
        ret     8
               
  L1:   fcomi   st, st(2)                   ; compare st(2)=MAX with st(0)
        jbe     L2
        fxch    st(2)

  L2:   fstp    st
        sub     ecx, 1
        jnz     L0                          ; if ecx>0 loop to L0       
        ret     8
Myminmax    endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


            See the file i posted before
            Assume the procedures we are writing
            works for n>1 

DEBUG values

The array is this:

      array  real4 -8.8, -3.9, 111.5, 0.5, 3.6, 1.2, 4.9, 9.9, -988.8, 0.0

2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST > Source
Exception  : e s p u o z d i
St(0)      : -8.8                   ««« ECX=9
St(1)      : -8.8
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -8.8                   ««« ECX=8
St(1)      : 0
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=7
St(1)      : 0
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=6
St(1)      : 9.9
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=5
St(1)      : 9.9
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=4
St(1)      : 9.9
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=3
St(1)      : 9.9
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8                 ««« ECX=2
St(1)      : 9.9
2 Numbers
----------------------------------------
FPU Levels : 2
Conditional: ST < Source
Exception  : e s P u o z d i
St(0)      : -988.8           =MIN  ««« ECX=1
St(1)      : 111.5            =MAX

KeepingRealBusy

Rui,

I was not testing for an array size of 0 because there was no defined error return (an error could have been returned in eax) and since there was also no documented way to indicate that st(0) and st(1) were invalid upon return (loading st(0) and st(1) with fltmax and fltmin would also  be deceptive since they would not be the correct max or min - there is no max or min of there is no array). I also did not check for an invalid or null pointer.

But, It should be perfectly valid to call this function with a valid pointer to an array for any size of the array, even for a single entry (max and min would be the same as the first entry in that case).

If you want to restrict this function to sizes > 1, then you should document the restriction.

My supplied fix works for all sizes > 0, and is shorter than you version. If you want to work for all sizes including 0, then you should check for 0 size and return an error in eax, otherwise set eax to the good return value and then scan the array (up to you to define these good/bad values in the documentation).

I am not trying to steal your code or take credit for it, I am just pointing out that your code, as posted and not documented otherwise, will fail for an array size of 1.

Dave.

RuiLoureiro

Dave,
        We are testing for speed and i wrote it
        only to compare with fltmax and fltmin
        procedures from VC toolkit
         
        No need to check for null pointer,
        generally it crashes.
       
Quote
If you want to restrict this function to sizes > 1,
then you should document the restriction.
Or you could ask me if it works for n=1.
        Jochen and others developed their and
        we dont know if it works for n=1 or not.
        We are developing to see what it does
        so it is not necessary.

Farabi

Quote from: MichaelW on June 14, 2012, 01:23:11 AM
Another FPU solution, I think probably slow, with min and max as separate procedures, and since this was for graphics I guessed REAL4 instead of REAL8.

;==============================================================================
include \masm32\include\masm32rt.inc
.686
;==============================================================================

;-------------------------------------
; These from VC Toolkit 2003 float.h:
;-------------------------------------

FLT_MAX equ 3.402823466e+38
DBL_MAX equ 1.7976931348623158e+308

;==============================================================================
.data
      array real4 -8.8, -3.9, 111.5, 0.5, 3.6, 1.2, 4.9, 9.9, -98.2, 0.0
      r8    real8 ?
.code
;==============================================================================

fltmax proc p:dword, n:dword
    local _max:real4
    mov ecx, n
    mov edx, p
    fld4 -FLT_MAX
    fstp _max
  L0:
    fld real4 ptr [edx+ecx*4]
    fld _max
    fcomip st, st(1)
    ja  L1
    fst _max
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _max
    ret
fltmax endp

;==============================================================================

fltmin proc p:dword, n:dword
    local _min:real4
    mov ecx, n
    mov edx, p
    fld4 FLT_MAX
    fstp _min
  L0:
    fld real4 ptr [edx+ecx*4]
    fld _min
    fcomip st, st(1)
    jb  L1
    fst _min
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _min
    ret
fltmin endp

;==============================================================================
start:
;==============================================================================

    invoke fltmin, addr array, lengthof array
    fstp r8
    printf("%.1f\n",r8)
    invoke fltmax, addr array, lengthof array
    fstp r8
    printf("%.1f\n",r8)

    inkey
    exit
;==============================================================================
end start


MichaelW, Thanks a lot. This is cut the time converting the array from the real4 to real8.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

Farabi

Hi Michael, the mistake I can notice from your function is, if I had 5 array, I had to type 4 for the array count for it so it can be worked. I think that is the only bug I can notice.




Have a look at the red line, it was telling us where is the lowest Vertex and the highest Vertex from an object.
I need this function to be functioning so I will be able to extract each edge and build a shdow volume from it.

Here is how I used it

fShadowProcessArrayX proc uses esi edi lpArray:dword,nVertexCount:dword
LOCAL x_arr,y_arr,z_arr:dword
LOCAL data_offset:dword
LOCAL memNeeded:dword
LOCAL MAX:VERTEX
LOCAL MIN:VERTEX
LOCAL DLT:VERTEX
LOCAL DLT2:VERTEX
LOCAL PVTPNT:VERTEX
LOCAL arrLen,arrOffs,lpResult:dword
LOCAL lgDP:Dword
LOCAL fLen:real4


mov ecx,nVertexCount
shl ecx,2
mov memNeeded,ecx
invoke mAlloc,memNeeded
mov x_arr,eax
invoke mAlloc,memNeeded
mov y_arr,eax
invoke mAlloc,memNeeded
mov z_arr,eax

mov ecx,nVertexCount
shl ecx,4
mov memNeeded,ecx
invoke mAlloc,memNeeded
mov lpResult,eax

mov esi,lpArray

xor ecx,ecx
mov data_offset,ecx
loop_extract:
push ecx
mov ecx,x_arr
add ecx,data_offset
mov eax,[esi].VERTEX.x
mov [ecx],eax

mov ecx,y_arr
add ecx,data_offset
mov eax,[esi].VERTEX.y
mov [ecx],eax

mov ecx,z_arr
add ecx,data_offset
mov eax,[esi].VERTEX.z
mov [ecx],eax

add esi,12
add data_offset,4
pop ecx
inc ecx
cmp ecx,nVertexCount
jl loop_extract

dec nVertexCount
invoke fltmax,x_arr,nVertexCount
fstp MAX.x
invoke fltmax,y_arr,nVertexCount
fstp MAX.y
invoke fltmax,z_arr,nVertexCount
fstp MAX.z
dec nVertexCount
invoke fltmin,x_arr,nVertexCount
fstp MIN.x
invoke fltmin,y_arr,nVertexCount
fstp MIN.y
invoke fltmin,z_arr,nVertexCount
fstp MIN.z
inc nVertexCount

invoke glColorMask,GL_TRUE,GL_TRUE,GL_TRUE,GL_TRUE
invoke glEnable,GL_COLOR_MATERIAL
invoke glDisable,GL_TEXTURE_2D
invoke glDisable,GL_STENCIL_TEST
invoke glColor4f,FP4(1.0f),FP4(0.0f),FP4(0.0f), FP4(1.)
invoke glBegin,GL_LINES
invoke glVertex3fv,addr MIN
invoke glVertex3fv,addr MAX
invoke glEnd
invoke glEnable,GL_STENCIL_TEST
invoke glColorMask,GL_FALSE,GL_FALSE,GL_FALSE,GL_FALSE

invoke GlobalFree,x_arr
invoke GlobalFree,y_arr
invoke GlobalFree,z_arr

invoke Vec_Sub,addr DLT,addr MAX,addr MIN
invoke Vec_Normalize,addr PVTPNT,addr DLT
invoke Vec_DotProduct,addr DLT,addr DLT
FDIV FP4(2.)
fstp fLen
invoke Vec_Scale,addr DLT,fLen

mov ecx,nVertexCount
shl ecx,2
invoke mAlloc,ecx
mov arrLen,eax
mov esi,lpArray

xor ecx,ecx
mov arrOffs,ecx
mov data_offset,ecx
loop_get_len:
push ecx
invoke Vec_Sub,addr DLT2,esi,addr DLT
invoke Vec_DotProduct,addr DLT2,addr DLT2
mov ecx,arrLen
add ecx,data_offset
fsqrt
fstp dword ptr[ecx]

add esi,12
add data_offset,4
pop ecx
inc ecx
cmp ecx,nVertexCount
jl loop_get_len

dec nVertexCount
invoke fltmax,arrLen,nVertexCount
fstp lgDP

mov edi,arrLen
loop_get_longest:
push ecx
FCMP dword ptr[edi],lgDP
jz done_loop
add edi,4
pop ecx
inc ecx
cmp ecx,nVertexCount
jl loop_get_longest
done_loop:



invoke GlobalFree,arrLen

ret
fShadowProcessArrayX endp



Thanks for your time,
Onan Farabi
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165