News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Min And Max function?

Started by Farabi, June 13, 2012, 01:25:00 PM

Previous topic - Next topic

KeepingRealBusy

Quote from: qWord on June 13, 2012, 10:20:11 PM
The A problem is the start value: this must either a value of the list or the maximum and minimum values of the type (double):  +-1.7976931348623157E+308.

Always initialize the min and max values from the first entry in the list, that way you will never get a min or max that is not in the list.

Dave.

RuiLoureiro

Quote from: KeepingRealBusy on June 14, 2012, 01:38:07 AM
Quote from: qWord on June 13, 2012, 10:20:11 PM
The A problem is the start value: this must either a value of the list or the maximum and minimum values of the type (double):  +-1.7976931348623157E+308.

Always initialize the min and max values from the first entry in the list, that way you will never get a min or max that is not in the list.

Dave.
they can get but i use to do min=max=first entry and then compare
            the second, etc.

RuiLoureiro

MichaelW,
          I would like to understand this fltmin procedure
          Could you help me ?
         
    1.    If we have 10 numbers in the array
          lengthof array is 10 no ?
    2.    How to access the last value ?

           

fltmin proc p:dword, n:dword
    local _min:real4
    mov ecx, n
    mov edx, p
   
    fld4 FLT_MAX
    fstp _min
  L0:
    fld real4 ptr [edx+ecx*4]
    fld _min
    fcomip st, st(1)
    jb  L1
    fst _min
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _min
    ret
[/coed]

KeepingRealBusy

Looks to me that you have to decrement ecx before you start the loop whenever you use a reg as both an index and a count, otherwise you are accessing an entry outside of the array.
At least that is what I have been doing for 47 years of programming. In addition, the array will be accessed in the reverse order (last to first), which is valid in this particular algo.

Dave.

MichaelW

Yes, I intended that ECX be decremented before the start of the loop. As it is the code accesses the first (index 9) through the last (index 0) values, but unfortunately it also accesses the least-significant dword of r8 (index 10).
Well Microsoft, here's another nice mess you've gotten us into.

jj2007

Quote from: qWord on June 13, 2012, 10:20:11 PM
The A problem is the start value: this must either a value of the list or the maximum and minimum values of the type (double):  +-1.7976931348623157E+308.

The start value was ok, but taking a value from the list is actually more efficient, thanks for the idea :icon14:
The problem is really that Win7-64 trashes lots of xmm regs. For now, I have identified SetConsoleCP (used in Print) and QueryPerformanceFrequency (used in NanoTimer) as culprits - the latter trashes xmm0...xmm5. Win7-32 does not do such nasty things. Will keep you posted :biggrin:

RuiLoureiro

MichaelW,
         Yes the problem was to call fltmin with lengthof array  :t
          I dont like that fld4 FLT_MAX
        and
        jb  L1      ; why to set it again if it is = _min ?
        fst _min  ; one _min is better than the other ?


Dave,
     Yes the problem was to call fltmin with lengthof array
     Yes we know that the array is accessed in the reverse order
     and we can do _min=last. No problem. Reverse order is usually
     what i do, i have not a count !
Quote
47 years of programming
I guess you are nearly 70, no ?  ;)

KeepingRealBusy

RuiLoureiro,

73 this year. You are dealing with an old dog that doesn't learn new tricks too easily.

Dave.

RuiLoureiro

Dave,
          I hope you live the others 73 !  ;)
          I hope to get that number one day in the near future 
          Best regards
          Rui Loureiro

RuiLoureiro

Well the problem is to help Farabi

MichaelW,
          Take a look at this
         
Based on the example we can write this:
It should work for n=1 to ... N
I dont tested it
"printf" doesnt work wirh me

For REAL8, replace real4 by real8
and *4 by *8 i think
We can also replace jmp start by sub  ecx, 1 but...
Quote
;n=lengthof array
MyMin   proc p:dword, n:dword
        local _min:real4
        mov  ecx, n
        mov  edx, p
        sub  ecx, 1
        ;
        fld  real4 ptr [edx+ecx*4]   
        fstp _min
        jmp  start
  L0:
        fld real4 ptr [edx+ecx*4]
        fld _min
        fcomip st, st(1)
        jbe  L1
        fst _min
  L1:
        fstp st
start:
        sub ecx, 1
        jns L0
        fld _min
        ret
MyMin   endp   
Quote
;n=lengthof array
MyMax   proc p:dword, n:dword
        local _max:real4       
        mov  ecx, n
        mov  edx, p
        sub  ecx, 1
        ;
        fld  real4 ptr [edx+ecx*4]   
        fstp _max
        jmp  start
  L0:
        fld real4 ptr [edx+ecx*4]
        fld _max
        fcomip st, st(1)
        jae  L1
        fst _max
  L1:
        fstp st
start:
        sub ecx, 1
        jns L0
        fld _max
        ret
MyMax   endp   

MichaelW

In my tests there was no significant speed advantage of JAE/JBE over JA/JB (even though intuitively it seems to me that there should be), and no speed advantage of setting min/max to the first element over setting it to +/- FLT_MAX.
;==============================================================================
include \masm32\include\masm32rt.inc
.686
;==============================================================================

;-------------------------------------
; These from VC Toolkit 2003 float.h:
;-------------------------------------

FLT_MAX equ 3.402823466e+38
DBL_MAX equ 1.7976931348623158e+308

;==============================================================================
.data
      array real4 -8.8, -3.9, 111.5, 0.5, 3.6, 1.2, 4.9, 9.9, -98.2, 0.0
      r4    real4 ?
      r8    real8 ?
.code
;==============================================================================

;------------------------------------------------------------------------
; This is Abel's version of a Park-Miller-Carta generator, details here:
;   http://www.masm32.com/board/index.php?topic=6558.0
; Modified to return a floating-point value in the interval [0,1) at
; the top of the FPU stack in ST(0), as per the normal convention.
;
; The period of the core generator is 2147483646 (tested), and it runs
; in 23 cycles on a P3, including the call overhead and a fstp to store
; the result to memory. Note that setting frnd_divider to the period
; instead of to a power of 2 caused a 2x slowdown.
;------------------------------------------------------------------------

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 4
frnd proc

    .data
        align 8
        abel_rand_seed dd 1
        frnd_divider   dq 2147483648
    .code

    mov eax, abel_rand_seed
    mov ecx, 16807              ; a = 7^5
    mul ecx                     ; edx:eax == a*seed == D:A
    mov ecx, 7fffffffh          ; ecx = m

    add edx, edx                ; edx = 2*D
    cmp eax, ecx                ; eax = A
    jna @F
    sub eax, ecx                ; if A>m, A = A - m
  @@:
    add eax, edx                ; eax = A + 2*D
    jns @F
    sub eax, ecx                ; If (A + 2*D)>m
  @@:
    mov abel_rand_seed, eax     ; save new seed
    fild abel_rand_seed
    fild frnd_divider
    fdiv
    ret

frnd  endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;==============================================================================

fltmax proc p:dword, n:dword
    local _max:real4
    ;xor esi, esi
    mov ecx, n
    dec ecx
    mov edx, p
    ;fld4 -FLT_MAX
    fld real4 ptr [edx+ecx*4]
    fstp _max
  L0:
    ;inc esi
    fld real4 ptr [edx+ecx*4]
    fld _max
    fcomip st, st(1)
    ja  L1
    ;inc esi
    fst _max
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _max
    ;printf("max:%d\n",esi)
    ret
fltmax endp

;==============================================================================

fltmin proc p:dword, n:dword
    local _min:real4
    ;xor esi, esi
    mov ecx, n
    dec ecx
    mov edx, p
    ;fld4 FLT_MAX
    fld real4 ptr [edx+ecx*4]
    fstp _min
  L0:
    ;inc esi
    fld real4 ptr [edx+ecx*4]
    fld _min
    fcomip st, st(1)
    jb  L1
    ;inc esi
    fst _min
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _min
    ;printf("min:%d\n",esi)
    ret
fltmin endp

;==============================================================================
start:
;==============================================================================
    invoke Sleep, 3000

    mov esi, alloc(10000000*4)
    xor ebx, ebx
    invoke GetTickCount
    movzx edi, ax
    .WHILE ebx < edi
        invoke frnd
        fstp r4
        add ebx, 1
    .ENDW
    xor ebx, ebx
    .WHILE ebx < 10000000
        invoke frnd
        fld4 999.0
        fmul
        fld4 498.0
        fsub
        fstp r4
        mov eax, dword ptr r4
        mov [esi+ebx*4], eax
        add ebx, 1
    .ENDW

    invoke GetTickCount
    push eax

    invoke fltmin, esi, 9999999
    fstp r8
    ;printf("%.1f\n",r8)
    invoke fltmax, esi, 9999999
    fstp r8
    ;printf("%.1f\n",r8)

    invoke GetTickCount
    pop edx
    sub eax, edx
    printf("%d\n",eax)

    free esi

    inkey
    exit
;==============================================================================
end start


The only advantage I can see of setting min/max to the first element (or any other particular element) instead of setting it to +/- FLT_MAX would be if there is a significant error in the value of FLT_MAX, and the array contains values that are close to +/- FLT_MAX.
Well Microsoft, here's another nice mess you've gotten us into.

RuiLoureiro

MichaelW,
          Thank you for testing this cases.
          Meanwhile, my question is why to find out the max value
          (if it is 3.402823466e+38 or 3.402843566e+38 or
          or 1.7976931348623158e+308 or 1.7976941348623158e+308) ?
          Why if we have a set of N values and we can start with
          one of them ? I have one answer: to invent a different way.
          To me it makes no sense, simply. I never use it.
                   
          Second question is this: In the following fltmax procedure
          you set _max with real4 ptr [edx+ecx*4]

            fld real4 ptr [edx+ecx*4]
            fstp _max

         and next you compare the same values

            fld real4 ptr [edx+ecx*4]
            fld _max
            fcomip st, st(1)

        and next you set real4 ptr [edx+ecx*4] to _max again
       
            fst _max

        because it is the same first value

Quote
fltmax proc p:dword, n:dword
    local _max:real4
    ;xor esi, esi
    mov ecx, n
    dec ecx
    mov edx, p
    ;fld4 -FLT_MAX
    fld real4 ptr [edx+ecx*4]
    fstp _max                   ; put real4 ptr [edx+ecx*4] to _max
  L0:
    ;inc esi
    fld real4 ptr [edx+ecx*4]
    fld _max
    fcomip st, st(1)            ; compare real4 ptr [edx+ecx*4] with _max
    ja  L1
    ;inc esi
    fst _max                    ; put it again
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _max
    ;printf("max:%d\n",esi)
    ret
fltmax endp

            It should be

Quote
fltmax proc p:dword, n:dword
    local _max:real4
    ;xor esi, esi
    mov ecx, n
    dec ecx
    mov edx, p
    ;fld4 -FLT_MAX
    fld real4 ptr [edx+ecx*4]
    fstp _max                   
    dec  ecx 
L0:
    ;inc esi
    fld real4 ptr [edx+ecx*4]
    fld _max
    fcomip st, st(1)                ; compare the first with another
    ja  L1
    ;inc esi
    fst _max
  L1:
    fstp st
    sub ecx, 1
    jns L0
    fld _max
    ;printf("max:%d\n",esi)
    ret
fltmax endp

sinsi

Quote from: jj2007 on June 14, 2012, 03:56:15 AM
The problem is really that Win7-64 trashes lots of xmm regs. For now, I have identified SetConsoleCP (used in Print) and QueryPerformanceFrequency (used in NanoTimer) as culprits - the latter trashes xmm0...xmm5. Win7-32 does not do such nasty things. Will keep you posted :biggrin:
Hey jj, remember the x64 calling convention? XMM0..XMM5 are all volatile. Register Usage

jj2007

Quote from: sinsi on June 14, 2012, 08:59:54 PM
Hey jj, remember the x64 calling convention? XMM0..XMM5 are all volatile. Register Usage

Hey sinsi,
The link is good but they are talking x64 architecture, and the code runs in x32 mode. Under Win7-32, none of the XMM regs is ever being trashed, now five of them are zeroed. I bet that must break quite a bit of existing code ::)

sinsi

Ouch, a problem I guess. Although what does the Microsoft x86 calling convention say about xmm registers?