News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

I Need More Help!!

Started by hawkeye62, June 05, 2013, 06:46:45 AM

Previous topic - Next topic

hawkeye62

I have been trying to understand more about the macros in masm32\macros\timers.asm. In particular, I have been looking at the macros that calculate milliseconds. And even more particular, the timer_end macro. At the end of the timer_end macro, there are nine lines of code, two mov instructions and seven floating point instructions. flid instructions push  __timer__pc__count__  and  __timer__pc__frequency__  on the stack. then fdiv is performed. I believe this operation is dividing the pc frequency by the total cycle count. Unless I am crazy, I think it should divide total cycle count by frequency.

Please tell me I am not crazy.

Thanks, Jim

dedndave

you are crazy !    :badgrin:

it is dividing the total number of clock cycles by the number of loop passes

on each pass through the test code, it totalizes clock cycles
when the loop count has expired, the number of clock cylces for a single pass is calculated

hawkeye62

Thanks for your reply! Dividing the total number of clock cycles by the number of loop passes is what it SHOULD be doing. BUT the devil is in the details.  __timer__pc__frequency__  is obtained from the QueryPerformanceFrequency function which returns PC clock cycles per second. And __timer__pc__count__  is as you say, the total clock cycles for execution of the timed code. And I believe if you look at the push instructions followed by the fdiv instruction, you will see that pc clock cycles is being divided by total cycles. Please show me why I am crazy.

Regards, Jim


qWord

#3
Quote from: hawkeye62 on June 05, 2013, 10:43:51 AMQueryPerformanceFrequency function which returns PC clock cycles per second
no. -EDIT- I've interpreted "PC clock cycles" as CPU frequency  :icon_eek:
----

Windows use programmable hardware timers for the performance counter - the frequency of these is independent form the CPU's one.
You can read out the CPU frequency from the registry:
LOCAL hKey:HKEY,freq:REAL8,qMHz:QWORD,_size:DWORD
...
    mov _size,4
    .if rvx(esi = RegOpenKeyEx,HKEY_LOCAL_MACHINE,"HARDWARE\DESCRIPTION\System\CentralProcessor\0",0,KEY_READ,&hKey) != ERROR_SUCCESS || \
        rvx(RegQueryValueEx,hKey,"~MHz",0,0,&qMHz, &_size) != ERROR_SUCCESS
        ; error: can't read CPU frequency from registry
    .endif
invoke RegCloseKey,hKey

    mov DWORD ptr qMHz[4],0
    fild qMHz
    fmul FP8(1.0E6)
    fstp freq  ; [1/s]
MREAL macros - when you need floating point arithmetic while assembling!

dedndave

lol
you're probably right
let me have a look at it - it's been a while

yah
that one uses the high-resolution counter (QueryPerformanceCounter)
we rarely use that one

we generally use the first set of macros, counter_begin and counter_end
RDTSC is about 10 times faster than QueryPerformanceCounter
counter_end divides the cycle count by the pass count   :P

MichaelW

In the timer_end macro:

            finit
            fild  __timer__pc__count__
            fild  __timer__pc__frequency__
            fdiv
            mov   __timer__dw_count__, 1000
            fild  __timer__dw_count__
            fmul
            fistp __timer__dw_count__


Because the FDIV instruction has no operands it is encoded as FDIVP ST(1), ST(0), so it divides ST(1) (__timer__pc__count__) by ST(0) (__timer__pc__frequency__) to calculate the elapsed seconds for the entire loop, and then the FMUL instruction multiples the result by 1000 to convert it to elapsed milliseconds.
Well Microsoft, here's another nice mess you've gotten us into.

hawkeye62

All of the references I have seen say that fdiv with no operands divides st(0) by st(1).

Regards, Jim

MichaelW

#7

;==============================================================================
    include \masm32\include\masm32rt.inc
;==============================================================================
    .data
        ten   REAL8 10.0
        five  REAL8 5.0
        r8    REAL8 ?
    .code
;==============================================================================
start:
;==============================================================================
    fld ten
    fld five
    fdiv
    fstp r8
    printf("%f\n", r8)
    fld ten
    fld five
    fdivr
    fstp r8
    printf("%f\n", r8)
    inkey
    exit
;==============================================================================
END start


00401000                    start:
00401000 DD0500304000           fld     qword ptr [off_00403000]
00401006 DD0508304000           fld     qword ptr [off_00403008]
0040100C DEF9                   fdivp   st(1),st
0040100E DD1D10304000           fstp    qword ptr [off_00403010]
00401014 FF3514304000           push    dword ptr [off_00403014]
0040101A FF3510304000           push    dword ptr [off_00403010]
00401020 6818304000             push    offset off_00403018 ; '%f',00Dh,00Ah,000h
00401025 FF1520204000           call    dword ptr [printf]
0040102B 83C40C                 add     esp,0Ch
0040102E DD0500304000           fld     qword ptr [off_00403000]
00401034 DD0508304000           fld     qword ptr [off_00403008]
0040103A DEF1                   fdivrp  st(1),st


2.000000
0.500000


And from a recent Intel manual for FDIV/FDIVP/FIDIV-Divide:
Quote
The no-operand version of this instruction divides the contents of the ST(1) register by the contents of the ST(0) register.
...
The FDIVP instructions perform the additional operation of popping the FPU register stack after storing the result.
...
In some assemblers, the mnemonic for this instruction is FDIV rather than FDIVP.
Well Microsoft, here's another nice mess you've gotten us into.

hawkeye62

Here is a direct quote from "Art of Assembly" Chapter 14 at cs.smith.edu.

"With zero operands, the fdiv and fdivp instructions pop st(0) and st(1), compute st(0)/st(1), and push the result back onto the stack. The fdivr and fdivrp instructions also pop st(0) and st(1) but compute st(1)/st(0) before pushing the quotient onto the stack."

So much for "Art of Assembly".

Thanks for the help, crazy jim :icon_redface:

dedndave

you can download the intel manuals for info

http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

also, Raymond has a nice FPU tutorial that explains each instruction...

http://www.ray.masmcode.com/

qWord

I would suggest to use AMD's documentation, because the manuals are separated by instruction sets: http://support.amd.com/us/Processor_TechDocs/26569_APM_v5.pdf
MREAL macros - when you need floating point arithmetic while assembling!

hawkeye62

OK guys. Thanks very much for the patience and the help.

Regards, crazy Jim