I Need More Help!!

hawkeye62 · June 05, 2013, 06:46:45 AM

I have been trying to understand more about the macros in masm32\macros\timers.asm. In particular, I have been looking at the macros that calculate milliseconds. And even more particular, the timer_end macro. At the end of the timer_end macro, there are nine lines of code, two mov instructions and seven floating point instructions. flid instructions push __timer__pc__count__ and __timer__pc__frequency__ on the stack. then fdiv is performed. I believe this operation is dividing the pc frequency by the total cycle count. Unless I am crazy, I think it should divide total cycle count by frequency.

Please tell me I am not crazy.

Thanks, Jim

dedndave · June 05, 2013, 07:06:01 AM

you are crazy !

it is dividing the total number of clock cycles by the number of loop passes

on each pass through the test code, it totalizes clock cycles
when the loop count has expired, the number of clock cylces for a single pass is calculated

hawkeye62 · June 05, 2013, 10:43:51 AM

Thanks for your reply! Dividing the total number of clock cycles by the number of loop passes is what it SHOULD be doing. BUT the devil is in the details. __timer__pc__frequency__ is obtained from the QueryPerformanceFrequency function which returns PC clock cycles per second. And __timer__pc__count__ is as you say, the total clock cycles for execution of the timed code. And I believe if you look at the push instructions followed by the fdiv instruction, you will see that pc clock cycles is being divided by total cycles. Please show me why I am crazy.

Regards, Jim

qWord · June 05, 2013, 10:59:01 AM

Quote from: hawkeye62 on June 05, 2013, 10:43:51 AMQueryPerformanceFrequency function which returns PC clock cycles per second

~~no.~~ -EDIT- I've interpreted "PC clock cycles" as CPU frequency :icon_eek:
----
Windows use programmable hardware timers for the performance counter - the frequency of these is independent form the CPU's one.
You can read out the CPU frequency from the registry:

Code Select

LOCAL hKey:HKEY,freq:REAL8,qMHz:QWORD,_size:DWORD
...
    mov _size,4
    .if rvx(esi = RegOpenKeyEx,HKEY_LOCAL_MACHINE,"HARDWARE\DESCRIPTION\System\CentralProcessor\0",0,KEY_READ,&hKey) != ERROR_SUCCESS || \
        rvx(RegQueryValueEx,hKey,"~MHz",0,0,&qMHz, &_size) != ERROR_SUCCESS
        ; error: can't read CPU frequency from registry
    .endif
	invoke RegCloseKey,hKey

    mov DWORD ptr qMHz[4],0
    fild qMHz
    fmul FP8(1.0E6)
    fstp freq  ; [1/s]

dedndave · June 05, 2013, 11:00:44 AM

lol
you're probably right
let me have a look at it - it's been a while

yah
that one uses the high-resolution counter (QueryPerformanceCounter)
we rarely use that one

we generally use the first set of macros, counter_begin and counter_end
RDTSC is about 10 times faster than QueryPerformanceCounter
counter_end divides the cycle count by the pass count :P

MichaelW · June 05, 2013, 11:08:47 AM

In the timer_end macro:

Code Select


            finit
            fild  __timer__pc__count__
            fild  __timer__pc__frequency__
            fdiv
            mov   __timer__dw_count__, 1000
            fild  __timer__dw_count__
            fmul
            fistp __timer__dw_count__

Because the FDIV instruction has no operands it is encoded as FDIVP ST(1), ST(0), so it divides ST(1) (__timer__pc__count__) by ST(0) (__timer__pc__frequency__) to calculate the elapsed seconds for the entire loop, and then the FMUL instruction multiples the result by 1000 to convert it to elapsed milliseconds.

hawkeye62 · June 05, 2013, 12:18:41 PM

All of the references I have seen say that fdiv with no operands divides st(0) by st(1).

Regards, Jim

MichaelW · June 05, 2013, 01:09:27 PM

Code Select


;==============================================================================
    include \masm32\include\masm32rt.inc
;==============================================================================
    .data
        ten   REAL8 10.0
        five  REAL8 5.0
        r8    REAL8 ?
    .code
;==============================================================================
start:
;==============================================================================
    fld ten
    fld five
    fdiv
    fstp r8
    printf("%f\n", r8)
    fld ten
    fld five
    fdivr
    fstp r8
    printf("%f\n", r8)
    inkey
    exit
;==============================================================================
END start

Code Select


00401000                    start:
00401000 DD0500304000           fld     qword ptr [off_00403000]
00401006 DD0508304000           fld     qword ptr [off_00403008]
0040100C DEF9                   fdivp   st(1),st
0040100E DD1D10304000           fstp    qword ptr [off_00403010]
00401014 FF3514304000           push    dword ptr [off_00403014]
0040101A FF3510304000           push    dword ptr [off_00403010]
00401020 6818304000             push    offset off_00403018	; '%f',00Dh,00Ah,000h
00401025 FF1520204000           call    dword ptr [printf]
0040102B 83C40C                 add     esp,0Ch
0040102E DD0500304000           fld     qword ptr [off_00403000]
00401034 DD0508304000           fld     qword ptr [off_00403008]
0040103A DEF1                   fdivrp  st(1),st

Code Select


2.000000
0.500000

And from a recent Intel manual for FDIV/FDIVP/FIDIV-Divide:

Quote
The no-operand version of this instruction divides the contents of the ST(1) register by the contents of the ST(0) register.
...
The FDIVP instructions perform the additional operation of popping the FPU register stack after storing the result.
...
In some assemblers, the mnemonic for this instruction is FDIV rather than FDIVP.

hawkeye62 · June 06, 2013, 02:09:59 AM

Here is a direct quote from "Art of Assembly" Chapter 14 at cs.smith.edu.

"With zero operands, the fdiv and fdivp instructions pop st(0) and st(1), compute st(0)/st(1), and push the result back onto the stack. The fdivr and fdivrp instructions also pop st(0) and st(1) but compute st(1)/st(0) before pushing the quotient onto the stack."

So much for "Art of Assembly".

Thanks for the help, crazy jim :icon_redface:

dedndave · June 06, 2013, 02:15:58 AM

you can download the intel manuals for info

http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

also, Raymond has a nice FPU tutorial that explains each instruction...

http://www.ray.masmcode.com/

qWord · June 06, 2013, 03:34:47 AM

I would suggest to use AMD's documentation, because the manuals are separated by instruction sets: http://support.amd.com/us/Processor_TechDocs/26569_APM_v5.pdf

hawkeye62 · June 06, 2013, 07:57:52 AM

OK guys. Thanks very much for the patience and the help.

Regards, crazy Jim

The MASM Forum

News:

I Need More Help!!

hawkeye62

dedndave

hawkeye62

qWord

dedndave

MichaelW

hawkeye62

MichaelW

hawkeye62

dedndave

qWord

hawkeye62