News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Help with QueryPerformanceCounter and 64 bit numbers

Started by Lonewolff, April 12, 2018, 03:15:46 PM

Previous topic - Next topic

Lonewolff

Hi Guys,

I am trying to convert my C/C++ frame rate counter to work in MASM. But I have come across a bit of a snag. I'm not sure how you go about handling 64 bit integers.


; Framerate counter
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeEnd
;TimeElapsed.QuadPart = TimeEnd.QuadPart - TimeStart.QuadPart;
;TimeElapsed.QuadPart *= 1000000000;
;TimeElapsed.QuadPart /= TimeFrequency.QuadPart; // in nanoseconds
inc nCounter

;if (TimeElapsed.QuadPart > 1000000000)
.if 1 ; placeholder
invoke itoa, nCounter, addr szBuffer, 10
invoke SetWindowText, hWnd, addr szBuffer
mov nCounter, 0
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeStart
.endif


Here is my partially converted code. The commented lines are C/C++

If anyone could assist, that would be truly appreciated  8)

hutch--

If the number range is within DWORD then you probably only need to access the low DWORD of the 64 bit number. I gather this is 32 bit code ?

Lonewolff

Yep 32 bit code.

Could you please give an example of how to access the low part of the DWORD?

Still getting my feet on the ground with the simple stuff. Tried a few different things but they don't compile.

Thanks again  :)

Lonewolff

I think I am a step closer, but I am on the edge of my knowledge of ASM here - LOL  :bgrin:


invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeEnd

;TimeElapsed.QuadPart = TimeEnd.QuadPart - TimeStart.QuadPart;
mov eax,DWORD PTR TimeEnd[0]
sub eax,DWORD PTR TimeStart[0]
mov ecx,DWORD PTR TimeEnd[+4]
sbb ecx,DWORD PTR TimeStart[+4]
mov DWORD PTR TimeElapsed[0], eax
mov DWORD PTR TimeElapsed[+4], ecx

;TimeElapsed.QuadPart *= 1000000000;                             // Not sure what to do here
;TimeElapsed.QuadPart /= TimeFrequency.QuadPart; // Not sure what to do here
inc nCounter

;if (TimeElapsed.QuadPart > 1000000000)                           // Not sure what to do here
.if 1 ; placeholder
invoke itoa, nCounter, addr szBuffer, 10
invoke SetWindowText, hWnd, addr szBuffer
mov nCounter, 0
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeStart
.endif


If anyone can assist in helping fill in the blanks, it would be truly awesome.

jj2007

The simple solution:
  NanoTimer()
  invoke Sleep, 1000
  Inkey NanoTimer$()


But if you want to roll your own, it's good to know that the FPU understands perfectly what a QWORD integer is:
include \masm32\include\masm32rt.inc

.data?
timeStart dq ?
timeEnd dq ?
timeFrequency dq ?
timeElapsed dq ?

.code
start:
  invoke QueryPerformanceFrequency, addr timeFrequency
  invoke QueryPerformanceCounter, addr timeStart
  invoke Sleep, 3000
  invoke QueryPerformanceCounter, addr timeEnd
  fild timeEnd
  fild timeStart
  fsub
  fild timeFrequency
  fdiv
  fistp timeElapsed
  inkey str$(dword ptr timeElapsed), " seconds elapsed"
  exit
end start

LordAdef

i know the feeling....


I copied/pasted some related stuff from my own code, without much checking. But I hope some of it may help.
It may even have some bugs, although it's actually working ok.


.data
       SchedulerMS dd 1 ; granularity for Sleep
PerfCountFreq dd 0
LastCounter dd 0
EndCounter dd 0
ElapsedCounter dd 0
tFPS dd 0
MSPerFrameR real8     0.0
SleepMS sdword 0
TargetSecPerFrame real8 16.0



In the code:

QueryPerformance... uses LONG INTEGER which is an Union, but you can deal with it straight as a dword:


....
inv QueryPerformanceFrequency, ADDR PerfCountFreq



in the game loop:
...

inv QueryPerformanceCounter, ADDR LastCounter



....


inv QueryPerformanceCounter, ADDR EndCounter
mov ecx, LastCounter
mov eax, EndCounter
sub eax, ecx
mov edx, 1000
mov ElapsedCounter, eax
mul edx

push eax
fild dword ptr [esp]
fidiv PerfCountFreq
fstp MSPerFrameR

mov eax, PerfCountFreq
cdq
div dword ptr ElapsedCounter
mov tFPS, eax

m2m LastCounter, EndCounter
[code]

Siekmanski

This is a piece of timer code I use to calculate the FrameTime delta.
You only need the low 32bit part to calculate the time between screen refreshes.

FramesPerSecond = (1.0 / FrameTimeDelta )

.const
QPinteger struct
    Low32bit    dd ?
    High32bit   dd ?
QPinteger ends

float1          real4 1.0

.data?
align8
FrameTimeOld    QPinteger <?>
FrameTimeNew    QPinteger <?>

TicksPerSecondReciprocal real4 ?
FrameTimeDelta           real4 ?


.code

InitTimer proc
    invoke   QueryPerformanceCounter,addr FrameTimeOld
    invoke   QueryPerformanceFrequency,addr FrameTimeNew
    movss    xmm0,float1
    cvtsi2ss xmm1,FrameTimeNew.Low32bit
    divss    xmm0,xmm1
    movss    TicksPerSecondReciprocal,xmm0
    ret
InitTimer endp


Update_frame proc
    invoke   QueryPerformanceCounter,addr FrameTimeNew
    mov      eax,FrameTimeNew.Low32bit
    mov      ecx,eax
    sub      eax, FrameTimeOld.Low32bit
    mov      FrameTimeOld.Low32bit,ecx
    cvtsi2ss xmm0,eax
    mulss    xmm0,TicksPerSecondReciprocal
    movss    FrameTimeDelta,xmm0 ; FramesPerSecond == 1 / FrameTimeDelta
    ret
Update_frame endp


EDIT: code adjustment!
Creative coders use backward thinking techniques as a strategy.

Lonewolff

Awesome, thanks for the advice.  8)

How would I compare TimeElapsed against 1000000000 to see if it is greater?

I can't use something like the following as it doesn't fit in EAX


mov eax, TimeElapsed
mov ebx, 1000000000
cmp ebx, eax
jg greater


Could you get away with just the low byte or something?


Lonewolff

This is what I presently have but the code after the compare never gets executed.

The aim is to display the frame rate at one second intervals in the window title area.

Am I on the write track?


; Framerate counter (Work In Progress)
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeEnd
fild TimeEnd
fild TimeStart
fsub
fild TimeFrequency
fdiv
fistp TimeElapsed
 
inc nCounter

mov eax, DWORD PTR TimeElapsed[0]
mov ebx, 1000000000
cmp ebx, eax
jg skip

; ** Never gets called **
invoke itoa, nCounter, addr szBuffer, 10
invoke SetWindowText, hWnd, addr szBuffer
mov nCounter, 0
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeStart

skip:

Siekmanski

I adjusted the code in my previous post.

The Update_frame proc would be something like this:


FrameCounter                dd 0
TimeCounter                 real4 0.0
FrameTimeCounter            real4 0.0
FramesPerSecond             real4 0.0


    invoke      QueryPerformanceCounter,addr FrameTimeNew
    mov         eax,FrameTimeNew.Low32bit
    mov         ecx,eax
    sub         eax,FrameTimeOld.Low32bit
    mov         FrameTimeOld.Low32bit,ecx
    cvtsi2ss    xmm0,eax
    mulss       xmm0,TicksPerSecondReciprocal
    movss       FrameTimeDelta,xmm0 ; FPS = 1 / FrameTimeDelta

    movss       xmm1,TimeCounter
    addss       xmm1,xmm0
    movss       TimeCounter,xmm1

    inc         FrameCounter
    movss       xmm1,FrameTimeCounter
    addss       xmm1,xmm0
    comiss      xmm1,FLT4(1.0)
    jb          PerSecond
    cvtsi2ss    xmm0,FrameCounter
    divss       xmm0,xmm1
    movss       FramesPerSecond,xmm0  ; update per second
    mov         FrameCounter,0       
    xorps       xmm1,xmm1
PerSecond:   
    movss       FrameTimeCounter,xmm1
Creative coders use backward thinking techniques as a strategy.

LordAdef

Marinus, not using FPU is a personal taste or there is any performance gain? As far as I read FPU still stands nicely, right?


edit to add: the reason I'm curious is because sometimes you also use FPU, got me thinking

Lonewolff

Seem to have it working now  :icon_cool:

Needed to throw a multiplication of 1000000000 in there.


; Framerate counter (Work In Progress)
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeEnd
fild TimeEnd
fild TimeStart
fsub
fild TimeNanoSecond ; 1000000000
fmul
fild TimeFrequency
fdiv
fistp TimeElapsed

  inc nCounter

mov eax, DWORD PTR TimeElapsed[0]
mov ebx, 1000000000
cmp eax, ebx
jl skip
invoke itoa, nCounter, addr szBuffer, 10
invoke SetWindowText, hWnd, addr szBuffer
mov nCounter, 0
invoke QueryPerformanceFrequency, addr TimeFrequency
invoke QueryPerformanceCounter, addr TimeStart
skip:


I must be missing some optimisation techniques somewhere as my C++ loop (using the same render code) is 1000 FPS faster than the ASM loop.

C++ render loop is ~7000 FPS
ASM render loop is ~6000 FPS

Not a bad comparison though.

LordAdef

one think I noticed is (as far as I know) you only need to invoke queryperformancefrequency once, outside and prior to the loop.


You will be receiving the same value all the time.

Lonewolff

True. I could take out one of the calls.

But if I take out both (and place a single call prior to the loop) systems that throttle clock speed (the ones that are too smart for their own good) will get incorrect results.

jj2007

Quote from: Lonewolff on April 12, 2018, 06:50:55 PMI must be missing some optimisation techniques somewhere as my C++ loop (using the same render code) is 1000 FPS faster than the ASM loop.

Check where the bottleneck is... as far as the timing functions are concerned, they have low overhead, but you could, for example,
- call frequency only once before the loop (it won't change)
- if you use it inside the loop, use QueryPerformanceCounter only once (old end = new start time)