News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Help with QueryPerformanceCounter and 64 bit numbers

Started by Lonewolff, April 12, 2018, 03:15:46 PM

Previous topic - Next topic

LordAdef

Quote from: Lonewolff on April 12, 2018, 07:03:30 PM
True. I could take out one of the calls.

But if I take out both (and place a single call prior to the loop) systems that throttle clock speed (the ones that are too smart for their own good) will get incorrect results.


thats's true too, have you benchmarked with and without it? I'm way from the computer but got curious

Lonewolff

About the same.

I think it is more the 'DX11 render cycle' that is the bottleneck.

Just had a thought though. The ASM version is built with ML and Link that is supplied with MASM32. But the C++ version is built with the versions supplied with VS2017. I wonder if that is the source of the difference.

Gonna grab something to eat and I'll report back when I build the ASM versions with the 2017 compiler.  :icon_cool:

Siekmanski

Quote from: LordAdef on April 12, 2018, 06:41:29 PM
Marinus, not using FPU is a personal taste or there is any performance gain? As far as I read FPU still stands nicely, right?


edit to add: the reason I'm curious is because sometimes you also use FPU, got me thinking

Hi Alex,

When coding graphics and audio, I mainly use SIMD and not FPU because it can move more data around at greater speed.
When possible I don't mix SIMD and FPU that's why I used scalar SIMD for the timer code.
Creative coders use backward thinking techniques as a strategy.

Siekmanski

Quote from: Lonewolff on April 12, 2018, 06:50:55 PM
I must be missing some optimisation techniques somewhere as my C++ loop (using the same render code) is 1000 FPS faster than the ASM loop.

C++ render loop is ~7000 FPS
ASM render loop is ~6000 FPS

Not a bad comparison though.


Are the message pump loops the same for ASM and C++?
Creative coders use backward thinking techniques as a strategy.

Siekmanski

Quote from: Lonewolff on April 12, 2018, 07:03:30 PM
True. I could take out one of the calls.

But if I take out both (and place a single call prior to the loop) systems that throttle clock speed (the ones that are too smart for their own good) will get incorrect results.

Never noticed that ( my system does throttle the clock speed )
Creative coders use backward thinking techniques as a strategy.

Lonewolff

Quote from: Siekmanski on April 12, 2018, 07:18:29 PM
Are the message pump loops the same for ASM and C++?

Yep, making sure I keep the code the same so we are comparing apples with apples.

Lonewolff

Just changed all of the libs to the 2017 SDK versions and the frame rate is now on par with the C++ version.

Couldn't compile with the 2017 ML.exe as it is complaining about invalid operands on a couple of my calls. Will look a bit closer to see if I am doing something wrong on that front.

Siekmanski

Creative coders use backward thinking techniques as a strategy.

Lonewolff

Was already using d3d11.lib from the SDK.

Copied the others across - gdi32.Lib, kernel32.Lib, and user32.Lib.


Not sure why the new version of ML.exe doesn't like the project though. Something to do with the coinvoke macro?

Quote
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(228): Main Line Code
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(233): Main Line Code
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(238): Main Line Code


The corresponding lines of code;


(line 228) coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, SIZEOFvertexShaderData, NULL, addr d3dVertexShader

(line 233) coinvoke d3dDevice, ID3D11Device, CreatePixelShader, addr pixelShaderData, SIZEOFpixelShaderData, NULL, addr d3dPixelShader

(line 238) coinvoke d3dDevice, ID3D11Device, CreateInputLayout, addr inputDescP, 1, addr vertexShaderData, SIZEOFvertexShaderData, addr d3dInputLayout



[edit]
Worked it out. The new compiler doesn't like the way I am doing 'sizeof'. I'll work that out another day  :biggrin:

Siekmanski

Creative coders use backward thinking techniques as a strategy.

Lonewolff

Quote from: Siekmanski on April 12, 2018, 08:10:21 PM
ole32.lib perhaps?

(SIZEOF vertexShaderData)

Yeah 'SIZEOF vertexShaderData' doesn't work because the declaration is multi-line (Shader is hard coded at present)


vertexShaderData db 68,88,66,67,166,109,78,113,107,98,65,70,91,88,250,161,103,22,241,76,1,0,0,0,16,2,0,0,6,0,0,0,56,0,0,0,156,0,0,0,224,0,0,0,92,1,0,0
db 168,1,0,0,220,1,0,0,65,111,110,57,92,0,0,0,92,0,0,0,0,2,254,255,52,0,0,0,40,0,0,0,0,0,36,0,0,0,36,0,0,0,36,0,0,0,36,0,1
db 0,36,0,0,0,0,0,1,2,254,255,31,0,0,2,5,0,0,128,0,0,15,144,4, 0,0,4,0,0,3,192,0,0,255,144,0,0,228,160,0,0,228,144,1,0,0,2,0,0
db 12,192,0,0,228,144,255,255,0,0,83,72,68,82,60,0,0,0,64,0,1,0,15,0,0,0,95,0,0,3,242,16,16,0,0,0,0,0,103,0,0,4,242,32,16,0,0,0,0
db 0,1,0,0,0,54,0,0,5,242,32,16,0,0,0,0,0,70,30,16,0,0,0,0,0,62,0,0,1,83,84,65,84,116,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0
db 2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
db 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
db 0,0,0,0,0,0,82,68,69,70,68,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,0,0,0,0,4,254,255,0,1,0,0,28,0,0,0,77,105,99,114,111,115,111
db 102,116,32,40,82,41,32,72,76,83,76,32,83,104,97,100,101,114,32,67,111,109,112,105,108,101,114,32,49,48,46,49,0,73,83,71,78,44,0,0,0,1,0,0,0,8,0,0,0
db 32,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,15,15,0,0,80,79,83,73,84,73,79,78,0,171,171,171,79,83,71,78,44,0,0,0,1,0,0,0,8
db 0,0,0,32,0,0,0,0,0,0,0,1,0,0,0,3,0,0,0,0,0,0,0,15,0,0,0,83,86,95,80,79,83,73,84,73,79,78,0
SIZEOFvertexShaderData EQU $-vertexShaderData


So this is how I am calculating 'sizeof' until I code a better solution.

New compiler doesn't like that very much, where the old one seems ok with it.

Project doesn't link to ole32.lib.

jj2007

An old problem with recent Micros**t assemblers. Try this:
mov ecx, SIZEOFpixelShaderData
coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, ecx, NULL, addr d3dVertexShader


If that doesn't work:
vertexShaderData        db 68 ....
vertexShaderDataEnd     db 0
...
mov ecx, vertexShaderDataEnd
sub ecx, vertexShaderData
coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, ecx, NULL, addr d3dVertexShader

Siekmanski

Thought your data was in a structure member, than you can use (sizeof vertexShaderData)
Your solution should work too.
Creative coders use backward thinking techniques as a strategy.

Lonewolff

Thanks JJ2007, a couple of things to try.

Nah Siekmanski, my data is all nasty and flapping about at the moment - LOL

Siekmanski

-> Project doesn't link to ole32.lib.

CoInitialize and CoUninitialize do need ole32.lib, needed for COM.
Creative coders use backward thinking techniques as a strategy.