Help with QueryPerformanceCounter and 64 bit numbers

LordAdef · April 12, 2018, 07:05:56 PM

Quote from: Lonewolff on April 12, 2018, 07:03:30 PM
True. I could take out one of the calls.

But if I take out both (and place a single call prior to the loop) systems that throttle clock speed (the ones that are too smart for their own good) will get incorrect results.

thats's true too, have you benchmarked with and without it? I'm way from the computer but got curious

Lonewolff · April 12, 2018, 07:10:00 PM

About the same.

I think it is more the 'DX11 render cycle' that is the bottleneck.

Just had a thought though. The ASM version is built with ML and Link that is supplied with MASM32. But the C++ version is built with the versions supplied with VS2017. I wonder if that is the source of the difference.

Gonna grab something to eat and I'll report back when I build the ASM versions with the 2017 compiler. :icon_cool:

Siekmanski · April 12, 2018, 07:12:40 PM

Quote from: LordAdef on April 12, 2018, 06:41:29 PM
Marinus, not using FPU is a personal taste or there is any performance gain? As far as I read FPU still stands nicely, right?

edit to add: the reason I'm curious is because sometimes you also use FPU, got me thinking

Hi Alex,

When coding graphics and audio, I mainly use SIMD and not FPU because it can move more data around at greater speed.
When possible I don't mix SIMD and FPU that's why I used scalar SIMD for the timer code.

Siekmanski · April 12, 2018, 07:18:29 PM

Quote from: Lonewolff on April 12, 2018, 06:50:55 PM
I must be missing some optimisation techniques somewhere as my C++ loop (using the same render code) is 1000 FPS faster than the ASM loop.

C++ render loop is ~7000 FPS
ASM render loop is ~6000 FPS

Not a bad comparison though.

Are the message pump loops the same for ASM and C++?

Siekmanski · April 12, 2018, 07:28:09 PM

Quote from: Lonewolff on April 12, 2018, 07:03:30 PM
True. I could take out one of the calls.

But if I take out both (and place a single call prior to the loop) systems that throttle clock speed (the ones that are too smart for their own good) will get incorrect results.

Never noticed that ( my system does throttle the clock speed )

Lonewolff · April 12, 2018, 07:31:44 PM

Quote from: Siekmanski on April 12, 2018, 07:18:29 PM
Are the message pump loops the same for ASM and C++?

Yep, making sure I keep the code the same so we are comparing apples with apples.

Lonewolff · April 12, 2018, 07:40:24 PM

Just changed all of the libs to the 2017 SDK versions and the frame rate is now on par with the C++ version.

Couldn't compile with the 2017 ML.exe as it is complaining about invalid operands on a couple of my calls. Will look a bit closer to see if I am doing something wrong on that front.

Siekmanski · April 12, 2018, 07:47:24 PM

Just curious, was it the d3d11.lib ?

Lonewolff · April 12, 2018, 07:57:42 PM

Was already using d3d11.lib from the SDK.

Copied the others across - gdi32.Lib, kernel32.Lib, and user32.Lib.

Not sure why the new version of ML.exe doesn't like the project though. Something to do with the coinvoke macro?

Quote
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(228): Main Line Code
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(233): Main Line Code
error A2070:invalid instruction operands coinvoke(16): Macro Called From project.asm(238): Main Line Code

The corresponding lines of code;

Code Select


(line 228) coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, SIZEOFvertexShaderData, NULL, addr d3dVertexShader

(line 233) coinvoke d3dDevice, ID3D11Device, CreatePixelShader, addr pixelShaderData, SIZEOFpixelShaderData, NULL, addr d3dPixelShader

(line 238) coinvoke d3dDevice, ID3D11Device, CreateInputLayout, addr inputDescP, 1, addr vertexShaderData, SIZEOFvertexShaderData, addr d3dInputLayout

[edit]
Worked it out. The new compiler doesn't like the way I am doing 'sizeof'. I'll work that out another day

Siekmanski · April 12, 2018, 08:10:21 PM

ole32.lib perhaps?

(SIZEOF vertexShaderData)

Lonewolff · April 12, 2018, 08:21:56 PM

Quote from: Siekmanski on April 12, 2018, 08:10:21 PM
ole32.lib perhaps?

(SIZEOF vertexShaderData)

Yeah 'SIZEOF vertexShaderData' doesn't work because the declaration is multi-line (Shader is hard coded at present)

Code Select


vertexShaderData		db						68,88,66,67,166,109,78,113,107,98,65,70,91,88,250,161,103,22,241,76,1,0,0,0,16,2,0,0,6,0,0,0,56,0,0,0,156,0,0,0,224,0,0,0,92,1,0,0
						db						168,1,0,0,220,1,0,0,65,111,110,57,92,0,0,0,92,0,0,0,0,2,254,255,52,0,0,0,40,0,0,0,0,0,36,0,0,0,36,0,0,0,36,0,0,0,36,0,1
						db						0,36,0,0,0,0,0,1,2,254,255,31,0,0,2,5,0,0,128,0,0,15,144,4, 0,0,4,0,0,3,192,0,0,255,144,0,0,228,160,0,0,228,144,1,0,0,2,0,0
						db						12,192,0,0,228,144,255,255,0,0,83,72,68,82,60,0,0,0,64,0,1,0,15,0,0,0,95,0,0,3,242,16,16,0,0,0,0,0,103,0,0,4,242,32,16,0,0,0,0
						db						0,1,0,0,0,54,0,0,5,242,32,16,0,0,0,0,0,70,30,16,0,0,0,0,0,62,0,0,1,83,84,65,84,116,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0
						db						2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
						db						0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
						db						0,0,0,0,0,0,82,68,69,70,68,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,0,0,0,0,4,254,255,0,1,0,0,28,0,0,0,77,105,99,114,111,115,111
						db						102,116,32,40,82,41,32,72,76,83,76,32,83,104,97,100,101,114,32,67,111,109,112,105,108,101,114,32,49,48,46,49,0,73,83,71,78,44,0,0,0,1,0,0,0,8,0,0,0
						db						32,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,15,15,0,0,80,79,83,73,84,73,79,78,0,171,171,171,79,83,71,78,44,0,0,0,1,0,0,0,8
						db						0,0,0,32,0,0,0,0,0,0,0,1,0,0,0,3,0,0,0,0,0,0,0,15,0,0,0,83,86,95,80,79,83,73,84,73,79,78,0
SIZEOFvertexShaderData	EQU						$-vertexShaderData

So this is how I am calculating 'sizeof' until I code a better solution.

New compiler doesn't like that very much, where the old one seems ok with it.

Project doesn't link to ole32.lib.

jj2007 · April 12, 2018, 08:30:27 PM

An old problem with recent Micros**t assemblers. Try this:

Code Select

mov ecx, SIZEOFpixelShaderData
coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, ecx, NULL, addr d3dVertexShader

If that doesn't work:

Code Select

vertexShaderData        db 68 ....
vertexShaderDataEnd     db 0
...
mov ecx, vertexShaderDataEnd
sub ecx, vertexShaderData
coinvoke d3dDevice, ID3D11Device, CreateVertexShader, addr vertexShaderData, ecx, NULL, addr d3dVertexShader

Siekmanski · April 12, 2018, 08:36:57 PM

Thought your data was in a structure member, than you can use (sizeof vertexShaderData)
Your solution should work too.

Lonewolff · April 12, 2018, 08:41:59 PM

Thanks JJ2007, a couple of things to try.

Nah Siekmanski, my data is all nasty and flapping about at the moment - LOL

Siekmanski · April 12, 2018, 08:52:41 PM

-> Project doesn't link to ole32.lib.

CoInitialize and CoUninitialize do need ole32.lib, needed for COM.

The MASM Forum

News:

Help with QueryPerformanceCounter and 64 bit numbers

LordAdef

Lonewolff

Siekmanski

Siekmanski

Siekmanski

Lonewolff

Lonewolff

Siekmanski

Lonewolff

Siekmanski

Lonewolff

jj2007

Siekmanski

Lonewolff

Siekmanski