News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Interesting speed observations between ASM and C/C++

Started by Jason, November 06, 2024, 05:27:50 PM

Previous topic - Next topic

Jason

Quote from: NoCforMe on November 06, 2024, 07:50:50 PMJust off the top of my head here: since you're just setting 16 DWORD elements here, if they're in a contiguous block, you could do something like this:

; First set 4 elements to MATRIX_REAL_ONE:
      MOV  EAX, MATRIX_REAL_ONE
      MOV  ECX, 4
      MOV  EDI, mat
      REP  STOSD

; Now set the remainder to MATRIX_REAL_ZERO:
      MOV  EAX, MATRIX_REAL_ZERO
      MOV  ECX, 12
      REP  STOSD

Of course, if they're not in 2 contiguous blocks like I'm assuming here, you'll have to do something different. You could zero the whole thing out first, then just set those 4 elements to MATRIX_REAL_ONE.

Interestingly enough, this method turns out to be slower than my initial implementation, at about 60% of the speed when compared to filling the memory locations one by one.

NoCforMe

Before we go further, let me ask you: do you really have a need for speed here?
I ask because there's kind of an obsession with speed for its own sake here among some programmers. Sometimes this takes the form of assembly-language pissing contests: "my code's faster than your code!" (Well, not really: everyone's quite polite and gracious about it. But still.)

My argument, take it for what it's worth, is that unless you're writing an app that's so computation-bound that a speedup is critical, the obsession with speed is kinda misplaced. I mean, if you're writing a routine that gets called while waiting for user input, what's the rush?

Of course, if you have a legitimate need for speed, then simply disregard what I wrote here.
Assembly language programming should be fun. That's why I do it.

Jason

Hey there! Very valid argument and I agree.

In my case I am looking at every avenue I can take to make my DirectX 11 renderer as efficient as possible and I am making great progress in doing so, being able to render a ridiculous amounts of sprites per frame.

It's more a challenge against myself than anything. I'm not out here to say "my code is faster than yours", I'm more interested in the quirks of how things can be quicker (or more compact, depending on my mood on the day  :biggrin:,  I know compact and speed often don't go hand in hand).

Just a hobby for me.  :thumbsup:

NoCforMe

Quote from: Jason on November 07, 2024, 11:54:00 AMIn my case I am looking at every avenue I can take to make my DirectX 11 renderer as efficient as possible and I am making great progress in doing so, being able to render a ridiculous amounts of sprites per frame.

That's a perfectly valid reason to aim for speed.

QuoteJust a hobby for me.  :thumbsup:

Same here.
Assembly language programming should be fun. That's why I do it.

Jason


zedd151

Quote from: NoCforMe on November 07, 2024, 12:06:58 PM
Quote from: Jason on November 07, 2024, 11:54:00 AMIn my case I am looking at every avenue I can take to make my DirectX 11 renderer as efficient as possible and I am making great progress in doing so, being able to render a ridiculous amounts of sprites per frame.

That's a perfectly valid reason to aim for speed.


Quote from: Jason on November 07, 2024, 12:14:55 PMGood fun and brain frying at the same time.  :biggrin:
:biggrin:

Sounds like you are indeed having some fun. (or phun?)
DirectX has always been beyond my skill set.
Ventanas diez es el mejor.  :azn:

Jason

Quote from: zedd151 on November 07, 2024, 03:22:01 PMSounds like you are indeed having some fun. (or phun?)
DirectX has always been beyond my skill set.

 :biggrin:

Maybe this is my calling? Aim for the smallest DirectX 11 application that I can achieve.

I'm certainly up against it beating the compiler for speed, with all of the shenanigans it pulls.

You got me thinking now.  :thumbsup:

NoCforMe

Assembly language programming should be fun. That's why I do it.

Jason

Direct 3D. I must admit I've never even touched Direct 2D as Direct 3D can do anything 2D anyway.  :biggrin:

TimoVJL

What DX function was used for tests ?

XMMatrixIdentity() or D3DXMatrixIdentity(D3DXMATRIX *pout)

perhaps static data is simple to use.


this is deprecated function :
static inline D3DXMATRIX* D3DXMatrixIdentity(D3DXMATRIX *pout)
{
    if ( !pout ) return NULL;
    D3DX_U(*pout).m[0][1] = 0.0f;
    D3DX_U(*pout).m[0][2] = 0.0f;
    D3DX_U(*pout).m[0][3] = 0.0f;
    D3DX_U(*pout).m[1][0] = 0.0f;
    D3DX_U(*pout).m[1][2] = 0.0f;
    D3DX_U(*pout).m[1][3] = 0.0f;
    D3DX_U(*pout).m[2][0] = 0.0f;
    D3DX_U(*pout).m[2][1] = 0.0f;
    D3DX_U(*pout).m[2][3] = 0.0f;
    D3DX_U(*pout).m[3][0] = 0.0f;
    D3DX_U(*pout).m[3][1] = 0.0f;
    D3DX_U(*pout).m[3][2] = 0.0f;
    D3DX_U(*pout).m[0][0] = 1.0f;
    D3DX_U(*pout).m[1][1] = 1.0f;
    D3DX_U(*pout).m[2][2] = 1.0f;
    D3DX_U(*pout).m[3][3] = 1.0f;
    return pout;
}
May the source be with you

Jason

I used a few of them against various XMMatrix* functions.

In debug mode my Asm code demolished the DirectX functions, but in release mode, the roles were reversed.