News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast SIMD transpose routines

Started by Siekmanski, June 25, 2018, 11:35:37 AM

Previous topic - Next topic

nidud

#30
deleted

Siekmanski

No need for rearranging the regs if you include the memory reads and writes in the speed test.
Creative coders use backward thinking techniques as a strategy.

nidud

#32
deleted

nidud

#33
deleted

Siekmanski

 :biggrin:
Quote from: nidud on July 14, 2018, 09:35:26 AM
Well, I'm writing vector call tests at the moment where values are kept in registers over multiple calls, so the thinking and implementation is a bit different.

OK, that makes sense.

Quote
As for AVX in this case there don't seem to be (as you pointed out) any speed improvement except from saving regs. There is also VMOVHLPS that can be used in the same way.

You must have confused me with someone else, never done an AVX version yet.
But will try it out some day.

I remember that we can replace VSHUFPS or VPERM2F128 with VBLENDPS instructions.
AVX shuffles are executed only on port 5, while blends are also executed on port 0.
VPERM2F128 instructions are not that fast.

Maybe we can get some gain out of it.

I will look this up.

EDIT: Found it, it's in chapter 12 section 11.1
http://members.home.nl/siekmanski/Intel_Optimization_Reference_Manual_248966-037.pdf
Creative coders use backward thinking techniques as a strategy.

Siekmanski

Quote from: nidud on July 14, 2018, 10:37:03 AM
Simpler and faster..

    vunpckhps xmm4,xmm2,xmm3
    vunpcklps xmm2,xmm2,xmm3
    vunpckhps xmm3,xmm0,xmm1
    vunpcklps xmm1,xmm0,xmm1
    vmovlhps  xmm0,xmm1,xmm2
    vmovhlps  xmm1,xmm2,xmm1
    vmovlhps  xmm2,xmm3,xmm4
    vmovhlps  xmm3,xmm4,xmm3


Timed this AVX piece on my computer, it's a little slower than the SSE version.
Creative coders use backward thinking techniques as a strategy.

nidud

#36
deleted

daydreamer

great work
so this works great for feed d3d9 with loads of different matrices?
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding