Author Topic: SSE Math Library  (Read 4193 times)

Farabi

  • Member
  • ****
  • Posts: 970
  • Neuroscience Fans
SSE Math Library
« on: June 16, 2012, 06:22:27 PM »
Code: [Select]


fsmVERTEX32 struct
X real4 0.
Y real4 0.
Z real4 0.
W real4 0.
fsmVERTEX32 ends


.data

fsmVecSub proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
subps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecSub endp

fsmVecAdd proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
addps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecAdd endp

fsmVecMul proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
mulps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecMul endp


Anyone done it? It no one done it, I'll do it.
« Last Edit: June 16, 2012, 07:47:42 PM by Farabi »
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

Farabi

  • Member
  • ****
  • Posts: 970
  • Neuroscience Fans
Re: SSE Math Library
« Reply #1 on: June 16, 2012, 07:02:44 PM »
After I read the Documentation of the SSE1-4 I think FPU will still be on the chip for a long time, except SSE had a geometry calculation instruction.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: SSE Math Library
« Reply #2 on: June 16, 2012, 10:55:03 PM »
Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw
MREAL macros - when you need floating point arithmetic while assembling!

RuiLoureiro

  • Member
  • ****
  • Posts: 671
Re: SSE Math Library
« Reply #3 on: June 16, 2012, 11:34:54 PM »
Farabi,
            Is it to add, sub, mul etc. only 2 elements
            or an array of elements ?

Farabi

  • Member
  • ****
  • Posts: 970
  • Neuroscience Fans
Re: SSE Math Library
« Reply #4 on: June 17, 2012, 01:38:31 AM »
Farabi,
            Is it to add, sub, mul etc. only 2 elements
            or an array of elements ?

Yeah, only for 2 VERTEX. Where the member is X-Y-Z-W real4 number.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

Farabi

  • Member
  • ****
  • Posts: 970
  • Neuroscience Fans
Re: SSE Math Library
« Reply #5 on: June 17, 2012, 01:40:30 AM »
Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw

Well, I think the library should save some times for some people. If anyone done it before, I think I'll just wasting time re-inventing it.

Anyway, the code above is only part of it. Im not done yet.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165

dedndave

  • Member
  • *****
  • Posts: 8734
  • Still using Abacus 2.0
    • DednDave
Re: SSE Math Library
« Reply #6 on: June 17, 2012, 03:21:39 AM »
your code might be useful to show us newbies how to do it   :P

but, i think what qWord and Rui are aluding to is that the advantage of SSE is "pipelining" operations
so - if you can do a bunch of something (a la Henry Ford), you get an advantage
« Last Edit: June 17, 2012, 06:56:29 PM by dedndave »

Farabi

  • Member
  • ****
  • Posts: 970
  • Neuroscience Fans
Re: SSE Math Library
« Reply #7 on: June 17, 2012, 11:53:12 AM »
Ouch my codes is slower than the original "Hitchckr" Vertices codes.
http://farabidatacenter.url.ph/MySoftware/
My 3D Game Engine Demo.

Contact me at Whatsapp: 6283818314165