The MASM Forum

General => The Laboratory => Topic started by: Farabi on June 16, 2012, 06:22:27 PM

Title: SSE Math Library
Post by: Farabi on June 16, 2012, 06:22:27 PM
Code: [Select]


fsmVERTEX32 struct
X real4 0.
Y real4 0.
Z real4 0.
W real4 0.
fsmVERTEX32 ends


.data

fsmVecSub proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
subps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecSub endp

fsmVecAdd proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
addps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecAdd endp

fsmVecMul proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword

mov esi,lpVA
mov edi,lpVB
mov eax,lpVDest

movups xmm0,[esi]
movups xmm1,[edi]
mulps xmm0,xmm1
movups [eax],xmm0


ret
fsmVecMul endp


Anyone done it? It no one done it, I'll do it.
Title: Re: SSE Math Library
Post by: Farabi on June 16, 2012, 07:02:44 PM
After I read the Documentation of the SSE1-4 I think FPU will still be on the chip for a long time, except SSE had a geometry calculation instruction.
Title: Re: SSE Math Library
Post by: qWord on June 16, 2012, 10:55:03 PM
Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw
Title: Re: SSE Math Library
Post by: RuiLoureiro on June 16, 2012, 11:34:54 PM
Farabi,
            Is it to add, sub, mul etc. only 2 elements
            or an array of elements ?
Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 01:38:31 AM
Farabi,
            Is it to add, sub, mul etc. only 2 elements
            or an array of elements ?

Yeah, only for 2 VERTEX. Where the member is X-Y-Z-W real4 number.
Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 01:40:30 AM
Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw

Well, I think the library should save some times for some people. If anyone done it before, I think I'll just wasting time re-inventing it.

Anyway, the code above is only part of it. Im not done yet.
Title: Re: SSE Math Library
Post by: dedndave on June 17, 2012, 03:21:39 AM
your code might be useful to show us newbies how to do it   :P

but, i think what qWord and Rui are aluding to is that the advantage of SSE is "pipelining" operations
so - if you can do a bunch of something (a la Henry Ford), you get an advantage
Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 11:53:12 AM
Ouch my codes is slower than the original "Hitchckr" Vertices codes.