Print Page - SSE Math Library

Title: SSE Math Library
Post by: Farabi on June 16, 2012, 06:22:27 PM




fsmVERTEX32 struct
	X real4 0.
	Y real4 0.
	Z real4 0.
	W real4 0.
fsmVERTEX32 ends


.data

fsmVecSub proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword
	
	mov esi,lpVA
	mov edi,lpVB
	mov eax,lpVDest
	
	movups xmm0,[esi]
	movups xmm1,[edi]
	subps xmm0,xmm1
	movups [eax],xmm0

	
	ret
fsmVecSub endp

fsmVecAdd proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword
	
	mov esi,lpVA
	mov edi,lpVB
	mov eax,lpVDest
	
	movups xmm0,[esi]
	movups xmm1,[edi]
	addps xmm0,xmm1
	movups [eax],xmm0

	
	ret
fsmVecAdd endp

fsmVecMul proc uses esi edi lpVDest:dword,lpVA:dword,lpVB:dword
	
	mov esi,lpVA
	mov edi,lpVB
	mov eax,lpVDest
	
	movups xmm0,[esi]
	movups xmm1,[edi]
	mulps xmm0,xmm1
	movups [eax],xmm0

	
	ret
fsmVecMul endp

Anyone done it? It no one done it, I'll do it.

Title: Re: SSE Math Library
Post by: Farabi on June 16, 2012, 07:02:44 PM

After I read the Documentation of the SSE1-4 I think FPU will still be on the chip for a long time, except SSE had a geometry calculation instruction.

Title: Re: SSE Math Library
Post by: qWord on June 16, 2012, 10:55:03 PM

Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw

Title: Re: SSE Math Library
Post by: RuiLoureiro on June 16, 2012, 11:34:54 PM

Farabi,
Is it to add, sub, mul etc. only 2 elements
or an array of elements ?

Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 01:38:31 AM

Quote from: RuiLoureiro on June 16, 2012, 11:34:54 PM
Farabi,
Is it to add, sub, mul etc. only 2 elements
or an array of elements ?

Yeah, only for 2 VERTEX. Where the member is X-Y-Z-W real4 number.

Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 01:40:30 AM

Quote from: qWord on June 16, 2012, 10:55:03 PM
Wrapping basic instructions by a function or even a macro makes not much sense. Also you are using the unaligned version, which are slower than their aligned counterpart.
A library should add functions, which are not available through SSEx. f.e.: sin,arcsin,exp,ln,...
Some time back, Clive pointed out a good book about this topic: "Math Toolkit for Real-Time Development", Jack W. Crenshaw

Well, I think the library should save some times for some people. If anyone done it before, I think I'll just wasting time re-inventing it.

Anyway, the code above is only part of it. Im not done yet.

Title: Re: SSE Math Library
Post by: dedndave on June 17, 2012, 03:21:39 AM

your code might be useful to show us newbies how to do it :P

but, i think what qWord and Rui are aluding to is that the advantage of SSE is "pipelining" operations
so - if you can do a bunch of something (a la Henry Ford), you get an advantage

Title: Re: SSE Math Library
Post by: Farabi on June 17, 2012, 11:53:12 AM

Ouch my codes is slower than the original "Hitchckr" Vertices codes.

The MASM Forum

General => The Laboratory => Topic started by: Farabi on June 16, 2012, 06:22:27 PM