Hi
I already had idea in 32 bit SSE coding, to preparing to 64 bit with creating xmm8-xmm15 in. Data section as variables
Historical using all 8 gp registers has been used in 32 bit mode,so 8 more xmm regs might speed up some SSE code
Also the different 64 bit api using 4 xmm regs for floats,switching to use right xmm regs in code might make code more efficient, instead of few Movaps before and inside proc to Mov to right registers
This thread is about take best SSE /SSE2 code and port to 64 bit, suggestions to include best code snippets in 64 bit sdk as examples