Quick test with

the GSL, picking arbitrarily this line...

` int 3`

Print Str$("\ns_variance= \t %5f", gsl_stats_variance(ecx, 1, ebx))

... and hitting F7 a few times:

`Address Hex dump Command Comments`

630E7AB0 Ú> À55 push ebp

630E7AB1 ³. 8BEC mov ebp, esp

...

630E7ACC ³. F20F1045 F8 movsd xmm0, [ebp-8] ; <<<< SIMD #######

630E7AD1 ³. F20F110424 movsd [esp], xmm0 ; ÚArg4_5

630E7AD6 ³. 53 push ebx ; ³Arg3 => [ARG.3]

630E7AD7 ³. FF75 0C push dword ptr [ebp+0C] ; ³Arg2 => [ARG.2]

630E7ADA ³. FF75 08 push dword ptr [ebp+8] ; ³Arg1 => [ARG.1]

630E7ADD ³. E8 AECAFFFF call 630E4590 ; Àgsl.630E4590

630E7AE2 ³. 660F6ECB movd xmm1, ebx ; <<<< SIMD #######

630E7AE6 ³. 8BC3 mov eax, ebx

630E7AE8 ³. F30FE6C9 cvtdq2pd xmm1, xmm1 ; <<<< SIMD #######

630E7AEC ³. C1E8 1F shr eax, 1F

630E7AEF ³. 83C4 14 add esp, 14

630E7AF2 ³. F20F580CC5 00B912 addsd xmm1, [eax*8+6312B900

630E7AFB ³. 8D43 FF lea eax, [ebx-1]

630E7AFE ³. 660F6EC0 movd xmm0, eax

630E7B02 ³. F30FE6C0 cvtdq2pd xmm0, xmm0 ; <<<< SIMD #######

630E7B06 ³. C1E8 1F shr eax, 1F

630E7B09 ³. 5B pop ebx

630E7B0A ³. F20F5804C5 00B912 addsd xmm0, [eax*8+6312B900

630E7B13 ³. F20F5EC8 divsd xmm1, xmm0 ; <<<< SIMD #######

630E7B17 ³. F20F114D F8 movsd [ebp-8], xmm1 ; <<<< SIMD #######

630E7B1C ³. DC4D F8 fmul qword ptr [ebp-8]

630E7B1F ³. 8BE5 mov esp, ebp

630E7B21 ³. 5D pop ebp

630E7B22 À. C3 retn

That's the 32-bit version, of course. But it seems obvious that the guys have had the same idea before. I know that you have a problem with the GSL, but this stuff is so complicated, why reinvent the wheel if a bunch of real mathematicians have done the job already, and give you over 1,000 functions ready to be linked in?

There is a discussion comparing Eigen, GSL and others

at ResearchGate:

But if you want speed, GSL is the best choice: http://www.gnu.org/software/gsl/

A comparison between

Boost and GSL says the latter is much faster (and Boost is a behemoth, too).

There is a good overview at University of Utah, CHPC - Research Computing Support for the University:

Math LibrariesWe had a long thread about Yeppp!, and while I just confirmed that

this snippet still builds and runs fine (provided you use ML 6.15 ... 10.0 - one of the rare cases where UAsm is not compatible...), it also turns out that the latest DLL doesn't use SIMD instructions. Yeppp! looks pretty dead

And then there is the Intel MKL, and guess what?

We tested it already