I am tinkering with Pelles C, and stumbled over a hilariously complicated implementation of log(x). Timings are accordingly :biggrin:
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
loop overhead is approx. 189/100 cycles
6357 cycles for 100 * log Asm
26583 cycles for 100 * log CRT
21191 cycles for 100 * logC Math.h
6343 cycles for 100 * log Asm
26624 cycles for 100 * log CRT
21163 cycles for 100 * logC Math.h
6351 cycles for 100 * log Asm
26614 cycles for 100 * log CRT
21157 cycles for 100 * logC Math.h
13 bytes for log Asm
24 bytes for log CRT
23 bytes for logC Math.h
R8=-5.54677872584653642 for t log Asm
R8=-5.54677872584653642 for t log CRT
R8=-5.54677872584653642 for t logC Math.h
P.S.: Build requires linking LogC.obj (included) and \Masm32\PellesC\Lib\crt.lib (not included)
Hi Jochen,
here are my results.
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
++++++++++++++++++2 of 20 tests valid, loop overhead is approx. 191/100 cycles
5012 cycles for 100 * log Asm
16923 cycles for 100 * log Asm with checks
21842 cycles for 100 * log CRT
22318 cycles for 100 * logC Math.h
11258 cycles for 100 * log Asm
18792 cycles for 100 * log Asm with checks
15586 cycles for 100 * log CRT
19173 cycles for 100 * logC Math.h
10486 cycles for 100 * log Asm
16812 cycles for 100 * log Asm with checks
21828 cycles for 100 * log CRT
22309 cycles for 100 * logC Math.h
13 bytes for log Asm
19 bytes for log Asm with checks
24 bytes for log CRT
23 bytes for logC Math.h
R8=-5.54677872584653642 for t log Asm
R8=-5.54677872584653642 for t log Asm with checks
R8=-5.54677872584653642 for t log CRT
R8=-5.54677872584653642 for t logC Math.h
--- ok ---
I can't re-build the EXE, because I haven't crt.lib. But transzendental functions are mostly awkward implemented by compiler builders and time consuming, too.
Gunther
18 of 20 ? - sounds borg to me :P
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
++18 of 20 tests valid, loop overhead is approx. 239/100 cycles
10014 cycles for 100 * log Asm
81822 cycles for 100 * log Asm with checks
84241 cycles for 100 * log CRT
47662 cycles for 100 * logC Math.h
11296 cycles for 100 * log Asm
82449 cycles for 100 * log Asm with checks
82746 cycles for 100 * log CRT
54064 cycles for 100 * logC Math.h
9925 cycles for 100 * log Asm
77921 cycles for 100 * log Asm with checks
90875 cycles for 100 * log CRT
47641 cycles for 100 * logC Math.h
13 bytes for log Asm
19 bytes for log Asm with checks
24 bytes for log CRT
23 bytes for logC Math.h
R8=-5.54677872584653642 for t log Asm
R8=-5.54677872584653642 for t log Asm with checks
R8=-5.54677872584653642 for t log CRT
R8=-5.54677872584653642 for t logC Math.h
It seems like that a series for ln(x) = 2*artanh((x-1)/(x+1)) is used - really not that effective, even if the FPU is allready used...