Author Topic: 1/x timings for FPU and SIMD code  (Read 72 times)

jj2007

  • Member
  • *****
  • Posts: 8447
  • Assembler is fun ;-)
    • MasmBasic
1/x timings for FPU and SIMD code
« on: June 23, 2018, 05:22:40 AM »
Normally, the FPU is not slower than equivalent SIMD code, but 1/x beats it:
Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

3168    cycles for 1000 * rcpss
13049   cycles for 1000 * fdiv

3167    cycles for 1000 * rcpss
13135   cycles for 1000 * fdiv

3181    cycles for 1000 * rcpss
13259   cycles for 1000 * fdiv

3189    cycles for 1000 * rcpss
13092   cycles for 1000 * fdiv

3156    cycles for 1000 * rcpss
13070   cycles for 1000 * fdiv

24      bytes for rcpss
23      bytes for fdiv

ST0     123453440.0000000000
ST0     123456792.0000000000

Of course, precision is lower; the expected value is 123456789.0

The source:
Code: [Select]
NameA equ rcpss ; assign a descriptive name here
TestA proc
  mov ebx, AlgoLoops-1 ; loop 1000x
  push 123456789
  fild stack
  fstp stack
  pop eax
  movd xmm0, eax
  align 4
  .Repeat
rcpss xmm0, xmm0
dec ebx
  .Until Sign?
  movd eax, xmm0
  ret
TestA endp

align_64
NameB equ fdiv ; assign a descriptive name here
TestB proc
  mov ebx, AlgoLoops-1 ; loop 1000x
  push 123456789
  fild stack
  fstp stack
  fld1
  align 4
  .Repeat
fld stack
fdiv ST, ST(1)
fstp stack
dec ebx
  .Until Sign?
  fstp st
  pop eax
  ret
TestB endp

Siekmanski

  • Member
  • *****
  • Posts: 1492
Re: 1/x timings for FPU and SIMD code
« Reply #1 on: June 23, 2018, 05:58:07 AM »
Code: [Select]
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)

3868    cycles for 1000 * rcpss
15813   cycles for 1000 * fdiv

3868    cycles for 1000 * rcpss
15826   cycles for 1000 * fdiv

3874    cycles for 1000 * rcpss
15808   cycles for 1000 * fdiv

3875    cycles for 1000 * rcpss
15808   cycles for 1000 * fdiv

3868    cycles for 1000 * rcpss
15821   cycles for 1000 * fdiv

24      bytes for rcpss
23      bytes for fdiv

ST0     123453440.0000000000
ST0     123456792.0000000000
Creative coders use backward thinking techniques as their strategy.

zedd151

  • Member
  • ****
  • Posts: 599
  • -------------
Re: 1/x timings for FPU and SIMD code
« Reply #2 on: June 23, 2018, 06:09:51 AM »
Code: [Select]
AMD A6-9220e RADEON R4, 5 COMPUTE CORES 2C+3G   (SSE4)

2564    cycles for 1000 * rcpss
17517   cycles for 1000 * fdiv

2552    cycles for 1000 * rcpss
16796   cycles for 1000 * fdiv

2822    cycles for 1000 * rcpss
16735   cycles for 1000 * fdiv

2566    cycles for 1000 * rcpss
16124   cycles for 1000 * fdiv

2636    cycles for 1000 * rcpss
16808   cycles for 1000 * fdiv

24      bytes for rcpss
23      bytes for fdiv

ST0     123453440.0000000000
ST0     123456792.0000000000

--- ok ---

1.60 Ghz as usual
@ Micros**t  --> 

Coming Soon "Taming Ten" - A tutorial for taming Windows 10, to gain more control over YOUR OWN computer.

Yuri

  • Member
  • **
  • Posts: 163
Re: 1/x timings for FPU and SIMD code
« Reply #3 on: June 23, 2018, 02:29:40 PM »
Code: [Select]
Intel(R) Core(TM) i3 CPU         540  @ 3.07GHz (SSE4)

928     cycles for 1000 * rcpss
12260   cycles for 1000 * fdiv

903     cycles for 1000 * rcpss
12142   cycles for 1000 * fdiv

889     cycles for 1000 * rcpss
12072   cycles for 1000 * fdiv

893     cycles for 1000 * rcpss
12114   cycles for 1000 * fdiv

894     cycles for 1000 * rcpss
12059   cycles for 1000 * fdiv

24      bytes for rcpss
23      bytes for fdiv

ST0     123453440.0000000000
ST0     123456792.0000000000

--- ok ---

jimg

  • Member
  • ***
  • Posts: 255
Re: 1/x timings for FPU and SIMD code
« Reply #4 on: June 23, 2018, 03:21:00 PM »
It shouldn't be a surprise since you're executing three instructions each loop for the fpu vs. one instruction for simd

Mikl__

  • Member
  • ****
  • Posts: 619
Re: 1/x timings for FPU and SIMD code
« Reply #5 on: Today at 10:59:23 AM »
Hi,jj2007!
Code: [Select]
fild stackat first I was even delighted with the non-standard appeal to the top of the FPU, but
Code: [Select]
tut_02.asm(8) : error A2006:undefined symbol : stack