News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Passing args on the stack: what is fastest?

Started by jj2007, December 09, 2015, 05:43:42 AM

Previous topic - Next topic

FORTRANS

pre-P4 (SSE1)

608 cycles for 100 * pop retadd, pop arg, push retadd
517 cycles for 100 * pop retadd, pop arg, jmp retadd
711 cycles for 100 * mov eax, arg/ret
1519 cycles for 100 * push esi edi ebx ecx
2150 cycles for 100 * pushad
809 cycles for 100 * popretadd, 2 args
504 cycles for 100 * 2 args via reg

610 cycles for 100 * pop retadd, pop arg, push retadd
518 cycles for 100 * pop retadd, pop arg, jmp retadd
711 cycles for 100 * mov eax, arg/ret
1519 cycles for 100 * push esi edi ebx ecx
2145 cycles for 100 * pushad
810 cycles for 100 * popretadd, 2 args
504 cycles for 100 * 2 args via reg

613 cycles for 100 * pop retadd, pop arg, push retadd
517 cycles for 100 * pop retadd, pop arg, jmp retadd
711 cycles for 100 * mov eax, arg/ret
1521 cycles for 100 * push esi edi ebx ecx
2146 cycles for 100 * pushad
811 cycles for 100 * popretadd, 2 args
504 cycles for 100 * 2 args via reg

608 cycles for 100 * pop retadd, pop arg, push retadd
517 cycles for 100 * pop retadd, pop arg, jmp retadd
711 cycles for 100 * mov eax, arg/ret
1538 cycles for 100 * push esi edi ebx ecx
2148 cycles for 100 * pushad
814 cycles for 100 * popretadd, 2 args
504 cycles for 100 * 2 args via reg

608 cycles for 100 * pop retadd, pop arg, push retadd
530 cycles for 100 * pop retadd, pop arg, jmp retadd
711 cycles for 100 * mov eax, arg/ret
1518 cycles for 100 * push esi edi ebx ecx
2148 cycles for 100 * pushad
810 cycles for 100 * popretadd, 2 args
516 cycles for 100 * 2 args via reg

11 bytes for pop retadd, pop arg, push retadd
11 bytes for pop retadd, pop arg, jmp retadd
15 bytes for mov eax, arg/ret
31 bytes for push esi edi ebx ecx
27 bytes for pushad
13 bytes for popretadd, 2 args
15 bytes for 2 args via reg


--- ok ---

sinsi

As usual, AMD says "up yours"

AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G (SSE4)

539     cycles for 100 * pop retadd, pop arg, push retadd
752     cycles for 100 * pop retadd, pop arg, jmp retadd
610     cycles for 100 * mov eax, arg/ret
1070    cycles for 100 * push esi edi ebx ecx
1998    cycles for 100 * pushad
718     cycles for 100 * popretadd, 2 args
330     cycles for 100 * 2 args via reg

548     cycles for 100 * pop retadd, pop arg, push retadd
610     cycles for 100 * pop retadd, pop arg, jmp retadd
615     cycles for 100 * mov eax, arg/ret
1091    cycles for 100 * push esi edi ebx ecx
1996    cycles for 100 * pushad
735     cycles for 100 * popretadd, 2 args
338     cycles for 100 * 2 args via reg

557     cycles for 100 * pop retadd, pop arg, push retadd
557     cycles for 100 * pop retadd, pop arg, jmp retadd
631     cycles for 100 * mov eax, arg/ret
1078    cycles for 100 * push esi edi ebx ecx
2010    cycles for 100 * pushad
739     cycles for 100 * popretadd, 2 args
339     cycles for 100 * 2 args via reg

546     cycles for 100 * pop retadd, pop arg, push retadd
722     cycles for 100 * pop retadd, pop arg, jmp retadd
649     cycles for 100 * mov eax, arg/ret
1085    cycles for 100 * push esi edi ebx ecx
1982    cycles for 100 * pushad
760     cycles for 100 * popretadd, 2 args
339     cycles for 100 * 2 args via reg

568     cycles for 100 * pop retadd, pop arg, push retadd
746     cycles for 100 * pop retadd, pop arg, jmp retadd
614     cycles for 100 * mov eax, arg/ret
1078    cycles for 100 * push esi edi ebx ecx
1998    cycles for 100 * pushad
735     cycles for 100 * popretadd, 2 args
356     cycles for 100 * 2 args via reg

11      bytes for pop retadd, pop arg, push retadd
11      bytes for pop retadd, pop arg, jmp retadd
15      bytes for mov eax, arg/ret
31      bytes for push esi edi ebx ecx
27      bytes for pushad
13      bytes for popretadd, 2 args
15      bytes for 2 args via reg


jj2007


TWell

AMD Athlon(tm) II X2 220 Processor (SSE3)

636     cycles for 100 * pop retadd, pop arg, push retadd
783     cycles for 100 * pop retadd, pop arg, jmp retadd
430     cycles for 100 * mov eax, arg/ret
968     cycles for 100 * push esi edi ebx ecx
1478    cycles for 100 * pushad
856     cycles for 100 * popretadd, 2 args
428     cycles for 100 * 2 args via reg

632     cycles for 100 * pop retadd, pop arg, push retadd
426     cycles for 100 * pop retadd, pop arg, jmp retadd
429     cycles for 100 * mov eax, arg/ret
969     cycles for 100 * push esi edi ebx ecx
1477    cycles for 100 * pushad
856     cycles for 100 * popretadd, 2 args
429     cycles for 100 * 2 args via reg

632     cycles for 100 * pop retadd, pop arg, push retadd
981     cycles for 100 * pop retadd, pop arg, jmp retadd
431     cycles for 100 * mov eax, arg/ret
968     cycles for 100 * push esi edi ebx ecx
1480    cycles for 100 * pushad
856     cycles for 100 * popretadd, 2 args
431     cycles for 100 * 2 args via reg

632     cycles for 100 * pop retadd, pop arg, push retadd
427     cycles for 100 * pop retadd, pop arg, jmp retadd
428     cycles for 100 * mov eax, arg/ret
969     cycles for 100 * push esi edi ebx ecx
1477    cycles for 100 * pushad
857     cycles for 100 * popretadd, 2 args
472     cycles for 100 * 2 args via reg

632     cycles for 100 * pop retadd, pop arg, push retadd
426     cycles for 100 * pop retadd, pop arg, jmp retadd
428     cycles for 100 * mov eax, arg/ret
968     cycles for 100 * push esi edi ebx ecx
1477    cycles for 100 * pushad
857     cycles for 100 * popretadd, 2 args
428     cycles for 100 * 2 args via reg

11      bytes for pop retadd, pop arg, push retadd
11      bytes for pop retadd, pop arg, jmp retadd
15      bytes for mov eax, arg/ret
31      bytes for push esi edi ebx ecx
27      bytes for pushad
13      bytes for popretadd, 2 args
15      bytes for 2 args via reg

Grincheux

AMD Athlon(tm) II X2 250 Processor (SSE3)

1463    cycles for 100 * pop retadd, pop arg, push retadd
428     cycles for 100 * pop retadd, pop arg, jmp retadd
438     cycles for 100 * mov eax, arg/ret
979     cycles for 100 * push esi edi ebx ecx
1489    cycles for 100 * pushad
857     cycles for 100 * popretadd, 2 args
496     cycles for 100 * 2 args via reg

1438    cycles for 100 * pop retadd, pop arg, push retadd
426     cycles for 100 * pop retadd, pop arg, jmp retadd
428     cycles for 100 * mov eax, arg/ret
970     cycles for 100 * push esi edi ebx ecx
1490    cycles for 100 * pushad
869     cycles for 100 * popretadd, 2 args
428     cycles for 100 * 2 args via reg

1331    cycles for 100 * pop retadd, pop arg, push retadd
1218    cycles for 100 * pop retadd, pop arg, jmp retadd
428     cycles for 100 * mov eax, arg/ret
979     cycles for 100 * push esi edi ebx ecx
1488    cycles for 100 * pushad
866     cycles for 100 * popretadd, 2 args
439     cycles for 100 * 2 args via reg

642     cycles for 100 * pop retadd, pop arg, push retadd
427     cycles for 100 * pop retadd, pop arg, jmp retadd
440     cycles for 100 * mov eax, arg/ret
984     cycles for 100 * push esi edi ebx ecx
1557    cycles for 100 * pushad
867     cycles for 100 * popretadd, 2 args
439     cycles for 100 * 2 args via reg

769     cycles for 100 * pop retadd, pop arg, push retadd
427     cycles for 100 * pop retadd, pop arg, jmp retadd
429     cycles for 100 * mov eax, arg/ret
979     cycles for 100 * push esi edi ebx ecx
1491    cycles for 100 * pushad
866     cycles for 100 * popretadd, 2 args
430     cycles for 100 * 2 args via reg

11      bytes for pop retadd, pop arg, push retadd
11      bytes for pop retadd, pop arg, jmp retadd
15      bytes for mov eax, arg/ret
31      bytes for push esi edi ebx ecx
27      bytes for pushad
13      bytes for popretadd, 2 args
15      bytes for 2 args via reg


--- ok ---