Author Topic: Need for Speed - C++ versus Assembly Language  (Read 1015 times)

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #45 on: April 21, 2017, 09:39:09 PM »
Like x64 fastcall calling convention, many algorithms use less that 3 args on 32 bit and can be run as FASTCALL.
Except for very small functions, FASTCALL will end making the code slower.
The reason is that you will have to save the registers content somewhere inside the function.
Before the call you have to load the registers with data and inside the function you will have to save the registers content somewhere because you need the registers for other things. A waste of cycles, it's like, put the car keys in the pocket to cross the room and place them in another table.
The same applies to x64, although it has more registers to play with, it is not called FASTCALL anymore - there is no other. Ah yes, Vectorcall, but the same problem.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4435
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Need for Speed - C++ versus Assembly Language
« Reply #46 on: April 22, 2017, 12:55:49 AM »
This is indeed an unusual comment, if you don't use register passing ALA the Microsoft Application Binary Interface "rcx rdx r8 r9" you are left with passing data by globals or old slower style STDCALL stack passing with pushes and pops. Now of course nothing is stopping you from pre-loading a number of AVX registers and calling a procedure that will use them but you must get the arguments for a procedure some how and it does not happen by magic. In 32 bit you used the Intel Application Binary Interface which was a standard PUSH/CALL technique and you can emulate FASTCALL with up to 3 registers to keep the stack overhead down if it is a short leaf procedure.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #47 on: April 22, 2017, 02:02:34 AM »
This is indeed an unusual comment,
Not unusual.
I made a quick search on google and there was someone with the same idea:
"You don't gain anything by passing in registers if the called function immediately needs to spill everything out into memory for its own calculations."
Another one:
"How fast is this calling convention, comparing to __cdecl and __stdcall? Find out for yourselves. Set the compiler option /Gr, and compare the execution time. I didn't find __fastcall to be any faster than other calling conventons, but you may come to different conclusions."

old slower style STDCALL stack passing with pushes and pops.
STDCALL is not slow anymore, it is as fast as CDECL and, in my opinion, in real life, not school class examples, both are faster than FASTCALL. Sound weird, but this the reason not to be widespread.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4435
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Need for Speed - C++ versus Assembly Language
« Reply #48 on: April 22, 2017, 03:38:05 AM »
This tends to be why assembler programmers benchmark techniques rather than search the internet for quotations.

FASTCALL in 64 bit is specification.

  mov rcx, handle
  mov rdx, wmsg
  mov r8,  wparam
  mov r9,  lparam
  call SendMessage
  mov retval, rax


In 32 bit STDCALL is specification.

  push lparam
  push wparam
  push wmsg
  push handle
  call SendMessage
  mov retval, eax


Its a simple fact that registers are a lot faster than memory and much of the design of 64 bit FASTCALL was to reduce the call overhead for the vast majority of function calls that use 4 or less arguments. When you don't need to twiddle the stack you reduce overhead and pick up speed. The other factor of course is "does it matter" when you are calling high level code in either libraries or DLL system functions.

Being able to save a few pico-seconds making a MessageBox() call seems to be the achievement of much modern high level code design where the benchmarking approach puts the effort where it matters, in high level code you pursue clarity and maintainability where in low level code you design and benchmark to get the speed up. You need to do more than just twiddle compiler options, a dis-assembler does not tell lies.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

nidud

  • Member
  • *****
  • Posts: 1228
    • https://github.com/nidud/asmc
Re: Need for Speed - C++ versus Assembly Language
« Reply #49 on: April 22, 2017, 04:06:41 AM »
 :biggrin:

If you plan on using these arguments, which is often the case, then you either use the stack or use nonvolitile registers, which have to be saved by pushing them on to the stack, so you will end up using the stack anyway.

It's also misleading to call the 64-bit calling convention fastcall given you in addition to pass arguments in registers also have to allocate stack-space for the arguments in case you plan on using them, which is often the case ...

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4435
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Need for Speed - C++ versus Assembly Language
« Reply #50 on: April 22, 2017, 04:36:58 AM »
This tends to be why you have a variety of techniques, stack frames for high level code that uses many arguments and local variables and no stack frame for low argument counts and direct register passing to cut overhead and improve speed.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

nidud

  • Member
  • *****
  • Posts: 1228
    • https://github.com/nidud/asmc
Re: Need for Speed - C++ versus Assembly Language
« Reply #51 on: April 22, 2017, 06:17:28 AM »
Here's a simple test case.

Code: [Select]
.x64
.model flat, pascal
.code

entry proc a1:ptr, a2:ptr, a3:ptr, a4:ptr
local l1:ptr, l2:ptr, l3:ptr, l4:ptr

mov rax,a1
mov l1,rax
mov rax,a2
mov l2,rax
mov rax,a3
mov l3,rax
mov rax,a4
mov l4,rax

mov rcx,l1
mov rdx,l2
mov r8,l3
mov r9,l4

mov rax,rcx
add rax,rdx
add rax,r8
add rax,r9

ret
entry endp

END

Seems to be more or less the same..
Code: [Select]
total [1 .. 3], 1++
 12342242 cycles 1.asm: stdcall
 12342272 cycles 3.asm: pascal
 12342401 cycles 0.asm: fastcall
 12342405 cycles 2.asm: c

felipe

  • Member
  • **
  • Posts: 115
  • Write the code, read the code, run the machine.
Re: Need for Speed - C++ versus Assembly Language
« Reply #52 on: April 25, 2017, 12:05:08 PM »

A helicopter was flying around above Seattle when an electrical malfunction disabled all of the aircraft's electronic navigation and communications equipment. Due to the clouds and haze, the pilot could not determine the helicopter's position and course to fly to the airport. The pilot saw a tall building, flew toward it, circled, drew a handwritten sign, and held it in the helicopter's window. The pilot's sign said "WHERE AM I?" in large letters. People in the tall building quickly responded to the aircraft, drew a large sign, and held it in a building window. Their sign read "YOU ARE IN A HELICOPTER." The pilot smiled, waved, looked at his map, determined the course to steer to SEATAC airport, and landed safely. After they were on the ground, the co-pilot asked the pilot how the "YOU ARE IN A HELICOPTER" sign helped determine their position. The pilot responded "I knew that had to be the Microsoft building because they gave me a technically correct, but completely useless answer."

HAHAHA! This is really funny  :lol:  :biggrin:
; A researcher...

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #53 on: April 28, 2017, 10:18:41 PM »
I updated the article "Need for Speed - C++ versus Assembly Language", now it includes C# and Free Pascal on the run as well. Now, ASM wins hands down in both cases, particularly for C#.  :greenclp:

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #54 on: May 20, 2017, 03:01:21 AM »
 :greenclp: :greenclp: :greenclp: :greenclp:

You have won CodeProject

Best C++ Article of April 2017 First Prize.

Type:     Article
Location: Need for Speed - C++ versus Assembly Language
          https://www.codeproject.com/Articles/1182676/Need-for-Speed-Cplusplus-versus-Assembly-Language

CodeProject Mug (http://www.cafepress.co.uk/codeproject.1302986) from CodeProject. Value: $14

jj2007

  • Member
  • *****
  • Posts: 6906
  • Assembler is fun ;-)
    • MasmBasic
Re: Need for Speed - C++ versus Assembly Language
« Reply #55 on: May 20, 2017, 03:42:40 AM »
Congrats, José :t

That is a big success, and worth much more than the mug ;)

Raistlin

  • Member
  • **
  • Posts: 210
Re: Need for Speed - C++ versus Assembly Language
« Reply #56 on: May 20, 2017, 05:37:10 AM »
PPS. looking (after you conveniently ignored my initial post - seesh science must hurt, or else your preaching to the converted) at
real (therefore repro-able) science = metrics
It has been established that compilers
cannot exceed the standard asm coder. Please rectify your
popularist code project award winning post.

fearless

  • Member
  • ***
  • Posts: 270
    • LetTheLightIn
Re: Need for Speed - C++ versus Assembly Language
« Reply #57 on: May 20, 2017, 05:46:52 AM »
Gonna need a pic of you standing majestically with the codeproject mug and staring off into the distance.
fearless

CM690II Case, HX1000 PSU, Asus Z97, Intel i7-4790K, Seidon 120v Cooler, 16GB DDR3, MSI GTX 980TI, Samsung 256GB + 1TB SSD, WD Black 2TB x2 + 4TB HDD, Asus 27" LCD

www.LetTheLight.in  My Github

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #58 on: May 20, 2017, 04:29:54 PM »
That is a big success, and worth much more than the mug ;)
I had no idea it was worth anything.  :icon14:
« Last Edit: May 20, 2017, 10:50:40 PM by aw27 »

aw27

  • Member
  • **
  • Posts: 243
Re: Need for Speed - C++ versus Assembly Language
« Reply #59 on: May 20, 2017, 04:31:04 PM »
Gonna need a pic of you standing majestically with the codeproject mug and staring off into the distance.
And with a crown in my head, of course.