The MASM Forum

General => The Soap Box => Topic started by: aw27 on April 19, 2017, 03:42:37 PM

Title: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 03:42:37 PM
Yesterday, I published an article at Code Project. "Need for Speed - C++ versus Assembly Language" (https://www.codeproject.com/Articles/1182676/Need-for-Speed-Cplusplus-versus-Assembly-Language)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: guga on April 19, 2017, 04:48:15 PM
Hi Aw.

Out of the thread. I´m updating your code on Transposing matrix to optimize it. The version is a bit slow, due to the heavy usage of push/pop, add instructions. I´m trying to rewrite it to work at the proper speed.

So far, i succeeded to double the speed gain, removing the loops and addition operations at edi. I´m currently trying to make the same for esi and later will review the remainder. loops.

On my PC the old speed was something around 438 ns on the original version. Now it is only something around 220 and i believe it can reach something around 130 if i succeed to make the same for esi.

Once i finish i´ll post it for you.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: LordAdef on April 19, 2017, 06:04:32 PM
My guys will avenge this heresy!! C++ cannot beat a really optimized asm.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 06:34:37 PM
Out of the thread. I´m updating your code on Transposing matrix to optimize it.

It is an excellent exercise.  :t
And there is also room to optimize the Fast Matrix Flip.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Siekmanski on April 19, 2017, 07:57:56 PM
LordAdef is right, no compiler can beat 100% well optimized assembly code.
Is this a challenge ?

It's the same story over and over again on the c++ forums.

- you must be mad programming in asm.
- asm is ancient and dead.
- nobody uses it anymore.
- why using it if a compiler does a greater job.
- etc.

Some of those guys are really sneaky and provoke you to write a faster routine for their own use.  :biggrin:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 08:35:26 PM
LordAdef is right, no compiler can beat 100% well optimized assembly code.
I used to hear the same think about chess. No computer will ever be able to beat a chess grandmaster because computers have no ideas, no creativity, no positional sense, etc.
Now, every grandmaster is beaten by his smartphone running a free chess app with ELO 3000.

Is this a challenge ?

Sure, I know you are good at optimization, give it a go!


Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 19, 2017, 09:24:42 PM
I used to hear the same think about chess. No computer will ever be able to beat a chess grandmaster because computers have no ideas, no creativity, no positional sense, etc.

I understand the logic, but the cases are a bit different: Chess computers win with brute force. Now I wouldn't exclude that one day compilers test various options for an innermost loop to find the fastest encoding, but I wouldn't bet on it 8)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Siekmanski on April 19, 2017, 09:29:41 PM
That's comparing apples with oranges.

A chess player has to respond to the actions of his opponent. ( human or computer )
A programmer already knows how the CPU will respond.

Sorry, no time to take the challenge.

The first thing i would do, align the code loops and align the data memory and move the memory allocations outside the loops.
Or better allocate the needed amount of memory once.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: mineiro on April 19, 2017, 09:52:40 PM
hello sir aw27, I read your article.

Processors can beat humans on a brute force way, sequential logic, but when we start inserting paralell logic we can beat machines.
Remove chess program database (chess openings) and you see how easy is to win any chess program with high ELO level. Chess programs use database created by humans, by games between humans IM or GM. Have more chess possible movements than stars on our universe, so, how much time a computer chess program will spend to only do the five first moves without a database and using only chess rules as support?

I'm with opinions from persons here, c++ code can't beat an experienced assembly programmer. If we consider that a compiler was done by a lot of persons, going from math point of view reaching opcodes, let's do the same, let's join experienced assembly programmers to work as a team and this way I know the final answer. I'm saying this because we have much more freedom against c or c++ programmer.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 10:29:55 PM
Chess computers win with brute force.
The problem is that they don't win anymore by brute force alone. It would be impossible because there are trillions of possibilities.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 10:36:37 PM
Remove chess program database (chess openings) and you see how easy is to win any chess program with high ELO level.
Grandmasters have also encyclopedic knowledge on chess openings. On the middle game, is where computers take the lead, not to mention on finals.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 19, 2017, 10:53:05 PM
Yesterday, I published an article at Code Project. "Need for Speed - C++ versus Assembly Language" (https://www.codeproject.com/Articles/1182676/Need-for-Speed-Cplusplus-versus-Assembly-Language)

Nice work, you put a lot of effort into it :t

Typo in point 7): In order words,... you meant in other words, I suppose.

Quote
I undertook a few optimization steps with the ASM source code and was able to improve the assembled performance by more than 30%.
...
Quote
I have seen the ASM listing produced by the C++ compiler and some parts are just mind blowing - nobody would believe a human would code that way (if he does the code would be almost impossible to maintain). The compiler uses every trick under the table in an automated way - difficult to beat. It knows about everything about how the pipelines and predictive branching work, it reorders of instructions, does loop tiling and uses cache-oblivious algorithms.

Of course, you have a point there. On the other hand, if you take the compiler's assembler listing and play around with it, you may tickle out a few % more... and prove again that assembler is faster ;-)

Just joking. Your article clearly shows that a C++ compiler can beat assembler. However, truth is also that in the Lab we have beaten the CRT many times, typically by a factor 2-3; and one would assume that the CRT developers use the best compilers, no?
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 19, 2017, 11:20:57 PM
Typo in point 7): In order words,... you meant in other words, I suppose.
Thanks. I have a few more to fix.

Quote
if you take the compiler's assembler listing and play around with it, you may tickle out a few % more... and prove again that assembler is faster ;-)
In a real case I would feel tempted to compile the compiler's assembler listing  :biggrin:

Quote
CRT many times, typically by a factor 2-3; and one would assume that the CRT developers use the best compilers, no?

The initialization code looks at lots of things we typically don't.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: LordAdef on April 20, 2017, 04:05:26 AM
The article is really well written, congratulations.

But the the asm code can be further optimized. Marinus mentioned the memory allocation, but there are those PUSH-POPs, there is the shl which is slow. If I'm not mistaken lea is slower too and could be exchanged to mov - offset (can that be done in 64?).

What I'm saying is that a guy like Marinus, Johen and others could make this asm run faster than the C++ one. In this case, the point of the article looses ground.

Another thing to consider is, if one refers to the article's title, when you really do the C++ thing (oop, those C++ crazy abstractions, etc...), C++ gets increasingly slower and slower. It's slower than pure C.

My only point here is that you are comparing a fully optimized C++ code versus a not fully optimized asm.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 20, 2017, 05:01:25 AM
This much I have seen with comparisons of this type, once the optimisations on both sources are done if the code is competent in both, if the same instructions are used in both, the difference is negligible. Where the real action will be is in designing better algorithms and parallel multi-thread applications as SSE and AVX instructions are not particularly responsive to minor twiddling like the older traditional integer instructions. You do the obvious things like not repeatedly running code that should safely exist outside intensive loops, align all of the data to its required optimums and you can have a fiddle with code alignment if you think it can make it a bit faster but the action here will always be better design, not close range twiddling.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: nidud on April 20, 2017, 06:44:48 AM
deleted
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 20, 2017, 08:00:20 AM
You should add a test for ML.exe ;)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: nidud on April 20, 2017, 08:16:39 AM
deleted
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 20, 2017, 08:57:35 AM
Yep, this is what I meant :biggrin:
RichMasm assembles with asmc in less than 600 ms on my i5, as compared to 680 for JWasm and 1230 for ML :t

(Wow, that thread was long  ::))
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Raistlin on April 20, 2017, 03:52:32 PM
I have a vested interest in this topic that continually seems to pop up from time to time (I wonder why?)

On the scholarly literature I've been able to accumulate the following conclusions have been drawn:
1) Handwritten assembly - properly done (the REAL POINT) - will ALWAYS out perform the compiler (ANY) for the same algorithm
2) Loop structures (unrolling) are a critical factor that need to target hardware L1 cache hierarchies - compilers are really bad at this
3) Data locality cannot be easily predicted by compiler heuristic transforms - and thus will always produce sub-optimal code
4) Compilers are really BAD at generating SIMD/AVX code that takes full advantage of the instruction set
    - in one large study of compilers, published in 2013, it was found that on average compilers miss 60% of the opportunity vectorize
     


Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 05:50:38 PM
but there are those PUSH-POPs, there is the shl which is slow.
You appear to know a lot about these things.

Quote
If I'm not mistaken lea is slower too and could be exchanged to mov - offset (can that be done in 64?).
You never heard that lea is an handy fast arithmetic calculator? I am using it like that, not to load an effective memory address.

Quote
What I'm saying is that a guy like Marinus, Johen and others could make this asm run faster than the C++ one. In this case, the point of the article looses ground.
I am also expecting that either of them or anyone else will restore my faith in a fair World. Compiler produced spaghetthi code should not perform better than well written hand made ASM.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 20, 2017, 06:44:52 PM
... will restore my faith in a fair World. Compiler produced spaghetthi code should not perform better than well written hand made ASM.

Your article clearly shows that a C++ compiler can beat assembler.

Your article is good, really. And it shows that a compiler can beat us. It does not prove, though, that it will beat us all the time. How easy will it be to write an article that, on the basis of one particular case, "proves" that the compiler can be beaten? Let's not be too superficial...

P.S.:
Continuing the saga on Matrix Transposing...
This is a solution for transposing matrixes of any size, squared or not. It supports as well small matrixes or matrixes with less than 4 rows or 4 columns.

Which compiler produced that ultra-fast assembler code you are showing there...?
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: LordAdef on April 20, 2017, 06:56:13 PM
olá!

Quote
You appear to know a lot about these things.

 Not really. I'm fairly new to asm myself. but I've been doing a really intensive training and learning from whatever source I can. As I'm currently writing I prog in asm myself, I happened to benchmark shl and can confirm it's a rather slow option, if you are aiming for speed.

Quote
You never heard that lea is an handy fast arithmetic calculator? I am using it like that, not to load an effective memory address.

I do! But I confess I only passed my eyes on the lea instructions. Sorry. Nevertheless, it's a place to check the clock and maybe see if it's the fastest option.

I'm very curious to see how you and these guys can optimize this algo and how asm will react afterwards.

there is also a suggestion from Hutch I read sometime ago one should take into consideration: building the asm side in a dedicated ide, not in VS.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: LordAdef on April 20, 2017, 07:01:20 PM
Quote
Which compiler produced that ultra-fast assembler code you are showing there...?
You wont like the answer.. :badgrin:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 07:12:11 PM
How easy will it be to write an article that, on the basis of one particular case, "proves" that the compiler can be beaten? Let's not be too superficial...
If you have a good case it will be easy. This is not a superficial answer.

Which compiler produced that ultra-fast assembler code you are showing there...?
Did I mention it was ultrafast? But you can always improve and show the outcome.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 07:15:26 PM
I happened to benchmark shl and can confirm it's a rather slow option, if you are aiming for speed.
Use "mul" instead, and show the benchmarks.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 07:19:07 PM
Quote
Which compiler produced that ultra-fast assembler code you are showing there...?
You wont like the answer.. :badgrin:
Probably, I will start disregarding your provocations.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 20, 2017, 07:20:21 PM
Did I mention it was ultrafast? But you can always improve and show the outcome.

The point is that, as far as I know, you hand-coded it in assembler ;)

I happened to benchmark shl and can confirm it's a rather slow option, if you are aiming for speed.
Use "mul" instead, and show the benchmarks.

http://masm32.com/board/index.php?topic=6092.msg64629#msg64629
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 07:26:45 PM
Did I mention it was ultrafast? But you can always improve and show the outcome.

The point is that, as far as I know, you hand-coded it in assembler. Right?
Ah, "assembler is fun" as you say under your logo and I never though about doing it in C/C++. I don't know whether it would be faster or not in this case (probably, not in this case).
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: LordAdef on April 20, 2017, 07:30:08 PM
Quote
Which compiler produced that ultra-fast assembler code you are showing there...?
You wont like the answer.. :badgrin:
Probably, I will start disregarding your provocations.

It's not a provocation aw27, it's me actually teasing JJ!!!  He is Microsoft's enemy number 1. I never did anything to you... Why is that?
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 20, 2017, 07:48:03 PM
There is an obvious elephant in the room here that everybody is desperately trying to avoid and that is processor family differences. Just as an example, the instruction LEA was genuinely fast on a PIII but turned into a lemon on a PIV. The next generation Core2 series hardware was a little kinder to LEA and the range of i7 hardware does not have a problem with it. SSE got a lot faster  with the Core2 and later series at the expense of simpler integer instructions getting slower as silicon was being pointed at later instructions.

Then for each family of processor you have a range of price driven variants that vary with their power design, cache size and frequency rate and collectively all of these many variants make the notion of a single piece of code being faster than another nonsense. The best you will get on similar age processors is a decent average across similar hardware and even that is dodgy.

The vast majority are messy in data and caching and much of the speed related advantages of one algo over another are wasted when the rest of the app is necessarily messy in how it works. It is not to say that its not worth the effort to manually optimise code but it is nowhere as simple as it is being made out here.

With this 6 core Haswell I work with at the moment, it is a nominally a 3.3gig processor but check the task manager and most of the time it sleeps in noddy mode at about 1.2 gig and only when it is loaded does it come up to speed. I have just recently tweaked my old i7 860 which was an overclockers toy some years ago by adding memory to it (8 up to 16 gig) then upped the clock from 2.8 to 3.5gig and it benchmarks faster than the Haswell.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 20, 2017, 08:27:36 PM
I never though about doing it in C/C++. I don't know whether it would be faster or not in this case

Well... can you really resist trying that one with MSVC?
 ;)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 20, 2017, 11:15:44 PM
I never though about doing it in C/C++. I don't know whether it would be faster or not in this case

Well... can you really resist trying that one with MSVC?
 ;)

MSVC is faster for small matrixes. Tends to be slower and slower as the size grows.
But the ASM is not yet optimized.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 21, 2017, 01:20:06 AM
I have tried to build your code... first, Transpose.vcxproj launched VS Community, which took OVER THREE MINUTES to open. CRAP. Then it asked me to login to my M$ account (which I refuse to have) because the trial period is over. Redmond, this CRAPWARE was supposed to be FREE, right?

So I tried my commandline setup for MSVC: "c:\program files (x86)\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h(194): error C2059: syntax error: '['"

[RANT]
Great. And it seems that all my previously working sources show the same error. Thank you, VS Community, for introducing new "features".

Sorry, I give up on C/C++. Almost every time I put hand on C code, it ends up with endless searches on the web for somebody who solved the mystery of missing header files, or (in this case) header files that have "syntax errors" although I definitely never touched them. Not to mention the numerous attempts to load M$ "projects" which fail miserably because the current MSVC is not able to read the old obsolete crap that was saved in the previous version two years ago. Visual Crap just stinks. Kudos to Hutch - Masm32 works. Same to Pelle Orinius, btw - his C compiler works, too.

This afternoon I wasted over one hour trying to connect a phone to my PC with Bluetooth. Incredibly complicated, Windows help completely useless, it just sends you in circles, I gave up in the end. How could this "OS" survive so many years???

Quote
A helicopter was flying around above Seattle when an electrical malfunction disabled all of the aircraft's electronic navigation and communications equipment. Due to the clouds and haze, the pilot could not determine the helicopter's position and course to fly to the airport. The pilot saw a tall building, flew toward it, circled, drew a handwritten sign, and held it in the helicopter's window. The pilot's sign said "WHERE AM I?" in large letters. People in the tall building quickly responded to the aircraft, drew a large sign, and held it in a building window. Their sign read "YOU ARE IN A HELICOPTER." The pilot smiled, waved, looked at his map, determined the course to steer to SEATAC airport, and landed safely. After they were on the ground, the co-pilot asked the pilot how the "YOU ARE IN A HELICOPTER" sign helped determine their position. The pilot responded "I knew that had to be the Microsoft building because they gave me a technically correct, but completely useless answer."
[/RANT]
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: TWell on April 21, 2017, 01:31:01 AM
MS C compiler is good, standard headers are just c...
For that reason i use WDDK headers or just without those.
C is so flexible :t

EDIT: reminder
Code: [Select]
cl.exe -GS- -Zl -fp:fast -arch:SSE2 -d2noftol3 -O2 N4S.c DeterminantC.c -link -nocoffgrpinfo
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on April 21, 2017, 01:33:25 AM
MS C compiler is good, standard headers are just c...

I didn't ask MS C to include sourceannotations.h :(
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 21, 2017, 01:50:37 AM
I have tried to build your code... first, Transpose.vcxproj launched VS Community, which took OVER THREE MINUTES to open. CRAP. Then it asked me to login to my M$ account (which I refuse to have) because the trial period is over. Redmond, this CRAPWARE was supposed to be FREE, right?

So I tried my commandline setup for MSVC: "c:\program files (x86)\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h(194): error C2059: syntax error: '['"

Don't forget that you can always delete it and have some peace of mind in the future.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 21, 2017, 04:55:27 AM
I confess I am no fan of the musical chairs that Microsoft play with their C/C+ versions. I have the source code from the SAPI 5 SDK for the app that runs the speech engine and I also have a perfect copy of the VC2003 environment that built everything from the old SDK onwards but when I tried to build the TTS app, it needed some AFX crap so I thought PHUKIT and tweaked the original executable in ResourceHacker, put a manifest into it so it looks like a modern app, redrew the dialog interface so that it was more or less useful and it is now worth using.

C was supposed to be portable, something that Microsoft have deliberately broken to keep the suckers dependent.

At least with Japheth's JWASM if you paddled through his makefile you could get the options and build it with a batch file but it WAS written to be portable. Built it in Pelle's C compiler as well.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 21, 2017, 06:38:11 AM
At least with Japheth's JWASM if you paddled through his makefile you could get the options and build it with a batch file but it WAS written to be portable. Built it in Pelle's C compiler as well.

For JWASM, I did not even look at the makefile, in the editor I selected all .C and .H files and it compiled fine with VS 2015 both for 32 and 64-bit.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 21, 2017, 06:49:34 AM
Japheth used to recommend the VC2003 toolkit as it had better libraries than the later versions. I got it to build in VC10, VC2003 and Pelle's C but never with an IDE from Microsoft, always built with batch files. The VC2003 versions were smaller and faster.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 21, 2017, 07:45:33 AM
Japheth used to recommend the VC2003 toolkit as it had better libraries than the later versions. I got it to build in VC10, VC2003 and Pelle's C but never with an IDE from Microsoft, always built with batch files. The VC2003 versions were smaller and faster.

Something older than VS 2005 has no interest for me now, support for 64-bit started there. In case of need I have DDKs as old as Windows Nt 4, or SDks not distributed anymore.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 21, 2017, 03:59:13 PM
It worries me much less with tools and utilities than it would with specific user apps, in most cases non-UI tools applications that are primarily single thread don't go any faster in 64 bit than 32 bit. Its where large memory is an advantage that 64 bit shines when you can routinely allocate multi-gigabyte blocks and multi-thread the processing of large amounts of memory.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: TWell on April 21, 2017, 04:18:41 PM
test programs compiled with version 19.10 for x86 and x64
and x86 test programs compiled with versions 19.00  and 19.10
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 21, 2017, 06:03:46 PM
don't go any faster in 64 bit than 32 bit.

When the number of 64-bit CPU registers help reduce data in and out from memory it will a lot .
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 21, 2017, 08:17:02 PM
Like x64 fastcall calling convention, many algorithms use less that 3 args on 32 bit and can be run as FASTCALL. You may be assuming that everything is done in 256 bit operations but many tasks are unaligned messes with mixed byte, word and dword data that defy later larger and faster registers. Then you have the code size difference, smaller 32 bit code uses less cache that equivalent 64 bit code and smaller data sizes load faster than big ones. Simple case is some code is faster in 32 bit, some other code is faster in 64 bit, anyone who is familiar with writing 32 bit assembler code already know how to be efficient with usable registers, the differences between inner and outer loop code and instruction choice.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 21, 2017, 09:39:09 PM
Like x64 fastcall calling convention, many algorithms use less that 3 args on 32 bit and can be run as FASTCALL.
Except for very small functions, FASTCALL will end making the code slower.
The reason is that you will have to save the registers content somewhere inside the function.
Before the call you have to load the registers with data and inside the function you will have to save the registers content somewhere because you need the registers for other things. A waste of cycles, it's like, put the car keys in the pocket to cross the room and place them in another table.
The same applies to x64, although it has more registers to play with, it is not called FASTCALL anymore - there is no other. Ah yes, Vectorcall, but the same problem.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 22, 2017, 12:55:49 AM
This is indeed an unusual comment, if you don't use register passing ALA the Microsoft Application Binary Interface "rcx rdx r8 r9" you are left with passing data by globals or old slower style STDCALL stack passing with pushes and pops. Now of course nothing is stopping you from pre-loading a number of AVX registers and calling a procedure that will use them but you must get the arguments for a procedure some how and it does not happen by magic. In 32 bit you used the Intel Application Binary Interface which was a standard PUSH/CALL technique and you can emulate FASTCALL with up to 3 registers to keep the stack overhead down if it is a short leaf procedure.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 22, 2017, 02:02:34 AM
This is indeed an unusual comment,
Not unusual.
I made a quick search on google and there was someone with the same idea:
"You don't gain anything by passing in registers if the called function immediately needs to spill everything out into memory for its own calculations."
Another one:
"How fast is this calling convention, comparing to __cdecl and __stdcall? Find out for yourselves. Set the compiler option /Gr, and compare the execution time. I didn't find __fastcall to be any faster than other calling conventons, but you may come to different conclusions."

old slower style STDCALL stack passing with pushes and pops.
STDCALL is not slow anymore, it is as fast as CDECL and, in my opinion, in real life, not school class examples, both are faster than FASTCALL. Sound weird, but this the reason not to be widespread.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 22, 2017, 03:38:05 AM
This tends to be why assembler programmers benchmark techniques rather than search the internet for quotations.

FASTCALL in 64 bit is specification.

  mov rcx, handle
  mov rdx, wmsg
  mov r8,  wparam
  mov r9,  lparam
  call SendMessage
  mov retval, rax


In 32 bit STDCALL is specification.

  push lparam
  push wparam
  push wmsg
  push handle
  call SendMessage
  mov retval, eax


Its a simple fact that registers are a lot faster than memory and much of the design of 64 bit FASTCALL was to reduce the call overhead for the vast majority of function calls that use 4 or less arguments. When you don't need to twiddle the stack you reduce overhead and pick up speed. The other factor of course is "does it matter" when you are calling high level code in either libraries or DLL system functions.

Being able to save a few pico-seconds making a MessageBox() call seems to be the achievement of much modern high level code design where the benchmarking approach puts the effort where it matters, in high level code you pursue clarity and maintainability where in low level code you design and benchmark to get the speed up. You need to do more than just twiddle compiler options, a dis-assembler does not tell lies.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: nidud on April 22, 2017, 04:06:41 AM
deleted
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on April 22, 2017, 04:36:58 AM
This tends to be why you have a variety of techniques, stack frames for high level code that uses many arguments and local variables and no stack frame for low argument counts and direct register passing to cut overhead and improve speed.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: nidud on April 22, 2017, 06:17:28 AM
deleted
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: felipe on April 25, 2017, 12:05:08 PM

A helicopter was flying around above Seattle when an electrical malfunction disabled all of the aircraft's electronic navigation and communications equipment. Due to the clouds and haze, the pilot could not determine the helicopter's position and course to fly to the airport. The pilot saw a tall building, flew toward it, circled, drew a handwritten sign, and held it in the helicopter's window. The pilot's sign said "WHERE AM I?" in large letters. People in the tall building quickly responded to the aircraft, drew a large sign, and held it in a building window. Their sign read "YOU ARE IN A HELICOPTER." The pilot smiled, waved, looked at his map, determined the course to steer to SEATAC airport, and landed safely. After they were on the ground, the co-pilot asked the pilot how the "YOU ARE IN A HELICOPTER" sign helped determine their position. The pilot responded "I knew that had to be the Microsoft building because they gave me a technically correct, but completely useless answer."

HAHAHA! This is really funny  :lol:  :biggrin:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on April 28, 2017, 10:18:41 PM
I updated the article "Need for Speed - C++ versus Assembly Language" (https://www.codeproject.com/Articles/1182676/Need-for-Speed-Cplusplus-versus-Assembly-Language), now it includes C# and Free Pascal on the run as well. Now, ASM wins hands down in both cases, particularly for C#.  :greenclp:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on May 20, 2017, 03:01:21 AM
 :greenclp: :greenclp: :greenclp: :greenclp:

You have won CodeProject

Best C++ Article of April 2017 First Prize.

Type:     Article
Location: Need for Speed - C++ versus Assembly Language
          https://www.codeproject.com/Articles/1182676/Need-for-Speed-Cplusplus-versus-Assembly-Language

CodeProject Mug (http://www.cafepress.co.uk/codeproject.1302986) from CodeProject. Value: $14
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 20, 2017, 03:42:40 AM
Congrats, José :t

That is a big success, and worth much more than the mug ;)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Raistlin on May 20, 2017, 05:37:10 AM
PPS. looking (after you conveniently ignored my initial post - seesh science must hurt, or else your preaching to the converted) at
real (therefore repro-able) science = metrics
It has been established that compilers
cannot exceed the standard asm coder. Please rectify your
popularist code project award winning post.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: fearless on May 20, 2017, 05:46:52 AM
Gonna need a pic of you standing majestically with the codeproject mug and staring off into the distance.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on May 20, 2017, 04:29:54 PM
That is a big success, and worth much more than the mug ;)
I had no idea it was worth anything.  :icon14:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on May 20, 2017, 04:31:04 PM
Gonna need a pic of you standing majestically with the codeproject mug and staring off into the distance.
And with a crown in my head, of course.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on May 20, 2017, 04:32:45 PM
Please rectify your popularist code project award winning post.
Yes, Sir!   :icon_redface:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: K_F on May 20, 2017, 10:46:40 PM
I'm sure if you get rid of the high level constucts (.If .. then.. else.. etc), you see a vast improvement in speed  ;)
As the C++ compiler uses every trick in the book.. try the same for Asm  ;)
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 21, 2017, 12:10:01 AM
I'm sure if you get rid of the high level constucts (.If .. then.. else.. etc), you see a vast improvement in speed  ;)

I've never seen such an improvement. Can you post an example?
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: K_F on May 21, 2017, 01:26:30 AM
I haven't got an example offhand, but Dedndave (or someon else) posted something on this topic a few years back.

Their example improved the 'goto' by one instruction per IF IIRC - so this was my thought as AW27's asm example had a few levels of IFs.
A couple million extra instructions could make a difference in those totals.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on May 21, 2017, 02:05:03 AM
If you are serious about such things, you use a table of labels and reach every option with a couple of instructions.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 21, 2017, 02:18:20 AM
I am always curious to see highly optimised code, therefore some time I created the CodeSize macro. The attached testbed is purest Masm32 code, no MasmBasic, promised. All you have to do is write code that is more efficient than the built-in HLL stuff, and add a label before and after. Example:

Code: [Select]
Man_Repeat_s:
@@:
dec ecx
jne @B
Man_Repeat_endp:
CodeSize Man_Repeat

Output:
Code: [Select]
3        bytes for Man_Repeat
Really, extremely easy to use. And it's fun to outperform the HLL stuff :t
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: aw27 on May 21, 2017, 02:50:13 AM
I'm sure if you get rid of the high level constucts (.If .. then.. else.. etc), you see a vast improvement in speed  ;)
As the C++ compiler uses every trick in the book.. try the same for Asm  ;)
The variation I assembled with ML64, does not contain high level constructs and has no noticeable performance differences. But I agree that the high level constructs contain more instructions.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: HSE on May 21, 2017, 02:55:13 AM
JJ: Why not
Code: [Select]
mov ecx, 9
Loop_s:
loop Loop_s
Loop_endp:

Epa! (forget the question)
Code: [Select]
146665    cycles  for Loop
98486     cycles for jne
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 21, 2017, 03:13:35 AM
But I agree that the high level constructs contain more instructions.

For example?


JJ: Why not
Code: [Select]
mov ecx, 9
Loop_s:
loop Loop_s
Loop_endp:

Epa! (forget the question)
Code: [Select]
146665    cycles  for Loop
98486     cycles for jne

Valid example, thanks, it's indeed one byte shorter ;-)

I've added your suggestion to testbed version 2. Including the HLL equivalent :bgrin:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on May 25, 2017, 09:13:35 PM
I always laugh at some of the notions of speed, would it really matter if your MessageBoxA was a few picoseconds faster than someone elses ? You put the effort where it does matter, where you have processing bottlenecks, where time critical code is holding up the works, the rest is write it clearly, make it maintainable and reliable. Then there is the choice of algorithm, a sloppy quick sort will outperform a brilliantly optimised bubble sort, don't waste your effort on the wrong idea.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 25, 2017, 10:51:37 PM
I absolutely agree, Hutch. 100% 8)

The last question was, however, whether HLL produces longer or slower code. And I am still waiting to see an example that confirms this statement. One example would be enough.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Raistlin on May 29, 2017, 03:48:34 PM
Code: [Select]
I always laugh at some of the notions of speed, would it really matter if your MessageBoxA was a few picoseconds faster than someone elses ?
Although I agree - I have another reason for doing full ASM apps. Wait for it.... TA DAAAAA = Size !

Yes, size does matter - especially when you want to create cross hardware/software SKU -platform support / installs, network-comms etc.

It does not detract either that: Smaller, generally (yes, I know that's relative) means faster.
RE: smaller in-memory footprint, cache hit probability, hdd binary size reduces load times etc....

[EDIT: Note to self - do not use "etc" so much - either list all or list none]
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: hutch-- on May 29, 2017, 08:45:30 PM
The reason that sticks in my head is POWER, size is useful in that it allows you to do things that would be clunky and complicated in a HLL not designed for the purpose which means that you can trade size for speed when your code is small enough to get away with it but below all of the practical considerations, it is the architectural freedom to design what you like in whatever way you like that is the real reason why I write in x86 assembler.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: felipe on May 30, 2017, 03:27:18 AM
Other reason, that i like is the close relation with the circuits (the microprocessor). Electronic stuffs are fascinating i think, so write in assembly is the best for controlling by software all that circuitry.  :bgrin:
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jimg on May 30, 2017, 04:22:32 AM
While I agree with you Felipe, unfortunately, everyone is using C to program their microprocessors these days.  Very disheartening.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Siekmanski on May 30, 2017, 05:08:03 AM
While I agree with you Felipe, unfortunately, everyone is using C to program their microprocessors these days.  Very disheartening.

I noticed the same, especially since the internet of things modules such as arduino and stuff. Most of them use the c++ compiler that comes with it. The young people don't even know what an assembler is. Here in europe there are still many people programming microcontrollers in assembly. There are great forums in Germany.

One thing is for sure, assembly has a very big advantage in speed and size over C/C++ for the very small rams of the tiny microcontrollers.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jimg on May 30, 2017, 06:30:47 AM
For the esp8266 I've been working with, everything is in C, I haven't seen an assembler for it.  So I'm using ZBasic which converts basic to C for the esp.
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: jj2007 on May 30, 2017, 06:31:13 AM
Don't be so negative: assembler is on the rise again: https://www.tiobe.com/tiobe-index/
Title: Re: Need for Speed - C++ versus Assembly Language
Post by: Siekmanski on May 30, 2017, 09:33:35 AM
With an Atmega microcontroller programmed in assembly you can send and receive the AT commands from and to the esp8266 module.
This is not hard at all, because the ATmega has a UART on board.

Here is an example in assembly how to use the UART of an ATmel ATtiny2313 microcontroller:

Code: [Select]
.include <tn2313def.inc>    ; ATtiny2313 definitie bestand

.equ F_CPU = 11059200       ; Hz "cpu kloksnelheid definitie voor Wacht-macro"

; Instellingen RS232 communicatie
.equ BaudRate        = 115200
.equ RxBufferLengte  = 64       ; Inhoud van de ringbuffer in bytes


.org 0x0000    ; Code laad adres 0x0000
    rjmp Init  ; Maak een Relatieve sprong naar Init
.org URXCaddr
    rjmp Receive_Byte ; Ontvangst interrupt

.DSEG
                                   
RS232_Buffer:                 .byte RxBufferLengte
RS232_BufferSchrijfPositie:   .byte 1
RS232_BufferLeesPositie:      .byte 1
RS232_BufferLeesPositieNieuw: .byte 1

.CSEG


Init:
    ldi     r16,low(RAMEND) ; Initialisatie STACK
    out     SPL,r16

    cli

    rcall   RS232_Init      ; RS232 protocol initialisieren

    sei

Start:
;   rcall   PrintRS232data
;   ldi     r16,0x4F            ; zend "O"
;   sts     RS232_Karakter,r16
;   rcall   Send_Byte
;   ldi     r16,0x4B            ; zend "K"
;   sts     RS232_Karakter,r16
;   rcall   Send_Byte
    rjmp    Start

RS232_Init:
    eor     r16,r16
    sts     RS232_BufferSchrijfPositie,r16
    sts     RS232_BufferLeesPositie,r16
    sts     RS232_BufferLeesPositieNieuw,r16
    sts     RS232_NieuweRegel,r16

    ldi     r16,(0<<U2X) ; geen dubbele snelheid
    out     UCSRA,r16
    ; Stel baud rate in:
    ldi     r16,High(F_CPU/(16*BaudRate)-1)
    out     UBRRH,r16
    ldi     r16,Low(F_CPU/(16*BaudRate)-1)
    out     UBRRL,r16
    ; Stel ontvanger (RXEN), zender (TXEN) en ontvang_interrupt (RXCIE) in.
    ldi     r16,(1<<RXCIE)|(1<<RXEN)|(1<<TXEN)
    out     UCSRB,r16
    ; Stel transfer formaat in: 8 data (UCSZn1 = 1 & UCSZ0 =1), 1 stop bit (USBS = 0, == 1 stopbit)
    ldi     r16,(1<<UCSZ1)|(1<<UCSZ0)|(0<<USBS) ; 8N1
    out     UCSRC,r16
    ret


Receive_Byte:
    push    r16     
    in      r16,SREG
    push    r16

    ldi     YH,high(RS232_Buffer)
    ldi     YL,low(RS232_Buffer)
    lds     r16,RS232_BufferSchrijfPositie
    add     YL,r16                  ; zet positie in RS232_Buffer
    mov     r17,r16
    inc     r17
    andi    r17,RxBufferLengte-1    ; blijf in de "Ringbuffer"
    sts     RS232_BufferSchrijfPositie,r17  ;

    in      r16,UDR                 ; ontvangen karakter

    sts     RS232_Karakter,r16      ; opslaan voor echo
    st      Y,r16                   ; sla karakter op in RS232_Buffer
    cpi     r16,0                   ; testen op einde commando
    brne    geen_nieuwe_regel
    sts     RS232_BufferLeesPositieNieuw,r17    ; nieuwe positie voor volgende regel
    ldi     r16,1
    sts     RS232_NieuweRegel,r16   ; meld nieuwe regel aan ( we zitten in een Interrupt )

geen_nieuwe_regel:
    pop     r16
    out     SREG,r16
    pop     r16
    reti


Send_Byte:
    sbis    UCSRA,UDRE              ; wachten tot zendbuffer leeg is.
    rjmp    Send_Byte

    lds     r16,RS232_Karakter
    out     UDR,r16
    ret

Title: Re: Need for Speed - C++ versus Assembly Language
Post by: K_F on May 30, 2017, 10:15:32 AM
Don't be so negative: assembler is on the rise again: https://www.tiobe.com/tiobe-index/
It's just about awareness....
Once the younger generation discover assembler ... :eusa_boohoo: