News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Branch avoiding - does it really make sens?

Started by flipflop, February 11, 2013, 04:48:11 AM

Previous topic - Next topic

flipflop

Hi folks,

I've heard a lot about avoiding branch. I saw many unusual combinations to avoid branching. I've written the same algorithm in two ways: with and without branching.
I've made RDSTC calculations and the better was with branching. I'm attaching those files to post. Could someone explain me what is wrong or say there is nothing magical
in avoiding barches - this doesn't make sens.

Thanks in advance.

dedndave

the newer the CPU, the less important it probably is to avoid them
they get better at predicting branches   :P

jj2007

Hi flipflop,

First things first: Welcome to the forum :icon14:

Re branches: We had that question already here. It is really CPU-dependent, as Dave already wrote.

Vortex

Hi flipflop,

You can check Mark Larson's article :

Assembly Optimization Tips

http://mark.masmcode.com/

hutch--

Hi flipflop,

It is not an either / or with branch reduction, its understanding what you are writing and writing efficient code. Branch reduction is generally applied to badly written code OR unoptimised code created by a compiler before the optimisation phase and within reasonable boundaries it does make a difference on sub standard code.

Having a half a dozen more branches than you need does slow code down but that is the case with any instruction. Branching is more of a problem because it involves an interruption to the scheduling of instructions through multiple pipelines and while branch prediction has improved on later hardware, a gaggle of jumps goes well beyond what a branch prediction algorithm can predict. Some of the conditional move instruction can at times be a bit faster but they can equally be a bit slower so you are always better off timing algorithm differences to see which is faster. Also note that you can get different timings on different processors.

General drift is don't use more than you need but don't waste your time looking for workarounds, a processor cannot work without branching, just don't over do it.  :biggrin:

Gunther

You have to know the facts before you can distort them.

dedndave

something i have noticed is that it can take more time to let the flag settle than it does to branch

consider the following code
        mov     edx,26
        shr     eax,1
        jc      SomeLabel

before the branch can occur, the carry flag has to settle

by re-organizing the instructions, the time for the MOV is sort of a freebie
        shr     eax,1
        mov     edx,26
        jc      SomeLabel


of course, this is also processor-dependant   :P