Hi folks,
I've heard a lot about avoiding branch. I saw many unusual combinations to avoid branching. I've written the same algorithm in two ways: with and without branching.
I've made RDSTC calculations and the better was with branching. I'm attaching those files to post. Could someone explain me what is wrong or say there is nothing magical
in avoiding barches - this doesn't make sens.
Thanks in advance.
the newer the CPU, the less important it probably is to avoid them
they get better at predicting branches :P
Hi flipflop,
First things first: Welcome to the forum :icon14:
Re branches: We had that question already here (http://www.masmforum.com/board/index.php?topic=18765.msg158922#msg158922). It is really CPU-dependent, as Dave already wrote.
Hi flipflop,
You can check Mark Larson's article :
Assembly Optimization Tips
http://mark.masmcode.com/
Hi flipflop,
It is not an either / or with branch reduction, its understanding what you are writing and writing efficient code. Branch reduction is generally applied to badly written code OR unoptimised code created by a compiler before the optimisation phase and within reasonable boundaries it does make a difference on sub standard code.
Having a half a dozen more branches than you need does slow code down but that is the case with any instruction. Branching is more of a problem because it involves an interruption to the scheduling of instructions through multiple pipelines and while branch prediction has improved on later hardware, a gaggle of jumps goes well beyond what a branch prediction algorithm can predict. Some of the conditional move instruction can at times be a bit faster but they can equally be a bit slower so you are always better off timing algorithm differences to see which is faster. Also note that you can get different timings on different processors.
General drift is don't use more than you need but don't waste your time looking for workarounds, a processor cannot work without branching, just don't over do it. :biggrin:
Hi flipflop,
welcome to the forum.
Gunther
something i have noticed is that it can take more time to let the flag settle than it does to branch
consider the following code
mov edx,26
shr eax,1
jc SomeLabel
before the branch can occur, the carry flag has to settle
by re-organizing the instructions, the time for the MOV is sort of a freebie
shr eax,1
mov edx,26
jc SomeLabel
of course, this is also processor-dependant :P