According to my vision, these algos are simple and efficient
in their design. And this fact is confirmed by the speed test
we performed some time ago.
After two years of no coding at all, in my mind have arised
new ideas about optimization, that mostly depends on a
paradigm shift and a change in the algos design.
I see now clearly some weakness in the overall design that
I didn't see some time ago.
The most important weakness I can see is the
single byte
processing. To obtain a single formatted byte the algo has to:
divide or multiply / move / add / push / pop / move.
I have the idea that we can have a much faster algo if we
use a parallel approach [at least 4 bytes in the same cycle]
bypassing unnecessary steps altogether.
So the first step is to put together some processing, making
the
single byte processing into a
multibytes processing.
In next post I'll show you the code doing this kind of transformation.
Stay tuned, and post your code as well if you like.
Frank
