News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast DwordtoHex ?

Started by guga, November 27, 2015, 11:16:24 PM

Previous topic - Next topic

dedndave

prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)

23333   cycles for 100 * dw2hex
1261    cycles for 100 * utoh (Hutch)
173108  cycles for 100 * CRT sprintf
1684    cycles for 100 * Bin2Hex
1480    cycles for 100 * Bin2Hex2 cx
1549    cycles for 100 * Bin2Hex3 ecx
5593    cycles for 100 * Bin2Hex6
2880    cycles for 100 * FastHex

23375   cycles for 100 * dw2hex
1266    cycles for 100 * utoh (Hutch)
173666  cycles for 100 * CRT sprintf
1460    cycles for 100 * Bin2Hex
1485    cycles for 100 * Bin2Hex2 cx
1474    cycles for 100 * Bin2Hex3 ecx
5608    cycles for 100 * Bin2Hex6
2874    cycles for 100 * FastHex

23111   cycles for 100 * dw2hex
1275    cycles for 100 * utoh (Hutch)
172950  cycles for 100 * CRT sprintf
1465    cycles for 100 * Bin2Hex
1481    cycles for 100 * Bin2Hex2 cx
1481    cycles for 100 * Bin2Hex3 ecx
5592    cycles for 100 * Bin2Hex6
2805    cycles for 100 * FastHex

FORTRANS

Hi,

Quote from: jj2007 on December 09, 2015, 01:37:36 PM
Note that the currently attached version does not require MasmBasic.

   In that case, some oldies.  Somewhat odd utoh results for
the P-MMX?


{P-MMX}
pre-P4
15186 cycles for 100 * dw2hex
7585 cycles for 100 * utoh (Hutch)
253408 cycles for 100 * CRT sprintf
7057 cycles for 100 * Bin2Hex
7626 cycles for 100 * Bin2Hex2 cx
7585 cycles for 100 * Bin2Hex3 ecx
8991 cycles for 100 * Bin2Hex6
6013 cycles for 100 * FastHex

14894 cycles for 100 * dw2hex
8126 cycles for 100 * utoh (Hutch)
245623 cycles for 100 * CRT sprintf
7622 cycles for 100 * Bin2Hex
7604 cycles for 100 * Bin2Hex2 cx
7590 cycles for 100 * Bin2Hex3 ecx
8918 cycles for 100 * Bin2Hex6
5857 cycles for 100 * FastHex

13313 cycles for 100 * dw2hex
7410 cycles for 100 * utoh (Hutch)
245176 cycles for 100 * CRT sprintf
8035 cycles for 100 * Bin2Hex
7633 cycles for 100 * Bin2Hex2 cx
6907 cycles for 100 * Bin2Hex3 ecx
9412 cycles for 100 * Bin2Hex6
6373 cycles for 100 * FastHex

20 bytes for dw2hex
600 bytes for utoh (Hutch)
29 bytes for CRT sprintf
138 bytes for Bin2Hex
150 bytes for Bin2Hex2 cx
214 bytes for Bin2Hex3 ecx
616 bytes for Bin2Hex6
66 bytes for FastHex

00345678 = eax dw2hex
12345678 = eax utoh (Hutch)
345678 = eax CRT sprintf
12345678 = eax Bin2Hex
00345678 = eax Bin2Hex2 cx
00345678 = eax Bin2Hex3 ecx
12345678 = eax Bin2Hex6
012345678 = eax FastHex

--- ok ---
{P-III}
pre-P4 (SSE1)

7668 cycles for 100 * dw2hex
1311 cycles for 100 * utoh (Hutch)
166093 cycles for 100 * CRT sprintf
1614 cycles for 100 * Bin2Hex
2016 cycles for 100 * Bin2Hex2 cx
1617 cycles for 100 * Bin2Hex3 ecx
3698 cycles for 100 * Bin2Hex6
5975 cycles for 100 * FastHex

7687 cycles for 100 * dw2hex
1312 cycles for 100 * utoh (Hutch)
166001 cycles for 100 * CRT sprintf
1728 cycles for 100 * Bin2Hex
2025 cycles for 100 * Bin2Hex2 cx
1652 cycles for 100 * Bin2Hex3 ecx
3717 cycles for 100 * Bin2Hex6
5958 cycles for 100 * FastHex

7719 cycles for 100 * dw2hex
1322 cycles for 100 * utoh (Hutch)
166667 cycles for 100 * CRT sprintf
1625 cycles for 100 * Bin2Hex
2023 cycles for 100 * Bin2Hex2 cx
1613 cycles for 100 * Bin2Hex3 ecx
3718 cycles for 100 * Bin2Hex6
5966 cycles for 100 * FastHex

20 bytes for dw2hex
600 bytes for utoh (Hutch)
29 bytes for CRT sprintf
138 bytes for Bin2Hex
150 bytes for Bin2Hex2 cx
214 bytes for Bin2Hex3 ecx
616 bytes for Bin2Hex6
66 bytes for FastHex

00345678 = eax dw2hex
12345678 = eax utoh (Hutch)
345678 = eax CRT sprintf
12345678 = eax Bin2Hex
00345678 = eax Bin2Hex2 cx
00345678 = eax Bin2Hex3 ecx
12345678 = eax Bin2Hex6
012345678 = eax FastHex

--- ok ---


Regards,

Steve N.

hutch--

Steve,

The variations with the older hardware is probably due to the different handling of the Intel complex addressing mode. Its been a long time since I worked on a PIII or earlier but from memory you preferentially loaded a table address into a register first then processed it and this was to simplify the complex address to registers rather than combined OFFSETs from a table mixed with registers. The techniques changed with the PIV and then again with the Core2 series. So far the i7 I am using is similar to the Core2 quad I used to use and this is what I tested the last set of algos on.

FORTRANS

Hi Steve,

   Thanks for the reply.  So complex addressing is the difference.
I should have remembered when I first got a Pentium Pro computer.
It performed quite a bit better than the Pentiums that were then
current.  If you had a good 32-bit compiler.  16-bit stuff was only
faster due to the faster clock and better cache.

Regards,

Steve N.

dedndave

i like the Pentium MMX - it makes my code look the best   :lol:

guga

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

hutch--

Dave,

If you can optimise code for a Prescott you will find the later processors easy in comparison. The Prescott processors had the longest and fussiest pipeline of any of the Intel processors I remember.

dedndave

i would say it's easiest to optimize for whatever processor you are currently using - lol

guga

Dave

the Rdtscp opcode exists in what processors ? I7 and above only ?

And the Rdtsc ? Does it exists on a PII or PIII too ? (If so, how to detect it ?

I mean, rdtscp is detectable thourgh bit 27 resultant from cpuid with mode 1 (in eax) activated. But, and rdtsc which bit is used to recognize it  ?

I´m not sure if rdtsc exists in all Pentium Series or only from PII or PIII and above
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

dedndave

i believe RDTSC is supported in all pentiums and newer (that's from memory, so check it)

the CPUID bit is named TSC
it is CPUID function 1, EDX bit 4

guga

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com