News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

C/C++ vs Assembler

Started by Manos, May 04, 2013, 04:11:50 AM

Previous topic - Next topic

Gunther

Quote from: dedndave on May 07, 2013, 01:16:18 AM
:biggrin:  as if the subject has never come up before



oh yes, it's a very new topic.  :lol: :lol: :lol:

Gunther
You have to know the facts before you can distort them.

jj2007

Quote from: qWord on May 06, 2013, 09:33:07 PM
Behold and see...

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
41      cycles for Small 1
38      cycles for Small 2
40      cycles for Small 3
42      cycles for Small 3.1
41      cycles for Small 4
50      cycles for C version
53      cycles for C mod JJ
58      hsz2dw (Microsoft 32-Bit C/C++ optimization compiler v16.00.40219.01)
24      hsz2dw2 (unrolled 4 times)

41      cycles for Small 1
39      cycles for Small 2
41      cycles for Small 3
57      cycles for Small 3.1
23      cycles for Small 4
51      cycles for C version
53      cycles for C mod JJ
31      hsz2dw (Microsoft 32-Bit C/C++ optimization compiler v16.00.40219.01)
24      hsz2dw2 (unrolled 4 times)

43      cycles for Small 1
38      cycles for Small 2
41      cycles for Small 3
42      cycles for Small 3.1
41      cycles for Small 4
50      cycles for C version
51      cycles for C mod JJ
31      hsz2dw (Microsoft 32-Bit C/C++ optimization compiler v16.00.40219.01)
24      hsz2dw2 (unrolled 4 times)


Well, a LUT is difficult to beat ;-)

habran

 :t

Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz (SSE4)



7       cycles for Fast version
8       cycles for Fast version under AMD
16      cycles for Small 1
18      cycles for Small 2
19      cycles for Small 3
19      cycles for Small 3.1
19      cycles for Small 4
4       cycles for MMX 1
5       cycles for MMX 2
4       cycles for SSE1

Other's Versions:
23      cycles for Axhex2dw improved by Hutch (1)
22      cycles for Axhex2dw improved by Hutch (2)

3       cycles for Lingo's SSE version
7       cycles for Lingo's BIG integer version
21      cycles for Jochen's WORD-Indexed version
12      cycles for Dave's version (with minor changes)


11      cycles for Fast version
14      cycles for Fast version under AMD
17      cycles for Small 1
19      cycles for Small 2
18      cycles for Small 3
19      cycles for Small 3.1
18      cycles for Small 4
5       cycles for MMX 1
4       cycles for MMX 2
4       cycles for SSE1

Other's Versions:
22      cycles for Axhex2dw improved by Hutch (1)
21      cycles for Axhex2dw improved by Hutch (2)

2       cycles for Lingo's SSE version
7       cycles for Lingo's BIG integer version
5       cycles for Jochen's WORD-Indexed version
12      cycles for Dave's version (with minor changes)


10      cycles for Fast version
13      cycles for Fast version under AMD
17      cycles for Small 1
17      cycles for Small 2
18      cycles for Small 3
19      cycles for Small 3.1
19      cycles for Small 4
5       cycles for MMX 1
6       cycles for MMX 2
5       cycles for SSE1

Other's Versions:
23      cycles for Axhex2dw improved by Hutch (1)
22      cycles for Axhex2dw improved by Hutch (2)

4       cycles for Lingo's SSE version
7       cycles for Lingo's BIG integer version
6       cycles for Jochen's WORD-Indexed version
11      cycles for Dave's version (with minor changes)

==========
Codesizes:
Axhex2dw_Unrolled:      396
Axhex2dw_Unrolled_AMD:  396
Axhex2dw1 - 1:  69
Axhex2dw2 - 2:  48
Axhex2dw3 - 3:  57
Axhex2dw3_1 - 3.1:      56
Axhex2dw3 - 4:  61
Axhex2dw_MMX:   128
Axhex2dw_MMX2:  160
Axhex2dw_SSE:   160
Alex_Short_Hutch:       59
Axhex2dw_Hutch2:        54
Hex2dwLingoSSE: 160
lingo_htodw:    1950
ax_jj_htodw:    174
krbhtodw:       547
--- ok ---
Cod-Father

Gunther

Jochen,

Quote from: jj2007 on May 07, 2013, 03:31:40 AM
Well, a LUT is difficult to beat ;-)

that's true, but an old wisdom.

Gunther
You have to know the facts before you can distort them.

Antariy

Quote from: qWord on May 07, 2013, 12:53:42 AM
OK - I obviously missed the "spirit" of this thread...

No, it's OK, your code is good - fast and well readable, really :t Thanks for posting it :biggrin: It was informational test, too, now in this strange thread (@Dave - :biggrin:) we do have rolled and unrolled C (and you have provided both) and ASM versions of the code, mixed in a crazy testbeds :biggrin:

Antariy

Quote from: habran on May 07, 2013, 05:43:20 AM
:t

Thank you, habran :t

If your OS is 32 bit, then it really seems as a best idea to run 32 bit proggies under 32 bit OS :biggrin:

habran

It is 64 bit Win 7 :biggrin:
IMO there is no penalty for running 32 on 64 but 64 is certainly faster because of 64 bit programing :t
Cod-Father

Antariy

That was my assumption just because your timing results seem to be smaller than other's with the same CPU :biggrin:

habran

It is Toshiba Qosmio 16 gig ram laptop with 2.3 gig i7 and 64 bit Windows 7 Home with AVX :t
I think that qWord has got the same one
It is a great toy :bgrin:
Cod-Father


Gunther

Hi habran,

Quote from: habran on May 07, 2013, 02:18:51 PM
It is Toshiba Qosmio 16 gig ram laptop with 2.3 gig i7 and 64 bit Windows 7 Home with AVX :t
I think that qWord has got the same one
It is a great toy :bgrin:

it's probably the Ivy Bridge, isn't it?

Gunther
You have to know the facts before you can distort them.

habran

Yes Gunther, the Ivy Bridge it is :biggrin:
Quote
I already posted this before
here are specifications:                 
                                         
Intel® Core™ i7-3610QM Processor         
(6M Cache, up to 3.30 GHz)               
Specifications                           
Essentials                               
Status   Launched                         
Launch Date   Q2'12                       
Processor Number   i7-3610QM             
# of Cores   4                           
# of Threads   8                         
Clock Speed   2.3 GHz                     
Max Turbo Frequency   3.3 GHz             
Intel® Smart Cache   6 MB                 
Bus/Core Ratio   23                       
DMI   5 GT/s                             
Instruction Set   64-bit                 
Instruction Set Extensions   AVX         
Embedded Options Available   No           
Lithography   22 nm                       
Max TDP   45 W                           
Recommended Customer Price   TRAY: $378.00
Cod-Father

Gunther

Hi habran,

Quote from: habran on May 07, 2013, 11:20:17 PM
Quote
I already posted this before
here are specifications:                 
                                         
Intel® Core™ i7-3610QM Processor         
(6M Cache, up to 3.30 GHz)               
Specifications                           
Essentials                               
Status   Launched                         
Launch Date   Q2'12                       
Processor Number   i7-3610QM             
# of Cores   4                           
# of Threads   8                         
Clock Speed   2.3 GHz                     
Max Turbo Frequency   3.3 GHz             
Intel® Smart Cache   6 MB                 
Bus/Core Ratio   23                       
DMI   5 GT/s                             
Instruction Set   64-bit                 
Instruction Set Extensions   AVX         
Embedded Options Available   No           
Lithography   22 nm                       
Max TDP   45 W                           
Recommended Customer Price   TRAY: $378.00

an excellent machine. Runs Windows 64 as the only OS? How did you manage the new "BIOS"?

Gunther
You have to know the facts before you can distort them.

habran

Thanks Gunther
I did not have to touch anything
I just installed my tools and copyed my projects  :biggrin:
Cod-Father

Gunther

Hi habran,

Quote from: habran on May 08, 2013, 08:44:43 AM
I did not have to touch anything
I just installed my tools and copyed my projects  :biggrin:

so, do you have an EFI drive, too? Or is your hard disk not over 2.2 GB size?

Gunther
You have to know the facts before you can distort them.