Author Topic: Test results for AVX and AVX-512 needed  (Read 5001 times)

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Test results for AVX and AVX-512 needed
« Reply #30 on: December 23, 2017, 04:50:27 AM »
Hi aw27,

Quote from: aw27
Learning assembly language is important, even on the day compilers are able to do better than humans in every case. That day will arrive.
Right. Every programmer should know how computers work intern, how is hardware accessed etc. etc. We'll see if this day is coming.
Quote
I still remember the days most people believed it was impossible a machine to win on chess against a Grand Master because have no global vision, could not recognize patterns, had no sense of position, not able to think in strategic terms - could only use brute force. They were wrong, most chess programs nowadays beat easily every chess Grand master.

That's more complicated than it seems at first glance. Here is one of the best hit-parades of chess engines. It's updated at least weekly and very precise. I think your statement is true for the top scorers: Stockfish (by the way: Asmfish is a stockfish derivate), Kommodo, Houdini, Shredder etc.

I'm not a top correspondence chess player; my cc ELO is round about 2300. By comparison, the world ranking first is ELO 2688, because we haven't such usual ELO inflation. The calculation of our ELO numbers are a bit different; but that has proven itself. I'm using chess engines daily and with a little luck I'm qualified for the semifinals of the European Championship. However, I have had to pay a lot of apprenticeship. It's wrong to think: I'm using a chess engine now and will beat everyone else. You have to feed the Chess Engine with your own strategic ideas and then check that you have not overlooked any tactical finesse. That's the art. The only thing what chess engines are doing is: brute force. It has often happened to me that the engine suggests moves that ruin the pawn structure. That's poison for the entire game. So you have to look for an alternative and you need often a second opinion. But that's a big field and if anyone wants to keep discussing these questions, we should do that in a separate thread in the Soap Box. I still have a lot to talk about chess engines.

By the way: What says your machine now with the updated software in floatasm.zip?

Gunther
Get your facts first, and then you can distort them.

AW

  • Member
  • *****
  • Posts: 1309
  • Let's Make ASM Great Again!
Re: Test results for AVX and AVX-512 needed
« Reply #31 on: December 23, 2017, 04:59:28 AM »
@Felipe
We are in the era of the self-driving cars! Any cheap robot can kick that stupid cat out of the window reducing by one its many lifes!.

felipe

  • Member
  • ****
  • Posts: 833
  • Eagles are just great!
Re: Test results for AVX and AVX-512 needed
« Reply #32 on: December 23, 2017, 05:12:33 AM »
:biggrin:

Btw:
But that's a big field and if anyone wants to keep discussing these questions, we should do that in a separate thread in the Soap Box. I still have a lot to talk about chess engines.

 :t
Felipe.

six_L

  • Member
  • **
  • Posts: 132
Re: Test results for AVX and AVX-512 needed
« Reply #33 on: December 23, 2017, 05:18:19 AM »
Quote
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 62.32 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 44.33 Seconds
Performance Boost = 141%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 5.80 Seconds
Performance Boost = 1075%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 2.91 Seconds
Performance Boost = 2145%

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.
Quote
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C with assembly code generated by VS:
--------------------------------------------
sum1              = 8390656.00
Elapsed Time      = 62.45 Seconds

C and 4 accumulators with assembly code generated by VS:
--------------------------------------------------------
sum2              = 8390656.00
Elapsed Time      = 20.63 Seconds
Performance Boost = 303%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 5.19 Seconds
Performance Boost = 1204%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 2.62 Seconds
Performance Boost = 2381%

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Test results for AVX and AVX-512 needed
« Reply #34 on: December 23, 2017, 05:26:07 AM »
Thank you six_L for running the software and providing the results. What's your environment? I assume at least Windows 7-64. The processor would be interesting: Intel or AMD?

Gunther
Get your facts first, and then you can distort them.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Test results for AVX and AVX-512 needed
« Reply #35 on: December 23, 2017, 05:35:46 AM »
What I mean is this:
Quote from: fellipe
Haha, and here we go again, right?  ;)

 :lol:

I would say, if that's a total true, so machines and computers will do everything better some day, but i think that's not correct. It's just a simple generalization. Humans will be always smartest than machines, even if we don't realize of that.  :biggrin:

Btw i always question the real importance of the chess play. Maybe it's a stupid game. Humans had give machines the role of doing stupid and brutal things in an important part. So, they can win a chess play, but a cat can piss on a computer.  :lol:

or that:
Quote from: aw27
@Felipe
We are in the era of the self-driving cars! Any cheap robot can kick that stupid cat out of the window reducing by one its many lifes!.

I am not the senior teacher here, just a simple forum member in the last row. Would it not be better to discuss such deep philosophical questions inside several threads in the Soap Box or in the Coloseum?

Gunther
Get your facts first, and then you can distort them.

felipe

  • Member
  • ****
  • Posts: 833
  • Eagles are just great!
Re: Test results for AVX and AVX-512 needed
« Reply #36 on: December 23, 2017, 05:38:49 AM »
Code: [Select]

Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C with assembly code generated by VS:
--------------------------------------------
sum1              = 8390656.00
Elapsed Time      = 75.10 Seconds

C and 4 accumulators with assembly code generated by VS:
--------------------------------------------------------
sum2              = 8390656.00
Elapsed Time      = 25.73 Seconds
Performance Boost = 292%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 6.63 Seconds
Performance Boost = 1132%

Your current CPU doesn't support the AVX instruction set.
You'll need at least the Sandy Bridge or Ivy Bridge architecture.

The application terminates now.

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.


Windows 8.1...And a formerly bay trail  :redface:
Felipe.

six_L

  • Member
  • **
  • Posts: 132
Re: Test results for AVX and AVX-512 needed
« Reply #37 on: December 23, 2017, 05:41:10 AM »

felipe

  • Member
  • ****
  • Posts: 833
  • Eagles are just great!
Re: Test results for AVX and AVX-512 needed
« Reply #38 on: December 23, 2017, 05:41:53 AM »
Yeah, that was i saying with this:

Btw:
But that's a big field and if anyone wants to keep discussing these questions, we should do that in a separate thread in the Soap Box. I still have a lot to talk about chess engines.
:t

Felipe.

nidud

  • Member
  • *****
  • Posts: 1511
    • https://github.com/nidud/asmc
Re: Test results for AVX and AVX-512 needed
« Reply #39 on: December 23, 2017, 06:00:00 AM »
Hi Gunther

I have hardware with support up to AVX-2 but AVX-512 is now implemented in Asmc. Good to see hardware is available for testing.

Here's the result from floatassembly.exe (Win7-64):
Code: [Select]
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C with assembly code generated by VS:
--------------------------------------------
sum1              = 8390656.00
Elapsed Time      = 79.44 Seconds

C and 4 accumulators with assembly code generated by VS:
--------------------------------------------------------
sum2              = 8390656.00
Elapsed Time      = 19.88 Seconds
Performance Boost = 400%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 4.93 Seconds
Performance Boost = 1611%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 2.48 Seconds
Performance Boost = 3203%

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

FORTRANS

  • Member
  • *****
  • Posts: 1002
Re: Test results for AVX and AVX-512 needed
« Reply #40 on: December 23, 2017, 08:28:46 AM »
Hi Gunther,

   i3, Win 8.1, notebook.

Code: [Select]
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 108.83 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 73.05 Seconds
Performance Boost = 149%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 9.08 Seconds
Performance Boost = 1199%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 4.59 Seconds
Performance Boost = 2369%

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C with assembly code generated by VS:
--------------------------------------------
sum1              = 8390656.00
Elapsed Time      = 107.92 Seconds

C and 4 accumulators with assembly code generated by VS:
--------------------------------------------------------
sum2              = 8390656.00
Elapsed Time      = 36.27 Seconds
Performance Boost = 298%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 9.08 Seconds
Performance Boost = 1189%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 4.58 Seconds
Performance Boost = 2357%

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

HTH,

Steve N.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Test results for AVX and AVX-512 needed
« Reply #41 on: December 23, 2017, 12:04:43 PM »
Felipe,

thank you for testing floatasm. It's simply the code of VS which aw27 provided. I think that I should re-arrange the if statement. My fault, excuse me, please.

six_L,

special thanks for your detailed environment information. Where can I find Raistlins software?

nidud,

Quote from: nidud
I have hardware with support up to AVX-2 but AVX-512 is now implemented in Asmc. Good to see hardware is available for testing.

Wow, impressive link. It seems that you've included the complete instruction set, including the new mask registers.  :t

Steve (aka FORTRANS),

I am looking forward to hearing from you again. We had a long break. I very much hope that we will work together as comradely as we used to. In this sense: thank you for testing. Not bad for a small i3.

To sum up, so far all testers are driving on the Intel rail. Is AMD out of fashion?

Gunther
Get your facts first, and then you can distort them.

HSE

  • Member
  • ****
  • Posts: 683
  • <AMD>< 7-32>
Re: Test results for AVX and AVX-512 needed
« Reply #42 on: December 23, 2017, 12:13:06 PM »
Perfect now :t

Float:
Code: [Select]
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 85.06 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 56.41 Seconds
Performance Boost = 151%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 7.21 Seconds
Performance Boost = 1180%

Your current CPU doesn't support the AVX instruction set.
You'll need at least the Sandy Bridge or Ivy Bridge architecture.

The application terminates now.

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

Floatassembly:
Code: [Select]
Calculating the sum of a float array in 5 different variants.
That'll take a little while. Please be patient ...

Simple C with assembly code generated by VS:
--------------------------------------------
sum1              = 8390656.00
Elapsed Time      = 84.75 Seconds

C and 4 accumulators with assembly code generated by VS:
--------------------------------------------------------
sum2              = 8390656.00
Elapsed Time      = 28.91 Seconds
Performance Boost = 293%

Assembly language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 7.19 Seconds
Performance Boost = 1178%

Your current CPU doesn't support the AVX instruction set.
You'll need at least the Sandy Bridge or Ivy Bridge architecture.

The application terminates now.

Your current CPU doesn't support the AVX-512 instruction set.
You'll need at least the Knights Landing or Skylake architecture.

The application terminates now.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Test results for AVX and AVX-512 needed
« Reply #43 on: December 23, 2017, 12:36:04 PM »
Hi HSE,

good to see that. Please excuse the inconveniences. But where the hell did the instruction set number 13 come from? I've no answer, to be honest.

Gunther
Get your facts first, and then you can distort them.

felipe

  • Member
  • ****
  • Posts: 833
  • Eagles are just great!
Re: Test results for AVX and AVX-512 needed
« Reply #44 on: December 23, 2017, 02:26:20 PM »
Gunther here you will find that great work  from raistlin:

http://masm32.com/board/index.php?topic=5964.0

 :t
Felipe.