Author Topic: AVX for 32-bit Windows applications  (Read 35849 times)

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7553
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: AVX for 32-bit Windows applications
« Reply #45 on: July 15, 2014, 11:19:52 AM »
I only got this far with my old Core2 quad.

Code: [Select]
Calculating the sum of a float array in different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 16.44 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 9.61 Seconds
Performance Boost = 171%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.42 Seconds
Performance Boost = 1156%

Your current CPU doesn't support the AVX instruction set.
The application terminates now.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: AVX for 32-bit Windows applications
« Reply #46 on: July 15, 2014, 11:29:39 AM »
Thank you Hutch. It's clear that the Core2 doesn't support AVX. But the other timings are interesting.

Sinsi, you're using VMware and that allows AVX. Interesting point. Thank you, too.

Gunther
Get your facts first, and then you can distort them.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: AVX for 32-bit Windows applications
« Reply #47 on: August 23, 2014, 08:31:35 PM »
I've installed yesterday VMware Player with Windows 7-64 as host and Windows 7-32 as guest. Here are the results for fsum.exe, which is under post #43:

Quote

Calculating the sum of a float array in different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 12.96 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.46 Seconds
Performance Boost = 201%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.09 Seconds
Performance Boost = 1187%

Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4              = 8390656.00
Elapsed Time      = 0.75 Seconds
Performance Boost = 1733%


Gunther
Get your facts first, and then you can distort them.

habran

  • Member
  • *****
  • Posts: 1225
    • uasm
Re: AVX for 32-bit Windows applications
« Reply #48 on: November 02, 2014, 02:41:58 PM »
I have found error for vsqrtps, vsqrtpd in JWasm

file instravx.h

line 30
was:
Code: [Select]
avxins (SQRTPD,   vsqrtpd,       P_AVX, VX_L ) /* L, s */
avxins (SQRTPS,   vsqrtps,       P_AVX, VX_L ) /* L, s */
change to:
Code: [Select]
avxins (SQRTPD,   vsqrtpd,       P_AVX, VX_L|VX_NND ) /* L, ns */
avxins (SQRTPS,   vsqrtps,       P_AVX, VX_L|VX_NND ) /* L, ns */
Cod-Father

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: AVX for 32-bit Windows applications
« Reply #49 on: November 03, 2014, 02:04:09 AM »
Good catch, habran.  :t

Gunther

Get your facts first, and then you can distort them.

habran

  • Member
  • *****
  • Posts: 1225
    • uasm
Re: AVX for 32-bit Windows applications
« Reply #50 on: November 03, 2014, 05:40:05 AM »
Thanks mate  :biggrin:
Cod-Father