Attached is the file DOSAVX.zip. It contains the sources and binaries to build the program, for example:
AVX.COM: COM file for preparation the AVX instruction set under plain DOS
B_DFSUM.BAT: Batch file to build the EXE
DFSUM.C: Main function
DFSUMF.C: file with C functions
DFSUMF.ASM: file with assembly language functions
For more details, please read that
thread very accurate. Here is the output of the program under VirtualBox:
Calculating the sum of a float array in 4 different ways.
That'll take a little while. Please be patient ...
Simple C implementation:
------------------------
sum1 = 8390656.00
Elapsed Time = 45.38 Seconds
C implementation with 4 accumulators:
-------------------------------------
sum2 = 8390656.00
Elapsed Time = 17.86 Seconds
Performance Boost = 254%
Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3 = 8390656.00
Elapsed Time = 1.21 Seconds
Performance Boost = 3755%
Assembly Language with 4 YMM accumulators:
------------------------------------------
sum4 = 8390656.00
Elapsed Time = 0.82 Seconds
Performance Boost = 5507%
Some test results under other configurations would be fine.
Gunther