The archive iset.zip contains an assembly language procedure for detecting the available instruction sets. There are 2 sub-folders - one for Windows the other for Linux. An appropriate test program is included, too. The procedure detrects the instruction sets up to AVX2 properly.
The source is included and well commented. Comments and proposal for improvements are welcome.
Gunther
Waiting for 32 bit version. :t
Hello Gunther;
Good work Sir, worked fine.
A minor suggestion on linux side is; I noticed that build_features.sh have crlf, I changed it to lf and was able to compile.
To compile it using g++, I changed the line in features.c:
extern long long int Iset(void);
to
extern "C" long long int Iset(void);
and changed all gcc to g++ inside build_features.sh
Again, good work, long life and prosper, congrats.
Hi mineiro and Frank,
thank you for the flowers. The 32 bit version is coming soon.
Gunther
Under the first post of this thread is now the new instruction set detection procedure. The archive is ISET1.ZIP. It checks now for RDRAND support, too and indicates it if the check had success.
RDRAND (http://software.intel.com/sites/default/files/m/d/4/1/d/8/441_Intel_R__DRNG_Software_Implementation_Guide_final_Aug7.pdf) generates random numbers with hardware support. It was introduced by Intel in 2012 with the Ivy Bridge architecture; older processors won't support this instruction.
Here's the output of the test program on my Windows 7 machine:
Supported by Processor and installed Operating System:
------------------------------------------------------
MMX, CMOV and FCOMI, SSE, SSE2, SSE3, SSSE3, SSE4.1,
POPCNT, SSE4.2, AVX, RDRAND, PCLMUL and AES
featurenumber = 16
The archive contains the sources for Windows, Linux, BSD and MacOS.
There's nothing wrong by using RDRAND for Monte Carlo Simulations and the like. But if one has plans to use it for enryption purposes, that needs increased caution. You should check this (http://masm32.com/board/index.php?topic=2406.msg25139#msg25139) and this (http://masm32.com/board/index.php?topic=2406.msg25152#msg25152) post before.
Comments and proposal for improvements are welcome. Good luck.
Gunther
Under the first post of this thread is now the archive IsetU1.zip. It's a major update of the instruction detection procedure. I've changed a bit the logic of the application. It uses now several routines for detecting basic instruction sets and special instructions. As a major result, this archive contains only the 64-bit source for Windows; the Linux program needs another thread. That has to do with the very different application binary interfaces for both operating systems.
I've used gcc for the C source. Since it has not much special, it should also compile with VS, but that's not tested. The assembly language source is written for jWasm, but ml should also do the job. For a more detailed description of the source files, please have a look into the readme.txt file.
Here is the output from my desktop PC, running Windows 7:
Supported Features by Processor and Operating System
====================================================
Vendor String: GenuineIntel
Brand String: Intel(R)Core(TM)i7-3770CPU@3.40GHz
Instruction Sets
----------------
MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX
Supported Special Instructions
------------------------------
Conditional Moves
FXSAVE and FXSTOR
XSAVE and XSTOR for processor extended state management.
POPCNT
RDRAND
AES (Advanced Encryption Standard) Instruction Set
16-bit floating-point Conversion Instructions
Please, press enter to end the application ...
That's the output from the small laptop, running Windows 8:
Supported Features by Processor and Operating System
====================================================
Vendor String: GenuineIntel
Brand String: Intel(R)Core(TM)i5-3210MCPU@2.50GHz
Instruction Sets
----------------
MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX
Supported Special Instructions
------------------------------
Conditional Moves
FXSAVE and FXSTOR
XSAVE and XSTOR for processor extended state management.
POPCNT
RDRAND
AES (Advanced Encryption Standard) Instruction Set
16-bit floating-point Conversion Instructions
Please, press enter to end the application ...
I haven't an AMD box running 64-bit Windows, but I hope it won't only run on newer Intel processors. But that all must be tested with different processors and other configurations. Your help and test results are very welcome.
Gunther
My test box - Windows 7 Ultimate x64
Supported Features by Processor and Operating System
====================================================
Vendor String: AuthenticAMD
Brand String: AMDA10-7850KAPUwithRadeon(TM)R7Graphics
Instruction Sets
----------------
MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX
Supported Special Instructions
------------------------------
Conditional Moves
FXSAVE and FXSTOR
XSAVE and XSTOR for processor extended state management.
POPCNT
AES (Advanced Encryption Standard) Instruction Set
FMA (Fused Multiply Add) extensions using YMM state.
16-bit floating-point Conversion Instructions
Hi sinsi,
thank you for testing the software. Seems that your CPU is a Bulldozer follower. It comes with FMA.
Gunther
Kaveri (http://en.wikipedia.org/wiki/List_of_AMD_Accelerated_Processing_Unit_microprocessors#.22Kaveri.22_.282014.2C_28_nm.29), based on the Steamroller microarchitecture.
Hi Sinsi,
Quote from: sinsi on April 27, 2014, 08:04:42 PM
Kaveri (http://en.wikipedia.org/wiki/List_of_AMD_Accelerated_Processing_Unit_microprocessors#.22Kaveri.22_.282014.2C_28_nm.29), based on the Steamroller microarchitecture.
thank you for the information. Interesting product line by AMD.
Gunther
so, Gunther,
did you figure out whether or not a 64-bit OS is required to use AVX instruction sets ?
Dave,
Quote from: dedndave on June 24, 2014, 10:33:51 PM
so, Gunther,
did you figure out whether or not a 64-bit OS is required to use AVX instruction sets ?
To use AVX instructions properly you'll need a 64-bit OS. So, AVX, AVX2, AVX512 etc. are not available for 32-bit operating systems. Therefore that thread (http://masm32.com/board/index.php?topic=3191.0). But I had no answers.
Gunther
thanks :t
one more question.....
what about a 32-bit program under 64-bit OS - can you use AVX then ?
Dave,
Quote from: dedndave on June 25, 2014, 01:36:08 AM
what about a 32-bit program under 64-bit OS - can you use AVX then ?
Yes, a 32-bit client can use at least AVX under a 64-bit environment without re-compilng. That's tested.
Gunther
thanks again, Gunther :t
working on a 32-bit hardware ID routine and wanted to know if AVX was pertinent
Dave,
you can test it with that software (http://masm32.com/board/index.php?topic=3227.msg33791#msg33791).
Gunther
i don't have a processor with AVX support - that's why i'm asking questions - lol
tell me if i have these right (assuming the processor supports AVX)...
1) windows 7 32-bit with SP1, or later, will support AVX
2) windows 7 64-bit with SP1, or later, will support AVX/AVX2
3) some buggish behaviour with windows 7, with regard to debugging on an AVX machine
more specifically: debugging a 32-bit app under win7-64 an an AVX CPU, exceptions not handled correctly
QuoteNew x86 processors, such as the Intel Sandy Bridge (formerly Gesher) processor, support the AVX instructions and register set (YMM0-YMM15). In Windows 7 with Service Pack 1 (SP1) and Windows Server 2008 R2, both 32-bit and 64-bit versions of the operating system preserve the AVX registers across thread (and process) switches.
http://msdn.microsoft.com/en-us/library/ff545910.aspx (http://msdn.microsoft.com/en-us/library/ff545910.aspx)
Dave,
Quote from: dedndave on June 27, 2014, 04:49:31 AM
tell me if i have these right (assuming the processor supports AVX)...
1) windows 7 32-bit with SP1, or later, will support AVX
no that's not right. Here's the output of the test program (http://masm32.com/board/index.php?topic=3227.msg33791#msg33791) under Windows 7-32:
AVX support not available:
==========================
X[0] = 1.00 Y[0] = 1.00
X[1] = 2.00 Y[1] = 2.00
X[2] = 3.00 Y[2] = 3.00
X[3] = 4.00 Y[3] = 4.00
Please, press enter to end the application ...
Quote from: dedndave on June 27, 2014, 04:49:31 AM
2) windows 7 64-bit with SP1, or later, will support AVX/AVX2
That's correct. The same application brings that output under Windows 7-64:
AVX support available:
======================
X[0] = 1.00 Y[0] = 1.00 Z[0] = 1.00
X[1] = 2.00 Y[1] = 2.00 Z[1] = 2.00
X[2] = 3.00 Y[2] = 3.00 Z[2] = 3.00
X[3] = 4.00 Y[3] = 4.00 Z[3] = 4.00
Please, press enter to end the application ...
That has to do with the VEX prefix for AVX, AVX2, AVX512 ... etc. The instruction:
lds si, [bp+10]
is encoded as:
C5760A
The instructions LDS and LES are not valid under 64-bit operating systems. So they are used as VEX prefixes.
vmovaps ymm0, [esi]
is encoded as:
C5FC2806
The result is, a 32-bit application can't use AVX instructions in a native 32-bit environment. The only chance for such a program is to run under a 64-bit Operating System in the compatibility mode. But that's not ver logical. Why not use a native 64-bit program for the same purpose?
Gunther
The archive IsetsAVX-512.zip under the first post of this thread contains in \IsetsAVX-512\Isets\Source the following files:
build_isets.bat: The batch file for building the running EXE
is.asm: Assembly language procedures
ISets.c: C file with main()
ISets.h: C header with some variable declarations etc.
ISetsFunc.c: C file with functions
The binary folder contains the appropriate binary files. The C files are compiled or pre-compiled with gcc version 7.2.0; but there's nothing special; it should compile with any other C compiler (clang, VS, PellesC etc.) But this must be checked. Have a look into the batch file for appropriate switches, please. The assembly language source is written for NASM/YASM, because I've no other assembler running on my new machine - at the moment, at least. I hope that in the next days I'll get the MASM64 package running. With some minor changes the ml64 should also do the job, I hope.
In the last few days a had a very hard fight with the AVX-512 Instruction Set; it's a strange beast. I've checked the latest Intel manuals and the search engine was running at full speed. The manuals are a bit tortuous and in some cases not accurate. On the other hand, one can find a lot of AVX-512 detection code written with that crippled intrinsics. The point is: In most cases the Intel compiler is used which has a built-in detection wrapper; the gcc has another set of intrinsics and VS has a different intrinsic set, too. It's a shame. Furthermore, the intrinsic code isn't good readable. The whole weekend was spent for this work. All things considered, I'm doing just fine.
What's the result? There are several sets and subsets of AVX-512 instructions. You have to be good at set theory to draw the corresponding Venn diagram. Here's what I figured out:
- AVX-512 F is the fundamental instruction set, it expands most of AVX functions to support 512-bit registers and adds masking, embedded broadcasting, embedded rounding and exception control.
- AVX-512 CD is the Conflict Detection instruction set, which allows vectorization of loops with vector dependency due to writing conflicts.
- AVX-512 BW is the Byte and Word support instruction set: 8-bit and 16-bit integer operations, processing up to 64 8-bit elements or 32 16-bit integer elements per vector.
- AVX-512 DQ Double and Quad word instruction set, supports new instructions for double-word (32-bit) and quadword (64-bit) integer and floating-point elements.
- AVX-512 VL Vector Length extensions. Support for vector lengths smaller than 512 bits.
- AVX-512 PF Data prefetching for gather and scatter instructions.
- AVX-512 ER Exponential and Reciprocal instruction set for high-accuracy base-2 exponential functions, reciprocals, and reciprocal square roots.
All together: A lot of work for assembler and compiler builders. The Knights Landing architecture uses AVX-512 F, CD, PF, ER while the Skylake architecture uses AVX-512 F, CD, BW, DQ, ER. And: It seems to me that Intel plans further instruction set extensions for the future. Here is the output of me instruction set detection:
Supported Features by Processor and Operating System
====================================================
Vendor String: GenuineIntel
Brand String: Intel(R)Core(TM)i7-7820XCPU@3.60GHz
Instruction Sets
----------------
MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX AVX2
AVX-512 F - Fundamental Instructions
AVX-512 DQ - Double and Quad Word Instructions
AVX-512 CD - Conflict Detection Instructions
AVX-512 BW - Byte and Word Support Instructions
AVX-512 ER - Exponential and Reciprocal Instructions
Please, press enter to end the application ...
So, I don't have another choice and I've to realize the AVX-512 detection in a separate procedure for the future, because the situation is very confusing. But for now it works fine and is tested under Windows 10 with the Skylake architecture. Other processors (AMD etc.) will give another output, of course. Test results and comments for improvement are very welcome. Have fun.
Gunther
Thanks for share your good work. I have a celeron processor that support instructions until sse4.2. No AVX at all. The programs behaved well. (I have windows 8.1). I wonder what's the name of this architecture. :icon_redface:
:P
Thank you for the flowers, Felipe.
Quote
I have a celeron processor that support instructions until sse4.2. No AVX at all. The programs behaved well.
That's not unusual. The detection up to AVX2 isn't a big deal; the tricky part are the several AVX-512 detections.
Quote
I wonder what's the name of this architecture.
The Intel and AMD architecture and code names are a science in their own right. Here (http://arith22.gforge.inria.fr/slides/s1-cornea.pdf) is one of the countless documents which I've read last weekend. Please check for example page 2 of the slide show. But that's not all. For a complete list of Intel processors and chipsets, please have a look here. (https://www.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/platform-codenames.html) I hope that helps.
Gunther
Thanks. As i can see is a formerly Bay trail. :bgrin:
Felipe,
Quote
As i can see is a formerly Bay trail. :bgrin:
You'll have to discuss that point with your manufacturer or re-seller. Show him the documents. :t
Gunther
Hi Gunther i received an used (second hand) computer. You can see the specifications of the processor here:
https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz (https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz).
Is a shame that been an i5 does not have support for avx instructions at all. But i'm very happy with this new toy anyway :greenclp:.
I want to share this in here because i used your program to detect the instruction set and i couldn't believe the results :shock:. Then i checked with the qeditor(about) and finally with raistlin tool extreme id. Well, it was not until i found the specs in the intel page that i wasn't 100% sure :redface:. So, this is a little extra feedback: :t
Supported Features by Processor and Operating System
====================================================
Vendor String: GenuineIntel
Brand String: Intel(R)Core(TM)i5CPU650@3.20GHz
Instruction Sets
----------------
MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2
Supported Special Instructions
------------------------------
Conditional Moves
FXSAVE and FXSTOR
POPCNT
AES (Advanced Encryption Standard) Instruction Set
Please, press enter to end the application ...
Hi Felipe,
Quote from: felipe on December 29, 2017, 08:49:06 AM
Hi Gunther i received an used (second hand) computer. You can see the specifications of the processor here:
https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz (https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz).
Found.
Quote from: felipe on December 29, 2017, 08:49:06 AM
Is a shame that been an i5 does not have support for avx instructions at all. But i'm very happy with this new toy anyway :greenclp:.
Intel has several different product lines. You have to check that out beforehand, otherwise you will experience quite nasty surprises. But all in all, that's not a bad machine.
Gunther
felipe,
Sounds like a good score, an i5 is a good sound 4 core machine. I have my last i7 alongside me and it has almost identical specs to an i5 so you will get a lot of decent code written on it. In terms of usefulness, late SSE and AVX are OK but very little software uses it unless it has processor detection and optional methods for machines that only have late SSE (SSE4.2).