News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Instruction Set detection for 64 bit Operating Systems

Started by Gunther, February 01, 2013, 05:50:10 AM

Previous topic - Next topic

Gunther

#15
Dave,

you can test it with that software.

Gunther
You have to know the facts before you can distort them.

dedndave

i don't have a processor with AVX support - that's why i'm asking questions - lol

tell me if i have these right (assuming the processor supports AVX)...
1) windows 7 32-bit with SP1, or later, will support AVX
2) windows 7 64-bit with SP1, or later, will support AVX/AVX2
3) some buggish behaviour with windows 7, with regard to debugging on an AVX machine
more specifically: debugging a 32-bit app under win7-64 an an AVX CPU, exceptions not handled correctly

QuoteNew x86 processors, such as the Intel Sandy Bridge (formerly Gesher) processor, support the AVX instructions and register set (YMM0-YMM15). In Windows 7 with Service Pack 1 (SP1) and Windows Server 2008 R2, both 32-bit and 64-bit versions of the operating system preserve the AVX registers across thread (and process) switches.

http://msdn.microsoft.com/en-us/library/ff545910.aspx

Gunther

Dave,

Quote from: dedndave on June 27, 2014, 04:49:31 AM
tell me if i have these right (assuming the processor supports AVX)...
1) windows 7 32-bit with SP1, or later, will support AVX
no that's not right. Here's the output of the test program under Windows 7-32:


AVX support not available:
==========================

X[0] = 1.00     Y[0] = 1.00
X[1] = 2.00     Y[1] = 2.00
X[2] = 3.00     Y[2] = 3.00
X[3] = 4.00     Y[3] = 4.00

Please, press enter to end the application ...


Quote from: dedndave on June 27, 2014, 04:49:31 AM
2) windows 7 64-bit with SP1, or later, will support AVX/AVX2
That's correct. The same application brings that output under Windows 7-64:

AVX support available:
======================

X[0] = 1.00     Y[0] = 1.00     Z[0] = 1.00
X[1] = 2.00     Y[1] = 2.00     Z[1] = 2.00
X[2] = 3.00     Y[2] = 3.00     Z[2] = 3.00
X[3] = 4.00     Y[3] = 4.00     Z[3] = 4.00

Please, press enter to end the application ...

That has to do with the VEX prefix for AVX, AVX2, AVX512 ... etc. The instruction:

        lds        si, [bp+10]       

is encoded as: C5760A
The instructions LDS and LES are not valid under 64-bit operating systems. So they are used as VEX prefixes.

        vmovaps    ymm0, [esi]

is encoded as: C5FC2806

The result is, a 32-bit application can't use AVX instructions in a native 32-bit environment. The only chance for such a program is to run under a 64-bit Operating System in the compatibility mode. But that's not ver logical. Why not use a native 64-bit program for the same purpose?

Gunther
You have to know the facts before you can distort them.

Gunther

The archive IsetsAVX-512.zip under the first post of this thread contains in \IsetsAVX-512\Isets\Source the following files:

build_isets.bat: The batch file for building the running EXE
is.asm:             Assembly language procedures
ISets.c:            C file with main()
ISets.h:            C header with some variable declarations etc.
ISetsFunc.c:     C file with functions

The binary folder contains the appropriate binary files. The C files are compiled or pre-compiled with gcc version 7.2.0; but there's nothing special; it should compile with any other C compiler (clang, VS, PellesC etc.) But this must be checked. Have a look into the batch file for appropriate switches, please. The assembly language source is written for NASM/YASM, because I've no other assembler running on my new machine - at the moment, at least. I hope that in the next days I'll get the MASM64 package running. With some minor changes the ml64 should also do the job, I hope.

In the last few days a had a very hard fight with the AVX-512 Instruction Set; it's a strange beast. I've checked the latest Intel manuals and the search engine was running at full speed. The manuals are a bit tortuous and in some cases not accurate. On the other hand, one can find a lot of AVX-512 detection code written with that crippled intrinsics. The point is: In most cases the Intel compiler is  used which has a built-in detection wrapper; the gcc has another set of intrinsics and VS has a different intrinsic set, too. It's a shame. Furthermore, the intrinsic code isn't good readable. The whole weekend was spent for this work. All things considered, I'm doing just fine.

What's the result? There are several sets and subsets of AVX-512 instructions. You have to be good at set theory to draw the corresponding Venn diagram. Here's what I figured out:

  • AVX-512 F is the fundamental instruction set, it expands most of AVX functions to support 512-bit registers and adds masking, embedded broadcasting, embedded rounding and exception control.
  • AVX-512 CD is the Conflict Detection instruction set, which allows vectorization of loops with vector dependency due to writing conflicts.
  • AVX-512 BW is the Byte and Word support instruction set: 8-bit and 16-bit integer operations, processing up to 64 8-bit elements or 32 16-bit integer elements per vector.
  • AVX-512 DQ Double and Quad word instruction set, supports new instructions for double-word (32-bit) and quadword (64-bit) integer and floating-point elements.
  • AVX-512 VL Vector Length extensions. Support for vector lengths smaller than 512 bits.
  • AVX-512 PF Data prefetching for gather and scatter instructions.
  • AVX-512 ER Exponential and Reciprocal instruction set for high-accuracy base-2 exponential functions, reciprocals, and reciprocal square roots.
All together: A lot of work for assembler and compiler builders. The Knights Landing architecture uses AVX-512 F, CD, PF, ER while the Skylake architecture uses AVX-512 F, CD, BW, DQ, ER. And: It seems to me that Intel plans further instruction set extensions for the future. Here is the output of me instruction set detection:

Supported Features by Processor and Operating System
====================================================

Vendor String: GenuineIntel
Brand  String: Intel(R)Core(TM)i7-7820XCPU@3.60GHz

Instruction Sets
----------------

MMX  SSE  SSE2  SSE3  SSSE3  SSE4.1  SSE4.2  AVX  AVX2
AVX-512 F  - Fundamental Instructions
AVX-512 DQ - Double and Quad Word Instructions
AVX-512 CD - Conflict Detection Instructions
AVX-512 BW - Byte and Word Support Instructions
AVX-512 ER - Exponential and Reciprocal Instructions


Please, press enter to end the application ...

So, I don't have another choice and I've to realize the AVX-512 detection in a separate procedure for the future, because the situation is very confusing. But for now it works fine and is tested under Windows 10 with the Skylake architecture. Other processors (AMD etc.) will give another output, of course. Test results and comments for improvement are very welcome. Have fun.

Gunther




           
You have to know the facts before you can distort them.

felipe

Thanks for share your good work. I have a celeron processor that support instructions until sse4.2. No AVX at all. The programs behaved well. (I have windows 8.1). I wonder what's the name of this architecture.   :icon_redface:

:P

Gunther

Thank you for the flowers, Felipe.

Quote
I have a celeron processor that support instructions until sse4.2. No AVX at all. The programs behaved well.

That's not unusual. The detection up to AVX2 isn't a big deal; the tricky part are the several AVX-512 detections.

Quote
I wonder what's the name of this architecture.

The Intel and AMD architecture and code names are a science in their own right. Here is one of the countless documents which I've read last weekend. Please check for example page 2 of the slide show. But that's not all. For a complete list of Intel processors and chipsets, please have a look here. I hope that helps.

Gunther
You have to know the facts before you can distort them.

felipe


Gunther

Felipe,

Quote
As i can see is a formerly Bay trail.  :bgrin:

You'll have to discuss that point with your manufacturer or re-seller. Show him the documents.  :t

Gunther
You have to know the facts before you can distort them.

felipe

Hi Gunther i received an used (second hand) computer. You can see the specifications of the processor here:
https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz.
Is a shame that been an i5 does not have support for avx instructions at all. But i'm very happy with this new toy anyway  :greenclp:.
I want to share this in here because i used your program to detect the instruction set and i couldn't believe the results  :shock:. Then i checked with the qeditor(about) and finally with raistlin tool extreme id. Well, it was not until i found the specs in the intel page that i wasn't 100% sure  :redface:. So, this is a little extra feedback:  :t



        Supported Features by Processor and Operating System
        ====================================================

Vendor String: GenuineIntel
Brand  String: Intel(R)Core(TM)i5CPU650@3.20GHz

        Instruction Sets
        ----------------

MMX  SSE  SSE2  SSE3  SSSE3  SSE4.1  SSE4.2

        Supported Special Instructions
        ------------------------------

Conditional Moves
FXSAVE and FXSTOR
POPCNT
AES (Advanced Encryption Standard) Instruction Set

Please, press enter to end the application ...

Gunther

Hi Felipe,

Quote from: felipe on December 29, 2017, 08:49:06 AM
Hi Gunther i received an used (second hand) computer. You can see the specifications of the processor here:
https://ark.intel.com/products/43546/Intel-Core-i5-650-Processor-4M-Cache-3_20-GHz.
Found.

Quote from: felipe on December 29, 2017, 08:49:06 AM
Is a shame that been an i5 does not have support for avx instructions at all. But i'm very happy with this new toy anyway  :greenclp:.
Intel has several different product lines. You have to check that out beforehand, otherwise you will experience quite nasty surprises. But all in all, that's not a bad machine.

Gunther
You have to know the facts before you can distort them.

hutch--

felipe,

Sounds like a good score, an i5 is a good sound 4 core machine. I have my last i7 alongside me and it has almost identical specs to an i5 so you will get a lot of decent code written on it. In terms of usefulness, late SSE and AVX are OK but very little software uses it unless it has processor detection and optional methods for machines that only have late SSE (SSE4.2).