AVX for 32-bit Windows applications

dedndave · June 29, 2014, 05:44:20 AM

i was reading something about ymm0-7 etc the other day
i think it mentioned something about 64-bit OS, being able to use the others
i may have mis-read because it was not what i was after and i skimmed that part quickly

as for VM's....

according to some documentation i have seen,
some VM's return their own vendor ID for CPUID

Code Select

;'KVMKVMKVMKVM' KVM
;'Microsoft Hv' Microsoft Hyper-V or Windows Virtual PC
;'VMwareVMware' VMware
;'XenVMMXenVMM' Xen HVM

that might make it difficult for software to determine the level of support for extensions ::)

Gunther · June 29, 2014, 11:56:08 PM

Quote from: anta40 on June 29, 2014, 02:15:02 AM
Hi Gunther,

I'm running 32-bit Win 7. CPU-Z says that my CPU supports AVX.
Your program seems to run correctly as expected.

That's a big surprise for me. But Windows 7-32, Professional, SP 1 doesn't support it, at least as VM.

qWord,

Quote from: qWord on June 29, 2014, 03:39:05 AM
However, only ymm0-7 can be used in these modes.

That's clear. If so, it would be similar to the usage of XMM registers. Furthermore, we had that interesting discussion. The groundwork was that link to an article by Chris Lomont in the Intel Developer Zone:

Quote
The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under Intel® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers).

Gunther

qWord · June 30, 2014, 12:06:20 AM

Quote from: Gunther on June 29, 2014, 11:56:08 PMqWord,
Quote from: qWord on June 29, 2014, 03:39:05 AM
However, only ymm0-7 can be used in these modes.

That's clear. If so, it would be similar to the usage of XMM registers. Furthermore, we had that interesting discussion. The groundwork was that link to an article by Chris Lomont in the Intel Developer Zone:
Quote
The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under Intel® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers).

What should this say to other readers?

Gunther · June 30, 2014, 12:14:35 AM

qWord,

Quote from: qWord on June 30, 2014, 12:06:20 AM
What should this say to other readers?

no offense. It wasn't my quote. It's a statement by Chris Lomont inside the Intel Developer Network. The other side of the coin is the result by anta40. So, what is with Windows 8?

Gunther

dedndave · June 30, 2014, 02:32:40 AM

perhaps not all AVX instructions are VEX encoded ?
it's a lot to absorb :(

Gunther · June 30, 2014, 03:08:12 AM

Dave,

Quote from: dedndave on June 30, 2014, 02:32:40 AM
perhaps not all AVX instructions are VEX encoded ?

But

Code Select


        vaddps

is VEX encoded.

Gunther

dedndave · June 30, 2014, 03:35:52 AM

your test program didn't use any VEX-encoded instructions, though - right ?

Gunther · June 30, 2014, 03:38:17 AM

Quote from: dedndave on June 30, 2014, 03:35:52 AM
your test program didn't use any VEX-encoded instructions, though - right ?

It uses VEX encoding. That makes the entire thing a bit strange.

Gunther

dedndave · June 30, 2014, 03:48:11 AM

ok - it's nice to know that i'm not the only person that's confused

dedndave · June 30, 2014, 04:48:47 AM

i was looking for info and found this document
it's all about AVX512

however, they also give a complete treatment to CPUID at the beginning of section 2

https://software.intel.com/en-us/file/319433-019pdf

qWord · June 30, 2014, 04:56:25 AM

What is so confusion? That a VM does not support AVX?

AVX512 does not exist in hardware currently and the linked manuals shows that these instruction will get their own new (4 byte) prefix: EVEX.

dedndave · June 30, 2014, 05:23:47 AM

i can't speak for Gunther, but here's what's confusing me.....

the AVX instructions are VEX-encoded
i.e., they use the opcode space of the old LDS and LES instructions

so, let's say you have a 64-bit processor that is AVX-capable (i7, for example)
and you have Windows 7 32-bit installed

it seems to support AVX, at least to some degree
but, some documentation states that VEX may only be decoded in long mode :redface:

dedndave · June 30, 2014, 05:41:25 AM

to clarify....

here is a link to Intel info, stating that VEX instructions may only be decoded in long mode
at the end of the page, instructions are listed, nearly all seem to be VEX encoded

https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions

here is a link to Intel info, showing how to test for AVX support - 32-bit and 64-bit code provided

https://software.intel.com/en-us/blogs/2011/04/14/is-avx-enabled

according to Gunther, his test program uses at least one VEX encoded instruction
and anta40 ran the test program successfully under Windows 7-32, SP1

qWord · June 30, 2014, 05:54:07 AM

The VEX prefix is encoded in such way that the first two byte form an invalid form of LES resp. LDS: These instructions have one register argument as destination and one memory operand as source. The VEX prefixes (in 16 or 32 bit modes) encode the illegal form with two register arguments (ModR/M: mod=11y). The limitation for ymm0-7 has to do with the 2-byte VEX prefix (=>LDS), where bit 6 of the second prefix byte ( = low bit of the mod-filed of LDS) is used encode a register number (ymmX). The reg. number is saved in 1's complement thus this bit is 1 for ymm0-7. In 64 bit mode this bit could also be 0, because LDS does not exist, but in 32 and 16 bit modes this bit must be 1 to get an illegal form of LDS (mod=11y).

You can read this up in the latest Manuals (I've used the "AllInOne" pdf).

BTW: does the Author of the PDF work for Intel? I didn't think so.

EDIT: bit 7 of the second prefix byte is also used for register encoding and must be 1 in 32 bit modes.

Gunther · June 30, 2014, 06:03:59 AM

Hi qWord,

thank you for the explanation. I'm sitting here, writing some test programs resulting in strange results. So, I'll need some time.

Quote from: qWord on June 30, 2014, 05:54:07 AM
BTW: does the Author of the PDF work for Intel? I didn't think so.

On the other hand, Chris Lomont isn't Mr. Nobody.

Gunther

The MASM Forum

News: