News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

AVX for 32-bit Windows applications

Started by Gunther, May 27, 2014, 04:08:26 AM

Previous topic - Next topic

dedndave

i was reading something about ymm0-7 etc the other day
i think it mentioned something about 64-bit OS, being able to use the others
i may have mis-read because it was not what i was after and i skimmed that part quickly

as for VM's....

according to some documentation i have seen,
some VM's return their own vendor ID for CPUID
;'KVMKVMKVMKVM' KVM
;'Microsoft Hv' Microsoft Hyper-V or Windows Virtual PC
;'VMwareVMware' VMware
;'XenVMMXenVMM' Xen HVM


that might make it difficult for software to determine the level of support for extensions   ::)

Gunther

Quote from: anta40 on June 29, 2014, 02:15:02 AM
Hi Gunther,

I'm running 32-bit Win 7. CPU-Z says that my CPU supports AVX.
Your program seems to run correctly as expected.

That's a big surprise for me. But Windows 7-32, Professional, SP 1 doesn't support it, at least as VM.

qWord,
Quote from: qWord on June 29, 2014, 03:39:05 AM
However, only ymm0-7 can be used in these modes.

That's clear. If so, it would be similar to the usage of XMM registers. Furthermore, we had that interesting discussion. The groundwork was that link to an article by Chris Lomont in the Intel Developer Zone:
Quote
The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under IntelĀ® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers).

Gunther
You have to know the facts before you can distort them.

qWord

Quote from: Gunther on June 29, 2014, 11:56:08 PMqWord,
Quote from: qWord on June 29, 2014, 03:39:05 AM
However, only ymm0-7 can be used in these modes.

That's clear. If so, it would be similar to the usage of XMM registers. Furthermore, we had that interesting discussion. The groundwork was that link to an article by Chris Lomont in the Intel Developer Zone:
Quote
The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under IntelĀ® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers).
What should this say to other readers?
MREAL macros - when you need floating point arithmetic while assembling!

Gunther

qWord,

Quote from: qWord on June 30, 2014, 12:06:20 AM
What should this say to other readers?

no offense. It wasn't my quote. It's a statement by Chris Lomont inside the Intel Developer Network. The other side of the coin is the result by anta40. So, what is with Windows 8?

Gunther
You have to know the facts before you can distort them.

dedndave

perhaps not all AVX instructions are VEX encoded ?
it's a lot to absorb   :(

Gunther

Dave,

Quote from: dedndave on June 30, 2014, 02:32:40 AM
perhaps not all AVX instructions are VEX encoded ?

But

        vaddps

is VEX encoded.

Gunther
You have to know the facts before you can distort them.

dedndave

your test program didn't use any VEX-encoded instructions, though - right ?

Gunther

Quote from: dedndave on June 30, 2014, 03:35:52 AM
your test program didn't use any VEX-encoded instructions, though - right ?

It uses VEX encoding. That makes the entire thing a bit strange.

Gunther
You have to know the facts before you can distort them.

dedndave

ok - it's nice to know that i'm not the only person that's confused   :biggrin:

dedndave

i was looking for info and found this document
it's all about AVX512

however, they also give a complete treatment to CPUID at the beginning of section 2   :biggrin:

https://software.intel.com/en-us/file/319433-019pdf

qWord

What is so confusion? That a VM does not support AVX?

AVX512 does not exist in hardware currently and the linked manuals shows that these instruction will get their own new (4 byte) prefix: EVEX.



MREAL macros - when you need floating point arithmetic while assembling!

dedndave

i can't speak for Gunther, but here's what's confusing me.....

the AVX instructions are VEX-encoded
i.e., they use the opcode space of the old LDS and LES instructions

so, let's say you have a 64-bit processor that is AVX-capable (i7, for example)
and you have Windows 7 32-bit installed

it seems to support AVX, at least to some degree
but, some documentation states that VEX may only be decoded in long mode   :redface:

dedndave

to clarify....

here is a link to Intel info, stating that VEX instructions may only be decoded in long mode
at the end of the page, instructions are listed, nearly all seem to be VEX encoded

https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions

here is a link to Intel info, showing how to test for AVX support - 32-bit and 64-bit code provided

https://software.intel.com/en-us/blogs/2011/04/14/is-avx-enabled

according to Gunther, his test program uses at least one VEX encoded instruction
and anta40 ran the test program successfully under Windows 7-32, SP1

qWord

#28
The VEX prefix is encoded in such way that the first two byte form an invalid form of LES resp. LDS: These instructions have one register argument as destination and one memory operand as source. The VEX prefixes (in 16 or 32 bit modes) encode the illegal form with two register arguments (ModR/M: mod=11y). The limitation for ymm0-7 has to do with the 2-byte VEX prefix (=>LDS), where bit 6 of the second prefix byte ( = low bit of the mod-filed of LDS) is used encode a register number (ymmX). The reg. number is saved in 1's complement thus this bit is 1 for ymm0-7. In 64 bit mode this bit could also be 0, because LDS does not exist, but in 32 and 16 bit modes this bit must be 1 to get an illegal form of LDS (mod=11y).

You can read this up in the latest Manuals (I've used the "AllInOne" pdf).

BTW: does the Author of the PDF work for Intel? I didn't think so.

EDIT: bit 7 of the second prefix byte is also used for register encoding and must be 1 in 32 bit modes.
MREAL macros - when you need floating point arithmetic while assembling!

Gunther

Hi qWord,

thank you for the explanation. I'm sitting here, writing some test programs resulting in strange results. So, I'll need some time.

Quote from: qWord on June 30, 2014, 05:54:07 AM
BTW: does the Author of the PDF work for Intel? I didn't think so.

On the other hand, Chris Lomont isn't Mr. Nobody.

Gunther
You have to know the facts before you can distort them.