With the new Sandy Bridge and Ivy Bridge architectures it's necessary to determine the available instruction set. The best approach - without any doubts - is the CPUID instruction. Has anyone a solution for that - more general, not only Windows specific?
Gunther
Gunther,
I think the only way is to have a good look through the latest Intel manual to get the later instructions. I don't have a manual later than the earlier i7 hardware which from memory did not have the instruction set you are after.
Steve,
Quote from: hutch-- on October 12, 2012, 12:28:04 AM
I think the only way is to have a good look through the latest Intel manual to get the later instructions. I don't have a manual later than the earlier i7 hardware which from memory did not have the instruction set you are after.
that was my suspicion. So I've to write it with CPUID as the base.
Gunther
hello Sir Gunther;
you can try "cat /proc/cpuinfo" on linux side; some commands usefulls but not related are "dmesg","lspci".
mineiro@assembly:/proc$ cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz
stepping : 11
microcode : 0xb6
cpu MHz : 1200.000
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm
bogomips : 3599.87
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
....
Hi mineiro,
Quote from: mineiro on October 12, 2012, 01:25:55 AM
hello Sir Gunther;
you can try "cat /proc/cpuinfo" on linux side; some commands usefulls but not related are "dmesg","lspci".
thank you for the hint, but that doesn't help much. The trick is: a programmer can't know in which environment his application is running; it could be a native OS or one of the available VM on different processors. Let's say the application has the following different code paths: one for floating point operations with the classic FPU, one for floating point operations with xmm registers, and one for floating point operations with the new ymm registers. Therefore, the application has to check the available instruction set during run time. That's the trick.
Gunther
An simple example using CPUID:
BITP macro index
EXITM %(1 SHL index)
endm
FI_ECX_AVX EQU BITP(28)
FI_ECX_AES EQU BITP(25)
FI_ECX_POPCNT EQU BITP(23)
FI_ECX_SSE42 EQU BITP(20)
FI_ECX_SSE41 EQU BITP(19)
FI_ECX_FMA EQU BITP(12)
FI_ECX_SSSE3 EQU BITP(9)
FI_ECX_PCLMUL EQU BITP(1)
FI_ECX_SSE3 EQU BITP(0)
FI_EDX_SSE2 EQU BITP(26)
FI_EDX_SSE EQU BITP(25)
FI_EDX_MMX EQU BITP(23)
FI_EDX_CMOV EQU BITP(15)
FI_EDX_FPU EQU BITP(0)
mov eax,1
cpuid ; Feature Identifiers in ECX and EDX
mov esi,ecx
mov edi,edx
.if edi & FI_EDX_FPU
print "FPU",13,10
.endif
.if edi & FI_EDX_SSE
print "SSE",13,10
.endif
.if edi & FI_EDX_SSE2
print "SSE2",13,10
.endif
.if edi & FI_EDX_MMX
print "MMX",13,10
.endif
.if esi & FI_ECX_SSE3
print "SSE3",13,10
.endif
.if esi & FI_ECX_SSSE3
print "SSSE3",13,10
.endif
.if esi & FI_ECX_FMA
print "FMA",13,10
.endif
.if esi & FI_ECX_SSE41
print "SSE41",13,10
.endif
.if esi & FI_ECX_SSE42
print "SSE42",13,10
.endif
.if esi & FI_ECX_AVX
print "AVX",13,10
.endif
BTW: YMM registers are extended XMM registers.
Hi qword,
Quote from: qWord on October 12, 2012, 02:32:15 AM
An simple example using CPUID:
thank you for the code snippet, but I know well how to use CPUID. The critical point is, CPUID isn't sufficient enough. It's not enough to know if this or that instruction set is available. On the other hand, the operating system must be aware to save, for example, the complete xmm register set during task switches. That's also to check. Furthermore, the entire thing should work in both worlds: Win64 and Unix64. But I hope that I can post my first approach in a few days.
Quote from: qWord on October 12, 2012, 02:32:15 AM
BTW: YMM registers are extended XMM registers.
YMM registers are 256 bit wide and the original chip manufacturer (Intel) uses the name YMM. Please have a look into the Intel manuals. http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html (http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
Gunther
Gunther,
Just be careful in trying to use the same code across Windows and Unix based 64 bit systems as I think from memory they use slightly different register preservation conventions.
Steve,
Quote from: hutch-- on October 15, 2012, 01:34:00 AM
Just be careful in trying to use the same code across Windows and Unix based 64 bit systems as I think from memory they use slightly different register preservation conventions.
that's for sure, because different ABIs are used. In some cases, one can use the same code for both operating systems; an example is here: http://masm32.com/board/index.php?topic=795.0 (http://masm32.com/board/index.php?topic=795.0). The procedure InstructionSet will run under Windows, Linux, and MacOS X, with different output formats (win64, elf64, macho64).
But that's not the end of the story. Also in HLL, like C, C++ etc. are some differences. The most NIXES (Solaris, Linux, BSD, and MacOS X) are using the LP64 data model, while MS uses the LLP64 data model. https://en.wikipedia.org/wiki/64-bit (https://en.wikipedia.org/wiki/64-bit) That'll make the developers very happy. :lol:
Gunther