The MASM Forum

General => The Laboratory => Topic started by: InfiniteLoop on February 18, 2021, 06:46:05 AM

Title: AVX2 __m256i (ymmword) variable byte shift.
Post by: InfiniteLoop on February 18, 2021, 06:46:05 AM
While looking through the Intel Instrinsics Guide I noticed there is no such function:
__m256i Shift(__m256i a, __m128i count)
meaning to shift a right/left by count number of bytes while shifting in zeros.

What is fastest?
Also I discovered a serious bug in MASM. Why won't fixed addresses / labels work? Its having none of it. It MUST be put into a register first or it crashes.
i.e. ymmword ptr [label + rax*4] instead of mov reg, OFFSET label and ymmword ptr[reg + rax*4]


;================================
;1.A
;================================
MyCode SEGMENT READ WRITE EXECUTE ALIGN(4096)
ASM_Mod proc
vmovdqu ymm0, ymmword ptr [rcx]
vextracti128 xmm1, ymm0, 1
mov byte ptr [CodeChange + 5], dl
CodeChange: vpalignr ymm0, ymm1, ymm0, 0
ret
ASM_Mod endp
MyCode ENDS

;================================
;1.B
;================================
CodeStart SEGMENT READ EXECUTE ALIGN(4096)
ASM_Mod proc
vmovdqu ymm0, ymmword ptr [rcx]
vextracti128 xmm1, ymm0, 1
mov byte ptr [CodeChange + 5], dl
mov rax, OFFSET CodeChange
jmp rax
ASM_Mod endp
CodeStart ENDS

MyCode SEGMENT READ WRITE EXECUTE ALIGN(4096)
CodeChange: vpalignr ymm0, ymm1, ymm0, 0
ret
MyCode ENDS

;================================
;2.
;================================
ASM_Switch proc
vmovdqu ymm0, ymmword ptr [rcx]
vextracti128 xmm1, ymm0, 1
mov rcx, OFFSET _J0
lea rax, [rcx + 8*rdx]
jmp rax
_J0:
nop
nop
nop
nop
nop
nop
nop
ret
_J1:
nop
vpalignr ymm0, ymm1, ymm0, 1
ret
_J2:
nop
vpalignr ymm0, ymm1, ymm0, 2
ret
_J3:
nop
vpalignr ymm0, ymm1, ymm0, 3
ret
_J4:
nop
vpalignr ymm0, ymm1, ymm0, 4
ret
_J5:
nop
vpalignr ymm0, ymm1, ymm0, 5
ret
_J6:
nop
vpalignr ymm0, ymm1, ymm0, 6
ret
_J7:
nop
vpalignr ymm0, ymm1, ymm0, 7
ret
_J8:
nop
vpalignr ymm0, ymm1, ymm0, 8
ret
_J9:
nop
vpalignr ymm0, ymm1, ymm0, 9
ret
_J10:
nop
vpalignr ymm0, ymm1, ymm0, 10
ret
_J11:
nop
vpalignr ymm0, ymm1, ymm0, 11
ret
_J12:
nop
vpalignr ymm0, ymm1, ymm0, 12
ret
_J13:
nop
vpalignr ymm0, ymm1, ymm0, 13
ret
_J14:
nop
vpalignr ymm0, ymm1, ymm0, 14
ret
_J15:
nop
vpalignr ymm0, ymm1, ymm0, 15
ret
ASM_Switch endp
;================================
;3.
;================================
TestFunc proc
vmovdqu ymm0, ymmword ptr [rcx]
vmovdqu ymm5, ymmword ptr [permd_index]
mov ecx, edx
and ecx, 3
mov eax, 32
lea r8d, [8*ecx] ;8*xmod4
sub eax, r8d ;8*(4-xmod4)
mov r10d, edx ;x
sar edx, 2 ;x/4
mov r9d, edx
sar ecx, 2
sub r9d, ecx
inc r9d ;(4+x-xmod4) / 4
movd xmm1, edx
movd xmm2, r8d
vpbroadcastb ymm1, xmm1
vpaddd ymm1, ymm1, ymm5
vpermd ymm3, ymm1, ymm0
vpsrld ymm3, ymm3, xmm2
movd xmm1, r9d
movd xmm2, eax
vpbroadcastb ymm1, xmm1
vpaddd ymm1, ymm1, ymm5
vpermd ymm4, ymm1, ymm0
mov rax, OFFSET mask_index
vmovdqu ymm1, ymmword ptr [rax + r10]
vpslld ymm4, ymm4, xmm2
vxorps xmm5,xmm5,xmm5
vpcmpgtb ymm2, ymm4, ymm5
vpblendvb ymm0, ymm3, ymm4, ymm2
vpblendvb ymm0, ymm0, ymm5, ymm1
ret
permd_index DWORD 0,1,2,3,4,5,6,7
mask_index BYTE 32 dup (0), 32 dup (255)
TestFunc endp
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on February 18, 2021, 07:24:34 AM
You mean this type of addressing?
include \Masm32\MasmBasic\Res\JBasic.inc
_label YMMWORD 1111111111111111h, 2222222222222222h, 33333333333333333h
Init ; OPT_64 1 ; put 0 for 32 bit, 1 for 64 bit assembly
  PrintLine Chr$("This program was assembled with ", @AsmUsed$(1), " in ", jbit$, "-bit format.")
  mov rax, 4  ; 4*8=get the second element
  int 3
  vmovdqu ymm0, ymmword ptr _label[rax*8]
  Inkey "all is fine"
EndOfCode
OPT_DebugL /LARGEADDRESSAWARE:NO


Works fine with UAsm and Polink. Masm gives me "invalid data initializer" in line 2, apparently it doesn't like the YMMWORD.
Note the linker option - it seems necessary.

I've inserted the int 3 so that you can see in the debugger that ymm0 gets correctly loaded with 2222222222222222h.

See also How to load 256 bits to AVX register at once? (https://board.flatassembler.net/topic.php?p=214813), post by user MASM.

Note that...
  mov rax, 2 ; element 2
  imul rax, rax, 32
  vmovdqu ymm0, ymmword ptr _label[rax]

... gets wrongly encoded by UAsm as vmovdqu ymm0, ymmword ptr [rax], i.e. no _label.
AsmC handles it correctly, but you should use the /Znk assembly option, otherwise it may complain about the "type" keyword :cool:
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: TouEnMasm on February 18, 2021, 07:31:12 PM
look here : https://docs.microsoft.com/en-us/cpp/intrinsics/x64-amd64-intrinsics-list?view=msvc-160 (https://docs.microsoft.com/en-us/cpp/intrinsics/x64-amd64-intrinsics-list?view=msvc-160)
there is __ll_lshift and __ll_rshift valid for Intel and AMD processor
https://stackoverflow.com/questions/8924729/using-avx-intrinsics-instead-of-sse-does-not-improve-speed-why (https://stackoverflow.com/questions/8924729/using-avx-intrinsics-instead-of-sse-does-not-improve-speed-why)
avx
https://docs.microsoft.com/fr-fr/dotnet/api/system.runtime.intrinsics.x86.avx2.shiftleftlogicalvariable?view=net-5.0 (https://docs.microsoft.com/fr-fr/dotnet/api/system.runtime.intrinsics.x86.avx2.shiftleftlogicalvariable?view=net-5.0)
all functions are in immintrin.h (.sdk)
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on February 18, 2021, 07:51:05 PM
I don't have time to chase it up but putting the immediate into a register first is Win64, not the assembler. You could do it in Win32 but Win64 may not have the available opcode to use an OFFSET like that.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: TouEnMasm on February 18, 2021, 08:58:05 PM

uasm don't seem to have problem compiling that:
Quote
  00000001400027B7: C5 FE 6F 01        vmovdqu     ymm0,ymmword ptr [rcx]
  00000001400027BB: C4 E3 7D 39 C1 01  vextracti128 xmm1,ymm0,1
  00000001400027C1: 88 15 11 00 00 00  mov         byte ptr [00000001400027D8h],dl
  00000001400027C7: 48 B8 D3 27 00 40  mov         rax,1400027D3h
                    01 00 00 00
  00000001400027D1: FF E0              jmp         rax
  00000001400027D3: C4 E3 75 0F C0 00  vpalignr    ymm0,ymm1,ymm0,0
  00000001400027D9: C3                 ret
  00000001400027DA: 48 83 EC 08        sub         rsp,8
  00000001400027DE: C5 FE 6F 01        vmovdqu     ymm0,ymmword ptr [rcx]


ASM_Mod proc
vmovdqu ymm0, ymmword ptr [rcx]
vextracti128 xmm1, ymm0, 1
mov byte ptr [CodeChange + 5], dl
mov rax, OFFSET CodeChange
jmp rax
ASM_Mod endp
CodeChange: vpalignr ymm0, ymm1, ymm0, 0
ret



Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: InfiniteLoop on February 20, 2021, 09:53:35 AM
Quote from: TouEnMasm on February 18, 2021, 07:31:12 PM
there is __ll_lshift and __ll_rshift valid for Intel and AMD processor
Those are epi32 and epi64. There is no variable si256 shift by x bytes.

Quote from: jj2007 on February 18, 2021, 07:24:34 AM
You mean this type of addressing?
No I mean using fixed addresses.
e.g.
rax = some input value
;mov rdx, OFFSET Start ;should not need
;lea rax, [rdx + 8*rax] ;should not need
;jmp rax
jmp [Start+rax*8] ;absolute jmp should work no?
Start:
BYTE 7 DUP(0)
ret
BYTE 7 DUP(0)
ret

Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on February 20, 2021, 10:16:55 AM
Hi InfiniteLoop,

Looking at the code you have posted, it looks like you are converting 32 bit code to 64 bit code and the direct OFFSET included in parts of your code that worked fine in Win32 do not work in Win64 as you don't have the opcodes in the hardware to do so. In Win64 you have many more registers and the solution is to use them.

For example you could use r10 and r11 to hold the OFFSET values which you would only need to load once then in you complex addressing mode notation you use the registers instead of the direct OFFSETS. You have already done this and it is the right way to do it.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on February 20, 2021, 11:31:03 AM
Offset works for me:
include \Masm32\MasmBasic\Res\JBasic.inc ; ## console demo, builds in 32- or 64-bit mode with UAsm, ML, AsmC ##
_number dq 1234567890123456789
Init ; OPT_64 1 ; put 0 for 32 bit, 1 for 64 bit assembly
  PrintLine Chr$("This program was assembled with ", @AsmUsed$(1), " in ", jbit$, "-bit format.")

  mov rax, offset _number   ; <<<<<<<<<<<<<

  Inkey Str$("The number is %lli", qword ptr [rax])
EndOfCode


Disassembly: movabs rax, 0x1400012a8

Output:
This program was assembled with ml64 in 64-bit format.
The number is 1234567890123456789


See MOVABS opcode in the assembly code (https://stackoverflow.com/questions/46594389/movabs-opcode-in-the-assembly-code) at SOF. There is a lot of misleading info on the web; for example, some believe that movabs is GAS-specific (https://reverseengineering.stackexchange.com/questions/6540/when-was-the-movabs-instruction-introduced).

Note also:
  mov rax, offset _number ; identical result,
  lea rax, _number ; but lea is shorter


Exe with an int 3 before mov rax, offset _number attached (i.e. run with an x64 debugger).
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on February 20, 2021, 12:37:32 PM
 :biggrin:

See if you can build the app with /LARGEADDRESSAWARE using that old mnemonic. If not, you are stuck with Win32 memory limits.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on February 20, 2021, 01:06:37 PM
include \Masm32\MasmBasic\Res\JBasic.inc ; ## console demo, builds in 32- or 64-bit mode with UAsm, ML, AsmC ##
_number dq 1234567890123456789
Init ; OPT_64 1 ; put 0 for 32 bit, 1 for 64 bit assembly
  PrintLine Chr$("This program was assembled with ", @AsmUsed$(1), " in ", jbit$, "-bit format.")

  mov rax, offset _number   ; <<<<<<<<<<<<<

  xchg rax, rdi
  Print Str$("The address is at %xh\n", rdi)
  Inkey Str$("The number is %lli", qword ptr [rdi])
EndOfCode


Output with /LARGEADDRESSAWARE (my default setting):
This program was assembled with ml64 in 64-bit format.
The address is at 40003018h
The number is 1234567890123456789


With /LARGEADDRESSAWARE:NO:
This program was assembled with ml64 in 64-bit format.
The address is at 403018h
The number is 1234567890123456789
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on February 20, 2021, 02:26:56 PM
What did you miss about "using that old mnemonic" ?
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: six_L on February 20, 2021, 04:04:56 PM
about "LARGEADDRESSAWARE"
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on February 20, 2021, 10:21:29 PM
Quote from: hutch-- on February 20, 2021, 02:26:56 PM
What did you miss about "using that old mnemonic" ?

No idea what you mean, sorry.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: InfiniteLoop on February 22, 2021, 09:40:59 AM
This is the fastest I've found. 30% faster than the permd version.
Using intrinsics. Clang compiles it verbatim.
__m256i shift_8_right_2(__m256i a, unsigned int count)
{
__m256i mask = _mm256_set_epi8(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
__m256i b = _mm256_set1_epi8(count);
__m256i c = _mm256_set1_epi8(15);
__m256i mask_B = _mm256_add_epi8(mask, b);
__m256i mask_A = _mm256_cmpgt_epi8(mask_B, c);
mask_B = _mm256_and_si256(mask_B, c);
__m256i v = _mm256_shuffle_epi8(a, mask_B);
__m256i temp = _mm256_castsi128_si256(_mm256_extracti128_si256(v, 1));
v = _mm256_blendv_epi8(v, temp, mask_A);
return(v);
}


LargeAddressAware does nothing anymore. Its a 32-bit thing.

Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on February 22, 2021, 10:53:06 AM
Quote from: InfiniteLoop on February 22, 2021, 09:40:59 AMLargeAddressAware does nothing anymore. Its a 32-bit thing.

No. See reply #9.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: InfiniteLoop on March 02, 2021, 07:54:37 PM
Quote from: jj2007 on February 22, 2021, 10:53:06 AM
No. See reply #9.
I tested it. Setting LAA to disabled causes any allocation >= 2Gb to fail.

In the process I discovered 2 bugs in the way c++ works.
(1) For a pointer symbol P, which is identical to &P[0], the c++ doe something stupid and crashes unless the latter is used.
(2) 64-bit integers used for addresses and bytes. Absolute nightmare. 1024*1024*1024*2 equals 2 billion but I must be wrong apparently because the compiler says it equals 18 quadrillion and tries to allocate those bytes.

This is the fastest method to shift right by edx bytes.

;=====================================================================
; Shift Variable Right x bytes (RCX == input ymmword. RDX == count bytes)
;=====================================================================
ASM_VarShift_Right proc
vmovdqu ymm0, ymmword ptr [rcx] ;extract parameter
vmovd xmm2, edx
mov eax,15
vpbroadcastb ymm2, xmm2
vpaddb ymm1, ymm2, ymmword ptr Shuffle_Order
vmovd xmm3, eax
vpbroadcastb ymm3, xmm3
vpcmpgtb ymm4, ymm1, ymm3
vpand ymm2, ymm1, ymm3
vpshufb ymm0, ymm0, ymm2
vextracti128 xmm3, ymm0, 1
vpblendvb ymm0, ymm0, ymm3, ymm4
ret
Shuffle_Order BYTE 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
ASM_VarShift_Right endp
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: jj2007 on March 02, 2021, 09:54:27 PM
Quote from: InfiniteLoop on February 22, 2021, 09:40:59 AMLargeAddressAware does nothing anymore. Its a 32-bit thing.

No, it's a 64-bit thing, too:

LAA ON: The address is at 40003018h
LAA OFF: The address is at 403018h

include \Masm32\MasmBasic\Res\JBasic.inc ; OPT_64 1 ; for 32 bit, 1 for 64 bit assembly
_number dq 1234567890123456789
Init
  ifidn @Environ(oDebugL), </Largeaddressaware:NO>
Print "LAA OFF: "
  else
Print "LAA ON: "
  endif
  Inkey Str$("The address is at %xh\n", offset _number)
EndOfCode

OPT_DebugL /Largeaddressaware:NO  ; disable to get LAA ON
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: Ralphy on February 15, 2022, 10:07:44 AM
Not sure if this quote from Mr. Hyde's book helps but..

"Large Address Unaware Applications

One advantage of 64-bit addresses is that they can access a frightfully large
amount of memory (something like 8TB under Windows). By default, the
Microsoft linker (when it links together the C++ and assembly language
code) sets a flag named LARGEADDRESSAWARE to true (yes). This makes it possible
for your programs to access a huge amount of memory. However, there is a
price to be paid for operating in LARGEADDRESSAWARE mode: the const component
of the [reg64 + const] addressing mode is limited to 32 bits and cannot
span the entire address space.

Because of instruction-encoding limitations, the const value is limited
to a signed value in the range ±2GB. This is probably far more than enough
when the register contains a 64-bit base address and you want to access
a memory location at a fixed offset (less than ±2GB) around that base
address. A typical way you would use this addressing mode is as follows:

lea rcx, someStructure
mov al, [rcx+fieldOffset]

Prior to the introduction of 64-bit addresses, the const offset appearing
in the (32-bit) indirect-plus-offset addressing mode could span the entire
(32-bit) address space. So if you had an array declaration such as

.data
buf byte 256 dup (?)

you could access elements of this array by using the following addressing
mode form:

mov al, buf[ebx] ; EBX was used on 32-bit processors

If you were to attempt to assemble the instruction mov al, buf[rbx] in
a 64-bit program (or any other addressing mode involving buf other than
PC-relative), MASM would assemble the code properly, but the linker would
report an error:
error LNK2017: 'ADDR32' relocation to 'buf' invalid without /LARGEADDRESSAWARE:NO

The linker is complaining that in an address space exceeding 32 bits,
it is impossible to encode the offset to the buf buffer because the machine
instruction opcodes provide only a 32-bit offset to hold the address of buf.

However, if we were to artificially limit the amount of memory that our
application uses to 2GB, then MASM can encode the 32-bit offset to buf into
the machine instruction. As long as we kept our promise and never used any
more memory than 2GB, several new variations on the indirect-plus-offset
and scaled-indexed addressing modes become possible.

To turn off the large address–aware flag, you need to add an extra command
line option to the ml64 command. This is easily done in the build.bat
file; let's create a new build.bat file and call it sbuild.bat. This file will have
the following lines:

echo off
ml64 /nologo /c /Zi /Cp %1.asm
cl /nologo /O2 /Zi /utf-8 /EHa /Fe%1.exe c.cpp %1.obj /link /largeaddressaware:no

This set of commands tells MASM to pass a
command to the linker that turns off the large address–aware file. MASM,
MSVC, and the Microsoft linker will construct an executable file that
requires only 32-bit addresses (ignoring the 32 HO bits in the 64-bit registers
appearing in addressing modes).

Once you've disabled LARGEADDRESSAWARE, several new variants of the
indirect-plus-offset and scaled-indexed addressing modes become available
to your programs:

variable[reg64]
variable[reg64 + const]
variable[reg64 - const]
variable[reg64 * scale]
variable[reg64 * scale + const]
variable[reg64 * scale - const]
variable[reg64 + reg_not_RSP64 * scale]
variable[reg64 + reg_not_RSP64 * scale + const]
variable[reg64 + reg_not_RSP64 * scale - const]

where variable is the name of an object you've declared in your source file
by using directives like byte, word, dword, and so on; const is a (maximum
32-bit) constant expression; and scale is 1, 2, 4, or 8. These addressing mode
forms use the address of variable as the base address and add in the current
value of the 64-bit registers."
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on March 14, 2022, 05:39:00 PM
You just need to learn how 64 bit works, There is a special MOV mnemonic that will load large addresses into a 64 bit register. There is no opcode to load a 64 bit sized immediate into a memory operand, you load it into a register first then copy the register into the memory operand.

Turning off the /LARGEADDRESSAWARE is a mistake, you get all of the quirks of Win64 while being restricted to 32 bit memory addressing range.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: mikeburr on March 21, 2022, 09:41:00 PM
hutch
can you give an example of how to do this .. [theres plenty of stuff saying set largeaddress to no but nothing on largeaddressware that solves the issue]
eg
.data
message0  " 31char message... ",0
message1 2..31char message" ,0
....
...

....code

mov r15, 3
shl r15,5 ; get 5th message in array
lea rcx, [message0 + r15]  ..> fails lnk2017

how can this be done ??
regards mike b


Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: mikeburr on March 21, 2022, 10:14:06 PM
looks to be fairly simple
replace line
lea rcx, [message0 + r15]
by
mov rcx,  offset message0
add rcx, r15 ; displacement
regards mikeb .. ps..havent tested yet but doesnt throw a lnk error so i guess it will work ok
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: hutch-- on March 21, 2022, 10:22:21 PM
Mike,

Using the /LARGEADDRESSAWARE means transferring immediates to a 64 bit register first, then transferring the register to a 64 bit operand.

    mov rax, 1234567890123456   ; big number
    mov var, rax

RE : AVX, I have done some work there and the majpr factor is data alignment. It will say nasty things to you and crash if you get that wrong. Effectively you align data for AVX to at least the size of the register. Check the actual instructions in the Intel manual.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: Gunther on March 22, 2022, 03:20:56 AM
Quote from: hutch-- on March 21, 2022, 10:22:21 PM
RE : AVX, I have done some work there and the majpr factor is data alignment. It will say nasty things to you and crash if you get that wrong. Effectively you align data for AVX to at least the size of the register. Check the actual instructions in the Intel manual.

This is a very important hint. Even better is the alignment to 16 or even 32. It' just a few bytes that are given away by this.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: mikeburr on March 23, 2022, 04:00:43 AM
thanks.. yes been aligning to 16 where possible ... turned out using largeaddressware  required a few changes that proved a bit longwinded
[ 3 lines of code replacing 1 in the lnk error cases ] but working fine
regards mikeb
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: johnsa on April 03, 2022, 02:06:43 AM
There really isn't any reason to be using /LARGEADDRESSAWARE:NO for 64bit coding... that really defeats the object.

You want to get as much PIC (RIP relative) as possible.

branches/calls are all rip relative.
lea is inherently rip relative too and is your friend.

Use LEA to load the base address and then use it with another register as your index.
LEA rax,message0
mov rcx,(5*32) ; some offset amount
mov al,[rax+rcx]

variable[idx] type addressing that was common in 32bit asm code is a terrible idea now, on the plus side it means fewer relocations avoiding this.
Title: Re: AVX2 __m256i (ymmword) variable byte shift.
Post by: TimoVJL on April 03, 2022, 02:49:49 AM
PIC coding is important in linux too, so why not accept that in Windows x64 too.