The MASM Forum

64 bit assembler => UASM Assembler Development => Topic started by: LiaoMi on August 01, 2019, 10:57:02 PM

Title: VPERMILPS Bug?!
Post by: LiaoMi on August 01, 2019, 10:57:02 PM
Hi,

I think I found a bug, I experimented in the direction of Lanczos Interpolation, where I forgot to remove the last part of another command. As a result, the code was assembled with incorrect data (920h+var_920)

Quote
ownRowLanczos32pl proc near
var_920         = dword ptr -920h
...
VPERMILPS ymm0, ymm0, ymm1, 920h+var_920
                ;VPERMILPS ymm1, ymm2, ymm3/m256   
                ;Description - RVM  V/V   AVX   Permute single-precision floating-point values in ymm2 using controls from ymm3/mem and store result in ymm1.
                ;VPERMILPS ymm1, ymm0, ymm1, 920h+var_920

will be assembled into

Quote
vpermilps ymm0, ymm1, 0

what seems to me wrong  :rolleyes:, there is no warning that the instruction is not correct, as well as with the third parameter something is wrong(everything is fine here, zero is the correct calculation) ..
Title: Re: VPERMILPS Bug?!
Post by: AW on August 02, 2019, 12:40:09 AM
I have seen a few wrong AVX instructions that assemble without error to something completely different. A simple one: vmovaps ymm0, 2222
Title: Re: VPERMILPS Bug?!
Post by: johnsa on August 02, 2019, 01:14:35 AM
There are two code-gen's in UASM at present, The original one inherited from wasm/jwasm and modified over the years which is pretty awful and is the source of all these issues. I started replacing it in January with a new CodeGenV2. What happens in both of these cases at the moment is that the new CodeGenV2 correctly identifies this as an invalid instruction, however due to the fact that to maintain compatibility and not break the product entirely mid-stream in the event the V2 generator can't find a valid instruction it falls back to the legacy one, which creates the nonsense. Once ALL instructions are migrated to the new generator the old one will be removed completely and these issues will be a thing of the past (+ the new generator is a lot cleaner and faster than the old one).

Basically, I'm not fixing anything in the old code-gen unless it's totally unavoidable.. :)
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 02, 2019, 02:58:31 AM
I have seen a few wrong AVX instructions that assemble without error to something completely different. A simple one: vmovaps ymm0, 2222

Hi AW,

I also decided to check those that were on hand, but I didn't find anything on the first try, it would be great to automate such checks  :icon_idea:

Compiler Fuzzing With Prog-Fuzz Is Turning Up Bugs In GCC, Clang
Quote
Vegard Nossum of Oracle has been working on fuzzing different open-source compilers for turning up bugs within these code compiler likes GCC and Clang.

Vegard ended up writing a new compiler fuzzer from scratch making use of AFL instrumentation. This new fuzzer is dubbed simply Prog-Fuzz and is available on GitHub https://github.com/vegard/prog-fuzz (https://github.com/vegard/prog-fuzz).

Over the past few months, he has uncovered more than 100 different GCC compiler bugs while about three dozen of them are fixed so far. Most of these bugs cause the compiler to crash with compiler errors, assertion failures, or segmentation faults. At least 9 new bugs were also uncovered in the LLVM/Clang compiler.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 02, 2019, 03:03:33 AM
There are two code-gen's in UASM at present, The original one inherited from wasm/jwasm and modified over the years which is pretty awful and is the source of all these issues. I started replacing it in January with a new CodeGenV2. What happens in both of these cases at the moment is that the new CodeGenV2 correctly identifies this as an invalid instruction, however due to the fact that to maintain compatibility and not break the product entirely mid-stream in the event the V2 generator can't find a valid instruction it falls back to the legacy one, which creates the nonsense. Once ALL instructions are migrated to the new generator the old one will be removed completely and these issues will be a thing of the past (+ the new generator is a lot cleaner and faster than the old one).

Basically, I'm not fixing anything in the old code-gen unless it's totally unavoidable.. :)

Ah, here's how it is, now it’s clear  :biggrin: nice tricky way to refactor code  :thumbsup:
Title: Re: VPERMILPS Bug?!
Post by: AW on August 02, 2019, 05:59:29 AM
After assembly next step is to disassembly to confirm all is well. I can live with that, I am not using much AVX.  :biggrin:
Title: Re: VPERMILPS Bug?!
Post by: johnsa on August 03, 2019, 05:29:29 AM
Hopefully in a few months it will be a non issue.. but re-creating the code gen is a painful process.. I've created a regression test per instruction, so we move one at a time to make sure it's right.. its laborious!
Oh for the good ol days of a small ISA with 50 or so instructions.. not the 700 or whatever we have now !
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 06, 2019, 10:18:40 PM
Hi johnsa,

is it time consuming to update each instruction? Do you need help?
Title: Re: VPERMILPS Bug?!
Post by: johnsa on August 06, 2019, 11:08:41 PM
It is tedious, we could definitely use a hand creating per-instruction regression tests.

Creating the actual instruction entries isn't too bad as the format has been designed to match very closely with the instruction manuals.
Title: Re: VPERMILPS Bug?!
Post by: habran on August 06, 2019, 11:32:52 PM
At this moment we need the test for crc32 with all possible combinations:
    CRC32 r32, r / m8      F2 0F 38 F0 / r
    CRC32 r32, r / m8 *    F2 REX 0F 38 F0 / r
    CRC32 r32, r / m16     F2 0F 38 F1 / r
    CRC32 r32, r / m32     F2 0F 38 F1 / r
    CRC32 r64, r / m8      F2 REX.W 0F 38 F0 / r
    CRC32 r64, r / m64     F2 REX.W 0F 38 F1 / r

Code: [Select]
"crc32",   2, { R32,      R8      },
"crc32",   2, { R32,      R8H     },
"crc32",   2, { R32,      R8E     },
"crc32",   2, { R32,      R8U     },
"crc32",   2, { R32E,     R8      },
"crc32",   2, { R32E,     R8E     },
"crc32",   2, { R32E,     R8U     },
"crc32",   2, { R64,      R8      },
"crc32",   2, { R64,      R8E     },
"crc32",   2, { R64,      R8U     },
"crc32",   2, { R64E,     R8      },
"crc32",   2, { R64E,     R8U     },
"crc32",   2, { R64E,     R8E     },
"crc32",   2, { R64,      R8U     },
"crc32",   2, { R32,      R16     },
"crc32",   2, { R32E,     R16     },
"crc32",   2, { R32,      R32     },
"crc32",   2, { R32E,     R32E    },
"crc32",   2, { R32E,     R32     },
"crc32",   2, { R32,      R32E    },
"crc32",   2, { R64,      R64     },
"crc32",   2, { R64E,     R64E    },
"crc32",   2, { R64,      R64E    },
"crc32",   2, { R64E,     R64     },
"crc32",   2, { R32,      M8      },
"crc32",   2, { R32,      M16     },
"crc32",   2, { R32,      M32     },
"crc32",   2, { R64,      M8      },
"crc32",   2, { R64,      M64     },
Title: Re: VPERMILPS Bug?!
Post by: johnsa on August 07, 2019, 12:04:21 AM
I can walk somebody through how we create/run the regression tests if someone wants to have a go at one :)
Title: Re: VPERMILPS Bug?!
Post by: AW on August 07, 2019, 03:19:06 AM
I believe LiaoMi may be able to adapt his Haskell project for AVX2 instructions generation for testing. How to integrate it into with a 100% C codebase is a challenge.
Title: Re: VPERMILPS Bug?!
Post by: habran on August 07, 2019, 03:46:36 PM
I have volunteered to create a testing peace for crc32 :biggrin:
Code: [Select]
crc32 r8,r9                  ;F2 4D 0F 38 F1 C1                 crc32       r8,r9 
crc32 ecx, cl                ;F2 0F 38 F0 C9                    crc32       ecx,cl 
crc32 ecx, ch                ;F2 0F 38 F0 CD                    crc32       ecx,ch 
crc32 ecx, r10b              ;F2 41 0F 38 F0 CA                 crc32       ecx,r10b 
crc32 ecx, sil               ;F2 40 0F 38 F0 CE                 crc32       ecx,sil 
crc32 r10d, al               ;F2 44 0F 38 F0 D0                 crc32       r10d,al 
crc32 r10d, r10b             ;F2 45 0F 38 F0 D2                 crc32       r10d,r10b 
crc32 r10d, sil              ;F2 44 0F 38 F0 D6                 crc32       r10d,sil 
crc32 rcx, al                ;F2 48 0F 38 F0 C8                 crc32       rcx,al 
crc32 rcx, r10b              ;F2 49 0F 38 F0 CA                 crc32       rcx,r10b 
crc32 rcx, sil               ;F2 48 0F 38 F0 CE                 crc32       rcx,sil 
crc32 r10, al                ;F2 4C 0F 38 F0 D0                 crc32       r10,al 
crc32 r10, sil               ;F2 4C 0F 38 F0 D6                 crc32       r10,sil 
crc32 r10, r10b              ;F2 4D 0F 38 F0 D2                 crc32       r10,r10b 
crc32 ecx, ax                ;66 F2 0F 38 F1 C8                 crc32       ecx,ax 
crc32 r10d, bx               ;66 F2 44 0F 38 F1 D3              crc32       r10d,bx 
crc32 ecx, r10w              ;66 F2 41 0F 38 F1 CA              crc32       ecx,r10w 
crc32 r10d,r10w              ;66 F2 45 0F 38 F1 D2              crc32       r10d,r10w 
crc32 ecx, ecx               ;F2 0F 38 F1 C9                    crc32       ecx,ecx 
crc32 r10d, r11d             ;F2 45 0F 38 F1 D3                 crc32       r10d,r11d 
crc32 r10d, ecx              ;F2 44 0F 38 F1 D1                 crc32       r10d,ecx 
crc32 ecx, r11d              ;F2 41 0F 38 F1 CB                 crc32       ecx,r11d 
crc32 rcx, rcx               ;F2 48 0F 38 F1 C9                 crc32       rcx,rcx 
crc32 r10, r11               ;F2 4D 0F 38 F1 D3                 crc32       r10,r11 
crc32 rcx, r10               ;F2 49 0F 38 F1 CA                 crc32       rcx,r10 
crc32 r10, rcx               ;F2 4C 0F 38 F1 D1                 crc32       r10,rcx 
crc32 ecx, dbVar             ;F2 0F 38 F0 0D 4B 2F 00 00        crc32       ecx,byte ptr [dbVar (0404000h)] 
crc32 ecx, dwVar             ;66 F2 0F 38 F1 0D 42 2F 00 00     crc32       ecx,word ptr [dwVar (0404001h)]
crc32 ecx, ddVar             ;F2 0F 38 F1 0D 3B 2F 00 00        crc32       ecx,dword ptr [ddVar (0404003h)] 
crc32 rcx, qvar              ;F2 48 0F 38 F1 0D 35 2F 00 00     crc32       rcx,qword ptr [qvar (0404007h)]
crc32 rcx, dbVar             ;F2 48 0F 38 F0 0D 24 2F 00 00     crc32       rcx,byte ptr [dbVar (0404000h)]
crc32 r10d, dbVar            ;F2 44 0F 38 F0 15 1A 2F 00 00     crc32       r10d,byte ptr [dbVar (0404000h)] 
crc32 r10d, dwVar            ;66 F2 44 0F 38 F1 15 10 2F 00 00  crc32       r10d,word ptr [dwVar (0404001h)]
crc32 r10d, ddVar            ;F2 44 0F 38 F1 15 08 2F 00 00     crc32       r10d,dword ptr [ddVar (0404003h)] 
crc32 r10,  qvar             ;F2 4C 0F 38 F1 15 02 2F 00 00     crc32       r10,qword ptr [qvar (0404007h)] 
crc32 r10,  dbVar            ;F2 4C 0F 38 F0 15 F1 2E 00 00     crc32       r10,byte ptr [dbVar (0404000h)] 
Title: Re: VPERMILPS Bug?!
Post by: johnsa on August 07, 2019, 06:21:16 PM
The approach I take is as follows:

1) Create a plain BIN source file for 32bit and 64bit per instruction. These can be found in the regress/src folder.
2) The instruction must be tested in every possible combination (and this is the hard part), using sil/dil vs. high byte registers, registers whos number is <8, <16, <32.. combinations of those
    Combined with an array of addressing modes for memory operands, once again with various registers from the different number banks.
3) Additionally an error src file is created to specifically test variations of the instruction which should fail.
4) If the instruction support various forms of prefixes these must be included too.

Once that is all done, the regress test can automatically run it and compare the output BIN to a known-good/expected result file. Here is where the second part of the slow-process is:

5) Use UASM to assemble the bin file, take the resulting HEX file and I use Defuse to then manually go through each opcode and validate it. In addition I take the same instructions and assemble them via the Defuse interface to ensure we have selected the correct opcode and encoding (IE: the optimal shorter sequences). This result is then used to verify the expected result file which is stored in regress/exp

Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 07, 2019, 10:28:03 PM
I believe LiaoMi may be able to adapt his Haskell project for AVX2 instructions generation for testing. How to integrate it into with a 100% C codebase is a challenge.

Hi AW,

it was this project that I took as a basis =)

At this moment we need the test for crc32 with all possible combinations:
    CRC32 r32, r / m8      F2 0F 38 F0 / r
    CRC32 r32, r / m8 *    F2 REX 0F 38 F0 / r
    CRC32 r32, r / m16     F2 0F 38 F1 / r
    CRC32 r32, r / m32     F2 0F 38 F1 / r
    CRC32 r64, r / m8      F2 REX.W 0F 38 F0 / r
    CRC32 r64, r / m64     F2 REX.W 0F 38 F1 / r

Code: [Select]
"crc32",   2, { R32,      R8      },
"crc32",   2, { R32,      R8H     },
"crc32",   2, { R32,      R8E     },
"crc32",   2, { R32,      R8U     },
"crc32",   2, { R32E,     R8      },
"crc32",   2, { R32E,     R8E     },
"crc32",   2, { R32E,     R8U     },
"crc32",   2, { R64,      R8      },
"crc32",   2, { R64,      R8E     },
"crc32",   2, { R64,      R8U     },
"crc32",   2, { R64E,     R8      },
"crc32",   2, { R64E,     R8U     },
"crc32",   2, { R64E,     R8E     },
"crc32",   2, { R64,      R8U     },
"crc32",   2, { R32,      R16     },
"crc32",   2, { R32E,     R16     },
"crc32",   2, { R32,      R32     },
"crc32",   2, { R32E,     R32E    },
"crc32",   2, { R32E,     R32     },
"crc32",   2, { R32,      R32E    },
"crc32",   2, { R64,      R64     },
"crc32",   2, { R64E,     R64E    },
"crc32",   2, { R64,      R64E    },
"crc32",   2, { R64E,     R64     },
"crc32",   2, { R32,      M8      },
"crc32",   2, { R32,      M16     },
"crc32",   2, { R32,      M32     },
"crc32",   2, { R64,      M8      },
"crc32",   2, { R64,      M64     },

I have volunteered to create a testing peace for crc32 :biggrin:
Code: [Select]
crc32 r8,r9                  ;F2 4D 0F 38 F1 C1                 crc32       r8,r9 
crc32 ecx, cl                ;F2 0F 38 F0 C9                    crc32       ecx,cl 
crc32 ecx, ch                ;F2 0F 38 F0 CD                    crc32       ecx,ch 
crc32 ecx, r10b              ;F2 41 0F 38 F0 CA                 crc32       ecx,r10b 
crc32 ecx, sil               ;F2 40 0F 38 F0 CE                 crc32       ecx,sil 
crc32 r10d, al               ;F2 44 0F 38 F0 D0                 crc32       r10d,al 
crc32 r10d, r10b             ;F2 45 0F 38 F0 D2                 crc32       r10d,r10b 
crc32 r10d, sil              ;F2 44 0F 38 F0 D6                 crc32       r10d,sil 
crc32 rcx, al                ;F2 48 0F 38 F0 C8                 crc32       rcx,al 
crc32 rcx, r10b              ;F2 49 0F 38 F0 CA                 crc32       rcx,r10b 
crc32 rcx, sil               ;F2 48 0F 38 F0 CE                 crc32       rcx,sil 
crc32 r10, al                ;F2 4C 0F 38 F0 D0                 crc32       r10,al 
crc32 r10, sil               ;F2 4C 0F 38 F0 D6                 crc32       r10,sil 
crc32 r10, r10b              ;F2 4D 0F 38 F0 D2                 crc32       r10,r10b 
crc32 ecx, ax                ;66 F2 0F 38 F1 C8                 crc32       ecx,ax 
crc32 r10d, bx               ;66 F2 44 0F 38 F1 D3              crc32       r10d,bx 
crc32 ecx, r10w              ;66 F2 41 0F 38 F1 CA              crc32       ecx,r10w 
crc32 r10d,r10w              ;66 F2 45 0F 38 F1 D2              crc32       r10d,r10w 
crc32 ecx, ecx               ;F2 0F 38 F1 C9                    crc32       ecx,ecx 
crc32 r10d, r11d             ;F2 45 0F 38 F1 D3                 crc32       r10d,r11d 
crc32 r10d, ecx              ;F2 44 0F 38 F1 D1                 crc32       r10d,ecx 
crc32 ecx, r11d              ;F2 41 0F 38 F1 CB                 crc32       ecx,r11d 
crc32 rcx, rcx               ;F2 48 0F 38 F1 C9                 crc32       rcx,rcx 
crc32 r10, r11               ;F2 4D 0F 38 F1 D3                 crc32       r10,r11 
crc32 rcx, r10               ;F2 49 0F 38 F1 CA                 crc32       rcx,r10 
crc32 r10, rcx               ;F2 4C 0F 38 F1 D1                 crc32       r10,rcx 
crc32 ecx, dbVar             ;F2 0F 38 F0 0D 4B 2F 00 00        crc32       ecx,byte ptr [dbVar (0404000h)] 
crc32 ecx, dwVar             ;66 F2 0F 38 F1 0D 42 2F 00 00     crc32       ecx,word ptr [dwVar (0404001h)]
crc32 ecx, ddVar             ;F2 0F 38 F1 0D 3B 2F 00 00        crc32       ecx,dword ptr [ddVar (0404003h)] 
crc32 rcx, qvar              ;F2 48 0F 38 F1 0D 35 2F 00 00     crc32       rcx,qword ptr [qvar (0404007h)]
crc32 rcx, dbVar             ;F2 48 0F 38 F0 0D 24 2F 00 00     crc32       rcx,byte ptr [dbVar (0404000h)]
crc32 r10d, dbVar            ;F2 44 0F 38 F0 15 1A 2F 00 00     crc32       r10d,byte ptr [dbVar (0404000h)] 
crc32 r10d, dwVar            ;66 F2 44 0F 38 F1 15 10 2F 00 00  crc32       r10d,word ptr [dwVar (0404001h)]
crc32 r10d, ddVar            ;F2 44 0F 38 F1 15 08 2F 00 00     crc32       r10d,dword ptr [ddVar (0404003h)] 
crc32 r10,  qvar             ;F2 4C 0F 38 F1 15 02 2F 00 00     crc32       r10,qword ptr [qvar (0404007h)] 
crc32 r10,  dbVar            ;F2 4C 0F 38 F0 15 F1 2E 00 00     crc32       r10,byte ptr [dbVar (0404000h)] 

Hi habran,

we knew that asking for help is the best motivation to solve a problem yourself))))) thanks for volunteering!

The approach I take is as follows:

1) Create a plain BIN source file for 32bit and 64bit per instruction. These can be found in the regress/src folder.
2) The instruction must be tested in every possible combination (and this is the hard part), using sil/dil vs. high byte registers, registers whos number is <8, <16, <32.. combinations of those
    Combined with an array of addressing modes for memory operands, once again with various registers from the different number banks.
3) Additionally an error src file is created to specifically test variations of the instruction which should fail.
4) If the instruction support various forms of prefixes these must be included too.

Once that is all done, the regress test can automatically run it and compare the output BIN to a known-good/expected result file. Here is where the second part of the slow-process is:

5) Use UASM to assemble the bin file, take the resulting HEX file and I use Defuse to then manually go through each opcode and validate it. In addition I take the same instructions and assemble them via the Defuse interface to ensure we have selected the correct opcode and encoding (IE: the optimal shorter sequences). This result is then used to verify the expected result file which is stored in regress/exp

Hi johnsa,

thanks for the description! Now it’s clear how the basic structure looks like.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 07, 2019, 10:34:02 PM
(https://i.imgur.com/A0aS3H9.png)

this code is not assembled from the generated list
;CRC32 r15 , word ptr [4 * r15w + r15w]  ;Error A2031: invalid addressing mode with current CPU setting
Code: [Select]
;CRC32 eax , word ptr [4 * r15w + r15w] 
;CRC32 eax , word ptr [r15w + r15w] 
CRC32 eax , word ptr [4 * r15w + 123456h] 
;CRC32 eax , word ptr [r15w + 123456h] 
;CRC32 eax , word ptr [4 * r14w + r14w] 
;CRC32 eax , word ptr [r14w + r14w] 
CRC32 eax , word ptr [4 * r14w + 123456h] 
;CRC32 eax , word ptr [r14w + 123456h] 
;CRC32 eax , word ptr [4 * r13w + r13w] 
;CRC32 eax , word ptr [r13w + r13w] 
;CRC32 eax , word ptr [4 * r13w + 123456h] 
;CRC32 eax , word ptr [r13w + 123456h] 
;CRC32 eax , word ptr [4 * r12w + r12w] 
;CRC32 eax , word ptr [r12w + r12w] 
;CRC32 eax , word ptr [4 * r12w + 123456h] 
;CRC32 eax , word ptr [r12w + 123456h] 
;CRC32 eax , word ptr [4 * r11w + r11w] 
;CRC32 eax , word ptr [r11w + r11w] 
;CRC32 eax , word ptr [4 * r11w + 123456h] 
;CRC32 eax , word ptr [r11w + 123456h] 
;CRC32 eax , word ptr [4 * r10w + r10w] 
;CRC32 eax , word ptr [r10w + r10w] 
;CRC32 eax , word ptr [4 * r10w + 123456h] 
;CRC32 eax , word ptr [r10w + 123456h] 
;CRC32 eax , word ptr [4 * r9w + r9w] 
;CRC32 eax , word ptr [r9w + r9w] 
;CRC32 eax , word ptr [4 * r9w + 123456h] 
;CRC32 eax , word ptr [r9w + 123456h] 
;CRC32 eax , word ptr [4 * r8w + r8w] 
;CRC32 eax , word ptr [r8w + r8w] 
;CRC32 eax , word ptr [4 * r8w + 123456h] 
;CRC32 eax , word ptr [r8w + 123456h] 
;CRC32 eax , word ptr [4 * bp + bp] 
;CRC32 eax , word ptr [bp + bp] 
;CRC32 eax , word ptr [4 * bp + 123456h] 
;CRC32 eax , word ptr [bp + 123456h] 
;CRC32 eax , word ptr [4 * sp + bp] 
;CRC32 eax , word ptr [sp + bp] 
;CRC32 eax , word ptr [4 * sp + 123456h] 
;CRC32 eax , word ptr [sp + 123456h] 
;CRC32 eax , word ptr [4 * di + di] 
;CRC32 eax , word ptr [di + di] 
;CRC32 eax , word ptr [4 * di + 123456h] 
;CRC32 eax , word ptr [di + 123456h] 
;CRC32 eax , word ptr [4 * si + si] 
;CRC32 eax , word ptr [si + si] 
;CRC32 eax , word ptr [4 * si + 123456h] 
;CRC32 eax , word ptr [si + 123456h] 
;CRC32 eax , word ptr [4 * dx + dx] 
;CRC32 eax , word ptr [dx + dx] 
;CRC32 eax , word ptr [4 * dx + 123456h] 
;CRC32 eax , word ptr [dx + 123456h] 
;CRC32 eax , word ptr [4 * cx + cx] 
;CRC32 eax , word ptr [cx + cx] 
;CRC32 eax , word ptr [4 * cx + 123456h] 
;CRC32 eax , word ptr [cx + 123456h] 
;CRC32 eax , word ptr [4 * bx + bx] 
;CRC32 eax , word ptr [bx + bx] 
;CRC32 eax , word ptr [4 * bx + 123456h] 
;CRC32 eax , word ptr [bx + 123456h] 
;CRC32 eax , word ptr [4 * ax + ax] 
;CRC32 eax , word ptr [ax + ax] 
;CRC32 eax , word ptr [4 * ax + 123456h] 
;CRC32 eax , word ptr [ax + 123456h] 
;CRC32 eax , word ptr [ DataLAbelValue ] 

You can look at the generated example (version for all possible instructions, source code is in the archive). Instruction Generator is still a beta version, I wrote the program during the evening, so there are still many bugs .. The debugger produces strange code using an example, but this is probably a bug of the debugger itself.

(https://i.imgur.com/I4AycI7.png)
Title: Re: VPERMILPS Bug?!
Post by: habran on August 07, 2019, 11:29:25 PM
I can not assemble this because missing include files:
        include C:\masm64\VS2017\include_x86_x64\translate64.inc
 
       include C:\masm64\sdkrc100\um\windows.inc

those lines with  byte ptr [ DataLAbelValue ]
cause error Illegal use of segment register
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 07, 2019, 11:56:50 PM
I can not assemble this because missing include files:
        include C:\masm64\VS2017\include_x86_x64\translate64.inc
        include C:\masm64\sdkrc100\um\windows.inc

All extra code can be deleted, use your own batch file, copy only the code itself and the variable in the data section.
But if you really want to try, these header files can be downloaded here - https://mega.co.nz/#!g5x3hSLa!AAAAAAAAAAAtj2upDPBmFQAAAAAAAAAALY9rqQzwZhU (updated archive)
Title: Re: VPERMILPS Bug?!
Post by: AW on August 08, 2019, 12:12:11 AM
An approach for error check, would be through the use of one (or more than one) regular expression(s) for each Intel instruction.

For example, for the first part of a crc32 instruction (until the comma)
(?i)(^crc32\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\s*,\s*


(https://www.dropbox.com/s/17ywy13wjcc9h59/crc32.jpg?dl=1)

There are Regex libraries and Regex to C converters (I never tested).
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 08, 2019, 04:37:30 AM
An approach for error check, would be through the use of one (or more than one) regular expression(s) for each Intel instruction.

For example, for the first part of a crc32 instruction (until the comma)
(?i)(^crc32\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\s*,\s*


(https://www.dropbox.com/s/17ywy13wjcc9h59/crc32.jpg?dl=1)

There are Regex libraries and Regex to C converters (I never tested).

I would like to be able to generate erroneous instructions, almost like fuzzing  :badgrin: But in this case, you will have to limit the generation of instructions in the settings. Otherwise, the restriction on the number of displayed errors will not allow you to see all errors .. up to this point I have tested instructions with two operands

r16/32/64;r/m16/32/64;;;
r/m16/32/64;r16/32/64;;;
r8;r/m8;;;

For the rest I need to add handlers, which will process the instructions ..  :arrow_down:
imm8/16/32
xmm/m128
DIV;rDX;rAX;r/m16/32/64 // three operands
FDIVR;ST;m32real;;;
FDIVR;ST;STi;;;
FILD;ST;m32int;;;
FILD;ST;m16int;;;
FILD;ST;m64int;;;

Title: Re: VPERMILPS Bug?!
Post by: AW on August 08, 2019, 05:22:45 AM
We can produce operand variations in the same way you do it in Haskell, and we can group similar instructions and handle them in the same fashion reducing by several times the number of RegEx needed.
BTW, I did not complete my crc32 instruction RegEx because there are 2 other complementary RegExes needed. One for the REX case, the other to handle and validate values from memory.

The major problem is finding a nice RegEx C library, I tested the SLRE (https://github.com/cesanta/slre), it is cute and small builds fine with VS 2019 but  did not find matches when using my test RegEx,  :sad:. It works fine with its own unit test, though. There is probably a bug in there, but the code is small at around 400 lines, so it will not take long to find it, if need to be done.

Anyway, these are only brainstorming ideas.

Yes, I understand your Haskell project is mostly suited to find errors. It is a nice tool indeed.  :thumbsup:

Title: Re: VPERMILPS Bug?!
Post by: fearless on August 08, 2019, 07:38:35 AM
Could try the PCRE lib for regex, Biterider has a compiled verson of it in the ObjAsm Beta 2 for both x86 and x64 (version 841S)
Title: Re: VPERMILPS Bug?!
Post by: AW on August 08, 2019, 11:26:45 PM
A final note about RegEx for ASM instructions syntax check.

The C implementations I have seen appear buggy, despite being cute and small.
The C++ Boost library provides a robust RegEx implementation. It is in C++ but is callable from C.
I tested the same RegEx I have used previously and it worked as expected.

1- C source
Code: [Select]
#include "common.h"

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char instructs[NUMBER_OF_STRING][MAX_STRING_SIZE] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };

int main()
{
dotest(instructs, regexp);
}


2- Header
Code: [Select]
#pragma once
#define NUMBER_OF_STRING 8
#define MAX_STRING_SIZE 40

#ifdef __cplusplus
extern "C" {
#endif
int dotest(const char strArray[][MAX_STRING_SIZE], const char* pattern);
#ifdef __cplusplus
}
#endif

3- C++ file
Code: [Select]
#include <boost/regex.hpp>
#include <string>
#include <iostream>
#include "common.h"

using namespace std;

int dotest(const char strArray[][MAX_STRING_SIZE], const char* pattern)
{
boost::regex pat(pattern);
boost::smatch matches;
for (int i = 0; i < NUMBER_OF_STRING; i++)
{
string str(strArray[i]);
if (boost::regex_match(str, matches, pat))
cout << matches[0] << "\t\t matches" << endl;
else
cout << str << "\t\t does not match." << endl;
}
return 0;
}

Output:
Code: [Select]
crc32  ecx,              matches
CRC32 esi,               matches
crc32  r10d  ,           matches
cRC32 r15 ,              matches
Crr32 rdi,               does not match.
CrC32 rdi,               matches
crc32 bx,                does not match.
crc32  ebx ,             matches

It is also possible to use the Std's regex instead of the Boost library regex, but the pattern does not support case insensitiveness. We need to set a flag for that when declaring the regular expression. Otherwise it works fine too.

Title: Re: VPERMILPS Bug?!
Post by: TimoVJL on August 09, 2019, 03:46:28 AM
PCRE works with that pattern, so works with asm and C.
PCRE 6.4 at Pelles C (https://forum.pellesc.de/index.php?topic=1166.msg5310#msg5310)
Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//#include <pcre.h>
#pragma comment(lib, "pcre3s.lib")
typedef void* pcre; // opaque
typedef void* pcre_extra; // fake, not used
pcre __cdecl *pcre_compile(const char *, int, const char **, int *, const unsigned char *);
int __cdecl pcre_exec(const pcre *, const pcre_extra *, const char *, int, int, int, int *, int);

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char *instructs[] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };
#define OVECCOUNT 30    /* should be a multiple of 3 */
int main(int argC, char *argV[])
{
pcre *re;
const char *error;
int erroffset;
int ovector[OVECCOUNT];
int rc;

re = pcre_compile(regexp, 0, &error, &erroffset, NULL);
if (re) {
printf("pcre_compile\n");
for (int i = 0; i < 8; i++) {
int len = strlen(instructs[i]);
rc = pcre_exec(re, NULL, instructs[i], len, 0, 0, ovector, OVECCOUNT);
if (rc >= 0) printf("%s\t\t matches\n", instructs[i]);
else printf("%s\t\t does not match.\n", instructs[i]);
}
free(re);
}
return 0;
}

EDIT: With TRex remove (?i) as it don't support it.
T-Rex (https://sourceforge.net/projects/tiny-rex/), link was in Benchmark of Regex Libraries (http://lh3lh3.users.sourceforge.net/reb.shtml)
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 09, 2019, 05:31:57 AM
At the moment the program can only generate instructions with two operands, instructions with one and three operands are not yet supported, but it's easy to finish, check boxes also don't work, to generate commands, you need to select the instruction with two operands, click generate, a file with the source code will appear in the program folder.

The following parameters and their modifications are supported...
r16/32/64;r/m16/32/64;;;
r/m16/32/64;r16/32/64;;;
r8;r/m8;;;
imm8/16/32/64/128
xmm/m8/16/32/64/128
mm - xmm/m64

unsupported instructions ...
in eax, dx - with two operands - modifications with the specified register
DIV;rDX;rAX;r/m16/32/64 // three operands, rDX;rAX - modifications with the specified register
FDIVR;ST;m32real;;; - m32real, m64real, m80real
FDIVR;ST;STi;;;
FILD;ST;m32int;;;
FILD;ST;m16int;;;
FILD;ST;m64int;;;

Known bug - movq mm/m64 - mm - inheritance error in the variable will be fixed in the next version. There are a couple of extra characters in the file with command forms; I have not cleaned this file yet.
Title: Re: VPERMILPS Bug?!
Post by: AW on August 09, 2019, 03:52:03 PM
A more up to date PCRE for Windows (https://github.com/kiyolee/pcre-win-build) ?
However, the std's regex builds to only 55KB (the booster's regex to 145KB).
This is the std's Regex variation:

Code: [Select]
/*
Pattern passed:
const char regexp[] ="(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
*/
int dotestStd(const char strArray[][MAX_STRING_SIZE], const char* pattern)
{

std::regex pat(pattern, regex_constants::icase);

smatch matches;

cout << "\nStd Regex" << endl;
for (int i = 0; i < 8; i++)
{
string str(strArray[i]);
if (regex_match(str, matches, pat))
cout << matches[0] << "\t\t matches" << endl;
else
cout << str << "\t\t does not match." << endl;
}
return 0;
}
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 09, 2019, 10:34:19 PM
The database structure has been updated, added instructions AVX, AVX2, FMA, BMI, etc, apart from this, nothing has changed, the archive can be downloaded in the message above.
Title: Re: VPERMILPS Bug?!
Post by: AW on August 09, 2019, 10:38:57 PM
Almost invisible but there is also PCRE2 10.33 for Windows (https://github.com/kiyolee/pcre2-win-build).
Lots of different things but looks like the way to go if you need to become a RegEx GrandMaster using the latest developments.

This is the reviewed Timo source to build with PCRE2 10.33.

Code: [Select]
#define PCRE2_STATIC
#define PCRE2_CODE_UNIT_WIDTH 8
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "config.h"
#include "pcre2.h"
#pragma comment(lib, "libpcre2-8-static.lib")

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char* instructs[] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };
int main(int argC, char* argV[])
{

pcre2_code *re;
pcre2_match_data* match_data;
const char* error;
int erroffset;
int rc;
int len;
int i;

re = pcre2_compile(regexp, PCRE2_ZERO_TERMINATED, 0, &error, &erroffset, NULL);
match_data = pcre2_match_data_create_from_pattern(re, NULL);
if (re) {
printf("pcre_compile\n");
for ( i = 0; i < 8; i++) {
len = strlen(instructs[i]);

rc = pcre2_match(re, instructs[i], len, 0, 0, match_data, NULL);

if (rc >= 0) printf("%s\t\t matches\n", instructs[i]);
else printf("%s\t\t does not match.\n", instructs[i]);
}
free(re);
}
return 0;
}

The x86 .exe size is 230KB, the libpcre2-8-static.lib is 4 Mb.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 10, 2019, 08:14:34 AM
 :biggrin:

partial support of avx-avx2 instructions is added, partial support for single operand commands has been added... some one operand and two operand commands require type definitions, therefore, these commands may not be generated, for example imm32u, partial processing of instructions with precisely defined registers added - it affects the instructions xmm, base registers do not work yet. The search for instructions will work in the next update. Instructions with three and four operands will be added at the final stage.

You can write about bugs, just specify the line number from the list of commands, if there is a listing with incorrect instructions, you can copy the erroneous text. When all the instructions are checked, it will be possible to activate multi-generation, for 20 or more instructions at a time. The update is in the message above.  :azn:

p.s. If you try all the instructions in a row, you can find out which ones work and which don't)

Code: [Select]
VMOVUPS ymm1 , ymmword ptr [4 * r15 + r15] 
VMOVUPS ymm1 , ymmword ptr [r15 + r15] 
VMOVUPS ymm1 , ymmword ptr [4 * r15 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r15 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r14 + r14] 
VMOVUPS ymm1 , ymmword ptr [r14 + r14] 
VMOVUPS ymm1 , ymmword ptr [4 * r14 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r14 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r13 + r13] 
VMOVUPS ymm1 , ymmword ptr [r13 + r13] 
VMOVUPS ymm1 , ymmword ptr [4 * r13 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r13 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r12 + r12] 
VMOVUPS ymm1 , ymmword ptr [r12 + r12] 
VMOVUPS ymm1 , ymmword ptr [4 * r12 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r12 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r11 + r11] 
VMOVUPS ymm1 , ymmword ptr [r11 + r11] 
VMOVUPS ymm1 , ymmword ptr [4 * r11 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r11 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r10 + r10] 
VMOVUPS ymm1 , ymmword ptr [r10 + r10] 
VMOVUPS ymm1 , ymmword ptr [4 * r10 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r10 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r9 + r9] 
VMOVUPS ymm1 , ymmword ptr [r9 + r9] 
VMOVUPS ymm1 , ymmword ptr [4 * r9 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r9 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r8 + r8] 
VMOVUPS ymm1 , ymmword ptr [r8 + r8] 
VMOVUPS ymm1 , ymmword ptr [4 * r8 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r8 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rbp + rbp] 
VMOVUPS ymm1 , ymmword ptr [rbp + rbp] 
VMOVUPS ymm1 , ymmword ptr [4 * rbp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rbp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rsp + rbp] 
VMOVUPS ymm1 , ymmword ptr [rsp + rbp] 
VMOVUPS ymm1 , ymmword ptr [4 * rsp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rsp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rdi + rdi] 
VMOVUPS ymm1 , ymmword ptr [rdi + rdi] 
VMOVUPS ymm1 , ymmword ptr [4 * rdi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rdi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rsi + rsi] 
VMOVUPS ymm1 , ymmword ptr [rsi + rsi] 
VMOVUPS ymm1 , ymmword ptr [4 * rsi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rsi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rdx + rdx] 
VMOVUPS ymm1 , ymmword ptr [rdx + rdx] 
VMOVUPS ymm1 , ymmword ptr [4 * rdx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rdx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rcx + rcx] 
VMOVUPS ymm1 , ymmword ptr [rcx + rcx] 
VMOVUPS ymm1 , ymmword ptr [4 * rcx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rcx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rbx + rbx] 
VMOVUPS ymm1 , ymmword ptr [rbx + rbx] 
VMOVUPS ymm1 , ymmword ptr [4 * rbx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rbx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rax + rax] 
VMOVUPS ymm1 , ymmword ptr [rax + rax] 
VMOVUPS ymm1 , ymmword ptr [4 * rax + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rax + 123456h] 
VMOVUPS ymm1 , ymmword ptr [ DataLAbelValue ] 
VMOVUPS ymm1 , ymm2 
Title: Re: VPERMILPS Bug?!
Post by: TimoVJL on August 10, 2019, 06:01:31 PM
A note:
Opcodes (https://github.com/Maratyszcza/Opcodes) project have a opcode database in xml format.
Creating a csv-file from it could be useful, even that database has it's own limits.

asmdb (https://devhub.io/repos/asmjit-asmdb) as js/json format.

Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 10, 2019, 09:51:41 PM
A note:
Opcodes (https://github.com/Maratyszcza/Opcodes) project have a opcode database in xml format.
Creating a csv-file from it could be useful, even that database has it's own limits.

asmdb (https://devhub.io/repos/asmjit-asmdb) as js/json format.

Hi TimoVJL,

that's exactly what I did, I used this file - https://github.com/golang/arch/blob/master/x86/x86.csv (https://github.com/golang/arch/blob/master/x86/x86.csv). This file was made based on intelxed/xed datafiles https://github.com/intelxed/xed/tree/master/datafiles (https://github.com/intelxed/xed/tree/master/datafiles), in my opinion this is the best database of all instructions. The csv file itself was generated using Golang x86avxgen, I tried to generate a new database relevant for avx-512, but i'm stuck with that go language  :undecided: :sad:

As an option, I can try to parse instructions directly https://github.com/intelxed/xed/blob/master/datafiles/avx/avx-isa.txt (https://github.com/intelxed/xed/blob/master/datafiles/avx/avx-isa.txt), they have a format similar to ...
Code: [Select]
AVX_INSTRUCTIONS()::
{
ICLASS    : VADDPD
EXCEPTIONS: avx-type-2
CPL       : 3
CATEGORY  : AVX
EXTENSION : AVX
ATTRIBUTES: MXCSR
PATTERN : VV1 0x58  V66 VL128 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()
OPERANDS  : REG0=XMM_R():w:dq:f64 REG1=XMM_N():r:dq:f64 MEM0:r:dq:f64

PATTERN : VV1 0x58  V66 VL128 V0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]
OPERANDS  : REG0=XMM_R():w:dq:f64 REG1=XMM_N():r:dq:f64 REG2=XMM_B():r:dq:f64

PATTERN : VV1 0x58  V66 VL256 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()
OPERANDS  : REG0=YMM_R():w:qq:f64 REG1=YMM_N():r:qq:f64 MEM0:r:qq:f64

PATTERN : VV1 0x58  V66 VL256 V0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]
OPERANDS  : REG0=YMM_R():w:qq:f64 REG1=YMM_N():r:qq:f64 REG2=YMM_B():r:qq:f64
}


{
ICLASS    : VADDPS
EXCEPTIONS: avx-type-2
CPL       : 3
CATEGORY  : AVX
EXTENSION : AVX
ATTRIBUTES: MXCSR
PATTERN : VV1 0x58  VNP VL128 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()
OPERANDS  : REG0=XMM_R():w:dq:f32 REG1=XMM_N():r:dq:f32 MEM0:r:dq:f32

PATTERN : VV1 0x58  VNP VL128 V0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]
OPERANDS  : REG0=XMM_R():w:dq:f32 REG1=XMM_N():r:dq:f32 REG2=XMM_B():r:dq:f32

PATTERN : VV1 0x58  VNP VL256 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()
OPERANDS  : REG0=YMM_R():w:qq:f32 REG1=YMM_N():r:qq:f32 MEM0:r:qq:f32

PATTERN : VV1 0x58  VNP VL256 V0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]
OPERANDS  : REG0=YMM_R():w:qq:f32 REG1=YMM_N():r:qq:f32 REG2=YMM_B():r:qq:f32
}


{
ICLASS    : VADDSD
EXCEPTIONS: avx-type-3
CPL       : 3
ATTRIBUTES : simd_scalar MXCSR
CATEGORY  : AVX
EXTENSION : AVX
PATTERN : VV1 0x58  VF2  V0F  MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()
OPERANDS  : REG0=XMM_R():w:dq:f64 REG1=XMM_N():r:dq:f64 MEM0:r:q:f64

PATTERN : VV1 0x58  VF2  V0F  MOD[0b11] MOD=3 REG[rrr] RM[nnn]
OPERANDS  : REG0=XMM_R():w:dq:f64 REG1=XMM_N():r:dq:f64 REG2=XMM_B():r:q:f64
}

Working with the csv format is easier for me, this leads to the main goal, that I can combine fake instructions and intentionally create the conditions for an error. Everything should be as simple as possible. There are three options, generate fake instructions in memory, use generic structures(Currently used), use a pre-prepared base for spoiled instructions. The first is too expensive, the other two are perfect. Xed parser complicates the whole structure, but I think it can be done. Xed has 5202 instructions in the description.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 13, 2019, 04:18:55 AM
Hi,

I think I was wrong, because the database file is generated by another technique, they wanted to introduce the method above, but it needs improvement ...
Details can be obtained at the link - https://godoc.org/golang.org/x/arch/x86/x86spec (https://godoc.org/golang.org/x/arch/x86/x86spec)

Quote
Command x86spec
X86spec reads the “Intel® 64 and IA-32 Architectures Software Developer's Manual” to collect instruction encoding details and writes those details to standard output in CSV format.

Usage:

x86spec [-f file] [-u url] >x86.csv
The -f flag specifies the input file (default x86manual.pdf), the Intel instruction set reference manual in PDF form. If the input file does not exist, it will be created by downloading the manual.

The -u flag specifies the URL from which to download the manual (default https://golang.org/s/x86manual, which redirects to Intel's site). The URL is downloaded only when the file named by the -f flag is missing.

There are additional debugging flags, not shown. Run x86spec -help for the list.

Intel® 64 and IA-32 Architectures Software Developer’s Manual - Last updated 2019-05-30 - https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf (https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf)
All instructions can be found in the pdf file, the question is how to copy the tables correctly ?!
Title: Re: VPERMILPS Bug?!
Post by: TimoVJL on August 13, 2019, 04:34:18 PM
https://www.codeproject.com/articles/7056/code-to-extract-plain-text-from-a-pdf-file

EDIT: some streams needs a bigger buffer:
size_t outsize = (streamend - streamstart)*105;
Title: Re: VPERMILPS Bug?!
Post by: AW on August 13, 2019, 05:38:52 PM
I would bet in in moving the information from the XED text database to an Sqlite database and manipulate it from there.
Records are already parsed and separated by {}
Record fields are: ICLASS, UNAME, VERSION, CATEGORY, .... etc
It will be easy to build the Sqlite database.
Just an idea.  :biggrin:
Title: Re: VPERMILPS Bug?!
Post by: jj2007 on August 13, 2019, 06:48:38 PM
https://www.codeproject.com/articles/7056/code-to-extract-plain-text-from-a-pdf-file

EDIT: some streams needs a bigger buffer:
size_t outsize = (streamend - streamstart)*105;

It works, kind of, but it's lightyears away from what AbleWord (http://www.ableword.net/) can extract from a Pdf.

All instructions can be found in the pdf file, the question is how to copy the tables correctly ?!

Try AbleWord.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 13, 2019, 09:15:51 PM
https://www.codeproject.com/articles/7056/code-to-extract-plain-text-from-a-pdf-file

EDIT: some streams needs a bigger buffer:
size_t outsize = (streamend - streamstart)*105;

Hi TimoVJL,

 :thumbsup: great method, golang language processes the file in the same way, here is an example of a table to be found

(https://i.imgur.com/Rza2kGC.png)

Here is what will be found in the text file ..
Code: [Select]
Opcode Instruction Op/

En

64-Bit

Mode

Compat/

Leg Mode

Description

F6 /5 IMUL r/m8* MV a l i d V a l i d A X AL r/m byte.

F7 /5 IMUL r/m16 MV a l i d V a l i d D X : A X AX r/m word.

F7 /5 IMUL r/m32 MV a l i d V a l i d E D X : E A X EAX r/m 32.

REX.W + F7 /5 IMUL r/m64 M Valid N.E. RDX:RAX RAX r/m 64.

0F AF / r IMUL r16, r/m16 RM Valid Valid word register word register r/m 16.

0F AF / r IMUL r32, r/m32 RM Valid Valid doubleword register doubleword register

r/m32.

REX.W + 0F AF / r IMUL r64, r/m64 RM Valid N.E. Quadword register Quadword register

r/m64 .

6B / r ib IMUL r16, r/m16, imm8 RMI Valid Valid word register r/m16 sign-extended

immediate byte.

6B / r ib IMUL r32, r/m32, imm8 RMI Valid Valid doubleword register r/m32 sign-

extended immediate byte.

REX.W + 6B / r ib IMUL r64, r/m64, imm8 RMI Valid N.E. Quadword register r/m64 sign-extended

immediate byte.

69 / r iw IMUL r16, r/m16, imm16 RMI Valid Valid word register r/m16 immediate word.

69 / r id IMUL r32, r/m32, imm32 RMI Valid Valid doubleword register r/m32 immediate

doubleword.

REX.W + 69 / r id IMUL r64, r/m64, imm32 RMI Valid N.E. Quadword register r/m64 immediate

doubleword.

Formatting is broken, but as a method it should be the perfect way. Thanks!


I would bet in in moving the information from the XED text database to an Sqlite database and manipulate it from there.
Records are already parsed and separated by {}
Record fields are: ICLASS, UNAME, VERSION, CATEGORY, .... etc
It will be easy to build the Sqlite database.
Just an idea.  :biggrin:

Hi AW,

the form of records scared me, before that I studied a number of other similar databases, it turned out that with an almost perfect form, all operand information is lost, since encoding in operands has a suitable representation for instruction decoder, in other words, we do not know anything about operands and this information is transformed into the source code of the parser in C++. I immediately noticed this form in xed, therefore, I lost all desire to search for operand descriptors in the source code.

CMPXCHG8B
Code: [Select]
{
ICLASS    : CMPXCHG8B
CPL       : 3
CATEGORY  : SEMAPHORE
EXTENSION : BASE
ISA_SET   : PENTIUMREAL
ATTRIBUTES : LOCKABLE
FLAGS     : MUST [ zf-mod ]
PATTERN   : 0x0F 0xC7 MOD[mm] MOD!=3 REG[0b001] RM[nnn] not64 IMMUNE66() MODRM() nolock_prefix
OPERANDS  : MEM0:rcw:q REG0=XED_REG_EDX:rcw:SUPP REG1=XED_REG_EAX:rcw:SUPP REG2=XED_REG_ECX:r:SUPP REG3=XED_REG_EBX:r:SUPP
PATTERN   : 0x0F 0xC7 MOD[mm] MOD!=3 REG[0b001] RM[nnn] mode64 norexw_prefix IMMUNE66() MODRM() nolock_prefix
OPERANDS  : MEM0:rcw:q REG0=XED_REG_EDX:rcw:SUPP REG1=XED_REG_EAX:rcw:SUPP REG2=XED_REG_ECX:r:SUPP REG3=XED_REG_EBX:r:SUPP
}

(https://i.imgur.com/Glwxl7W.png)

Maybe I'm wrong, the instruction descriptor here is MEM0:rcw:q, CMPXCHG8B m64, m64 = MEM0:rcw:q ?! I will check the rest of the instructions and see if the information is lost there .. Thank you!


https://www.codeproject.com/articles/7056/code-to-extract-plain-text-from-a-pdf-file

EDIT: some streams needs a bigger buffer:
size_t outsize = (streamend - streamstart)*105;

It works, kind of, but it's lightyears away from what AbleWord (http://www.ableword.net/) can extract from a Pdf.

All instructions can be found in the pdf file, the question is how to copy the tables correctly ?!

Try AbleWord.

Hi jj2007,

I agree that a good presentation of text from pdf helps to process it more easily according to the conditions. But from experience this text always turns out to be messy, probably due to the fact that the pdf format has its own descriptors for the location of the text  :sad:. I tried the program and it gave me an error...

(https://i.imgur.com/odq6ojH.png)

I would not want to be attached to the method of transformation into text, because there will always be some kind of new text structure.  :angelic: Thank you!
Title: Re: VPERMILPS Bug?!
Post by: TimoVJL on August 13, 2019, 09:30:52 PM
FoxIt (https://www.foxitsoftware.com/pdf-reader/) can export to .txt

Title: Re: VPERMILPS Bug?!
Post by: AW on August 13, 2019, 10:11:10 PM
Quote
Maybe I'm wrong, the instruction descriptor here is MEM0:rcw:q, CMPXCHG8B m64, m64 = MEM0:rcw:q ?! I will check the rest of the instructions and see if the information is lost there .. Thank you!
I agree, disambiguate that appears difficult. The Golang guys have done it somewhere because they show "CMPXCHG8B m64","0F C7 /1","V","V","","operand16,operand32".
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 13, 2019, 10:38:39 PM
Quote
Maybe I'm wrong, the instruction descriptor here is MEM0:rcw:q, CMPXCHG8B m64, m64 = MEM0:rcw:q ?! I will check the rest of the instructions and see if the information is lost there .. Thank you!
I agree, disambiguate that appears difficult. The Golang guys have done it somewhere because they show "CMPXCHG8B m64","0F C7 /1","V","V","","operand16,operand32".

They took this data from the pdf file and did not capture avx512 since there is a different table format, but the table of any instruction looks identical, one column requires a separator between the opcode and the instruction, this can be seen from the tables in the pictures above.

I took instructions with complex encoding ..
Code: [Select]
# EMITTING VMOVAPD (VMOVAPD-512-1)
{
ICLASS:      VMOVAPD
CPL:         3
CATEGORY:    DATAXFER
EXTENSION:   AVX512EVEX
ISA_SET:     AVX512F_512
EXCEPTIONS:     AVX512-E1
REAL_OPCODE: Y
ATTRIBUTES:  MASKOP_EVEX
PATTERN:    EVV 0x28 V66 V0F MOD[0b11] MOD=3 BCRC=0 REG[rrr] RM[nnn]  VL512  W1  NOEVSR
OPERANDS:    REG0=ZMM_R3():w:zf64 REG1=MASK1():r:mskw:TXT=ZEROSTR REG2=ZMM_B3():r:zf64
IFORM:       VMOVAPD_ZMMf64_MASKmskw_ZMMf64_AVX512
}

{
ICLASS:      VMOVAPD
CPL:         3
CATEGORY:    DATAXFER
EXTENSION:   AVX512EVEX
ISA_SET:     AVX512F_512
EXCEPTIONS:     AVX512-E1
REAL_OPCODE: Y
ATTRIBUTES:  MEMORY_FAULT_SUPPRESSION MASKOP_EVEX REQUIRES_ALIGNMENT DISP8_FULLMEM
PATTERN:    EVV 0x28 V66 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] BCRC=0 MODRM()  VL512  W1  NOEVSR  ESIZE_64_BITS() NELEM_FULLMEM()
OPERANDS:    REG0=ZMM_R3():w:zf64 REG1=MASK1():r:mskw:TXT=ZEROSTR MEM0:r:zd:f64
IFORM:       VMOVAPD_ZMMf64_MASKmskw_MEMf64_AVX512
}


# EMITTING VMOVAPD (VMOVAPD-512-2)
{
ICLASS:      VMOVAPD
CPL:         3
CATEGORY:    DATAXFER
EXTENSION:   AVX512EVEX
ISA_SET:     AVX512F_512
EXCEPTIONS:     AVX512-E1
REAL_OPCODE: Y
ATTRIBUTES:  MASKOP_EVEX
PATTERN:    EVV 0x29 V66 V0F MOD[0b11] MOD=3 BCRC=0 REG[rrr] RM[nnn]  VL512  W1  NOEVSR
OPERANDS:    REG0=ZMM_B3():w:zf64 REG1=MASK1():r:mskw:TXT=ZEROSTR REG2=ZMM_R3():r:zf64
IFORM:       VMOVAPD_ZMMf64_MASKmskw_ZMMf64_AVX512
}

This information does not say anything about the form of real operands, because presumably this information is for the decoder algorithm.  :undecided:  :sad:

Code: [Select]
xed_reg_enum_t ZMM_R3()::
mode16 | OUTREG=ZMM_R3_32()
mode32 | OUTREG=ZMM_R3_32()
mode64 | OUTREG=ZMM_R3_64()

xed_reg_enum_t ZMM_R3_32()::
REG=0 | OUTREG=XED_REG_ZMM0
REG=1 | OUTREG=XED_REG_ZMM1
REG=2 | OUTREG=XED_REG_ZMM2
REG=3 | OUTREG=XED_REG_ZMM3
REG=4 | OUTREG=XED_REG_ZMM4
REG=5 | OUTREG=XED_REG_ZMM5
REG=6 | OUTREG=XED_REG_ZMM6
REG=7 | OUTREG=XED_REG_ZMM7

xed_reg_enum_t ZMM_R3_64()::
REXRR=0 REXR=0 REG=0 | OUTREG=XED_REG_ZMM0
REXRR=0 REXR=0 REG=1 | OUTREG=XED_REG_ZMM1
REXRR=0 REXR=0 REG=2 | OUTREG=XED_REG_ZMM2
REXRR=0 REXR=0 REG=3 | OUTREG=XED_REG_ZMM3
REXRR=0 REXR=0 REG=4 | OUTREG=XED_REG_ZMM4
REXRR=0 REXR=0 REG=5 | OUTREG=XED_REG_ZMM5
REXRR=0 REXR=0 REG=6 | OUTREG=XED_REG_ZMM6
REXRR=0 REXR=0 REG=7 | OUTREG=XED_REG_ZMM7
REXRR=0 REXR=1 REG=0 | OUTREG=XED_REG_ZMM8
REXRR=0 REXR=1 REG=1 | OUTREG=XED_REG_ZMM9
REXRR=0 REXR=1 REG=2 | OUTREG=XED_REG_ZMM10
REXRR=0 REXR=1 REG=3 | OUTREG=XED_REG_ZMM11
REXRR=0 REXR=1 REG=4 | OUTREG=XED_REG_ZMM12
REXRR=0 REXR=1 REG=5 | OUTREG=XED_REG_ZMM13
REXRR=0 REXR=1 REG=6 | OUTREG=XED_REG_ZMM14
REXRR=0 REXR=1 REG=7 | OUTREG=XED_REG_ZMM15

REXRR=1 REXR=0 REG=0 | OUTREG=XED_REG_ZMM16
REXRR=1 REXR=0 REG=1 | OUTREG=XED_REG_ZMM17
REXRR=1 REXR=0 REG=2 | OUTREG=XED_REG_ZMM18
REXRR=1 REXR=0 REG=3 | OUTREG=XED_REG_ZMM19
REXRR=1 REXR=0 REG=4 | OUTREG=XED_REG_ZMM20
REXRR=1 REXR=0 REG=5 | OUTREG=XED_REG_ZMM21
REXRR=1 REXR=0 REG=6 | OUTREG=XED_REG_ZMM22
REXRR=1 REXR=0 REG=7 | OUTREG=XED_REG_ZMM23
REXRR=1 REXR=1 REG=0 | OUTREG=XED_REG_ZMM24
REXRR=1 REXR=1 REG=1 | OUTREG=XED_REG_ZMM25
REXRR=1 REXR=1 REG=2 | OUTREG=XED_REG_ZMM26
REXRR=1 REXR=1 REG=3 | OUTREG=XED_REG_ZMM27
REXRR=1 REXR=1 REG=4 | OUTREG=XED_REG_ZMM28
REXRR=1 REXR=1 REG=5 | OUTREG=XED_REG_ZMM29
REXRR=1 REXR=1 REG=6 | OUTREG=XED_REG_ZMM30
REXRR=1 REXR=1 REG=7 | OUTREG=XED_REG_ZMM31

Code: [Select]
xed_reg_enum_t ZMM_B3()::
mode16 | OUTREG=ZMM_B3_32()
mode32 | OUTREG=ZMM_B3_32()
mode64 | OUTREG=ZMM_B3_64()

xed_reg_enum_t ZMM_B3_32()::
RM=0 | OUTREG=XED_REG_ZMM0
RM=1 | OUTREG=XED_REG_ZMM1
RM=2 | OUTREG=XED_REG_ZMM2
RM=3 | OUTREG=XED_REG_ZMM3
RM=4 | OUTREG=XED_REG_ZMM4
RM=5 | OUTREG=XED_REG_ZMM5
RM=6 | OUTREG=XED_REG_ZMM6
RM=7 | OUTREG=XED_REG_ZMM7

xed_reg_enum_t ZMM_B3_64()::
REXX=0 REXB=0 RM=0 | OUTREG=XED_REG_ZMM0
REXX=0 REXB=0 RM=1 | OUTREG=XED_REG_ZMM1
REXX=0 REXB=0 RM=2 | OUTREG=XED_REG_ZMM2
REXX=0 REXB=0 RM=3 | OUTREG=XED_REG_ZMM3
REXX=0 REXB=0 RM=4 | OUTREG=XED_REG_ZMM4
REXX=0 REXB=0 RM=5 | OUTREG=XED_REG_ZMM5
REXX=0 REXB=0 RM=6 | OUTREG=XED_REG_ZMM6
REXX=0 REXB=0 RM=7 | OUTREG=XED_REG_ZMM7
REXX=0 REXB=1 RM=0 | OUTREG=XED_REG_ZMM8
REXX=0 REXB=1 RM=1 | OUTREG=XED_REG_ZMM9
REXX=0 REXB=1 RM=2 | OUTREG=XED_REG_ZMM10
REXX=0 REXB=1 RM=3 | OUTREG=XED_REG_ZMM11
REXX=0 REXB=1 RM=4 | OUTREG=XED_REG_ZMM12
REXX=0 REXB=1 RM=5 | OUTREG=XED_REG_ZMM13
REXX=0 REXB=1 RM=6 | OUTREG=XED_REG_ZMM14
REXX=0 REXB=1 RM=7 | OUTREG=XED_REG_ZMM15
REXX=1 REXB=0 RM=0 | OUTREG=XED_REG_ZMM16
REXX=1 REXB=0 RM=1 | OUTREG=XED_REG_ZMM17
REXX=1 REXB=0 RM=2 | OUTREG=XED_REG_ZMM18
REXX=1 REXB=0 RM=3 | OUTREG=XED_REG_ZMM19
REXX=1 REXB=0 RM=4 | OUTREG=XED_REG_ZMM20
REXX=1 REXB=0 RM=5 | OUTREG=XED_REG_ZMM21
REXX=1 REXB=0 RM=6 | OUTREG=XED_REG_ZMM22
REXX=1 REXB=0 RM=7 | OUTREG=XED_REG_ZMM23
REXX=1 REXB=1 RM=0 | OUTREG=XED_REG_ZMM24
REXX=1 REXB=1 RM=1 | OUTREG=XED_REG_ZMM25
REXX=1 REXB=1 RM=2 | OUTREG=XED_REG_ZMM26
REXX=1 REXB=1 RM=3 | OUTREG=XED_REG_ZMM27
REXX=1 REXB=1 RM=4 | OUTREG=XED_REG_ZMM28
REXX=1 REXB=1 RM=5 | OUTREG=XED_REG_ZMM29
REXX=1 REXB=1 RM=6 | OUTREG=XED_REG_ZMM30
REXX=1 REXB=1 RM=7 | OUTREG=XED_REG_ZMM31
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 14, 2019, 04:03:05 AM
 :azn:

I made all the tables from the pdf file, the condition for the parser was at least two rows and at least 5 columns. Scrolling through the document, I did not find an exception to this rule. The structure of the tables can be studied better using Excel, the last task is to combine these tables into one, difficulties will be due to the structure of the file, the description is divided into two lines (or more), just like opcode and instruction  :eusa_boohoo:


(https://i.imgur.com/gBKIczv.png)


Where can I see a full comparison of intel with amd instructions ?! Maybe here ..
https://www.amd.com/en/support/tech-docs?f%5B0%5D=tech_docs_product_type%3Aprocessor (https://www.amd.com/en/support/tech-docs?f%5B0%5D=tech_docs_product_type%3Aprocessor)
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 14, 2019, 07:24:02 AM
 :biggrin: second attempt

- added pages that are skipped at the end (wrong interval)
- double strings were assembled into one, in some places there are errors(line was added in a new column), but they are few and easy to fix manually
- opcode and instruction are on the same line

Tables have a promising look, now it would be nice to combine tables with different number of columns .. and delete the first line that describes the table title. Our table will be even better  :eusa_dance:

The table header can be taken from this file - page-589-table-1.csv (first instruction)
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 15, 2019, 07:44:30 AM
 :eusa_dance:

The list of instructions is ready, a complete set of instructions from the Intel documentation. For the generator it still needs to be adapted, but this is no longer a problem.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 21, 2019, 11:14:13 PM
Hi,

I'm doing 4 operand instructions now, for the test took this instruction VBLENDPD  :angelic: 20 736 lines of code, compilation results below,

Code: [Select]
Translated Windows SDK 10.0 64 bits
windef WIN_INTERNAL manque ENDIF
mywindow2.asm: 20806 lines, 2 passes, 178 ms, 0 warnings, 0 errors
Microsoft (R) Incremental Linker Version 14.22.27905.0
Copyright (C) Microsoft Corporation.  All rights reserved.

 Directory of E:\DATA\MASM64\HJWasm\Regression

21.08.2019  14:07         1 144 554 mywindow2.asm
21.08.2019  14:58           174 592 mywindow2.exe
21.08.2019  14:58             2 058 mywindow2.map
21.08.2019  14:58           487 424 mywindow2.pdb
               4 File(s)      1 808 628 bytes
               0 Dir(s)     157 351 936 bytes free
Press any key to continue . . .

There are a lot of instructions, some events are not processed yet, so I decided to make instructions with 3 operands after 4 operands, and after that add other types of processing data, we talked about these types above.
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 21, 2019, 11:50:27 PM
 :biggrin: we love real hardcore, VBLENDVPD 331 776 lines, there are bugs in UASM

mywindow2.obj : fatal error LNK1276: invalid directive '' found; does not start with '/'
mywindow2.obj : warning LNK4209: debugging information corrupt; recompile module; linking object as if no debug info


Code: [Select]
Translated Windows SDK 10.0 64 bits
windef WIN_INTERNAL manque ENDIF
mywindow2.asm: 331845 lines, 2 passes, 1247 ms, 0 warnings, 0 errors
Microsoft (R) Incremental Linker Version 14.22.27905.0
Copyright (C) Microsoft Corporation.  All rights reserved.

mywindow2.obj : fatal error LNK1276: invalid directive '' found; does not start with '/'
mywindow2.obj : warning LNK4209: debugging information corrupt; recompile module; linking object as if no debug info

 Directory of E:\DATA\MASM64\HJWasm\Regression

21.08.2019  15:28        19 077 160 mywindow2.asm
21.08.2019  14:58             2 058 mywindow2.map
21.08.2019  15:29         4 977 664 mywindow2.obj
21.08.2019  15:29            67 584 mywindow2.pdb
               4 File(s)     24 124 466 bytes
               0 Dir(s)     134 742 016 bytes free
Press any key to continue . . .


The macro assembler did the job, but the assembly slowed down for 7 seconds ...
Code: [Select]
Microsoft (R) Macro Assembler (x64) Version 14.22.27905.0
Copyright (C) Microsoft Corporation.  All rights reserved.

 Assembling: mywindowML2.asm
Microsoft (R) Incremental Linker Version 14.22.27905.0
Copyright (C) Microsoft Corporation.  All rights reserved.

Could Not Find E:\DATA\MASM64\HJWasm\Regression\*.lst
Could Not Find E:\DATA\MASM64\HJWasm\Regression\*.res

 Directory of E:\DATA\MASM64\HJWasm\Regression

21.08.2019  15:40        19 075 754 mywindowML2.asm
21.08.2019  15:41         2 755 072 mywindowML2.exe
21.08.2019  15:41             2 150 mywindowML2.map
21.08.2019  15:41         2 748 416 mywindowML2.pdb
               4 File(s)     24 581 392 bytes
               0 Dir(s)     114 884 608 bytes free
Press any key to continue . . .

The files are larger, so I uploaded to the file hosting https://mega.co.nz/#!w1Bn2AgZ!AAAAAAAAAABkZMBqwd624wAAAAAAAAAAZGTAasHetuM
Title: Re: VPERMILPS Bug?!
Post by: AW on August 22, 2019, 04:26:02 AM
@LiaoMi

 :biggrin:
I don't know whether it is a UASM bug or not, I will not investigate it further, but if we remove all things that are not necessary it will build the .exe (the KISS principle).

For example it will not build with this:
\masm32\bin\UASM64 /c -win64 -Zp8 /win64 /D_WIN64 /Cp /Cx /Cu /nologo /W2 -Zi0 -Zi1 -Zi2 -Zi3 /Zd -Zf %appname%.asm

But will build with this:
\masm32\bin\UASM64 /c -win64 -Zp8 %appname%.asm



Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 24, 2019, 06:32:37 PM
@LiaoMi

 :biggrin:
I don't know whether it is a UASM bug or not, I will not investigate it further, but if we remove all things that are not necessary it will build the .exe (the KISS principle).

For example it will not build with this:
\masm32\bin\UASM64 /c -win64 -Zp8 /win64 /D_WIN64 /Cp /Cx /Cu /nologo /W2 -Zi0 -Zi1 -Zi2 -Zi3 /Zd -Zf %appname%.asm

But will build with this:
\masm32\bin\UASM64 /c -win64 -Zp8 %appname%.asm

Hi AW,

it seems to me that this is clearly an erroneous processing of debugging information in the parser. Just look at the size of the pdb file, you can immediately see that it is corrupted  :angelic: PDB from another test project looks fine.

Title: Re: VPERMILPS Bug?!
Post by: AW on August 24, 2019, 07:57:44 PM
There is a problem because mainCRTStartup procedure has too many lines.
With a reduced number of lines it debugs well with this configuration (the source file needs to be loaded by hand, it will not autoload  :sad:).
\masm32\bin\uasm64 -c -win64 -Zp8 /Sg /WX -Zi %appname%.asm
\masm32\bin\link.exe /SUBSYSTEM:console /MACHINE:X64 /FIXED /DEBUG %appname%.obj


(https://www.dropbox.com/s/qpj8ccv27ma2v4u/VPERMILPS.png?dl=1)
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 25, 2019, 01:35:52 AM
Changes
- Added two types of addressing
- Fixed some errors that occurred during the generation process
- Database error correction was done (Related to pdf formatting) - all tables must be correct
- The search for instructions works, you can write one part of the instruction, it works after pressing the enter button, clearing the field and pressing enter you can see the whole list
- Added a program icon to make it a little prettier  :biggrin:
- 4 operand instructions should work
- For other types of operands I did not manage to add handlers, in the readme you can see all types ... it’s not difficult to do, just need a little more time  :skrewy:
- Gui is slightly modified.

What is already planned ?!
- Add three operand instructions
- Add five operand instructions
- Make a handler for instructions without operands (randomly generate all at once)
- Add handlers for all remaining types ({k1}{z}, /m512/m32bcst etc ...)
- Add checkbox handlers

P.S.
What do you think, if we adapt this database to RadASM ?! The idea is to implement autocomplete and help in the editor for assembler instructions  :eusa_boohoo:
Title: Re: VPERMILPS Bug?!
Post by: LiaoMi on August 28, 2019, 03:49:33 AM
Hi,

from now on I will post all updates in this old topic http://masm32.com/board/index.php?topic=7833.0 (http://masm32.com/board/index.php?topic=7833.0) Without the desire to delete the first post, updates will be in the third message of the old topic.  :thumbsup: