News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

VPERMILPS Bug?!

Started by LiaoMi, August 01, 2019, 10:57:02 PM

Previous topic - Next topic

LiaoMi



this code is not assembled from the generated list
;CRC32 r15 , word ptr [4 * r15w + r15w]  ;Error A2031: invalid addressing mode with current CPU setting

;CRC32 eax , word ptr [4 * r15w + r15w] 
;CRC32 eax , word ptr [r15w + r15w] 
CRC32 eax , word ptr [4 * r15w + 123456h] 
;CRC32 eax , word ptr [r15w + 123456h] 
;CRC32 eax , word ptr [4 * r14w + r14w] 
;CRC32 eax , word ptr [r14w + r14w] 
CRC32 eax , word ptr [4 * r14w + 123456h] 
;CRC32 eax , word ptr [r14w + 123456h] 
;CRC32 eax , word ptr [4 * r13w + r13w] 
;CRC32 eax , word ptr [r13w + r13w] 
;CRC32 eax , word ptr [4 * r13w + 123456h] 
;CRC32 eax , word ptr [r13w + 123456h] 
;CRC32 eax , word ptr [4 * r12w + r12w] 
;CRC32 eax , word ptr [r12w + r12w] 
;CRC32 eax , word ptr [4 * r12w + 123456h] 
;CRC32 eax , word ptr [r12w + 123456h] 
;CRC32 eax , word ptr [4 * r11w + r11w] 
;CRC32 eax , word ptr [r11w + r11w] 
;CRC32 eax , word ptr [4 * r11w + 123456h] 
;CRC32 eax , word ptr [r11w + 123456h] 
;CRC32 eax , word ptr [4 * r10w + r10w] 
;CRC32 eax , word ptr [r10w + r10w] 
;CRC32 eax , word ptr [4 * r10w + 123456h] 
;CRC32 eax , word ptr [r10w + 123456h] 
;CRC32 eax , word ptr [4 * r9w + r9w] 
;CRC32 eax , word ptr [r9w + r9w] 
;CRC32 eax , word ptr [4 * r9w + 123456h] 
;CRC32 eax , word ptr [r9w + 123456h] 
;CRC32 eax , word ptr [4 * r8w + r8w] 
;CRC32 eax , word ptr [r8w + r8w] 
;CRC32 eax , word ptr [4 * r8w + 123456h] 
;CRC32 eax , word ptr [r8w + 123456h] 
;CRC32 eax , word ptr [4 * bp + bp] 
;CRC32 eax , word ptr [bp + bp] 
;CRC32 eax , word ptr [4 * bp + 123456h] 
;CRC32 eax , word ptr [bp + 123456h] 
;CRC32 eax , word ptr [4 * sp + bp] 
;CRC32 eax , word ptr [sp + bp] 
;CRC32 eax , word ptr [4 * sp + 123456h] 
;CRC32 eax , word ptr [sp + 123456h] 
;CRC32 eax , word ptr [4 * di + di] 
;CRC32 eax , word ptr [di + di] 
;CRC32 eax , word ptr [4 * di + 123456h] 
;CRC32 eax , word ptr [di + 123456h] 
;CRC32 eax , word ptr [4 * si + si] 
;CRC32 eax , word ptr [si + si] 
;CRC32 eax , word ptr [4 * si + 123456h] 
;CRC32 eax , word ptr [si + 123456h] 
;CRC32 eax , word ptr [4 * dx + dx] 
;CRC32 eax , word ptr [dx + dx] 
;CRC32 eax , word ptr [4 * dx + 123456h] 
;CRC32 eax , word ptr [dx + 123456h] 
;CRC32 eax , word ptr [4 * cx + cx] 
;CRC32 eax , word ptr [cx + cx] 
;CRC32 eax , word ptr [4 * cx + 123456h] 
;CRC32 eax , word ptr [cx + 123456h] 
;CRC32 eax , word ptr [4 * bx + bx] 
;CRC32 eax , word ptr [bx + bx] 
;CRC32 eax , word ptr [4 * bx + 123456h] 
;CRC32 eax , word ptr [bx + 123456h] 
;CRC32 eax , word ptr [4 * ax + ax] 
;CRC32 eax , word ptr [ax + ax] 
;CRC32 eax , word ptr [4 * ax + 123456h] 
;CRC32 eax , word ptr [ax + 123456h] 
;CRC32 eax , word ptr [ DataLAbelValue ] 


You can look at the generated example (version for all possible instructions, source code is in the archive). Instruction Generator is still a beta version, I wrote the program during the evening, so there are still many bugs .. The debugger produces strange code using an example, but this is probably a bug of the debugger itself.



habran

I can not assemble this because missing include files:
        include C:\masm64\VS2017\include_x86_x64\translate64.inc
       include C:\masm64\sdkrc100\um\windows.inc

those lines with  byte ptr [ DataLAbelValue ]
cause error Illegal use of segment register
Cod-Father

LiaoMi

Quote from: habran on August 07, 2019, 11:29:25 PM
I can not assemble this because missing include files:
        include C:\masm64\VS2017\include_x86_x64\translate64.inc
        include C:\masm64\sdkrc100\um\windows.inc

All extra code can be deleted, use your own batch file, copy only the code itself and the variable in the data section.
But if you really want to try, these header files can be downloaded here - https://mega.co.nz/#!g5x3hSLa!AAAAAAAAAAAtj2upDPBmFQAAAAAAAAAALY9rqQzwZhU (updated archive)

aw27

An approach for error check, would be through the use of one (or more than one) regular expression(s) for each Intel instruction.

For example, for the first part of a crc32 instruction (until the comma)
(?i)(^crc32\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\s*,\s*




There are Regex libraries and Regex to C converters (I never tested).

LiaoMi

Quote from: AW on August 08, 2019, 12:12:11 AM
An approach for error check, would be through the use of one (or more than one) regular expression(s) for each Intel instruction.

For example, for the first part of a crc32 instruction (until the comma)
(?i)(^crc32\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\s*,\s*




There are Regex libraries and Regex to C converters (I never tested).

I would like to be able to generate erroneous instructions, almost like fuzzing  :badgrin: But in this case, you will have to limit the generation of instructions in the settings. Otherwise, the restriction on the number of displayed errors will not allow you to see all errors .. up to this point I have tested instructions with two operands

r16/32/64;r/m16/32/64;;;
r/m16/32/64;r16/32/64;;;
r8;r/m8;;;

For the rest I need to add handlers, which will process the instructions ..  :arrow_down:
imm8/16/32
xmm/m128
DIV;rDX;rAX;r/m16/32/64 // three operands
FDIVR;ST;m32real;;;
FDIVR;ST;STi;;;
FILD;ST;m32int;;;
FILD;ST;m16int;;;
FILD;ST;m64int;;;


aw27

We can produce operand variations in the same way you do it in Haskell, and we can group similar instructions and handle them in the same fashion reducing by several times the number of RegEx needed.
BTW, I did not complete my crc32 instruction RegEx because there are 2 other complementary RegExes needed. One for the REX case, the other to handle and validate values from memory.

The major problem is finding a nice RegEx C library, I tested the SLRE, it is cute and small builds fine with VS 2019 but  did not find matches when using my test RegEx,  :sad:. It works fine with its own unit test, though. There is probably a bug in there, but the code is small at around 400 lines, so it will not take long to find it, if need to be done.

Anyway, these are only brainstorming ideas.

Yes, I understand your Haskell project is mostly suited to find errors. It is a nice tool indeed.  :thumbsup:


fearless

Could try the PCRE lib for regex, Biterider has a compiled verson of it in the ObjAsm Beta 2 for both x86 and x64 (version 841S)

aw27

#22
A final note about RegEx for ASM instructions syntax check.

The C implementations I have seen appear buggy, despite being cute and small.
The C++ Boost library provides a robust RegEx implementation. It is in C++ but is callable from C.
I tested the same RegEx I have used previously and it worked as expected.

1- C source

#include "common.h"

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char instructs[NUMBER_OF_STRING][MAX_STRING_SIZE] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };

int main()
{
dotest(instructs, regexp);
}



2- Header

#pragma once
#define NUMBER_OF_STRING 8
#define MAX_STRING_SIZE 40

#ifdef __cplusplus
extern "C" {
#endif
int dotest(const char strArray[][MAX_STRING_SIZE], const char* pattern);
#ifdef __cplusplus
}
#endif


3- C++ file

#include <boost/regex.hpp>
#include <string>
#include <iostream>
#include "common.h"

using namespace std;

int dotest(const char strArray[][MAX_STRING_SIZE], const char* pattern)
{
boost::regex pat(pattern);
boost::smatch matches;
for (int i = 0; i < NUMBER_OF_STRING; i++)
{
string str(strArray[i]);
if (boost::regex_match(str, matches, pat))
cout << matches[0] << "\t\t matches" << endl;
else
cout << str << "\t\t does not match." << endl;
}
return 0;
}


Output:

crc32  ecx,              matches
CRC32 esi,               matches
crc32  r10d  ,           matches
cRC32 r15 ,              matches
Crr32 rdi,               does not match.
CrC32 rdi,               matches
crc32 bx,                does not match.
crc32  ebx ,             matches


It is also possible to use the Std's regex instead of the Boost library regex, but the pattern does not support case insensitiveness. We need to set a flag for that when declaring the regular expression. Otherwise it works fine too.


TimoVJL

#23
PCRE works with that pattern, so works with asm and C.
PCRE 6.4 at Pelles C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//#include <pcre.h>
#pragma comment(lib, "pcre3s.lib")
typedef void* pcre; // opaque
typedef void* pcre_extra; // fake, not used
pcre __cdecl *pcre_compile(const char *, int, const char **, int *, const unsigned char *);
int __cdecl pcre_exec(const pcre *, const pcre_extra *, const char *, int, int, int, int *, int);

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char *instructs[] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };
#define OVECCOUNT 30    /* should be a multiple of 3 */
int main(int argC, char *argV[])
{
pcre *re;
const char *error;
int erroffset;
int ovector[OVECCOUNT];
int rc;

re = pcre_compile(regexp, 0, &error, &erroffset, NULL);
if (re) {
printf("pcre_compile\n");
for (int i = 0; i < 8; i++) {
int len = strlen(instructs[i]);
rc = pcre_exec(re, NULL, instructs[i], len, 0, 0, ovector, OVECCOUNT);
if (rc >= 0) printf("%s\t\t matches\n", instructs[i]);
else printf("%s\t\t does not match.\n", instructs[i]);
}
free(re);
}
return 0;
}


EDIT: With TRex remove (?i) as it don't support it.
T-Rex, link was in Benchmark of Regex Libraries
May the source be with you

LiaoMi

#24
At the moment the program can only generate instructions with two operands, instructions with one and three operands are not yet supported, but it's easy to finish, check boxes also don't work, to generate commands, you need to select the instruction with two operands, click generate, a file with the source code will appear in the program folder.

The following parameters and their modifications are supported...
r16/32/64;r/m16/32/64;;;
r/m16/32/64;r16/32/64;;;
r8;r/m8;;;
imm8/16/32/64/128
xmm/m8/16/32/64/128
mm - xmm/m64

unsupported instructions ...
in eax, dx - with two operands - modifications with the specified register
DIV;rDX;rAX;r/m16/32/64 // three operands, rDX;rAX - modifications with the specified register
FDIVR;ST;m32real;;; - m32real, m64real, m80real
FDIVR;ST;STi;;;
FILD;ST;m32int;;;
FILD;ST;m16int;;;
FILD;ST;m64int;;;

Known bug - movq mm/m64 - mm - inheritance error in the variable will be fixed in the next version. There are a couple of extra characters in the file with command forms; I have not cleaned this file yet.

aw27

A more up to date PCRE for Windows ?
However, the std's regex builds to only 55KB (the booster's regex to 145KB).
This is the std's Regex variation:


/*
Pattern passed:
const char regexp[] ="(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
*/
int dotestStd(const char strArray[][MAX_STRING_SIZE], const char* pattern)
{

std::regex pat(pattern, regex_constants::icase);

smatch matches;

cout << "\nStd Regex" << endl;
for (int i = 0; i < 8; i++)
{
string str(strArray[i]);
if (regex_match(str, matches, pat))
cout << matches[0] << "\t\t matches" << endl;
else
cout << str << "\t\t does not match." << endl;
}
return 0;
}

LiaoMi

The database structure has been updated, added instructions AVX, AVX2, FMA, BMI, etc, apart from this, nothing has changed, the archive can be downloaded in the message above.

aw27

Almost invisible but there is also PCRE2 10.33 for Windows.
Lots of different things but looks like the way to go if you need to become a RegEx GrandMaster using the latest developments.

This is the reviewed Timo source to build with PCRE2 10.33.


#define PCRE2_STATIC
#define PCRE2_CODE_UNIT_WIDTH 8
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "config.h"
#include "pcre2.h"
#pragma comment(lib, "libpcre2-8-static.lib")

const char regexp[] = "(?i)(^crc32\\s+((((r|e)a|(r|e)b|(r|e)c|(r|e)d)x)|(((r|e)s|(r|e)d)i)|((r8|r9|r10|r11|r12|r13|r14|r15)d?)))\\s*,\\s*";
const char* instructs[] = { "crc32  ecx,", "CRC32 esi,  ", "crc32  r10d  , ", "cRC32 r15 , ","Crr32 rdi,", "CrC32 rdi,", "crc32 bx,", "crc32  ebx , " };
int main(int argC, char* argV[])
{

pcre2_code *re;
pcre2_match_data* match_data;
const char* error;
int erroffset;
int rc;
int len;
int i;

re = pcre2_compile(regexp, PCRE2_ZERO_TERMINATED, 0, &error, &erroffset, NULL);
match_data = pcre2_match_data_create_from_pattern(re, NULL);
if (re) {
printf("pcre_compile\n");
for ( i = 0; i < 8; i++) {
len = strlen(instructs[i]);

rc = pcre2_match(re, instructs[i], len, 0, 0, match_data, NULL);

if (rc >= 0) printf("%s\t\t matches\n", instructs[i]);
else printf("%s\t\t does not match.\n", instructs[i]);
}
free(re);
}
return 0;
}


The x86 .exe size is 230KB, the libpcre2-8-static.lib is 4 Mb.

LiaoMi

#28
 :biggrin:

partial support of avx-avx2 instructions is added, partial support for single operand commands has been added... some one operand and two operand commands require type definitions, therefore, these commands may not be generated, for example imm32u, partial processing of instructions with precisely defined registers added - it affects the instructions xmm, base registers do not work yet. The search for instructions will work in the next update. Instructions with three and four operands will be added at the final stage.

You can write about bugs, just specify the line number from the list of commands, if there is a listing with incorrect instructions, you can copy the erroneous text. When all the instructions are checked, it will be possible to activate multi-generation, for 20 or more instructions at a time. The update is in the message above.  :azn:

p.s. If you try all the instructions in a row, you can find out which ones work and which don't)

VMOVUPS ymm1 , ymmword ptr [4 * r15 + r15] 
VMOVUPS ymm1 , ymmword ptr [r15 + r15] 
VMOVUPS ymm1 , ymmword ptr [4 * r15 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r15 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r14 + r14] 
VMOVUPS ymm1 , ymmword ptr [r14 + r14] 
VMOVUPS ymm1 , ymmword ptr [4 * r14 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r14 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r13 + r13] 
VMOVUPS ymm1 , ymmword ptr [r13 + r13] 
VMOVUPS ymm1 , ymmword ptr [4 * r13 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r13 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r12 + r12] 
VMOVUPS ymm1 , ymmword ptr [r12 + r12] 
VMOVUPS ymm1 , ymmword ptr [4 * r12 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r12 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r11 + r11] 
VMOVUPS ymm1 , ymmword ptr [r11 + r11] 
VMOVUPS ymm1 , ymmword ptr [4 * r11 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r11 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r10 + r10] 
VMOVUPS ymm1 , ymmword ptr [r10 + r10] 
VMOVUPS ymm1 , ymmword ptr [4 * r10 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r10 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r9 + r9] 
VMOVUPS ymm1 , ymmword ptr [r9 + r9] 
VMOVUPS ymm1 , ymmword ptr [4 * r9 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r9 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * r8 + r8] 
VMOVUPS ymm1 , ymmword ptr [r8 + r8] 
VMOVUPS ymm1 , ymmword ptr [4 * r8 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [r8 + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rbp + rbp] 
VMOVUPS ymm1 , ymmword ptr [rbp + rbp] 
VMOVUPS ymm1 , ymmword ptr [4 * rbp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rbp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rsp + rbp] 
VMOVUPS ymm1 , ymmword ptr [rsp + rbp] 
VMOVUPS ymm1 , ymmword ptr [4 * rsp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rsp + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rdi + rdi] 
VMOVUPS ymm1 , ymmword ptr [rdi + rdi] 
VMOVUPS ymm1 , ymmword ptr [4 * rdi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rdi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rsi + rsi] 
VMOVUPS ymm1 , ymmword ptr [rsi + rsi] 
VMOVUPS ymm1 , ymmword ptr [4 * rsi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rsi + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rdx + rdx] 
VMOVUPS ymm1 , ymmword ptr [rdx + rdx] 
VMOVUPS ymm1 , ymmword ptr [4 * rdx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rdx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rcx + rcx] 
VMOVUPS ymm1 , ymmword ptr [rcx + rcx] 
VMOVUPS ymm1 , ymmword ptr [4 * rcx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rcx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rbx + rbx] 
VMOVUPS ymm1 , ymmword ptr [rbx + rbx] 
VMOVUPS ymm1 , ymmword ptr [4 * rbx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rbx + 123456h] 
VMOVUPS ymm1 , ymmword ptr [4 * rax + rax] 
VMOVUPS ymm1 , ymmword ptr [rax + rax] 
VMOVUPS ymm1 , ymmword ptr [4 * rax + 123456h] 
VMOVUPS ymm1 , ymmword ptr [rax + 123456h] 
VMOVUPS ymm1 , ymmword ptr [ DataLAbelValue ] 
VMOVUPS ymm1 , ymm2 

TimoVJL

A note:
Opcodes project have a opcode database in xml format.
Creating a csv-file from it could be useful, even that database has it's own limits.

asmdb as js/json format.

May the source be with you