News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

UASM 2.48 Pre Release

Started by johnsa, February 14, 2019, 10:19:30 AM

Previous topic - Next topic

johnsa

Hi all,

2.48 is ready for some user-testing :)

http://www.terraspace.co.uk/uasm248_x64.zip
http://www.terraspace.co.uk/uasm248_x86.zip

A few small changes:

Added support for MOVABS instruction.
Fixed declaration of .data RECORD inside proc.(http://masm32.com/board/index.php?topic=7650.0)
Fixed a number of evex encoding issues.
Prevent a macro expansion from not resetting evex processing state.
General code refactoring and optimisation.
Added support for EVEX decorators suffixing a variable name and not just a raw memory address.
Added support for {EVEX} promotion of AVX instructions.
Add warnings about using BSS type data in plain bin outputs.

However there is one major new change:
A completely new code-gen engine.
Over time we've had a lot of issues patching and fixing up the existing code-gen as it's been inherited over the years through wasm, jwasm.. it was just not fit for purpose.
It was never properly designed to deal with rex, vex, evex type encodings so it was time for a complete re-write into a new clean, robust logical structure with a lot more detail contained in the new instruction table.
In theory, as an end-user you won't notice anything different. If the new code-gen doesn't yet have a particular instruction it simply falls back to the existing code-gen.
The new code-gen however allows us to support a lot more robust error and type checking and is modular enough that in future it may even allow additional architecture targets.

Please give it a try, we'll gather all feedback and if all is good will make the packages available.
Source code is under the 2.48a branch in Github until release to master.

John


TimoVJL

#1
EDIT: 2.48 patches for PellesC.
May the source be with you


jj2007

Tested successfully with my major sources :t

LiaoMi

It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. invoke.c 2625

else
{
if (info->stackOfs % 16 != 0)
BuildCodeLine(info->stackOps[info->stackOpCount++], "%s ymmword ptr [%r+%u], ymm8", MOVE_UNALIGNED_INT, T_RSP, NUMQUAL info->stackAdj);
else

BuildCodeLine(info->stackOps[info->stackOpCount++], "sub %r, 32", T_RSP);
info->stackOfs += 32;
BuildCodeLine(info->stackOps[info->stackOpCount++], "%s ymm8, ymmword ptr %s", MOVE_UNALIGNED_INT, paramvalue);
}
return(1);


The variable 'i' is being used for this loop and for the outer loop. Check lines: 581, 710. preproc.c 710

The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 4533, 4535. proc.c 4533
for (regs = info->regslist, cnt = *regs++; cnt; cnt--, regs++)
if (GetValueSp(*regs) & OP_XMM)
cntxmm++;
else if (GetValueSp(*regs) & OP_YMM)
cntxmm += 2;
else if (GetValueSp(*regs) & OP_YMM)
cntxmm += 4;
else
cntstd++;
}


It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. types.c 1720
if ( cntBits > 16 ) {
        if ( cntBits > 64 ) {
            newr->sym.total_size = 128 ;
            newr->sym.mem_type = MT_OWORD;
        } else

        if ( cntBits > 32 ) {
            newr->sym.total_size = sizeof( uint_64 );
            newr->sym.mem_type = MT_QWORD;
        } else {
            newr->sym.total_size = sizeof( uint_32 );
            newr->sym.mem_type = MT_DWORD;
        }
    } else if ( cntBits > 8 ) {
        newr->sym.total_size = sizeof( uint_16 );
        newr->sym.mem_type = MT_WORD;
    } else {
        newr->sym.total_size = sizeof( uint_8 );
        newr->sym.mem_type = MT_BYTE;
    }


There are identical sub-expressions '(CodeInfo->token == T_VEXTRACTPS)' to the left and to the right of the '||' operator. codegen.c 1488
(CodeInfo->token == T_VEXTRACTPS)|| (CodeInfo->token == T_VEXTRACTPS)){

Possible overflow. Consider casting operands of the 'sym->offset + (int) StackAdj' operator to the 'int_64' type, not the result. expreval.c 1295
opnd->llvalue = (int_64)(sym->offset + (int)StackAdj);

The 'memcpy' function doesn't copy the whole string. Use 'strcpy / strcpy_s' function to preserve terminal null. hll.c 1184
memcpy(p, px, strlen(px));

The '_alloca' function is not declared. Passing data to or from this function can be affected. listing.c 279
struct lstleft *next = myalloca( sizeof( struct lstleft ) );

johnsa


Fixed.
2 of those are valid and fine, the others have been corrected.

The msvc2003 patches have been applied.

In branch v2.48b

LiaoMi

The potential null pointer is passed into '_fread' function. Inspect the first argument. Check lines: 272, 269. directiv.c 272

/* v2.14 : Get File Size */
fseek( file, 0L, SEEK_END );
sz = ftell( file ) - fileoffset; // sz = total data size to load into segment/section.
fseek( file, 0L, SEEK_SET );
pBinData = (unsigned char*)malloc(sz);
if ( fileoffset )
fseek( file, fileoffset, SEEK_SET );  /* fixme: use fseek64() */
result = fread(pBinData, sz, 1, file);
OutputBinBytes(pBinData, sz);


The 'alloca' function is used inside the loop. This can quickly overflow stack. listing.c 279
case LSTTYPE_TMACRO:
        ll.buffer[1] = '=';
        for ( p1 = sym->string_ptr, p2 = &ll.buffer[3], pll = ≪ *p1; ) {
            if ( p2 >= &pll->buffer[28] ) {
                struct lstleft *next = myalloca( sizeof( struct lstleft ) );
                pll->next = next;
                pll = next;
                pll->next = NULL;
                memset( pll->buffer, ' ', sizeof( pll->buffer) );
                p2 = &pll->buffer[3];
            }
            *p2++ = *p1++;
        }
        break;


The potential null pointer is passed into 'strcpy' function. Inspect the first argument. Check lines: 331, 330. macrolib.c 331
for (j = 0; j < macroLen[i]; j++)
{
srcLines[j] = (char *)malloc(MAX_LINE_LEN);
strcpy(srcLines[j], macCode[(start_pos + j)]);
}


It is possible that 'break' statement is missing in switch statement. invoke.c 4148
case 8:
#if AMD64_SUPPORT
if ((ModuleInfo.curr_cpu & P_CPU_MASK) >= P_64)
break;
#endif
/* v2.06: added support for double constants */
if (opnd.kind == EXPR_CONST || opnd.kind == EXPR_FLOAT) {
AddLineQueueX(" pushd %r (%s)", T_HIGH32, fullparam);
qual = T_LOW32;
instr = "d";
break;
}
default:
DebugMsg1(("PushInvokeParm(%u): error, CONST, asize=%u, psize=%u, pushsize=%u\n",
reqParam, asize, psize, pushsize));
EmitErr(INVOKE_ARGUMENT_TYPE_MISMATCH, reqParam + 1);


The 'alloca' function is used inside the loop. This can quickly overflow stack. macro.c 499
The 'alloca' function is used inside the loop. This can quickly overflow stack. macro.c 834

A call of the 'sprintf' function will lead to overflow of the buffer 'buffer + strlen(buffer)'. proc.c 2290
for (i = unw_info.CountOfCodes; i; i--) {
/* v2.11: use field FrameOffset */
//sprintf( buffer + strlen( buffer ), "%s 0%xh", pfx, unw_code[i-1] );
sprintf(buffer + strlen(buffer), "%s 0%xh", pfx, unw_code[i - 1].FrameOffset);
pfx = ",";
if (i == 1 || strlen(buffer) > 72) {
AddLineQueue(buffer);
buffer[0] = NULLC;
pfx = "dw";
}


Numeric Truncation Error. Return value of the 'strlen' function is written to the 8-bit variable. symbols.c 557

It is possible that 'break' statement is missing in switch statement. tokenize.c 649
    case '/' : /* 0x2F: binary operator */
        minuslbl:
/* all of these are themselves a token */
        p->input++;
        buf->token = symbol;
        buf->specval = 0; /* initialize, in case the token needs extra data */
        /* v2.06: use constants for the token string */
        buf->string_ptr = (char *)&stokstr1[symbol - '('];
        break;
    case '[' : /* T_OP_SQ_BRACKET operator - needs a matching ']' (0x5B) */
      a = '[';
    case ']' : /* T_CL_SQ_BRACKET (0x5D) */
        p->input++;


The potential null pointer is passed into '_fseek_nolock' function. Inspect the first argument. symbols.c 1023
It is possible that 'break' statement is missing in switch statement. tbyte.c 439

It is possible that 'break' statement is missing in switch statement. parser.c 3542
It is possible that 'break' statement is missing in switch statement. parser.c 3529
It is possible that 'break' statement is missing in switch statement. parser.c 2496
It is possible that 'break' statement is missing in switch statement. parser.c 2481
It is possible that 'break' statement is missing in switch statement. parser.c 299
It is possible that 'break' statement is missing in switch statement. data.c 999
It is possible that 'break' statement is missing in switch statement. condasm.c 628
It is possible that 'break' statement is missing in switch statement. branch.c 423

Adamanteus

#7
Yeh, I found where's works .xmm directive cpumodel.c 489
Code (c) Select

#if DOT_XMMARG
    if ( tokenarray[i].tokval == T_DOT_XMM && tokenarray[i+1].token == T_FINAL ) {
        struct expr opndx;
        i++;
        if ( EvalOperand( &i, tokenarray, Token_Count, &opndx, 0 ) == ERROR )
            return( ERROR );
        if ( opndx.kind != EXPR_CONST || opndx.value < 1 || opndx.value > 4 ) {
opndx.value = 4;
        }
if ((ModuleInfo.curr_cpu & P_686) != P_686)
            return EmitErr(CPU_OPTION_INVALID, tokenarray[i - 1].string_ptr);
        newcpu = ~P_SSEALL;
        switch ( opndx.value ) {
        case 4: newcpu |= P_SSE4;
        case 3: newcpu |= P_SSE3|P_SSSE3;
        case 2: newcpu |= P_SSE2;
        case 1: newcpu |= P_SSE1; break;
        }
    } else
#endif

TimoVJL

#8
Things to check:
codegen.c 995: is that line correct ? single '|'                 /* fix v2.46 */
                if (CodeInfo->token == T_VCVTPS2PD || CodeInfo->token == T_VCVTPH2PS ||
                    CodeInfo->token == T_VCVTPS2PH | CodeInfo->token == T_VCVTQQ2PD){

parser.c 3064:
memset(&CodeInfo, 0, sizeof(CodeInfo));
...
3336: unnessary zeroing again ?
CodeInfo.prefix.rex         = 0
...
May the source be with you

Adamanteus

For process .Model divective in win64 mode, looks in cpumodel.c line 280 is need to be so :

if (ModuleInfo.sub_format == SFORMAT_64BIT)
ModuleInfo.curr_cpu = (ModuleInfo.curr_cpu & ~P_CPU_MASK) | P_64;
    if (index >= 0) {
        if (ModuleInfo.model != MODEL_NONE) {

- and assemble.c line 810 :

            /* model = MODEL_FLAT; */
            if (ModuleInfo.langtype == LANG_NONE && Options.output_format == OFORMAT_COFF)
                ModuleInfo.langtype = LANG_FASTCALL;

LiaoMi

Maybe our developers will find this useful for documentation - Complete static UASM 2.49 Source Code Analysis, in two versions, quick analysis and the most comprehensive in-depth analysis. In the report, you can interactively go from error to error, the structure is similar to the electronic html book. All items that are marked as green, this is a clarification, just pay attention to this. Red marking indicates potential errors. Not all errors can be errors, but the critical places are identified plausibly.

Download link https://mega.co.nz/#!1x4E0QrZ!AAAAAAAAAACr749Iz959pQAAAAAAAAAAq--PSM_efaU



Potential errors may be interdependent (CodeGenV2) ...