News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

HJWasm Macro Library Suggestions

Started by johnsa, March 31, 2017, 08:00:16 AM

Previous topic - Next topic

hutch--

> which can be a waste of memory when they don't contain SIMD instructions

This is not the case as stack memory is ALREADY allocated and all you are doing for the duration of the procedure and any further nested procedures is offsetting the stack usage until the instruction sequence returns to the caller. Now it could be a problem if you were writing a highly recursive algorithm that progressively used a large amount of stack address space but the solution to that problem would be to either set a large stack in the linker OR set a recursion depth limiter OR both.

The simple answer is you cannot waste what has already been wasted.

aw27

Quote from: hutch-- on April 05, 2017, 05:54:09 PM
> which can be a waste of memory when they don't contain SIMD instructions

This is not the case as stack memory is ALREADY allocated and all you are doing for the duration of the procedure and any further nested procedures is offsetting the stack usage until the instruction sequence returns to the caller. Now it could be a problem if you were writing a highly recursive algorithm that progressively used a large amount of stack address space but the solution to that problem would be to either set a large stack in the linker OR set a recursion depth limiter OR both.

The simple answer is you cannot waste what has already been wasted.
For ML64 you have no better alternative, you really have to do it with macros.

hutch--

 :biggrin:

That's the price of write in a MACRO assembler, you can.  :P

jj2007

Quote from: hutch-- on April 05, 2017, 05:54:09 PMit could be a problem if you were writing a highly recursive algorithm

Right, although rarely relevant. Another argument is cache use, of course.

Not all XMMWORDs need align 16. On modern CPUs, movups and movaps are equally fast. Perhaps the coder could decide (with an option) if a misaligned XMMWORD should issue a warning, or throw an error.

johnsa

The case is more that movups will be "equally" fast if the data item happens to be aligned, but slower if not.

so movups gives you something as performant as movaps when the data is aligned, but doesn't explode in a heap when un-aligned..

basically it works the way the damn thing should have in the first place and there should never have been an aligned/unaligned variant :) imho

aw27

Quote from: johnsa on April 05, 2017, 08:53:53 PM
basically it works the way the damn thing should have in the first place and there should never have been an aligned/unaligned variant :) imho
It gave jobs to C++ programmers at Microsoft who invented data types to pass data already aligned.

jj2007

Yes, my wording was sloppy - and I fully agree with both of you :bgrin:

Anyway, warning or error for misaligned local xmmwords should be an option. While movups can replace movaps with no price to pay, there are many SIMD instructions that blow up when you write to/from unaligned memory.

Adamanteus

#22
 My variant of incresing macrolib flexibility and abilities, only basic improvements realised : as Win16-32-64 universality and more classes to macros added, that's possible turn on and off by one, and even by name of each macro, using mlib  and nomlib command lline options :eusa_boohoo:
And that not to make substitiutes for microlib names, maybe better to change code searching symbols :

cmdline.c :

line 42 :

#include "macrolib.h"

line 410 :

static void OPTQUAL Set_NOMLIB(void)
{
#if defined ML_SWN
if (*OptName) noAutoMacrosAdd(OptName + 1);
else
#endif
Options.nomlib = TRUE;
}

static void OPTQUAL Set_MLIB(void)
{
#if defined ML_SWN
if (*OptName) inAutoMacrosAdd(OptName + 1);
else
#endif
Options.nomlib = FALSE;
}

line 611 :

{ "nomlib=@", 0,      Set_NOMLIB },
{ "mlib=@", 0,        Set_MLIB },


expans.c

line 1164 : for invariant to register macro names

      if( tokenarray[i].token == T_ID || tokenarray[i].token == T_INSTRUCTION) {


line 1168 : for invariant to register macro names

sym = SymFindDeclare(tokenarray[i].string_ptr);
else
#ifdef __SW_BD
sym = SymFindToken(tokenarray[i].string_ptr, tokenarray[i].token);
#else
sym = SymSearch( tokenarray[i].string_ptr );
#endif


symbols.c

line 309 : for invariant to register macro names

#ifdef __SW_BD
struct asym *SymFindToken( const char *name, int token)
/**************************************/
/* find a symbol in the local/global symbol table,
* FOR REPLACE INSTRUCTIONS BY MACROSES
* return ptr to next free entry in global table if not found.
* Note: lsym must be global, thus if the symbol isn't
* found and is to be added to the local table, there's no
* second scan necessary.
*/
{
    int i;
    int len;

    len = strlen( name );
    i = hashpjw( name );

    if ( CurrProc ) {
        for( lsym = &lsym_table[ i % LHASH_TABLE_SIZE ]; *lsym; lsym = &((*lsym)->nextitem ) ) {
            if ( len == (*lsym)->name_size && SYMCMP( name, (*lsym)->name, len ) == 0 ) {
                DebugMsg1(("SymFind(%s): found in local table, state=%u, local=%u\n", name, (*lsym)->state, (*lsym)->scoped )); 
(*lsym)->used = TRUE;
                return( *lsym );
            }
        }
    }

    for( gsym = &gsym_table[ i % GHASH_TABLE_SIZE ]; *gsym; gsym = &((*gsym)->nextitem ) ) {
if ( len == (*gsym)->name_size && ((token == T_INSTRUCTION && (*gsym)->state == SYM_MACRO) ? (_memicmp(name, (*gsym)->name, len) == 0) : (SYMCMP(name, (*gsym)->name, len) == 0)) ) {
            DebugMsg1(("SymFind(%s): found, state=%u memtype=%X lang=%u\n", name, (*gsym)->state, (*gsym)->mem_type, (*gsym)->langtype ));
            return( *gsym );
        }
    }

    return( NULL );
}
#endif