News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Assembly code for a simple function (32 + 64 bit).

Started by zmau, March 25, 2020, 12:22:56 AM

Previous topic - Next topic

zmau

Hello experts,

The following function has some mixed C + assembly code (I have inherited this code).
This gives me some troubles, while compiling (especially when compiling for 64 bit).
I want to move this code into a separate file which contains only assembly code  (and then compile it differently - according to examples which I have).
Can anyone help ?


    __int64 gettsc()
    {
      __int64 u_timetag;
      __int64 *putimetag;
      putimetag = &u_timetag;
      u_timetag = 0;
      _asm
      {
      push edx;
      push eax;
      push esi;
      mov esi,putimetag;
      _emit 0x0f;
      _emit 0x31;
      mov [esi+4],edx;
      mov [esi],eax;
      pop esi;
      pop eax;
      pop edx;
      }
   
      return u_timetag;
    }


The first question is to have an assembly code which compiles nicely for 32 bit
A second question would be to have a second assembly code which compiles nicely for 64 bit
You can see the following two attached examples


Thanks
zmau

Vortex

Hi zmau,

The MS VC compiler does not support 64-bit inline assemly. You can use MinGW for this purpose :

#include <windows.h>

char* UpperCase(char* szText)
{
asm(
"mov rax,rcx;"
"sub rax,1;"
"1:"
"add rax,1;"
"movzx rdx,BYTE PTR [rax];"
"test rdx,rdx;"
"je 2f;"
"cmp rdx,97;"
"jb 1b;"
"cmp rdx,122;"
"ja 1b;"
"sub BYTE PTR [rax],32;"
"jmp 1b;"
"2:"
"mov rax,rcx;"
/* "ret;"*/
);
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,
LPSTR lpCmdLine, int nCmdShow)
{
char msg[]="inline assembly programming";
MessageBox(0,UpperCase(msg),"Hello!",MB_OK);
return 0;
}


gcc -o InlineAsm.exe InlineAsm.c -masm=intel

askm

that is if you want a message box with no text ..
of course, 64 bit function code typo corrected ...


char* UpperCase(char* szText)
{
   asm(
      "mov   rcx, rax;"
      "sub   rax, 1;"
"1:"
      "add   rax, 1;"
      "movzx   rdx, BYTE PTR [rax];"
      "test   rdx, rdx;"
      "je      2f;"
      "cmp   rdx, 97;"
      "jb      1b;"
      "cmp   rdx, 122;"
      "ja      1b;"
      "sub   BYTE PTR [rax], 32;"
      "jmp   1b;"
"2:"
      "mov   rax, rcx;"
/*      "ret;"*/
      );
}

Vortex

mov   rcx, rax

Guess what is the value of rcx before that instruction?

Vortex

Here is another version :

#include <windows.h>

char* UpperCase(char* szText)
{
    asm(
        "mov    rax,rcx;"
        "sub    rcx,1;"
"1:"
        "add    rcx,1;"
        "movzx  rdx,BYTE PTR [rcx];"
        "test   rdx,rdx;"
        "je     2f;"
        "cmp    rdx,97;"
        "jb     1b;"
        "cmp    rdx,122;"
        "ja     1b;"
        "sub    BYTE PTR [rcx],32;"
        "jmp    1b;"
"2:"
);
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,
LPSTR lpCmdLine, int nCmdShow)
{
char msg[]="inline assembly programming";
MessageBox(0,UpperCase(msg),"Hello!",MB_OK);
return 0;
}

askm

Meant only to point out how one line makes the difference in
the example subroutine, difference only the r32 or r64 operands, 
and the expected correct output.


char* UpperCase(char* szText)
{
    asm (
   
        // 64 bit                                                            // 32 bit or 64 bit
        "mov rcx, rax;"                                                // "mov ecx, eax;"
        "sub rax, 1;"                                                    // "sub eax, 1;"
        "1:"                                                                  // "1:"
        "add rax, 1;"                                                    // "add eax, 1;"
        "movzx rdx, BYTE PTR [rax];"                           // "movzx edx, BYTE PTR [eax];"
        "test rdx, rdx;"                                                // "test edx, edx;"
        "je  2f;"                                                           // "je  2f;"
        "cmp rdx, 97;"                                                 // "cmp edx, 97;"
        "jb  1b;"                                                          // "jb  1b;"
        "cmp rdx, 122;"                                               // "cmp edx, 122;"
        "ja  1b;"                                                          // "ja  1b;"
                                                                               //
        "sub BYTE PTR [rax], 32;"                               // "sub BYTE PTR [eax], 32;"
        "jmp 1b;"                                                        // "jmp 1b;"
        "2:"                                                                 // "2:"
        "mov rax, rcx;"                                                // "mov eax, ecx;"
        /* "ret;"*/
    );

}


Vortex

askm,

You are simply mixing apples and oranges. My original question was about 64-bit programming, nothing to do with 32-bit coding.

My code is working correctly. I already tested it.

I am asking you a simple question, please answer :

What's the purpose of the line mov rcx,rax at the top of your code?

char* UpperCase(char* szText)
{
    asm (
   
        // 64 bit
        "mov rcx, rax;"

askm

What I have seen is that to simply upsize or downsize registers
does not reliably produce expected results.

Look closely ...
compiler: gcc version 9.3.0
target: i686-w64-mingw32
debugger: x64dbg 32 bit mode
os: Windows 7 64bit

In same code with r32 registers,
at beginning--------------------------------------
eax : 0028FE94     "inline assembly programming"
ecx : ????????      // some other address
mov eax, ecx        // subroutine start line in question, downsize from 'mov rax, rcx '
eax : ????????    // now some other address

// code executes, eax incremented, ecx not changed
eax : ????????++    // some other address incremented

mov eax, ecx       // down size from 'mov rax, rcx'
at return--------------------------------------
eax : ????????    // now some other address again
ecx : ????????     // some other address
my MessageBox is blank, or unexpected, with r32 ...
eax is now ecx.

The alternative 'mov ecx, eax', ecx would hold the proper address til return,
giving proper text.

The code with r64, correctly gives text, as rcx and rax are equal at start.
rax gets the change. rcx, again does not.
There will be differences in OS and or compiler and or execution behaviors.

Results, observations from machine code operations do not always coincide with others.

This is what happens on my system.

Vortex

My MinGW version is 5.40 but this has nothing to do with the question I asked :

QuoteThe alternative 'mov ecx, eax', ecx would hold the proper address til return,
giving proper text.

Do you know about the x64 calling convention ?

https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019

Can you understand now why mov rcx,rax is incorrect in general programming practice?

hutch--

Vortex is correct here, win32 and win64 work differently with different calling conventions and different stack layout. With the win64 FASTCALL calling convention, rcx is your first argument, rdx the second, r8 the third and r9 the fourth and any other arguments are written to the stack.

Vortex

Here is a Masm64 example for askm :

include     \masm32\include64\masm64rt.inc

.data

string1     db 'This is a test.',0

.code

start PROC

    invoke  UpperCase,ADDR string1
    invoke  StdOut,rax
    invoke  ExitProcess,0

start ENDP


UpperCase PROC string:QWORD

    mov     rax,rcx ; is equal to
                    ; mov rax,string
    dec     rax
   
__repeat:

    add     rax, 1
    movzx   rdx,BYTE PTR [rax]
    test    rdx,rdx
    je      __end
    cmp     edx,97
    jb      __repeat
    cmp     rdx,122
    ja      __repeat
    sub     BYTE PTR [rax],32
    jmp     __repeat

__end:

    mov rax,string
    ret

UpperCase ENDP

END

askm

As I understand here, the original poster posted around March 25, correct ?
Thirty minutes roughly later, someone has already posted with an answer !

These are experts, as zmau said, after all.

If zmau feels is up to the challenge, said could figure in a reasonable amount of
time ? Say three days ? A week ?

Yes, thirty miutes later, first responding poster posted code ... someone already has an answer !
But instead of original poster getting help, instead sees old code that has absolutely
nothing to do with the topic put forth, but after I post you have some problem with my observations.
Why? If your code works for you, do you now doubt it ? Is that the expert response ?
Does this fixation on 64 bit, or the fact that someone question that code bother you ?
For reading comprehension, remember ( 32 + 64 ).

It has been nearly two months this topic. Observe this fact.
Or is nearly two months different in 64 bit from 32 bit ?
Did you use the 64 bit fastcall to give answer too quickly that does not address the topic ?

Are you willing to read ?

Shows some sort of attention deficit, or reading comprehension deficit, or just plain failure to
read, thus leading to not answering to the original intent of the topic. As critical as code is
I would not think it is not something that you could treat as anything less than deserving critical
reading skills. Or is the intent not to answer ?
 
Anyone is capable of exploratory observation. These observations ? They are mine.
I am right about my own observations. On my system.

In the topic context ( 32 + 64 ), this topic obviously is not just about 64 bit is it ?

So far I see zmau has seen fit not to respond. There is no mystery here as to why.
In reality, I hope everyone that is not a bad person is ok at this time. zmau are you ok ?
If ok, why not ask zmau if said has been satisfactorily assisted ?

Use the same energy to help zmau, or others, as you have wasted with these ridiculous responses.

Posting code that pertains to the original topic would make zmau and rest of forum happy.

Anything else is clearly topic obfuscation, on any forum, on any platform.

Unless thats too difficult for experts.

Again for reading comprehension, it has been nearly two months this topic.
Observe this fact.

zmau waits.

hutch--

Tread carefully with smartarse wisecracks in here, members have tried to help allowing that there is some confusion in the original question that was not particularly obvious in what was posted. The failure to comprehend the difference between the 32 bit Intel ABI (calling convention) and the 64 bit ABI (FASTCALL calling convention) rendered the post unintelligible.

Now if the posted inline assembler was originally from Microsoft VC or VS, as Vortex has already mentioned, 64 bit VC does not support inline assembler so the option apart from using a different C compiler is the way Microsoft support 64 bit assembler, write the code in 64 bit MASM. and link the module into the C code app.

Now the code that copied RAX into RCX has an obvious mistake in 64 bit, it over writes the first argument passed in the RCX register and if you think you are in a position to lecture other people who know far more than you do, fix your own mistakes first.

Now instead of crapping on, here is how you write a 64 bit upper case algo in MASM.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

NOSTACKFRAME

szUpper proc

  ; rcx = address of string

    mov rax, rcx
    sub rax, 1

  @@:
    add rax, 1
    cmp BYTE PTR [rax], 0
    je @F
    cmp BYTE PTR [rax], "a"
    jb @B
    cmp BYTE PTR [rax], "z"
    ja @B
    sub BYTE PTR [rax], 32
    jmp @B
  @@:

    mov rax, rcx

    ret

szUpper endp

STACKFRAME

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Vortex

Hi askm,

QuoteIs that the expert response ?

FYI, I am probably not an expert like you and I am falling under the amateur category but I would not talk about things I have no idea just as you did about 64-bit coding. Did you study the Windows 64 ABI? Probably no.

QuoteIf your code works for you, do you now doubt it ? Is that the expert response ?

No expert response from my part. You surely now better how to handle those things.

QuoteShows some sort of attention deficit, or reading comprehension deficit, or just plain failure to
read, thus leading to not answering to the original intent of the topic.

Exactly. Since I am facing some focusing problems, I didn't insist on the wrong parameter handling mechanism. Curiously, why did you start by trying to correct my code instead helping directly the thread starter?

QuoteDid you use the 64 bit fastcall to give answer too quickly that does not address the topic ?

You know about 64-bit fastcall. Not bad, not at all bad... Keep going on...

QuoteUse the same energy to help zmau, or others, as you have wasted with these ridiculous responses.

Yes, it's very ridiculous trying to comply with the correct 64-bit ABI. I agree with you.

QuoteAnything else is clearly topic obfuscation, on any forum, on any platform.

Yes, I obfuscated the truth and you enlightened us.

QuoteUnless thats too difficult for experts.

Should I call you Master like Anakin Skywalker addressing Senator Palpatine ?

jj2007

Hi askm,

Hutch (he is the boss here) gave you a detailed answer, and Vortex just replied with an angry answer.

Compliments, I can't remember any thread in the last ten years that made Vortex angry. He is one of the calmest, most peaceful and most helpful gentlemen of the entire forum. Making him angry is a major achievement, really.

Back to the point: zmau posted a slightly confused question, and got a friendly and detailed answer - btw absolutely on topic. He did not even bother to say "thank you", which probably means he is a dumb little a**hole who wanted somebody to do his homework for him and found out we are no fools. Bad luck for him :tongue:

Then you jumped in with some code, good idea - we appreciate that. Vortex tried to gently push you onto the right track, you know, 64-bit ABI and usage of rcx as first argument, but instead of listening, you started complaining.

You seem talented, otherwise Hutch would have kicked you out already. I'll explain the group dynamics to you:
- dumb little boy asks for help
- dumb little boy gets polite answer, but ignores it
- other members forget about dumb little boy but start commenting Vortex' answer (in this case)
- more people join in and post code vaguely related to OP's question
- discussion starts whether posted code (e.g. szUpper by Hutch) is the best solution
- alternative algos get proposed
- benchmarking starts
- best algo may get adopted for the Masm** SDK
- everybody happy and celebrating the World's best Assembly forum :thumbsup:

Now it's up to you to decide if you want to become a member of the gang :thup:

szUpperShort:
  ; rcx = address of string
    mov rax, rcx
    jmp @go
sub32:
    sub BYTE PTR [rcx], 32
@@:
    inc rcx
@go:
    cmp BYTE PTR [rcx], "z" ; can't be zero, right?
    ja @B
    cmp BYTE PTR [rcx], "a"
    jae sub32
    cmp BYTE PTR [rcx], 0
    jne @B
  @@:
  ret

Timings for a Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
27      bytes for szUpperShort
35      bytes for szUpperLong

1029 ticks for warming up

LET US TRY A VERY SIMPLE STRING: 1732 ticks for short szUpper
LET US TRY A VERY SIMPLE STRING: 2465 ticks for long szUpper
LET US TRY A VERY SIMPLE STRING: 1716 ticks for short szUpper
LET US TRY A VERY SIMPLE STRING: 2418 ticks for long szUpper
LET US TRY A VERY SIMPLE STRING: 1747 ticks for short szUpper
LET US TRY A VERY SIMPLE STRING: 2340 ticks for long szUpper