Understanding a typical MASM32 program template from a C++ point of view

RedSkeleton007 · May 17, 2016, 07:42:10 PM

First, lets compare and clarify some confusing C++ and Assembly program terms:

Label = Variable Name
Proc = beginning of function body
.data or .code directives = ?

Next, here's a simple exchange program originally written by Hutch in another post with some slight modifications.
As always, my questions are in the comments:

Code Select


include \masm32\include\masm32rt.inc

.data ; in C++, would this section be for prototype functions,
      ; or for global variable declarations? Do we need to include
      ; the .data directive in all masm32 programs?

.code
start: ; this simply indicates that the code action begins here, right?

    call main
    inkey     ; what's this inkey statement here for? Is it for the print operand?
    exit

main proc

    LOCAL var1  :DWORD
    LOCAL var2  :DWORD

    push esi 
    push edi

    mov var1, 12345678
    mov var2, 87654321

    print "Before XCHG",13,10
    print str$(var1)," var1",13,10
    print str$(var2)," var2",13,10,13,10

    mov esi, var1
    mov edi, var2

    xchg esi, edi  ; exchange the two registers

    mov var1, esi
    mov var2, edi

    print "After XCHG",13,10
    print str$(var1)," var1",13,10
    print str$(var2)," var2",13,10


    pop edi
    pop esi

    ret ; return 0?

main endp ; in C++, this would be the closing curly brace for main method, right?

exit ; in C++, this would be the closing curly brace for the class, right?

end start

mineiro · May 17, 2016, 09:28:12 PM

Have in mind that processors can address things, and can deal (read,write) inside that address.
So, what's a register? Register (flip-flop) is an address inside processor that can hold N bits.

Label, you define a name to an address.
Proc, you define a name to an address too. The difference between label and proc is that proc is generally used with call instruction, so, we suppose that have a ret instruction at the end of that block (main proc and endp). Label is generally used with jumps instruction. Proc can do more automation; you can do all only using labels, but you should do all the work by hands, like dealing with stack frame, local variable, parameters of functions, calling convention.

.data, global variables, all your source code can access that variable. Used with initialized variables. If you need unitialized variables so .data? is a default. Well, if that initialized variables are non mutant, they do not change, you can put that on .code section, but in some place where your variable will not be translated as code. But, if you need write into that variables, so .code section should have write access, and this is not good because mutant code can be done. Well, use .data and .data?, it's better, I said just for clarification.
Imagine that you put data variables inside .code section, so, the flow of program is going from left to right and up to down. But, you forgot as an example a simple jump instruction to avoid that variable, well, data will be code, and anomalies will happen. Not a good habit.

start: yes, it's a label to an entry point of your program. You define this on the end of your source code by using "end start".

inkey, exit, print are macros. They are translated by assembler.Search inside masm32 folder.

Code Select

include \masm32\include\masm32rt.inc

function proto only_one_argument:dword	;like math functions f(x) == y+1, (Leibnitz is the guy to flame if you don't like)

.data	;global variables
initialized_data db "hello world",00h	;not used, only a example

.data?	;global variables
not_initialized_data dd ?	;I will later initialize this variable of 32 bits (dword,dd) inside code section

.const	;global constants
constant equ 918

.code
start: ;I know the code start here because at the end of source code have "end label_name"
       ;Some code don't have "end label_name" so we know that can be a module, probably a library code.

    call main	;were calling a PROCedure labeled main, and this does not have parameters, so we don't need prototypes
;after main block have returned we reached here

    call other_procedure
;after other_procedure block have returned we reached here

    jmp that_address                    ;GOTO that_address
    nothing db "hello world again",00h 	;data inside code, not good, because if we remove the jump above
                                        ;the data will be executed by processor as code, as instructions
that_address:

    invoke ExitProcess,0		;the same as exit macro
                                        ;invoke deal with parameters to us if found a prototype to this function
                                        ;prototypes are inside some inc files on masm32 folder
                                        ;push 0
                                        ;call ExitProcess


main proc		;some part of code CALLed us, so we should RETurn

    LOCAL var1  :DWORD		;local variables, only can be accessed inside this block (procedure)
				;they are destroyed after this block is executed

    mov var1,constant		
    mov eax,var1
    mov not_initialized_data,eax

    ret ; return to the caller
main endp


other_procedure proc		;all CALL instruction expect a RET instruction by default
ret
endp

end start        ;end entry_point

qWord · May 18, 2016, 03:59:27 AM

The code section is for code - Constant values go to the .const-Section. Equates (EQU) are Assembly-Time constants (c or c++-eqvivalent ≈ #define ...) and does not go into any section thus they can appear anywhere in source.

xanatose · May 21, 2016, 05:36:19 PM

.code:
Code goes here. Usually you cannot write to this section.

.data:
Initialized data goes here. The initial values will be in the exe. Making it larger. Can write at this section at runtime. Cannot execute in this section.

.data?
Uninitialized data goes here. The initial values will not be in the exe. Cannot execute in this section.

.const
Constant data goes here. Initial values will be in the exe. Making it larger. Cannot write or execute this section.

As masm was created in the 16 bit era. It was necesary to divide the sections in order to access more than 64K.

Nowadays is more about protection of sections.

Vortex · May 21, 2016, 06:11:14 PM

Hello

You can use the traditional C run-time functions to control keyboard input :

Code Select

.386
.model flat,stdcall
option casemap:none

include     \masm32\include\windows.inc
include     \masm32\include\kernel32.inc
include     \masm32\include\msvcrt.inc

includelib  \masm32\lib\kernel32.lib
includelib  \masm32\lib\msvcrt.lib

.data

str1        db 'Press any key to exit.',13,10,0
str2        db 'You hit %c',0

.code

start:

    invoke  crt_printf,ADDR str1

    invoke  crt__getch

    invoke  crt_printf,ADDR str2,eax

    invoke  ExitProcess,0

END start

Code Select

.386
.model flat,stdcall
option casemap:none

include     \masm32\include\windows.inc
include     \masm32\include\kernel32.inc
include     \masm32\include\msvcrt.inc

includelib  \masm32\lib\kernel32.lib
includelib  \masm32\lib\msvcrt.lib

.data

msg         db 'Press any key and hit RETURN',13,10,0
msg2        db 'You pressed %c before hitting RETURN',13,10,0

.code

start:

    invoke  crt_printf,ADDR msg

    invoke  crt_getchar

    invoke  crt_printf,ADDR msg2,eax

    invoke  ExitProcess,0

END start

mineiro · May 22, 2016, 03:03:12 AM

Quote from: xanatose on May 21, 2016, 05:36:19 PM
As masm was created in the 16 bit era. It was necesary to divide the sections in order to access more than 64K.

Yes, when processor is operating on real mode, each segment can address (offset) 64k. If we need more addresses because our program gets bigger, we switch to other segment. Today we generally forgot about segments.

Mark Zbikowski was the designer of executable file format, and the headers of that file format start with his initials "MZ".

The MASM Forum

News:

Understanding a typical MASM32 program template from a C++ point of view

RedSkeleton007

mineiro

qWord

xanatose

Vortex

mineiro