Well I am hoping that I can get started with MASM someday, but I am having a little snag I think with setting up the environment. Tried to do this basic program from "Microsoft Assembly x64 Programming modern coding for MASM, SSE & AVX" by Mike McGrath
INCLUDELIB kernel32.lib
ExitProcess PROTO
.DATA
.CODE
main PROC
CALL ExitProcess
main ENDP
END
The environment I am trying to do this in is Visual Studio 2022 Express edition, installed the C++ workload, I setup an empty console project, set build dependencies to the MASM option "masm(.targets, .props)", set the linker->advanced properties to "main" for Entry point. Create the file with the .asm extension, and then after typing the previous code build the solution. It builds fine, but when I run I get this:
(process 13440) exited with code -1073741819 (0xc0000005)
And from: https://www.febooti.com/products/automation-workshop/online-help/actions/run-cmd-command/exit-codes/ (https://www.febooti.com/products/automation-workshop/online-help/actions/run-cmd-command/exit-codes/)
I am told that specific exit code means I am having a memory access violation, so the program is crashing! Not sure what I was expecting from the program, but later the book has me look at the exit code with echo %errorlevel% and I still get the same exit code. I've also tried compiling, linking, and running in the command prompt using ML64.exe and LINK.exe, and after doing echo %errorlevel% I get the same code.
I suspect that I am not setting up Visual Studio properly, or I have not allowed access to the kernel32.lib or KernelBase.dll in some way because the VS IDE debugger says the violation is related to functions in the KernelBase.dll.
Exception thrown at 0x00007FFACDC82AAA (KernelBase.dll) in Project1.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.
I have tried watching 3 videos now that setup similar to the way the book has the IDE, and still get the same results while they get a return code that they put in the RCX register with "mov $78, RCX" which I have also tried to no avail. I usually program in C and C++ so assembly is new, and setting up the environment has always bit me in the past so hoping I am simply missing a step somewhere.
In 64bit you have to declare the entry point of your program. You can do it manually or as I do, use the default
mainCRTStartup
The Win64 ABI requires a few extra things
INCLUDELIB kernel32.lib
ExitProcess PROTO
.DATA
.CODE
main PROC
sub rsp,28h ;<<<< 32 bytes for spill space plus 8 bytes for alignment
CALL ExitProcess
main ENDP
END
If you run it through the debugger, you get the faulting instruction
00007FFC73BE2AAA 0F 28 44 24 40 movaps xmm0,xmmword ptr [rsp+40h]
Without the stack adjustment RSP isn't aligned to 16 but movaps needs 16 byte alignment (the a tells you that the instruction needs alignment).
Hmmm. An ExitProcess call without an argument. That means that whatever(?) happens to be in rcx at the time is considered the exit code? Unless I am missing something. :tongue:
No, it means that whatever's on the stack will taken to be the exit code. So, garbage in this case.
Quote from: NoCforMe on September 13, 2024, 02:49:49 PMNo, it means that whatever's on the stack will taken to be the exit code. So, garbage in this case.
No, 64-bit uses RCX for the first arg, not the stack.
Quote from: sinsi on September 13, 2024, 03:20:37 PMQuote from: NoCforMe on September 13, 2024, 02:49:49 PMNo, it means that whatever's on the stack will taken to be the exit code. So, garbage in this case.
No, 64-bit uses RCX for the first arg, not the stack.
Exactly, like I said above. :biggrin:
Quote from: zedd151 on September 13, 2024, 03:45:43 PMQuote from: sinsi on September 13, 2024, 03:20:37 PMQuote from: NoCforMe on September 13, 2024, 02:49:49 PMNo, it means that whatever's on the stack will taken to be the exit code. So, garbage in this case.
No, 64-bit uses RCX for the first arg, not the stack.
Exactly, like I said above. :biggrin:
Maybe some people should stick to 32-bit code :badgrin:
Quote from: sinsi on September 13, 2024, 04:20:37 PMQuote from: zedd151 on September 13, 2024, 03:45:43 PMQuote from: sinsi on September 13, 2024, 03:20:37 PMQuote from: NoCforMe on September 13, 2024, 02:49:49 PMNo, it means that whatever's on the stack will taken to be the exit code. So, garbage in this case.
No, 64-bit uses RCX for the first arg, not the stack.
Exactly, like I said above. :biggrin:
Maybe some people should stick to 32-bit code :badgrin:
:biggrin: :cool:
Quote from: sinsi on September 13, 2024, 04:20:37 PMMaybe some people should stick to 32-bit code
That would be me ...
Quote from: sinsi on September 13, 2024, 08:23:26 AMThe Win64 ABI requires a few extra things
INCLUDELIB kernel32.lib
ExitProcess PROTO
.DATA
.CODE
main PROC
sub rsp,28h ;<<<< 32 bytes for spill space plus 8 bytes for alignment
CALL ExitProcess
main ENDP
END
If you run it through the debugger, you get the faulting instruction
00007FFC73BE2AAA 0F 28 44 24 40 movaps xmm0,xmmword ptr [rsp+40h]
Without the stack adjustment RSP isn't aligned to 16 but movaps needs 16 byte alignment (the a tells you that the instruction needs alignment).
Thank you this got me going! I also added:
mov RCX, 78
and now I am getting the exit code 78 which ... not knowing exactly what I am doing seems to be in the right direction. I appreciate the help because it means I can continue! Until the next roadblock at least..
I'm still not exactly clear on the reasoning for having to add this though and sounds like I need to read up on alignment and vector interrupts:
sub rsp, 28h
sub is a subtraction instruction, rsp is the general purpose register that holds the address to the top of the stack, and the 28h is a vector interrupt, correct? If I am grasping at least the basics of what you're saying is that the Windows Application Binary Interface (ABI) requires I byte align the address of the top of the stack to 16 bytes, and that somehow (not sure exactly how the subtraction of the address in the rsp by a what the vector interrupt contains accomplishes this) that line of code accomplishes that? Otherwise if I don't do that, data is not aligned and thus memory access violations occur and the program crashes? I guess I'm asking because it seems like I will have to do this for every example in this book that I am using to learn some of this.
No, not interrupt vectors; you might be thinking of the good old 8088 with its interrupt table in the first 256 doublewords of RAM. It's just a displacement into the stack.
Quote from: NoCforMe on September 14, 2024, 08:00:31 AMNo, not interrupt vectors; you might be thinking of the good old 8088 with its interrupt table in the first 256 doublewords of RAM. It's just a displacement into the stack.
Maybe that's what I would think if I knew anything about assembly? I have no clue really, thanks for clarifying. I just was searching for what that 28h means and interrupt vectors is what started to come up. Hopefully between the three books and other resources I got for assembly, displacing into the stack will make more sense as I go. For now I guess I will just keep it in my mind that there is a perfectly good reason why I need to do this, I just don't understand it yet :biggrin: It is always interesting when you have to do extra to get example code to work.
Looks like time for sinsi to explain spill space and all the other memory-layout weirdnesses of the 64-bit ABI ...
Nothing magic about 28H (= 40 decimal). Just a number of DWORDs/QWORDs (not sure which).
I certainly don't want to preempt sinsi or anyone else here who knows more than I do about X64 programming. However, I ran across an explanation of spill space (https://stackoverflow.com/questions/30190132/what-is-the-shadow-space-in-x64-assembly) (aka shadow space) which may be helpful. Apparently, there are two main reasons for providing this memory space: debugging and handling functions with variable-length argument lists (varargs):
QuoteThe Shadow space (also sometimes called Spill space or Home space) is 32 bytes above the return address which the called function owns (and can use as scratch space), below stack args if any. The caller has to reserve space for their callee's shadow space before running a call instruction.
It is meant to be used to make debugging x64 easier.
Recall that the first 4 parameters are passed in registers. If you break into the debugger and inspect the call stack for a thread, you won't be able to see any parameters passed to functions. The values stored in registers are transient and cannot be reconstructed when moving up the call stack.
This is where the Home space comes into play: It can be used by compilers to leave a copy of the register values on the stack for later inspection in the debugger. This usually happens for unoptimized builds. When optimizations are enabled, however, compilers generally treat the Home space as available for scratch use. No copies are left on the stack, and debugging a crash dump turns into a nightmare.
Of course, the space
can be used by a compiler, but we're not using one (this is assembly language), so the spill space just sits there unused, even though it must be "allocated" (by moving the stack pointer before calling the function).
Any better explanations welcomed here.
QuoteThe Shadow space (also sometimes called Spill space or Home space) is 32 bytes above the return address
It's
the 32 bytes, not 32 bytes above.
Quote from: jj2007 on September 14, 2024, 04:35:34 PMQuote from: NoCforMe on September 14, 2024, 10:11:49 AMbelow stack args
Really?
Actually, yes. On entry
RSP+0 return address
RSP+8 space for 1st arg
RSP+16 space for 2nd arg
RSP+24 space for 3rd arg
RSP+32 space for 4th arg
RSP+40 actual 5th arg - first stack arg
I've never heard the debugging reason though, the main use for me is saving the args for later use, since the first WinAPI call will probably trash RCX etc.
If you don't use it then there's 32 bytes for any small locals for you to use.
I've seen code that uses it to save registers instead of pushing them.
Quote from: sinsi on September 14, 2024, 05:49:29 PMActually, yes. On entry
Code Select Expand
RSP+0 return address
RSP+8 space for 1st arg
RSP+16 space for 2nd arg
RSP+24 space for 3rd arg
RSP+32 space for 4th arg
RSP+40 actual 5th arg - first stack arg
You are right, as usual :thumbsup:
Thank you for the detailed explanation, I happen to have come across something interesting that has confused me further, because what we have discussed I have attempted on my personal computer at home computer by installing Visual Studio 2022 Express edition, I today got my work computer setup with Visual Studio 2022 Professional, and when I setup a project the same way for MASM as I did before, the code seems to work without the addition of:
sub rsp, 28h
When I say work, I mean that I get an exit code other than : code -1073741819 since from other examples I have been doing that seems to be a universal you messed something up. Further more if I add the:
mov rcx, 78
instruction the exit program will end with : code 78
Any ideas to what the difference is? Makes my head spin a bit that in one computer it works and the other it doesn't! However, the code might not be working properly because even though it runs, I'm learning to debug by stepping through watching the registers, and everything seems to work but I've noticed that no matter which way with or without the "sub rsp, 28h" instruction, the line :
call ExitProcess
Is throwing an exception :
Exception thrown at 0x00007FF6A28C1031 in TestASM.exe: 0xC0000005: Access violation writing location 0xFFFFFFFFFFFFFFF8
So me using the term "worked" might be incorrect. So two things that interest me at this point:
1. One computer I need to add the "sub rsp, 28h" in order for the program to work at all, but another computer environment works without (mind you I haven't explored to many other programs so perhaps spill space and such will become more important later as I program something more complex)
2. Why is "call ExitProcess" throwing an exception? Stepping through everything seems to work until that point, and the program will even run if I just run it giving me a exit code as expected, but I don't like getting an exception if it truly is an error.
Once again, thanks for the help, it feels like I am at least making baby steps toward something that seems like a new world to me!
Post the non-working code, my mind reading skills are pretty poor :badgrin:
Sorry for that!
Its a code to just step down and watch the registers change as I use the debugging feature:
INCLUDELIB kernel32.lib
ExitProcess PROTO
.data
var QWORD 100 ; initialize variable mem
.code
main PROC
SUB RSP, 28H ; Byte align the top of the stack to 16 bytes
XOR RCX, RCX ; Clear registry
XOR RDX, RDX ; Clear registry
MOV RCX, 33 ; Assign reg/imm
MOV RDX, RCX ; Assign reg/reg
MOV RCX, var ; Assign reg/mem
MOV var, RDX ; assign mem/reg
MOV RCX, 78 ; Did it exit afterwards?
call ExitProcess
main ENDP
END
When I step down in the debugger of my work computer I can see everything transpiring between var, RCX, and RDX, but as soon as I hit call ExitProcess it throws an exception. I will note, that the same exact program does not throw the exception on my personal home computer, it exits as expected with no errors. Perhaps my work computer has a rights issue when 'ExitProcess' is called? The program seems to execute as expected and return an exit code 78 as expected, just throws an exception at the end :
Exception thrown at 0x00007FF691751039 in TestASM.exe: 0xC0000005: Access violation writing location 0x00000000542FF947.
asm file
include \masm64\include64\masm64rt.inc
.data
var QWORD 100 ; initialize variable mem
.code
main PROC
SUB RSP, 28H ; Byte align the top of the stack to 16 bytes
XOR RCX, RCX ; Clear registry
XOR RDX, RDX ; Clear registry
MOV RCX, 33 ; Assign reg/imm
MOV RDX, RCX ; Assign reg/reg
MOV RCX, var ; Assign reg/mem
MOV var, RDX ; assign mem/reg
MOV RCX, 78 ; Did it exit afterwards?
call ExitProcess
main ENDP
END
batch file to build it with
@echo off
set appname=test2
if exist %appname%.obj del %appname%.obj
if exist %appname%.exe del %appname%.exe
\masm64\bin64\ml64.exe /c /nologo %appname%.asm
\masm64\bin64\link.exe /SUBSYSTEM:WINDOWS /ENTRY:main /nologo %appname%.obj
dir %appname%.*
pause
ran in x64dbg debugger, no issues while running from the debugger, or while stepping the code in the debugger.
Something wrong on your end, it seems.
Yeah no problem here building with VS.
Interestingly the fault address (ending in 1039) is the address of the CALL ExitProcess line.
Could you go to Project>Properties and copy the ML64 and LINK command lines?
One thing, and pardon my ignorance of things 64-bit, but is this right in a 64-bit program?
INCLUDELIB kernel32.lib
I went to Project->Properties->Microsoft Macro Assembler->Command Line:
ml64.exe /c /nologo /Zi /Fo"TestASM\x64\Debug\%(FileName).obj" /W3 /errorReport:prompt /Ta
And in ...->Linker->Command Line:
/OUT:"C:\C++\TestASM\x64\Debug\TestASM.exe" /MANIFEST /NXCOMPAT /PDB:"C:\C++\TestASM\x64\Debug\TestASM.pdb" /DYNAMICBASE "kernel32.lib" "user32.lib" "gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib" "ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib" /DEBUG /MACHINE:X64 /ENTRY:"main" /INCREMENTAL /PGD:"C:\C++\TestASM\x64\Debug\TestASM.pgd" /SUBSYSTEM:CONSOLE /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /ManifestFile:"TestASM\x64\Debug\TestASM.exe.intermediate.manifest" /LTCGOUT:"TestASM\x64\Debug\TestASM.iobj" /ERRORREPORT:PROMPT /ILK:"TestASM\x64\Debug\TestASM.ilk" /NOLOGO /TLBID:1
Apart from file paths/names they're my command lines too.
Can you zip the exe and pdb (or better still the whole solution) and attach it here?
Attached the whole project folder.
No problems running the exe you built and the exe the project built.
QuoteException thrown at 0x00007FF6A28C1031 in TestASM.exe: 0xC0000005: Access violation writing location 0xFFFFFFFFFFFFFFF8
Exception thrown at 0x00007FF691751039 in TestASM.exe: 0xC0000005: Access violation writing location 0x00000000542FF947.
Interesting the differences. Did you change any code?
Also why would ExitProcess write to memory?
One thing to try, add two lines, but I don't see how it will fix the problem
call ExitProcess
add rsp,28h
ret
main ENDP
Quote from: sinsi on September 18, 2024, 04:00:23 PMAlso why would ExitProcess write to memory?
Freeing handles? memory? GDI objects? other housekeeping?
Quote from: NoCforMe on September 18, 2024, 11:27:58 AMOne thing, and pardon my ignorance of things 64-bit, but is this right in a 64-bit program?
INCLUDELIB kernel32.lib
Yes, the linker (in VS at least) looks in the correct folder.
The "32" is probably Microsoft's attempt in the early days to make it easier to port 32 to 64.
The DLL is called kernel32.dll, even the 64-bit version.
As usual with Microsoft, the 64-bit DLL is in System
32 and the 32-bit version is on SysWOW
64 :rolleyes:
Quote from: NoCforMe on September 18, 2024, 04:23:30 PMFreeing handles? memory? GDI objects? other housekeeping?
Or a typical Windows error code that doesn't actually tell you the error.
Nope didn't change the code, just ran the same program from my home computer to test the setup on my work computer. The only difference is one is the community version of .net and the other is the professional version. I wonder if it is just something about my work computer that is configured differently or has some secuirty feature that is causing issue, and I have no idea why ExitProcess would be writing anything.
Tried the two lines of code you added, but it still gives me a thrown exception when debugging, but runs. I attached a snippet image just for giggles. Just one of those things I guess...
And here is an attached image with the extra two lines of code, still doing the same thing. Weird...
Can you post the Output window text? Maybe something else is getting loaded.
'TestASM.exe' (Win32): Loaded 'E:\Desktop\TestASM\x64\Debug\TestASM.exe'. Symbols loaded.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\ntdll.dll'. Symbols loaded without source information.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\kernel32.dll'. Symbols loaded without source information.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\KernelBase.dll'. Symbols loaded without source information.
The thread 23284 has exited with code 0 (0x0).
I have to push the stop debugging button to get the window here it is:
'TestASM.exe' (Win32): Loaded 'C:\C++\TestASM\x64\Debug\TestASM.exe'. Symbols loaded.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\ntdll.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\kernel32.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\KernelBase.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\ctiuser.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\advapi32.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\msvcrt.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\sechost.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\rpcrt4.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\bcrypt.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\fltLib.dll'.
'TestASM.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbase.dll'.
The thread 28492 has exited with code 0 (0x0).
'TestASM.exe' (Win32): Loaded 'C:\Program Files\Avecto\Privilege Guard Client\PGHook.dll'.
The thread 26332 has exited with code 0 (0x0).
Exception thrown at 0x00007FF75D3D1039 in TestASM.exe: 0xC0000005: Access violation writing location 0x00000000B51CFCA7.
The program '[1100] TestASM.exe' has exited with code 0 (0x0).
QuotePrivilege Guard Client
That's probably the culprit.
Are you running VS as administrator?
Does the exception occur if you build a release version?
Sometimes this helps:
Dependency Walker 2.2 (https://dependencywalker.com/)
ExitProcess function won't return, so no need for additional code after it.
Probably dont need these:
/PGD:"C:\C++\TestASM\x64\Debug\TestASM.pgd"
/LTCGOUT:"TestASM\x64\Debug\TestASM.iobj"
/ILK:"TestASM\x64\Debug\TestASM.ilk"
Unfortunately I don't think I can edit the Privilege Guard as it is a security application for my computer. I did run debugging in release mode and it didn't throw the exception and the program seemed to work? So maybe I can look in the release configuration for a difference in settings and might find something.
Really appreciate the help, I don't know if I can dig anymore on my work computer to solve the issue because I don't have full rights to edit stuff as administrator and might have to ask IT for some help digging things up, maybe an exception to some firewall or to the Privilege guard needs to be made.
Okay so I don't know what this does:
Project Properties->Configuration Properties->Advanced "Whole Program Optimization"
I changed the setting from "No whole program optimization." to "Use Link Time Code Generation" because that was a setting that differed from debug to release. If I run the debugger without any breakpoints it runs and doesn't throw the exception, if I step through the program as soon as it hits the "call ExitProcess" It starts saying things about
kernel32.pdb and ntdll.dll not loaded, but as soon as I step out it finishes without the exception.
"kernel32.pdb contains the debug information required to find the source for the module kernel32.dll"
"ntdll.pdb contains the debug information required to find the source for the module ntdll.dll"
Dunno if that is exactly a full success or not, but it keeps the exception from being thrown. Might find some other settings that are different that might help change things.
The problem is stack alignment in ExitProcess.
Add NOSTACKFRAME after include \masm64\include64\masm64rt.inc to prevent the creation of stack frame code by PROC. The stack frame code aligns the stack to 16-byte, so SUB RSP, 28H misaligns the stack.
include \masm64\include64\masm64rt.inc
NOSTACKFRAME ; <-- Add this line
.data
var QWORD 100 ; initialize variable mem
.code
main PROC
SUB RSP, 28H ; Byte align the top of the stack to 16 bytes (keep this line)
Example code with SUB RSP, 28H and without masm64rt.inc
Testing with Depends.exe
Started "TESTASM.EXE" (process 0x11EB8) at address 0x000000013FF80000. Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x0000000077960000. Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x0000000077740000. Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x000007FEFD4D0000. Successfully hooked module.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" returned 1073217537 (0x3FF80001).
Injected "DEPENDS.DLL" at address 0x0000000074F30000.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000001AF8A0) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000001AF8A0) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNEL32.DLL" returned 1 (0x1).
DllMain(0x000007FEFD4D0000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNELBASE.DLL" returned 4294828033 (0xFFFDE001).
Exited "TESTASM.EXE" (process 0x11EB8) with code 78 (0x4E).
without SUB RSP, 28H
Started "TESTASM.EXE" (process 0x11F30) at address 0x000000013F700000. Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x0000000077960000. Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x0000000077740000. Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x000007FEFD4D0000. Successfully hooked module.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" returned 1064304641 (0x3F700001).
Injected "DEPENDS.DLL" at address 0x0000000074F30000.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000002FF8D0) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000002FF8D0) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" called.
Second chance exception 0xC0000005 (Access Violation) occurred in "NTDLL.DLL" at address 0x00000000779CB666.
Test code:INCLUDELIB kernel32.lib
ExitProcess PROTO
.data
var QWORD 100 ; initialize variable mem
.code
mainCRTStartup PROC
SUB RSP, 28H ; Byte align the top of the stack to 16 bytes
XOR RCX, RCX ; Clear registry
XOR RDX, RDX ; Clear registry
MOV RCX, 33 ; Assign reg/imm
MOV RDX, RCX ; Assign reg/reg
MOV RCX, var ; Assign reg/mem
MOV var, RDX ; assign mem/reg
MOV RCX, 78 ; Did it exit afterwards?
call ExitProcess
mainCRTStartup ENDP
END
build command:ml64.exe TestASM.asm -link -subsystem:console
Quote from: TimoVJL on September 19, 2024, 01:43:56 PMExample code with SUB RSP, 28H and without masm64rt.inc
Testing with Depends.exe
Started "TESTASM.EXE" (process 0x11EB8) at address 0x000000013FF80000. Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x0000000077960000. Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x0000000077740000. Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x000007FEFD4D0000. Successfully hooked module.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" returned 1073217537 (0x3FF80001).
Injected "DEPENDS.DLL" at address 0x0000000074F30000.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000001AF8A0) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000001AF8A0) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNEL32.DLL" returned 1 (0x1).
DllMain(0x000007FEFD4D0000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_DETACH, 0x0000000000000001) in "KERNELBASE.DLL" returned 4294828033 (0xFFFDE001).
Exited "TESTASM.EXE" (process 0x11EB8) with code 78 (0x4E).
without SUB RSP, 28H
Started "TESTASM.EXE" (process 0x11F30) at address 0x000000013F700000. Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x0000000077960000. Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x0000000077740000. Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x000007FEFD4D0000. Successfully hooked module.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" called.
DllMain(0x000007FEFD4D0000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" called.
DllMain(0x0000000077740000, DLL_PROCESS_ATTACH, 0x0000000000000000) in "KERNEL32.DLL" returned 1064304641 (0x3F700001).
Injected "DEPENDS.DLL" at address 0x0000000074F30000.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000002FF8D0) in "DEPENDS.DLL" called.
DllMain(0x0000000074F30000, DLL_PROCESS_ATTACH, 0x00000000002FF8D0) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x0000000074F30000, DLL_PROCESS_DETACH, 0x0000000000000001) in "DEPENDS.DLL" called.
Second chance exception 0xC0000005 (Access Violation) occurred in "NTDLL.DLL" at address 0x00000000779CB666.
Test code:
INCLUDELIB kernel32.lib
ExitProcess PROTO
.data
var QWORD 100 ; initialize variable mem
.code
mainCRTStartup PROC
SUB RSP, 28H ; Byte align the top of the stack to 16 bytes
XOR RCX, RCX ; Clear registry
XOR RDX, RDX ; Clear registry
MOV RCX, 33 ; Assign reg/imm
MOV RDX, RCX ; Assign reg/reg
MOV RCX, var ; Assign reg/mem
MOV var, RDX ; assign mem/reg
MOV RCX, 78 ; Did it exit afterwards?
call ExitProcess
mainCRTStartup ENDP
END
build command:
ml64.exe TestASM.asm -link -subsystem:console
So I tried this today, and it works both with and without the "sub rsp, 28h". Thank you for the solution. The biggest difference is this addtion:
mainCRTStartup PROC
and :
mainCRTStartup ENDP
I am not familiar enough to understand what the "mainCRTStartup" is, in the previous code I simply used "main" and in the linker advanced properties of the project I put in "main" as the entry point of the program. Why does mainCRTStartup work even without putting it as an entry point in the linker -> advanced properties, and also not have an exception thrown like before?
Quote from: Nate523 on September 20, 2024, 03:43:25 AMit works both with and without the "sub rsp, 28h"
Not aligning the stack correctly is a recipe for disaster. Many Windows APIs work just fine, but every now and then you will get mysterious crashes...
Quote from: jj2007 on September 20, 2024, 05:59:31 PMQuote from: Nate523 on September 20, 2024, 03:43:25 AMit works both with and without the "sub rsp, 28h"
Not aligning the stack correctly is a recipe for disaster. Many Windows APIs work just fine, but every now and then you will get mysterious crashes...
some people don't like facts and now it is very well proved.
If even depends.exe show, that program crash, who are those people ?
Quote from: TimoVJL on September 20, 2024, 11:09:13 PM... who are those people ?
New members that do not yet fully understand the concept of stack alignment, shadow space, etc. and other specs in the 64 bit ABI.
You should not assume that everyone knows what 'depends.exe' is or or what it does. New members often need a lot of help understanding the "how" and "why" things must be done a certain way, in words that they (as newbies) can understand.
I've never heard of depends.exe. What and where is it?
(Doesn't have anything to do with diapers, does it?)
Quote from: TimoVJL on September 20, 2024, 11:09:13 PMQuote from: jj2007 on September 20, 2024, 05:59:31 PMQuote from: Nate523 on September 20, 2024, 03:43:25 AMit works both with and without the "sub rsp, 28h"
Not aligning the stack correctly is a recipe for disaster. Many Windows APIs work just fine, but every now and then you will get mysterious crashes...
some people don't like facts and now it is very well proved.
If even depends.exe show, that program crash, who are those people ?
Actually, being the newbie, I missed the last line of your depends.exe export that it crashed when loading ntdll.dll when not aligning the stack, thank you for providing that, I will have to download on my personal computer and learn how to use depends.exe because my work computer (which I was having the issue with) didn't like depends.exe. I will definitely say I think I used the word "work" incorrectly. I was trying to just say that that when I use either the stack alignment way, or not, it runs and I get the exit code I was expecting and no longer getting the exception I was running into with the Visual Studio environment. I think I understand that doesn't mean it worked, and what you showed definitely shows it didn't. Sorry for the confusion.
Still reading up on stack alignment and trying to grasp it, I will take you're word with your experience that it is a bad thing to not do.
Stack alignment is pretty damn important to any computer system.
Probably the worst that can happen is that a subroutine return address gets lost or trashed. Think about how the CALL instruction works, which is how you access a function: the first thing it does is push the return address (the address of whatever instruction immediately follows the CALL on the stack. Then it jumps to the entry point of the subroutine and starts executing it.
So let's say that in the meantime, while the code in the subroutine is running, the stack pointer (RSP in 64 bit) gets somehow changed. When the subroutine code hits its RET instruction, it pops the stack and retrieves the return address that was pushed on the stack. But if the wrong thing gets popped, then the return will be to somewhere in never-neverland, and you'll get an exception.
Just one of the many ways you can get into trouble with code ...
Makes sense when you say it, but I'm only sorta familiar with pointers and stacked lists in C++, and even though it seems similar, CPU hardware stack seems more complex. No book or source so far with assembly that I have acquired has said "Hey align the stack before you do any programming", alignment is at the end of one of them so I am reading up on it, so I appreciate a forum like this that can tell me to do so :biggrin:
Umm and I know I'm a newbie but is this a typo and you meant 32 bit? :
Quote from: NoCforMe on September 21, 2024, 03:09:07 PM... the stack pointer (ESP in 64 bit)
Cause I am reading about the registers and I got a table here that says the RSP register is the 64 bit stack pointer, and ESP is the 32-bit portion of that register? Once again thanks for the explanation!
RSP, sorry. Typo.
Stacks: Better to have some pictures to illustrate, but follow my description and you should get the idea. It's not rocket surgery.
A stack is a structure implemented in computer memory. It's an area of memory set aside for this use. (Nothing special about that memory: it's the same memory used for your program's code and data. It's just used as a stack.)
So what is a stack? The classic illustration is something that we old folks were familiar with but you youngsters probably aren't: in cafeterias, when you went to get your food the first stop was at the dish stack. This was literally a stack of dishes that had a big spring underneath it so as people took plates off the top it would stay at the same level. (Do they still use those things? I don't remember seeing one of those in a long while.)
What you had here was what's called a
LIFO stack, meaning "last in-first out". Imagine instead of taking a dish off this stack, you put a dish on top of the stack--let's say it's your favorite dish and you want to make sure you get it off the stack. Then when you want the dish, you take it off the stack.
Moving to computer usage here, putting the dish on the stack is called a "push".
Taking a dish off the stack is called a "pop".
So why use a stack? what's the point?
Well, it's a very useful
data storage scheme. Let's say you're coding a routine that uses some registers--let's call them
A,
B and
C. But in that routine you need to call some other routine that, unfortunately, trashes those registers. What to do?
Well, you could set up some variables to store those registers in, say
SaveA,
SaveB and
SaveC, move the registers to those variables, call the trashing routine, then when it returns restore the registers from those variables.
That would work fine. But using the stack is much easier. Because the processor has instructions designed specifically to deal with the stack. Like the dish example, there are two instructions:
- PUSH pushes a value (register, variable or immediate value) on the stack
- POP pops the top of the stack into a register or variable
So in the example with the trashing routine, you'd do this to save those registers (this is pseudocode for some imaginary processor):
PUSH A
PUSH B
PUSH C
CALL TrashingRoutine
POP C
POP B
POP A
This saves those 3 registers on the stack, calls the routine, then restores the registers from the stack.
Notice something important:
the POPs are in reverse order of the PUSHes. Think about why this is; think about the stack of dishes and the order they are in when you put 3 dishes on the stack and then take those 3 dishes off; what order do they appear in? (Hint: LIFO.)
So moving to computer programming, the stack is used a lot. You can use it as shown above to save and restore registers (or memory variables). It's used every time you call a function; the processor pushes the return address of the function, and some of the parameters (the values passed to the function).
Detail: how does the processor keep track of the stack?
RSP is the key here. It's called the
stack pointer, and it always points to the top of the stack. It gets changed as
PUSHes and
POPs are executed. That's why it's so important to make sure that this register is set correctly.
So does that make sense? If you think this through you'll see it's not that complicated.
@Nate523
Just read what other users write, like sinsi said problem for you and i just tried to to give tool to everyone to check it.
Other users didn't do their best to support you, even they are masm programmers, what i am not.
Quote from: Nate523 on September 21, 2024, 02:26:02 PMStill reading up on stack alignment and trying to grasp it, I will take you're word with your experience that it is a bad thing to not do.
It's not just "a bad thing to not do", it's a
requirement of the Win64 ABI.
In your code you can do what you want, but as soon as you talk to Windows you have to abide by their rules.
It is just easier to apply the rules even to your own internal code.
Quote from: NoCforMe on September 21, 2024, 05:20:38 PMRSP, sorry. Typo.
Stacks: Better to have some pictures to illustrate, but follow my description and you should get the idea. It's not rocket surgery.
A stack is a structure implemented in computer memory. It's an area of memory set aside for this use. (Nothing special about that memory: ...
Thank you for taking the time to give me that detailed explanation. Took a bit to write I'm sure and I appreciate everyone's time in helping me with these things. I think I might have misunderstood that the solution sinsi gave to align the stack was a one time deal, after you're explanation its sounding more like its something I have to constantly do when using assembly and ensure that I don't say, take a sledgehammer to the stack and misalign everything?
Quote from: sinsi on September 21, 2024, 10:56:49 PMQuote from: Nate523 on September 21, 2024, 02:26:02 PMStill reading up on stack alignment and trying to grasp it, I will take you're word with your experience that it is a bad thing to not do.
It's not just "a bad thing to not do", it's a requirement of the Win64 ABI.
In your code you can do what you want, but as soon as you talk to Windows you have to abide by their rules.
It is just easier to apply the rules even to your own internal code.
I missed that it was an absolute requirement when you posted about it earlier, sorry for that, appreciate the clarification. I'll note it as a do or die thing for right now, is the instruction that you gave:
sub RSP, 28h
The only way thing needs to be done to align the stack, or in the future will I have to align it in different ways based on how I am manipulating data?
Quote from: Nate523 on September 22, 2024, 01:35:11 AMThe only way thing needs to be done to align the stack, or in the future will I have to align it in different ways based on how I am manipulating data?
The usual way is to allocate space for any WinAPI calls and align RSP to 16 on proc entry, then not alter RSP at all.
For example, CreateWindowEx takes 12 arguments, so you need to reserve 12*8 bytes, plus 8 for alignment.
Other functions might have less than 4 (GetProcessHeap for one) but you still
must allocate 4*8 bytes as per the ABI.
The MASM64 include files have a custom prologue/epilogue which should take care of everything for you - I roll my own so I can't commment on its usefulness.
Quote from: sinsi on September 22, 2024, 02:09:49 AMQuote from: Nate523 on September 22, 2024, 01:35:11 AMThe only way thing needs to be done to align the stack, or in the future will I have to align it in different ways based on how I am manipulating data?
The usual way is to allocate space for any WinAPI calls and align RSP to 16 on proc entry, then not alter RSP at all.
For example, CreateWindowEx takes 12 arguments, so you need to reserve 12*8 bytes, plus 8 for alignment.
Other functions might have less than 4 (GetProcessHeap for one) but you still must allocate 4*8 bytes as per the ABI.
The MASM64 include files have a custom prologue/epilogue which should take care of everything for you - I roll my own so I can't commment on its usefulness.
Thank you for the clarification :thup:
Hi
Nate523,Quoteby sinsi: The usual way is to allocate space for any WinAPI calls and align RSP to 16 on proc entry, then not alter RSP at all. For example, CreateWindowEx takes 12 arguments, so you need to reserve 12*8 bytes, plus 8 for alignment.
I agree with
sinsi about
"The Usual Way", but I use a trick from university -
"The Non-Usual Way".
1.
Don't touch the stack - it's aligned from the start.
2. Forget about push, pop, invoke and macros, sub rsp, number, add rsp, number
3. Use
ONLY global variables for arguments
I'm using this in my big Example64 project and will post the source code when I'm done.
Quote from: ognil on September 22, 2024, 05:21:17 AM2. Forget about push, pop, invoke
So your solution is to never use
PUSH,
POP or
INVOKE?
How quaint. Your programs must be marvels of ingenuity (not).
And you look down on us? Feh ...
OP: Don't pay any attention to this guy.
Quote from: ognil on September 22, 2024, 05:21:17 AM3. Use ONLY global variables for arguments
So you must never write any recursive code, yes?
Do you even know what that is?
Hi ognil,
Quote1. Don't touch the stack - it's aligned from the start.
Consider the simple 64-bit example below, why 28h is substracted from rsp instead of 4*8=20h ?
option casemap:none
EXTERN MessageBoxA:proc
EXTERN ExitProcess:proc
.data
text db 'Hello world', 0
caption db '64-bit test', 0
.code
main PROC
sub rsp,28h
xor r9d,r9d
lea r8,caption
lea rdx,text
xor rcx,rcx
call MessageBoxA
xor ecx, ecx
call ExitProcess
main ENDP
END
Thank you Vortex for the perfect example :thumbsup:
Quote from: ognil on September 22, 2024, 05:21:17 AM1. Don't touch the stack - it's aligned from the start.
Only if you use a custom prologue which does it for you.
Quote from: ognil on September 22, 2024, 05:21:17 AM2. Forget about push, pop, invoke and macros, sub rsp, number, add rsp, number
If you know what you are doing, these instructions are perfectly fine to use.
Quote from: ognil on September 22, 2024, 05:21:17 AM3. Use ONLY global variables for arguments
Sorry, that's just wrong. Like NoCforMe said, what about recursion? Or multithreading?
We are talking about bare ASM, the mechanics of 64-bit programming using ML64.
ognil,
I hope you are aware of the difference of my code and yours.
Thank you Vortex,
The difference is explained well in tenkey's post on page 3.
ognil, the original post (and problem) didn't use
masm64rt.inc so stack adjustment was needed.
Quote from: sinsi on September 22, 2024, 07:55:42 AMWe are talking about bare ASM, the mechanics of 64-bit programming using ML64.
Thank you
sinsi,I already agreed with you - see:
Quotefrom sinsi: The Usual Way is to allocate m...
and my:
'I agree with
sinsi about "
The Usual Way".....'
:smiley:
Windows have a loader, to load your program from device to memory. After, it do some management, relocation if necessary and call the entry point of your program.
So, if you look the RSP register value at entry point(mainCRTStartup), it will be something like: RSP=??????????????8h
The interrogation means that the RSP digits value in your machine can be different from my machine.
But, windows rule say: align stack to 16. Remember, stack walks from up to down.
Look for the rightmost hexadecimal digit from RSP. That 8 or 0?
If that is 8, means stack not aligned to 16, if that is 0 means stack aligned to 16.
Write down multiples of 16 and see the last nibble in hexadecimal base:
16*0 = 00h
16*1 = 10h
16*2 = 20h
16*3 = 30h
...
All mulitplications above ends with ?0h. So, at entry point of our program, we need subtract (up to down) 1*8 from RSP register to align.
So ok, stack is aligned to 16.
It's not done yet. Have a thing called "shadow space". We need reserve 4*8 bytes for it before using a call to an external function.
By luck, 4*8 in hexadecimal is 20h. And stack was aligned before, so, if we subtract from aligned stack the value 20h the stack value will continue aligned to 16.
This is why sub rsp,28h means:
sub rsp,(1*8)+(4*8)
After all is done, we need correct stack value back again, so we add:
add rsp,(1*8)+(4*8)
;ml64 /c test.asm
;link /entry:mainCRTStartup /subsystem:console test.obj
;test
option casemap:none
.code
mainCRTStartup PROC
;rsp == ???????????????8
sub rsp,1*8 ;rsp == ???????????????0
sub rsp,4*8 ;rsp == ???????????????0
;stack aligned, we can call an external function from this point
;if that function uses more than 4 parameters (rcx,rdx,r8,r9), so, windows rule say that next parameters should be
;passed by stack, and this means that we need do stack arithmetic (align) again
add rsp,4*8 ;rsp == ???????????????0
add rsp,1*8 ;rsp == ???????????????8
ret
mainCRTStartup ENDP
END
So, if you aligned stack correctly, what problems can happen in a future?
Well, we need call external functions contained in libraries (.dll).
Most parameters that function uses generaly accepts qwords (8 bytes) as values.
So, lets write down again:
8*0 = 0h
8*1 = 8h
8*2 = 10h
8*3 = 18h
...
So, we can conclude from comparing multiplication from 8 and 16 that we need even number of paramenters instead of odd number of parameters.
If a function uses 6,8,10,12,... parameters thats ok, stack continue aligned.
If a function uses 5,7,9,11,,,, parameters, so we need store the last(s) parameter(s) in stack and this will unalign our aligned stack.
So, be even.
Hi Vortex,
Here is a small example.
Thank you!