Just for fun, a little demo showing the virtues of len() in combination with VirtualAlloc :biggrin:
include \masm32\include\masm32rt.inc
.code
start:
invoke VirtualAlloc, 0, 4096, MEM_COMMIT, PAGE_READWRITE
xchg eax, edi
mov esi, InputFile("\Masm32\include\Windows.inc")
invoke RtlMoveMemory, edi, esi , 4095 ; we leave a nullbyte at the end of the committed memory
print str$(len(esi)), " bytes in esi", 13, 10
print str$(len(edi)), " bytes in edi", 13, 10
invoke RtlMoveMemory, edi, esi , 4096 ; no nullbyte...
print str$(len(edi)), " bytes in edi", 13, 10
MsgBox 0, "q.e.d.", "Test of len():", MB_OK
exit
end start
977412 bytes in esi
4095 bytes in edi
4096 bytes in edi
???
include \masm32\include\masm32rt.inc
.code
start:
invoke GlobalAlloc, GPTR, 4097
xchg eax, edi
mov esi, InputFile("\Masm32\include\Windows.inc")
invoke RtlMoveMemory, edi, esi , 4095 ; we leave a nullbyte at the end of the committed memory
print str$(len(esi)), " bytes in esi", 13, 10
print str$(len(edi)), " bytes in edi", 13, 10
invoke RtlMoveMemory, edi, esi , 4096 ; no nullbyte...
print str$(len(edi)), " bytes in edi", 13, 10
invoke GlobalFree, edi
MsgBox 0, "q.e.d.", "Test of len():", MB_OK
exit
end start
Used GlobalAlloc, instead. :tongue:
edit= forgot to free the global allocated memory. :biggrin:
Also, and more importantly the buffer needs to be big enough to be able to read the desired size plus one character for zero termination, else it will give erroneous results.
I changed the buffer size by one byte to accomodate the zero termination. Not a len() bug, imo.
Why do you think I used VirtualAlloc instead of GlobalAlloc? See the Instring thread (https://masm32.com/board/index.php?topic=11088.msg122376#msg122376).
Quote from: jj2007 on August 08, 2023, 09:46:04 PMlen() in combination with VirtualAlloc
Also, I didn't write it was a len() bug. I wrote that len() crashes under certain circumstances.
Quote from: jj2007 on August 09, 2023, 12:30:48 AMAlso, I didn't write it was a len() bug. I wrote that len() crashes under certain circumstances.
Seemed to have been inferred. Apologies.
I rarely use anything other than GlobalAlloc btw. Hasn't failed me. (as long as the buffer is big enough for what I need it for)
The point is that GlobalAlloc and HeapAlloc are tolerant: there are some trailing bytes, typically 8*AB, that you (or the len macro) can read ("DWO" is the end of the 4096 bytes):
Address Hex dump ASCII
0052D508 20 20 20 20|20 74 79 70|65 64 65 66|20 44 57 4F| typedef DWO
0052D518 AB AB AB AB|AB AB AB AB|00 00 00 00|00 00 00 00| ««««««««
0052D528 77 12 37 0C|B1 D6 00 00|C4 00 4F 00|00 61 4F 00| w␒7␌±Ö Ä O aO
0052D538 EE FE EE FE|EE FE EE FE|EE FE EE FE|EE FE EE FE| îþîþîþîþîþîþîþîþ
VirtualAlloc has no such tolerance, so len() crashes when trying to examine byte #4097 :sad:
Hi
It could be a canary to detect buffer overflows.
Certainly an undocumented feature that can change at any time.
Biterider
Quote from: Biterider on August 09, 2023, 02:20:10 AMIt could be a canary to detect buffer overflows.
Certainly an undocumented feature that can change at any time.
Attention, this is The Campus - please don't confuse n00bs who consult this area (this is neither a feature, nor can it be used to detect buffer overflows).
The len() macro crashes with the example code posted above because VirtualAlloc returns memory organised in 4096 byte pages. So if code (e.g. len) tries to see if byte #4097 is a nullbyte, i.e. the end of a string, it "hits the wall" and raises an exception.
This will not happen with strings allocated with GlobalAlloc or HeapAlloc, because (as explained above) they return always more bytes than requested.
Hi JJ
Please only refer to the official API documentation, in this particular case heapapi (https://learn.microsoft.com/en-us/windows/win32/api/heapapi/nf-heapapi-heapalloc#:~:text=If%20the%20HeapAlloc%20function%20succeeds%2C%20it%20allocates%20at%20least%20the%20amount%20of%20memory%20requested.). In the comments, it clearly says "at least", which means that it is possible that exactly the requested amount of memory will be allocated.
Everything else is pure speculation (which will confuse newcomers). Instead, encourage people to read the documentation carefully.
Biterider
Quote from: Biterider on August 09, 2023, 06:17:23 AMit is possible that exactly the requested amount of memory will be allocated.
It is not only possible, it is normal. However, you can
read approx. 30-40 bytes beyond the HeapAlloc'ed memory, otherwise tons of software would miserably fail.
include \masm32\include\masm32rt.inc
.code
start:
cls
if 0
invoke GlobalAlloc, 0, 4089
else
invoke VirtualAlloc, 0, 4096, MEM_COMMIT, PAGE_READWRITE
endif
xchg eax, edi
mov esi, InputFile("\Masm32\include\Windows.inc")
lea eax, [edi+4096-16]
; int 3 ; have a look in the debugger
invoke RtlMoveMemory, edi, esi , 4095 ; we leave a nullbyte at the end of the committed memory
invoke lstrlen, esi
print str$(eax), " lstrlen bytes in esi (the full Windows.inc)", 13, 10
invoke lstrlen, edi
print str$(eax), " lstrlen bytes in edi", 13, 10
invoke szLen, edi
print str$(eax), " szLen bytes in edi", 13, 10
invoke RtlMoveMemory, edi, esi , 4096 ; no nullbyte...
print "trying lstrlen:", 13, 10
invoke lstrlen, edi
print str$(eax), " lstrlen bytes in edi (0 is wrong, should be 4096)", 13, 10
print "trying szLen:", 13, 10
invoke szLen, edi
print str$(eax), " szLen bytes in edi", 13, 10
MsgBox 0, "q.e.d.", "Test of len():", MB_OK
exit
end start
Output:
977412 lstrlen bytes in esi (the full Windows.inc)
4095 lstrlen bytes in edi
4095 szLen bytes in edi
trying lstrlen:
0 lstrlen bytes in edi (0 is wrong, should be 4096)
trying szLen:
Both lstrlen and szLen yield wrong results with VirtualAlloc (but not with GlobalAlloc: it always adds some bytes extra...); szLen crashes with an exception, just as the len() macro; same for
invoke crt_strlen, edi
And the reason to use VirtualAlloc for strings is ... ¿?
In linux (wine) your program stops at with access violation exactly in that part that compare bytes to zero:
trying lstrlen:
Why do you overwrite end of string?
.data
binary db "string"
string db "string",0
Quote from: HSE on August 09, 2023, 09:03:14 AMAnd the reason to use VirtualAlloc for strings is ... ¿?
Yep, that's the whole point: Biterider says the fast pcmpistri algo is unsafe (https://masm32.com/board/index.php?topic=11088.msg122335#msg122335) because it goes beyond the allocated memory. So I made some tests, and it turns out that
a) he is right,
if it's VirtualAlloc'ed memory, and
b) that applies also to szLen, crt_strlen and probably other functions that need to
read until the last byte and beyond.
I made some tests with randomly HeapAlloc'ed memory, 80,000 allocations between 3 and 50k bytes, and you can
read approximately 30 bytes beyond the allocated zone without risking an exception. If you
write a single byte beyond, you are in trouble, though. Since functions like strlen() and instr() don't have to write, they are safe for HeapAlloc'ed strings.
Biterider is right, of course, that this is not documented. However, lstrlen is an official Windows function, and fails for VirtualAlloc'ed strings if there are no nullbytes in the last allocated page. If it would fail also for HeapAlloc'ed strings, Windows would be in deep trouble. Therefore I consider
pcmpistri to be safe for strings, as long as they are HeapAlloc'ed.
QuoteIf the GlobalAlloc function succeeds, it allocates at least the amount of memory requested.
If the actual amount allocated is greater than the amount requested, the process can use the entire amount.
invoke GlobalAlloc, 0, 4089
850039 lstrlen bytes in esi (the full Windows.inc)
4095 lstrlen bytes in edi
4095 szLen bytes in edi
trying lstrlen:
4099 lstrlen bytes in edi (0 is wrong, should be 4096)
trying szLen:
4099 szLen bytes in edi
message box
invoke GlobalAlloc, 0, 0
850039 lstrlen bytes in esi (the full Windows.inc)
4095 lstrlen bytes in edi
4095 szLen bytes in edi
trying lstrlen:
4096 lstrlen bytes in edi (0 is wrong, should be 4096)
trying szLen:
4096 szLen bytes in edi
wine: Unhandled page fault on write access to 3D2D3D31 at address 7BC2A1A2 (thread 06b4), starting debugger...
06b4:err:seh:NtRaiseException Unhandled exception code c0000005 flags 0 addr 0x7bc2a1a2
no message box
Windows OS gives zeroed heap to program, so there will be zeroes in first use.
Quote from: mineiro on August 09, 2023, 11:57:21 AMQuoteIf the GlobalAlloc function succeeds, it allocates at least the amount of memory requested.
If the actual amount allocated is greater than the amount requested, the process can use the entire amount.
...
wine: Unhandled page fault on write access
Unfortunately, the docs are not very clear on that: truth is that with HeapAlloc, you can "use" more than you requested, but only
reading the memory.
Quote from: TimoVJL on August 09, 2023, 04:36:42 PMWindows OS gives zeroed heap to program, so there will be zeroes in first use.
If you used the HEAP_ZERO_MEMORY flag. If not, you may have zeroes but there is no guarantee.
I made a testbed demonstrating that you can read dword ptr [eax+28], where eax points to the first byte after a heapalloc'ed buffer. The "tolerance" is the same for Win XP, Win 7-64 and Win 10, but grateful for tests on other machines.
As noted in previous posts, there is zero tolerance for VirtualAlloc'ed memory, and writing beyond the HeapAlloc'ed buffer is also not allowed by the OS. But reading is ;-)
mov edi, offset handles
push edi
push loops-1 ; 80,000 tests
.Repeat
lea eax, [1+Rand(50000)]
;and eax, 0FFFFFFF0h
stosd
add total, eax
invoke HeapAlloc, MbProHeap, flags, eax
stosd
Print Str$("%i MB", total/1048576), Cr$
dec stack
.Until Sign?
pop edx
pop esi
Print Str$("Allocated %3f GB - only extreme sizes are shown below\n", total/40000000h)
push loops-1
.Repeat
lodsd
mov elsize, eax
lodsd
xchg eax, edi
invoke HeapSize, MbProHeap, flags, edi
.if eax<=10 || eax>49990
fdeb 4, "size", eax ; only extreme sizes
.endif
lea ecx, [eax+edi]
mov edx, [eax+edi+28] ; read the ABABABAB behind the allocated memory; +28 is ok for Win XP, Win7-64 and Win 10
sub eax, elsize
.if dword ptr pcTable[-4]
; ifidni @Environ(oDebug), <1> ; OPT_Debug 1 ; 1=use \Masm32\MasmBasic\Res\DebugHeap.exe
echo debugging is ON
.if dword ptr [ecx+4]!=0ABABABABh ; the ABABs are there when a) Olly is launched or b) debugheap.exe is running
fdeb 1, "No ABAB found", x:edx, eax
; mov dword ptr [ecx], Mirror$("Ciao")
.endif
;else
; echo debugging is OFF
.endif
.if eax
Print Str$("allocated-requested=%i\n", eax)
.endif
invoke HeapFree, MbProHeap, flags, edi
; fdeb 7, "HF", $Err$()
dec stack
.Until Sign?
pop edx
Inkey Str$("Released %3f GB\n", total/40000000h)
Allocated 1.86 GB - only extreme sizes are shown below
size eax 2
size eax 49994
size eax 7
size eax 50000
size eax 1
size eax 6
size eax 49999
size eax 3
size eax 4
size eax 49991
size eax 2
size eax 2
size eax 49998
size eax 49997
size eax 10
size eax 3
size eax 49997
size eax 3
size eax 7
size eax 1
size eax 49999
size eax 6
size eax 6
size eax 2
size eax 49997
size eax 3
size eax 10
size eax 49993
size eax 49998
size eax 49997
size eax 49996
size eax 49999
size eax 9
Released 1.86 GB
Quote from: jj2007 on August 09, 2023, 10:35:59 AMYep, that's the whole point: Biterider says the fast pcmpistri algo is unsafe (https://masm32.com/board/index.php?topic=11088.msg122335#msg122335) because it goes beyond the allocated memory
No, that's not the point. I clearly wrote that if you try to access an uncommitted page, the processor will raise a fault.
This is by hardware!
Quote from: jj2007 on August 09, 2023, 06:45:20 AMIt is not only possible, it is normal. However, you can read approx. 30-40 bytes beyond the HeapAlloc'ed memory, otherwise tons of software would miserably fail.
It's not "normal", just a possible case. The allocator gives the user what better fits its allocation granularity. There may be some space before and after, a control block, or other structures to protect the heap, but this is an operating system responsibility and its implementation is subject to change. The only thing that is certain is what the documentation dictates. It's like some kind of contract.
The history of software development is full of examples where this rule has been broken and the catastrophic consequences it has had. If the documentation says you can do something, then do it. All other statements based on experiments may work for this or another OS version, but there is absolutely no guarantee that it will continue to work in the future.
Quote from: jj2007 on August 08, 2023, 09:29:31 PMIt's not so relevant for MasmBasic, because strings and buffers are all HeapAlloc'ed with a little reserve of 24 bytes that helps to avoid this scenario.
That may be true in the closed ecosystem you mention if there is always enough additional committed memory in the trail. Note that an overhead of 64 bytes will be required once the ISA AVX512 standard is established and appropriate instructions are available.
Quote from: jj2007 on August 08, 2023, 09:29:31 PMSee Peter Cordes' very detailed description of pcmpistri & pcmpestri performance at SOF (https://stackoverflow.com/questions/58901232/why-is-sse4-2-cmpstr-slower-than-regular-code)
Apart from the bottleneck discussion, they hit the nail on the head here
https://stackoverflow.com/questions/58901232/ (https://stackoverflow.com/questions/58901232/why-is-sse4-2-cmpstr-slower-than-regular-code#:~:text=.%20But%20without%20fault-suppression%20(for%20unaligned%20loads%20that%20potentially%20cross%20into%20a%20new%20page)%20they're%20hard%20to%20use%20without%20checking%20stuff%20about%20a%20pointer.)
Quote from: jj2007 on August 09, 2023, 06:29:09 PMUnfortunately, the docs are not very clear on that: truth is that with HeapAlloc, you can "use" more than you requested, but only reading the memory.
The documentation is very clear. The allocator returns a piece of memory, sometimes even slightly larger than requested (see above). How much can be determined with the function HeapSize https://learn.microsoft.com/en-us/windows/win32/api/heapapi/nf-heapapi-heapsize) (https://learn.microsoft.com/en-us/windows/win32/api/heapapi/nf-heapapi-heapsize))
Can you show where you got the information you can only read this memory (official sources only, e.g. MSDN, Raymond Chen, etc.). On x86 systems, the smallest unit of memory protection and management is at the page level. This means that memory permissions such as read, write and execute are managed at the this level rather than for individual bytes or smaller units.
This means that this "read only" memory must be located on the next memory page... ???
I have made myself clear. If you want to continue, that should be fine for you, but think of all this misinformation that is being spread. As an experienced programmer, you have a duty to encourage newcomers to follow good practices. One of these is to read the documentation carefully. It's annoying, but there are good reasons why it's there.
Biterider
Come on guys, this isn't the Colosseum. :wink2:
Quote from: The WorkshopThe Workshop is where general purpose questions and answers are posted. Any assembler programming topic is welcome and discussion is encouraged as long as its friendly.
Quote from: Biterider on August 10, 2023, 06:08:53 AMthink of all this misinformation that is being spread
HeapGames.zìp
24.52 KB
downloaded 0 times
You don't even bother to look at the testbed, compliments.