News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Buffer alignment

Started by sinsi, December 18, 2024, 03:16:28 PM

Previous topic - Next topic

sinsi

I was looking at ReadDirectoryChangesW and came across something I've not seen in the MS documentation.
Quote[out] lpBuffer
A pointer to the DWORD-aligned formatted buffer in which the read results are to be returned.
I thought Windows was pretty relaxed about buffer alignment, but later
QuoteReadDirectoryChangesW fails with ERROR_NOACCESS when the buffer is not aligned on a DWORD boundary.

Wonder why align 4 is so important?


Greenhorn

The alignment is important because the FILE_NOTIFY_INFORMATION structure is variable in length, because of it's last member FileName.

The other structure members are DWORDs, so you must align at 4 bytes (at least?).

8 byte and 16 byte alignment should also work ?! But this is just a guess by myself.
Kole Feut un Nordenwind gift en krusen Büdel un en lütten Pint.

LordAdef

Quote from: Greenhorn on December 19, 2024, 07:12:59 AMThe alignment is important because the FILE_NOTIFY_INFORMATION structure is variable in length, because of it's last member FileName.

The other structure members are DWORDs, so you must align at 4 bytes (at least?).

8 byte and 16 byte alignment should also work ?! But this is just a guess by myself.

There was a discussion a couple of years ago about this. Marinus suggested me to align 16  and 32 for performance in a certain code.

I just checked, out of curiosity, and I aligned 32 in the most important loop, and 16 in some other ones. Apart from that, I used 4

NoCforMe

Well, nobody has yet answered sinsi's question:
QuoteWonder why align 4 is so important?

My guess®™ is that somewhere in the code for ReadDirectoryChangesW() they explicitly check the buffer alignment, I'm guessing in the interest of speed.

Otherwise, how could a function possibly fail if data wasn't DWORD aligned?
Are there any x86/x64 instructions that would fail under that circumstance?
Assembly language programming should be fun. That's why I do it.

zedd151

Quote from: NoCforMe on March 30, 2025, 11:21:41 AMWell, nobody has yet answered sinsi's question:
QuoteWonder why align 4 is so important?
Maybe nobody here knows for certain.
QuoteMy guess®™ is that somewhere in the code for ReadDirectoryChangesW() they explicitly check the buffer alignment, I'm guessing in the interest of speed.

https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw
ERROR_NOACCESS when buffer is not aligned on a dword boundary.

But of course, it does not explain why.  :badgrin:

The ultimate answer: Because Microsoft says so.  :tongue:
Raymond Chen might know, he seems pretty knowledgeable about MS inner workings.
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

sinsi

Misaligning the buffer by 1 and following the code in a debugger I get to NtNotifyChangeDirectoryFileEx.
That fails with HRESULT 80000002 which, funnily enough, is
Quote0x80000002
STATUS_DATATYPE_MISALIGNMENT
{EXCEPTION} Alignment Fault
A data type misalignment was detected in a load or store instruction.
Still not sure why it causes a fault, but I need a break from debugging.

NoCforMe

Quote from: sinsi on March 30, 2025, 03:04:09 PMMisaligning the buffer by 1 and following the code in a debugger I get to NtNotifyChangeDirectoryFileEx.
That fails with HRESULT 80000002 which, funnily enough, is
Quote0x80000002
STATUS_DATATYPE_MISALIGNMENT
{EXCEPTION} Alignment Fault
A data type misalignment was detected in a load or store instruction.
Still not sure why it causes a fault, but I need a break from debugging.
Hmm; so is that a bona fide hardware fault, or something generated by the Microsoft's code in the function?
What load or store instructions will fault because of misalignment?
Not just generic MOVs, I don't think.
Assembly language programming should be fun. That's why I do it.

_japheth

Quote from: NoCforMe on March 30, 2025, 05:51:33 PMHmm; so is that a bona fide hardware fault, or something generated by the Microsoft's code in the function?
What load or store instructions will fault because of misalignment?
Not just generic MOVs, I don't think.

It's exception 0x11; requires bit 18 to be set in register CR0; then setting flag AC ( also bit 18 ) in register EFL should activate the check. All MOVs, PUSHs, POPs, BTxs are affected, but just in ring 3.
Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

jj2007

Quote from: NoCforMe on March 30, 2025, 11:21:41 AMAre there any x86/x64 instructions that would fail under that circumstance?

Not that I am aware of, but I could be wrong. There are plenty of SIMD instructions that require align 16, but align 4? Never seen.

NoCforMe

Quote from: _japheth on March 30, 2025, 06:04:56 PM
Quote from: NoCforMe on March 30, 2025, 05:51:33 PMHmm; so is that a bona fide hardware fault, or something generated by the Microsoft's code in the function?
What load or store instructions will fault because of misalignment?
Not just generic MOVs, I don't think.

It's exception 0x11; requires bit 18 to be set in register CR0; then setting flag AC ( also bit 18 ) in register EFL should activate the check. All MOVs, PUSHs, POPs, BTxs are affected, but just in ring 3.
So, are you saying that this exception must be enabled by setting those bits? otherwise it won't occur?
Why would anyone do such a thing?
Seems to me (unless I'm quite wrong here) that Micro$oft is being a bit anal by penalizing the caller for having a misaligned buffer. Why not just let it be aligned whereever, and if the caller takes a performance penalty for a misaligned buffer, then it's on them?

And I forget just which rings things execute in: does the kernel run in ring 3?
Assembly language programming should be fun. That's why I do it.

_japheth

Quote from: NoCforMe on March 31, 2025, 09:55:30 AMSo, are you saying that this exception must be enabled by setting those bits? otherwise it won't occur?
Yes.

QuoteWhy would anyone do such a thing?

To detect misaligned memory accesses, perhaps? Those cause performance penalties.

QuoteSeems to me (unless I'm quite wrong here) that Micro$oft is being a bit anal by penalizing the caller for having a misaligned buffer.
For Windows Apps alignment check is off in MS Windows - I guess, too many apps would crash if it's on.

QuoteAnd I forget just which rings things execute in: does the kernel run in ring 3?
No, your app runs in ring 3, the kernel in ring 0.

Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

NoCforMe

Quote from: _japheth on March 31, 2025, 01:26:01 PM
Quote from: NoCforMe on March 31, 2025, 09:55:30 AMSeems to me (unless I'm quite wrong here) that Micro$oft is being a bit anal by penalizing the caller for having a misaligned buffer.
For Windows Apps alignment check is off in MS Windows - I guess, too many apps would crash if it's on.
But but but ... now we're back to square one:
If that's true, then why did sinsi get a fault back there in this post (#5)?
Again: was that an actual hardware fault caused by a CPU instruction, or something somehow generated by the Win32 function? (Or a hardware fault that was caught by C++ code?)

Very confused here ...
Assembly language programming should be fun. That's why I do it.

adeyblue

QuoteAgain: was that an actual hardware fault caused by a CPU instruction, or something somehow generated by the Win32 function? (Or a hardware fault that was caught by C++ code?)
if((ULONG_PTR)buffer) & 3)
{
    ExRaiseDataytpeMisalignmentException();
}
That's what happens in the kernel for pretty much every native function that takes a struct parameter.
You don't see it often fail like that in 32-bit apps on 64- because if you send in misaligned buffer, you're only sending them to WoW, and WoW will call the kernel without dodgy buffers. With native bitness apps, it's pretty easy to do. Like the struct passed to VirtualQuery for instance, that needs 8 alignment on 64-bit because the first member of the struct is a pointer rather than a dword like the struct for ReadDirectoryChangesW.

As for why, this:
QuoteTo detect misaligned memory accesses, perhaps? Those cause performance penalties
NT has been ported to loads of different architectures, it doesn't just run on PCs. Some of those don't do misaligned memory reads (Itanium for one) . Every platform does aligned reads, so you write one check to make sure the buffer is aligned and then you don't have to care about what the platform does if it isn't.

NoCforMe

Thanks for the excellent explanation.
Assembly language programming should be fun. That's why I do it.

TimoVJL

a that buffer store linked lists, so it isn't BYTE nor WCHAR buffer.
May the source be with you