NT and XP classically used the high order bit to determine if an address was in kernel space, the /3GB option works when drivers/applications are aware that >=0xC0000000 is kernel space, not just >=0x80000000 (sign bit). The Kernel and non-page pool live in the top 1 GB (always virtualized up there), the Kernel can't tolerate page faults (ie can't deal with another page fault, while handling a page fault), and maps/locks user memory into these high regions so that kernel drivers can touch them regardless of whatever application/task is running below. You'd get pretty blue screens if your driver touched the lower user space after the context switched. IRQL not less than or equal, or some such. A clear sign a driver writer had failed to test across a context switch, and had only observed instant completion of his asynchronous IO. You also can't DMA into context switched memory, needs pining to a physical page or pages. And block and file system drivers need to keep all their structures in non-page pool.
Per Michael's link
"X86 client versions with PAE enabled do have a usable 37-bit (128 GB) physical address space. The limit that these versions impose is the highest permitted physical RAM address, not the size of the IO space. That means PAE-aware drivers can actually use physical space above 4 GB if they want. For example, drivers could map the "lost" memory regions located above 4 GB and expose this memory as a RAM disk."
I have an old Athlon64 box (HP DC5750) running XP 32-bit, with PAE enabled, 4 slots of DDR2 with 8GB total memory, and yes there are RAM DISK drivers that can use this space. I think the BOOT.INI had /PAE /3GB, the PAE setting can be confirmed via Right Click My Computer, Properties
Most Intel boxes of this era have 2GB max, and at most 4GB unless you got to Xeon/Server chipsets.
Now if you have a modern Win7 x64 AMD Box you can easily, and cheaply, get 4 slots of DDR3 with 32GB total memory, Windows will automatically use all uncommitted memory as disk cache.