Author Topic: xlat is pretty fast  (Read 3282 times)

jj2007

  • Member
  • *****
  • Posts: 13944
  • Assembly is fun ;-)
    • MasmBasic
Re: xlat is pretty fast
« Reply #30 on: September 29, 2022, 01:16:55 AM »
Some of this stuff sounds like it comes out of Alice In Wonderland.

Your rant is perfectly unrelated to the questions raised, but if it makes you happy to praise the 64-bit world, so be it :thumbsup:

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: xlat is pretty fast
« Reply #31 on: September 29, 2022, 02:30:21 AM »
And worse, some of the claims sound like they were spoken by the mad hatter.

PE specs are clear cut, OS versions change over time and have different characteristics, the last OS version to support both 16 and 32 bit was XP (from memory), Win7 64 and up do not support 16 bit code natively, on 64 bit OS versions, 32 bit code is supported in both hardware and the OS and protected mode makes it all possible.  :biggrin:

PS: XLAT works fine and will keep most people happy most of the time.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

NoCforMe

  • Member
  • *****
  • Posts: 1124
Re: xlat is pretty fast
« Reply #32 on: September 29, 2022, 07:54:38 AM »
So I'm pretty sure the correct answer is an combination of several opinions offered above:
  • Hutch's assertion that addresses given to ASM opcodes are, indeed, virtual addresses is correct. That's how the paging scheme I described works; if a virtual address doesn't point to actual, physical memory, the OS loads the page in question from "backing store" (disk file). This, of course, is a simplification of the process, but it's basically how it works. (That's why page faults are a good thing in this case!
  • Other than that (address translation), the opcodes we specify in our ASM source code are actually, physically executed. At some point an actual MOV operation has to actually move something between physical memory and a register (or two registers, or two memory locations using DMA*.)
Now that second point ignores the whole thing of microcode, which are the actual, for real, bona fide hardware operations that take place when we ask for, say, a REP STOSB. We can think of our opcodes as functions, and microcode is the actual (hardware) code inside those functions. But since nobody (that I know of) has ever seen this code, much less written it, it's only of academic interest. (Can't be read or written, so far as I know. At least not in the code stream.)

Speaking of which, does anyone know where one can find X86 microcode? I'm not sure if this is (Intel/AMD, etc.) proprietary information or not. I am curious to see how it works. And apparently it can be uploaded via BIOS.

* I may be thinking of the Olde Tymes, when the 2-parameter string instructions (MOVSB) used actual DMA to do the data transfer. Probably not true anymore with this newfangled hardware ...

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: xlat is pretty fast
« Reply #33 on: September 29, 2022, 09:41:09 AM »
Opcodes are opcodes and protected mode has no reason to interfere with this and in fact opcodes work directly on OS provided memory. Protected mode is literally OS controlled memory which ensures that one app cannot write to memory in another app like it used to be in Win3.? where some piece of crap trashed the entire OS.

Win 3.? was in fact one single app that emulated multitasking in software and it was genuinely clever stuff but with the advent of hardware multitasking, its great limitation was bypassed and the hardware provided the capacity to fully isolate each app in its own memory space. Much of what you are paying for when you buy a Windows version is this capacity to reliably run multiple apps and this is apart from all of the rest of the facilities in an OS.

Now instruction encodings are another matter altogether. In the days of 8088 CPUs, the whole instruction set was directly encoded in silicon but as the instruction set became much larger on much later hardware, x86 CISC was an interface to a RISC based instruction set which looked nothing like the old stuff. Each iteration of Intel (and probably AMD) hardware did much the same where you had preferred instructions (the simple ones) and the older more complex instructions that were dumped in much slower microcode.

The simple ones were much faster and if you needed something from the older instructions, it was available but you tried to avoid them for performance reasons. There are some special cases like the instructions that take a REP prefix. While MOVSD is as slow as a wet week by itself, REP MOVSD is special case circuitry that is competitively fast on modern hardware.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

daydreamer

  • Member
  • *****
  • Posts: 2395
  • my kind of REAL10 Blonde
Re: xlat is pretty fast
« Reply #34 on: September 29, 2022, 05:47:06 PM »
Could we go back to time my word LUT instead of this debate?
http://masm32.com/board/index.php?topic=7938.msg103017#msg103017
my none asm creations
http://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: xlat is pretty fast
« Reply #35 on: September 29, 2022, 06:40:40 PM »
magnus,

While I am pleased to see you writing code, this topic was originally about timing the XLAT instruction. It wandered due to a number of technical issues but it was not what you had in mind with the LUT that you are investigating.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy: