Author Topic: Application Binary Interface (ABI), calling conventions and the like  (Read 5827 times)

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Agner Fog on using FPU and xmm regs in Win7-64:
Quote
However, a public discussion forum quotes the following answers from Microsoft engineers
regarding this issue: "From: Program Manager in Visual C++ Group, Sent: Thursday, May
26, 2005 10:38 AM. It does preserve the state. It's the DDK page that has stale information,
which I've requested it to be changed. Let them know that the OS does preserve state of
x87 and MMX registers on context switches." and "From: Software Engineer in Windows
Kernel Group, Sent: Thursday, May 26, 2005 11:06 AM. For user threads the state of legacy
floating point is preserved at context switch. But it is not true for kernel threads.
Kernel mode drivers can not use legacy floating point instructions." (www.planetamd64.com/index.php?showtopic=3458&st=100).

The issue has finally been resolved with the long overdue publication of a more detailed ABI
for x64 Windows in the form of a document entitled "x64 Software Conventions", well hidden
in the bin directory (not the help directory) of some compiler packages. This document says:
"The MMX and floating-point stack registers (MM0-MM7/ST0-ST7) are preserved across
context switches. There is no explicit calling convention for these registers. The use of
these registers is strictly prohibited in kernel mode code." The same text has later appeared
at the Microsoft website (msdn2.microsoft.com/en-us/library/a32tsf7t(VS.80).aspx).
My tests indicate that these registers are saved correctly during task switches and thread
switches in 64-bit mode, even in an early beta version of x64 Windows.

I like the red part. It somehow implies that the very latest version of the Windows kernel uses, well, "legacy floating point instructions" :biggrin:


qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #1 on: June 14, 2012, 12:49:55 AM »
I think they want to speed up context switches in kernel land. Maybe they also want prevent the slow transcendental function to increase the interruptibility of kernel code.
The question is which kind of driver needs FPU stuff? Basic FP-Arithmetic is still available through SSEx.
MREAL macros - when you need floating point arithmetic while assembling!

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4813
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #2 on: June 14, 2012, 12:57:51 AM »
From memory Microsoft abandoned FPU code some time ago for 64 bit versions, over time SSE will probably do the job if they extend the maths to 128 bit. FPU code can still handle numbers in the 80 bit range but it would seem that Intel also want to shift most maths to SSE rather than the now ancient FPU.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #3 on: June 14, 2012, 01:02:40 AM »
From memory Microsoft abandoned FPU code some time ago for 64 bit versions, over time SSE will probably do the job if they extend the maths to 128 bit.
Only kernel code is affected. User mode applications can still use the FPU.
MREAL macros - when you need floating point arithmetic while assembling!

dedndave

  • Member
  • *****
  • Posts: 8734
  • Still using Abacus 2.0
    • DednDave
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #4 on: June 14, 2012, 01:16:11 AM »
From memory Microsoft abandoned FPU code some time ago for 64 bit versions, over time SSE will probably do the job if they extend the maths to 128 bit. FPU code can still handle numbers in the 80 bit range but it would seem that Intel also want to shift most maths to SSE rather than the now ancient FPU.

maybe that implies the future intentions of intel (assuming that intel and ms collaborate)
they may intend to phase it out over the next few generations of processors

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #5 on: June 14, 2012, 01:26:15 AM »
> Kernel mode drivers can not use legacy floating point instructions

They say "don't use them", not: "if you use them, preserve them". Who would be affected by "wrong" FPU values if not Kernel code itself? Or do I misunderstand something completely? Kernel-wise I am a noob...

dedndave

  • Member
  • *****
  • Posts: 8734
  • Still using Abacus 2.0
    • DednDave
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #6 on: June 14, 2012, 01:27:46 AM »
may have something to do with handling FPU exceptions
i thought you were an expert on that stuff   :biggrin:

qWord

  • Member
  • *****
  • Posts: 1454
  • The base type of a type is the type itself
    • SmplMath macros
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #7 on: June 14, 2012, 01:51:12 AM »
For kernel threads, the FPU registers/status is not saved while a context switch occurs. That means that the whole FPU contents can change from one instruction to the next, if a context switch has occurred between them.
MREAL macros - when you need floating point arithmetic while assembling!

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #8 on: June 14, 2012, 04:00:27 AM »
OK, got it, thanks :t

Zen

  • Member
  • ****
  • Posts: 962
  • slightly red-shifted
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #9 on: June 14, 2012, 05:20:46 AM »
This is mind-boggling,...
As if kernel-mode programming wasn't confusing enough already,...
Zen

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #10 on: June 14, 2012, 06:12:28 AM »
I've done some tests checking how much it costs to save & restore the xmm regs that Win7-64 so merciless trashes:

Code: [Select]
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
89      cycles for fxsave
83      cycles for fxrstor
152     cycles for fsave
113     cycles for frstor

89      cycles for fxsave
83      cycles for fxrstor
152     cycles for fsave
113     cycles for frstor

172 cycles on my puter. Looks like a lot but effectively they are needed only around some probably utterly slow Windows API calls.

dedndave

  • Member
  • *****
  • Posts: 8734
  • Still using Abacus 2.0
    • DednDave
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #11 on: June 14, 2012, 01:29:00 PM »
prescott w/htt - XP MCE2005 SP3
Code: [Select]
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
162     cycles for fxsave
243     cycles for fxrstor
530     cycles for fsave
576     cycles for frstor

158     cycles for fxsave
243     cycles for fxrstor
528     cycles for fsave
578     cycles for frstor

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #12 on: June 14, 2012, 03:56:54 PM »
P4 Northwood w/ht XP SP3
Code: [Select]
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE2)
87      cycles for fxsave
207     cycles for fxrstor
443     cycles for fsave
526     cycles for frstor

84      cycles for fxsave
202     cycles for fxrstor
434     cycles for fsave
529     cycles for frstor

Interesting to see the drop in IPC for the Prescott compared to the Northwood.

Well Microsoft, here’s another nice mess you’ve gotten us into.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4813
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #13 on: June 14, 2012, 04:39:42 PM »
I had similar timing results back when I had both the Northwood and a Prescott as dev boxes, the 2.8 gig Northwood was usually faster than the 3 gig Prescott and had noticable less lag. Apparently the Prescott has a much longer pipeline.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

jj2007

  • Member
  • *****
  • Posts: 7558
  • Assembler is fun ;-)
    • MasmBasic
Re: Application Binary Interface (ABI), calling conventions and the like
« Reply #14 on: June 14, 2012, 04:54:46 PM »
Getting faster...
Interesting that fsave/frstor is always much slower. The x variants save 512 bytes to memory.

Code: [Select]
AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
38      cycles for fxsave
73      cycles for fxrstor
166     cycles for fsave
130     cycles for frstor

38      cycles for fxsave
73      cycles for fxrstor
167     cycles for fsave
130     cycles for frstor