Can anyone think of a reason not to use MMX registers in 64 bit ? You have 8 x 64 bit registers that are not specified in win64 and it seems to be a waste of a resource not to use them for something. Among other things I wondered if there would be any problems in using them to pass extra arguments to procedures. It certainly will not work on API calls or any procedure that uses the Microsoft ABI but it may be useful internally to avoid the stack layout of the conventionsl ABI.
Forget MMX and use SSE2 (8 x 128 Bit), which is native part of x64 and more powerful. In that case you can also use the FPU.
As we all know, using mmx registers trashes the FPU - search the forum for emms fpu
How serious a problem that is would certainly justify one of our usual mega-threads 8)
SSE2 is better if you can do parallel operations. In all other cases, the fpu is equally fast, and has some cute functions that SIMD doesn't offer. In case somebody doesn't know about them: fphelp.hlp is attached.
I think you guys missed the drift of the question, I already know that XMM is bigger, better and more plentiful but my question was related to the FP/MMX not being specified in Win64 and just sitting there going to waste. What I had in mind was being able to use them to pass up to 8 extra arguments to a procedure as you only have 4 integer registers available with the ABI. RE: Floating point calculations you have late SSE specified in Win64 and while I have not properly read up on them, AVX and AVX2 will probably add some pace to calculations.
I have not personally used MMX registers for many years as you can generally use XMM registers to do the same thing.
You could use MMX in your own code for that purpose. Personally I use FPU a lot so don't use MMX. But I'm pretty sure MS treats MMX / FPU as volatile, not guaranteed across MS calls, so that limits their usefulness.
Quote from: rrr314159 on July 07, 2016, 11:40:40 AM
not guaranteed across MS calls, so that limits their usefulness.
Sorry rrr, perhaps (for one time :biggrin: ) you missed the word. How M$ can limit FPU usefulness?
Quote from: hutch-- on July 07, 2016, 09:55:01 AMnot being specified in Win64
That is the nasty point: xmm regs are "not being specified in Win
32", and I used xmm0 a lot as scratch register, only to discover one day that, while they kept their values in proggies running on XP-32, they lost them on 7-64 :(
As long as you don't use API calls, mmx, st(x) and xmm are "safe". But when using them with a MsgBox, make sure you read the ABI 8)
Part of the brave new world. Here is a macro to write 64 bit integer registers OR 64 bit immediates to an MMX register.
; ----------------------------------------
; use this macro to load a 64 bit register
; or immediate into an MMX register
; ----------------------------------------
lmm MACRO reg, value
LOCAL dat64,pdat
.data?
dat64 dq ?
.data
pdat dq dat64
.code
IFDIFI <value>,<rax> ;; insensitive comparison
mov rax, value ;; write value to rax if different
ENDIF
mov pdat, rax
movq reg, pdat
ENDM
Quote from: hutch-- on July 07, 2016, 09:55:01 AM
I think you guys missed the drift of the question, I already know that XMM is bigger, better and more plentiful but my question was related to the FP/MMX not being specified in Win64 and just sitting there going to waste. What I had in mind was being able to use them to pass up to 8 extra arguments to a procedure as you only have 4 integer registers available with the ABI.
Why not? The idea is good and worth testing.
Gunther
Quote from: HSE on July 07, 2016, 12:02:28 PM
Quote from: rrr314159 on July 07, 2016, 11:40:40 AM
not guaranteed across MS calls, so that limits their usefulness.
Sorry rrr, perhaps (for one time :biggrin: ) you missed the word. How M$ can limit FPU usefulness?
AFAIK if you make an API call it can trash contents of FPU. Even if it doesn't you can't be sure it won't in a later version of Windows, because behavior isn't specified in ABI. That makes it even worse, your prog may work for now but break later on. So, you just have to be careful to do all your FPU / MMX work without inserting an API call. That limits their usefulness. In particular it means, AFAIK, hutch's idea is no good. Even if you test and find out MMX's are still there after an API call it may change with the next update.
Thanks, HSE, for reading my post!
Relying on any register that is not specified as non-volatile in the ABI is dangerous, MMX are no exception. They will be useful for passing arguments around, that's mainly what I had in mind.
Thanks rrr!! Just I never think to call APIs in the middle of calculations.
New processors increase registers more and more. If APIs preserves all registers perhaps become very inefficients, better is programmers decide what register preserves.
Quote from: HSE on July 08, 2016, 12:50:52 AM
Thanks rrr!! Just I never think to call APIs in the middle of calculations.
New processors increase registers more and more. If APIs preserves all registers perhaps become very inefficients, better is programmers decide what register preserves.
Which registers should be volatile and which non-volatile is key issue in designing an ABI. You're advocating caller responsible for preserving everything, which is one approach but of course it has drawbacks. Anyway what MS actually did was, not specify behavior for MMX (AFAIK) which means you should assume the worst and always preserve MMX. Which invalidates hutch's idea AFAIK.
As for "never calling API in middle of calculations", that's what I do also. The point is you'd better be sure of that. What if you want to print a value to check calculations? printf works, and probably always will, but I wouldn't count on MessageBox.
BTW with brand new registers another issue pops up: preserving them across context switches. A while ago Windows didn't do that with AVX512, I suppose they do now?
That's an old story, of course. IMHO M$ should have declared non-volatile ecx at least, because it has several special instructions; that's why MB preserves it. A push/pop pair more in slow API code would cost absolutely nothing.
In 64-bit code, passing paras seems to be a big issue. I would exclude MMX because I like the FPU, but one could declare it a matter of taste.
OTOH, why aren't the additional r8...r15 enough? Do we really need so many regs?? Besides, with movd, movlps and movhps, one has really plenty of additional dword and qword registers. Wasn't one of the "important" arguments of 64-bit code that you have more registers? I never was short of regs in 32-bit code, so I really can't understand why suddenly we need three or four times as many...
:biggrin:
> Anyway what MS actually did was, not specify behavior for MMX (AFAIK) which means you should assume the worst and always preserve MMX. Which invalidates hutch's idea AFAIK.
I am still astounded at the level of ignorance here, not being specified in an ABI means you and everyone else can do any thing they like with MMX registers. Some 10 or more so years ago there was a lot of MMX code around in 32 bit yet no-one who knew how to write it was having these imaginary problems. Now the obvious is that when you don't have the usage specified in the OS based ABI, you don't make the assumption that you can leave data in them and expect it to be there the next time you want to use that data.
When you want to leave data where you can reliably find it next time you need it is to store it AS data. That is what memory is used for. It is not as if you need MMX registers for calculations any longer as you have XMM and YMM registers but there is no reason to leave them there to waste either if you can in fact find a use for them. Even if you want to use FP in win64, there has long been a normal method to do that.
Quote from: hutch-- on July 08, 2016, 10:44:04 AMI am still astounded at the level of ignorance here,
True I am somewhat ignorant about the issue, that's why I kept saying "AFAIK". Here's what I've been assuming,
1) FPU and MMX share same storage so can't be used at the same time.
2) You can't assume MS won't use FPU during some API calls (in both older and new Windows).
3) You should clear the FPU stack when making (most) API calls, if only just to be on the safe side.
4) You can't count on MMX registers preserved across API calls.
Since I don't use MMX, and finish FPU calcs and leave all registers free before returning from the procedure anyway, I've never bothered to find out for sure if these assumptions are correct. So, are they?
1) FPU and MMX share same storage so can't be used at the same time.
Although I'm not familiar with the MMX instructions, I think you can specify which registers are to be used. It could thus be possible to use both the FPU and MMX at the same time. However, since the MMX is used primarily for MultiMedia programming and ALL registers would normally be used, there may not be any usefulness for sharing with the FPU.
2) You can't assume MS won't use FPU during some API calls (in both older and new Windows).
I'm not so sure that MS programmers know how to use FPU instructions. However, they seem to be using the MMX instructions (which use the same hardware as the FPU instructions) for all their computations, specially those required for window sizing.
3) You should clear the FPU stack when making (most) API calls, if only just to be on the safe side.
In my opinion, there is NO need for such precaution. As many have already noticed, MMX instructions don't seem to require free registers (as do the FPU instructions). If float computations should be required by the OS, their compiler must certainly free whatever register(s) may be needed in order to avoid "invalid operation exceptions".
4) You can't count on MMX registers preserved across API calls.
TRUE. However, my interpretation of the intent of this thread is that the use of MMX registers is NOT for long term storage of data but simply as very temporary storage for transmitting some extra data to your own procedures which would be written specifically to accept such transmission.
Thanks raymond,
I assume you're kidding about MS programmers not knowing how to use FPU. I always leave FPU cleared out of habit, regardless of API calls, seems safest and cleanest that way; but agree it shouldn't cause the OS trouble if I don't. I thought we were talking about (possibly) using MMX registers for storage across API calls. Apart from those points we're on the same page.
Quote from: raymond on July 09, 2016, 04:59:53 AM
4) You can't count on MMX registers preserved across API calls.
TRUE. However, my interpretation of the intent of this thread is that the use of MMX registers is NOT for long term storage of data but simply as very temporary storage for transmitting some extra data to your own procedures which would be written specifically to accept such transmission.
Are you sure? The FXSAVE and XSAVE area contains room for FPU/MMX registers. That's necessary for task switches. Why not for API calls?
Gunther
Personally I'm not sure MMX is not preserved by ABI. But MS doesn't say you can count on that anywhere. FPU stack is preserved, BTW, by printf family. But AFAIK even if you test all your API calls you can't be sure they won't break later. Anyway FXSAVE and XSAVE don't guarantee anything. A task switch preserves rax and rcx (etc) also, but they can still be trashed by API calls.
Quote from: rrr314159 on July 12, 2016, 04:32:10 AM
But MS doesn't say you can count on that anywhere. FPU stack is preserved, BTW, by printf family.
That's the tricky point. You're right. It would be an enormously amount of work to test all the API calls.
Gunther
I wonder at the logic here, if the 64 bit ABI says the MM registers are not specified, then you treat them as unspecified registers in much the same way as the volatile registers in either win32 or 64. Leaving data in an unspecified register when there is the potential for them to be used in intervening code is just bad programming practice. I used to see these donkeys testing code under win32 and if they could get away with avoiding the specs of the ABI OR lack of them on one Windows version, they thought they could do it on all versions. There were many confused programmers when their code went BANG on another OS version.
The simple rule is ANY volatile register is unsafe when it can be used by other code.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
include \masm64\include\masm64rt.inc
.code
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
entry_point proc
.stack
LOCAL.mm0 :QWORD
LOCAL.mm1 :QWORD
LOCAL.mm2 :QWORD
LOCAL.mm3 :QWORD
LOCAL.mm4 :QWORD
LOCAL.mm5 :QWORD
LOCAL.mm6 :QWORD
LOCAL.mm7 :QWORD
movq .mm0, mm0
movq .mm1, mm1
movq .mm2, mm2
movq .mm3, mm3
movq .mm4, mm4
movq .mm5, mm5
movq .mm6, mm6
movq .mm7, mm7
; -------------------------------------
; use and abuse your MMX registers here
; -------------------------------------
movq mm0, .mm0
movq mm1, .mm1
movq mm2, .mm2
movq mm3, .mm3
movq mm4, .mm4
movq mm5, .mm5
movq mm6, .mm6
movq mm7, .mm7
void(ExitProcess,0)
ret
entry_point endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end