FPU exceptions

jj2007 · June 02, 2012, 03:53:10 AM

Quote from: raymondon May 24, 2012, 11:02:24 AM in the old Forum ;)
From the description of the Status Word,
QuoteBits 6-0 are flags raised by the FPU whenever it detects an exception. Those exception flags are cumulative in the sense that, once set (bit=1), they are not reset (bit=0) by the result of a subsequent instruction which, by itself, would not have raised that flag. Those flags can only be reset by either initializing the FPU (FINIT instruction) or by explicitly clearing those flags (FCLEX instruction).

Which means that if you check for an invalid operation at the end of a series of FPU instructions, that flag could have been raised by any of those previous instructions and not necessarily by the last one. When you need to find where the invalid operation occurs, you need to check for it after each operation which could be the culprit

Strangely enough, it seems that the OS does the job for you: It reports the line where the flag was set, not the line where the exception was triggered...

   FpuSet MbNear64, exall   ; set rounding=near, 64 bits precision, all flags set except precision flag
Try "fdiv zero is not a good idea"   ; you can pass your own message with try
   fldpi   ; push 3.14
   fldz      ; push 0

; the deb macro can display all kinds of numerical values (reg8/16/32, immediate constants, xmm regs, fpu regs etc
; you can use it with the console (deb 4), with messageboxes (deb 1-3) and even with logfiles (deb 5)
   deb 4, "First, let's have a look at the FPU:", ST(0), ST(1)

   PrintLine CrLf$, "And now the illegal fdiv instruction:"
   fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
   nop      ; line 18
   nop      ; line 19
   fstp st   ; triggers exception (check the line reported by the OS!!)
Catch
   PrintLine "LastEx= ", Tb$, Hex$(LastEx(code))
   PrintLine "Address= ", Tb$, Hex$(LastEx(addr))
   PrintLine Str$("Source line=\t%i ", LastEx(line)), String$(25, 60)
   .if LastEx(user)
      PrintLine "Your coder says ", eax
   .endif
   Inkey CrLf$, "The OS reports:", CrLf$, LastEx(info)

Source & exe attached; see also the Exceptions, runtime errors and debugging in Masm32 thread.

Code Select

First, let's have a look at the FPU:
ST(0)           0.0
ST(1)           3.14159265358979324

And now the illegal fdiv instruction:
LastEx=         C000008E
Address=        004011BE
Source line=    17 <<<<<<<<<<<<<<<<<<<<<<<<<
Your coder says fdiv zero is not a good idea

The OS reports:
{ERRORE DI EXCEPTION}
Divisione a virgola mobile per zero.
EIP     004011BE
Code    C000008E

raymond · June 02, 2012, 10:18:21 AM

QuoteFpuSet MbNear64, exall ; set rounding=near, 64 bits precision, all flags set except precision flag

My apology Jochen if I don't have a copy of all your multiple macros and it's almost impossible for me to understand all your High Level Language.

Before I make other comments based on false assumptions, I have to assume that the flags you are talking about are the interrupt masks of the Control Word (since the rounding and precision control bits are part of that register).

Without any alteration, those masks are all set by default upon opening a program. Under those conditions, the FPU would handle ALL exceptions without passing any to the OS. When one (or more) of those masks is cleared, the related exception(s) would then be passed to the OS for exception handling. (This can then be intercepted by the program to handle it instead of the OS.)

If you keep all masks set except the precision mask, only precision exceptions would be passed to the OS. All other exceptions including Division-by-Zero and Invalid Operations would be handled by the FPU itself and the OS would never be informed about them.

jj2007 · June 02, 2012, 05:47:14 PM

Quote from: raymond on June 02, 2012, 10:18:21 AM
QuoteFpuSet MbNear64, exall ; set rounding=near, 64 bits precision, all flags set except precision flag
My apology Jochen if I don't have a copy of all your multiple macros and it's almost impossible for me to understand all your High Level Language.

Sorry, Raymond, I had assumed that Try/Catch, Print etc were self-explanatory, but it seems I was wrong all the time ;)

Quote
Before I make other comments based on false assumptions, I have to assume that the flags you are talking about are the interrupt masks of the Control Word (since the rounding and precision control bits are part of that register).

Without any alteration, those masks are all set by default upon opening a program. Under those conditions, the FPU would handle ALL exceptions without passing any to the OS. When one (or more) of those masks is cleared, the related exception(s) would then be passed to the OS for exception handling. (This can then be intercepted by the program to handle it instead of the OS.)

If you keep all masks set except the precision mask, only precision exceptions would be passed to the OS. All other exceptions including Division-by-Zero and Invalid Operations would be handled by the FPU itself and the OS would never be informed about them.

Yes, that's correct, of course. Passing exall to the FpuSet macro means "trigger all FPU exceptions" (and the precision flag remains set because it would be triggered all the time). My wording above was incorrect, what I meant is indeed "all flags set to zero except precision flag".

The whole point was a different one, though: In case of an invalid operation, the FPU sets a flag but non-FPU code continues operating until it hits another FPU instruction, e.g. fstp st. So the line where the invalid operation occurs is different from the line where the exception is being passed to the OS - bad news for bug-chasers.

However, it seems that through some magic that I don't yet understand, the OS knows the line where the invalid operation occurred, and reports it correctly. With /MAP /MAPINFO:LINES for the linker, the RichMasm editor is able to grab that info and thus can highlight, in the source, the line where the invalid FPU instruction occurred. Which makes bug-chasing a little bit easier ;-)

raymond · June 03, 2012, 09:40:52 AM

Now that my first assumption has been cleared, let's clear my second assumption.

Quotefdiv ; 3.1415/0 -> sets Z flag but does not yet trigger the exception

You have confirmed that the Divide-by-Zero interrupt mask had been cleared. THEREFORE, the divide by zero DOES trigger an exception which is sent to the OS for handling and may be the one retrieved by LastEx.

Quotefstp st ; triggers exception (check the line reported by the OS!!)

As far as I know, this fstp instruction should NOT trigger any exception. The value of INFINITY is a valid floating point value and storing it in another FPU register (or even memory) should thus be considered valid (even if you pop it immediately).

If you had used the fist instruction to store the INFINITY value as an integer to memory, that would have raised an exception.

jj2007 · June 03, 2012, 04:55:04 PM

Quote from: raymond on June 03, 2012, 09:40:52 AM
THEREFORE, the divide by zero DOES trigger an exception ... this fstp instruction should NOT trigger any exception.

That was also my first assumption, but I was wrong all the time ;)
Below is a snippet showing what happens...

Code Select

include \masm32\include\masm32rt.inc

.code
start:
	push 1000001100100000b
	fldcw [esp]

	fldpi
	fldz
	fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
	nop
	nop
	fstp st   ; triggers exception - comment out to see...

	inkey "Hi Ray"
	exit

end start

dedndave · June 03, 2012, 11:06:04 PM

i don't suppose you guys have looked at the Intel Programmer's Reference Manual ?

raymond · June 04, 2012, 11:32:01 AM

Quotefstp st ; triggers exception - comment out to see...

I disagree with the comment.

i) An exception was triggerred by the fdiv instruction due to the division by zero.
ii) Your exception mask was cleared for the Division-by-Zero exceptions.
iii) The FPU exception was sent to the OS to be resolved. Meanwhile, while it gets resolved, CPU instructions can continue to be processed.
iv) The OS did not find a reference where the FPU exception could be handled and resolved. Since the OS is not designed to resolve it by itself, it could not clear the IR bit on the Status Word of the FPU. The program thus simply halts as soon as the FPU is sent another instruction to process, regardless of whether it may be valid or not.

QuoteThe IR field (bit 7) or Interrupt Request gets set to 1 by the FPU while an exception is being handled and gets reset to 0 when the exception handling is completed. When the interrupt is masked in the Control Word for the FPU to handle the exception, this bit may never be seen to be set while stepping through the instructions with a debugger. However, if the programmer handles the interrupt, that bit should remain set until the interrupt handling routine is completed.

I should be able to provide more input/insight when I start playing with the exception handler.

jj2007 · June 04, 2012, 04:50:34 PM

Quote from: dedndave on June 03, 2012, 11:06:04 PM
i don't suppose you guys have looked at the Intel Programmer's Reference Manual ?

You mean this part?

When a floating-point exception is unmasked and the exception condition occurs, the x87 FPU stops further execution of the floating-point instruction and signals the exception event. On the next occurrence of a floating-point instruction or a WAIT/FWAIT instruction in the instruction stream, the processor checks the ES flag in the x87 FPU status word for pending floating-point exceptions. If floating-point exceptions are pending, the x87 FPU makes an implicit call (traps) to the floating-point software exception handler. The exception handler can then execute recovery procedures for selected or all floating-point exceptions.

Check yourself

Code Select

include \masm32\include\masm32rt.inc

.code
start:
	push 1000001100100000b
	fldcw [esp]

	fldpi
	fldz
	fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
	nop
	nop

	testit = 4	; check which ones crash the proggie

	if testit eq 1
		fstp st
	elseif testit eq 2
		fclex
	elseif testit eq 3
		fnclex
	elseif testit eq 4
		nop
	endif
	inkey "That tickled a bit but I liked it"
	exit

end start

Hint: One of them is present just before the call to ErrLineFromMap in \Masm32\MasmBasic\MasmBasic.inc ;)

RuiLoureiro · June 05, 2012, 04:51:51 AM

Jochen,
I decided to compute a division by 0 and not to stop when we
find a division by 0. It gives infinity and doesnt raise the
invalid operation flag.
Better: I decided to compute an expression until it raises
invalid operation flag. When we get invalid operation we
call a procedure to examine that case.

If it is an "Indeterminate form" it stops. If not it continues.
In this way, after fsub, fadd, fmul and fdiv, if we get
invalid operation, then goto_examine_sub,goto_examine_add,
goto_examine_mul, goto_examine_div.

If it is one of this cases:

+INFINITY-INFINITY, -INFINITY+INFINITY, INFINITY*0, 0*INFINITY
INFINITY / INFINITY or 0/0

it stops and gives "Indeterminate form".
It seems to work well ;)

EDIT:
(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)-(1/0)+(1/(ln(e)-1))
= Indeterminate form

(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))
=INFINITY

raymond · June 05, 2012, 12:45:43 PM

I strongly believe that we are essentially arguing on semantics, i.e. the meaning of the word "trigger".

My understanding is that, when the exception masks are cleared, instructions such as a divide by zero would "trigger" an exception and set the ES flag in the Status Word. Thereafter, the FPU would not process any other FPU instruction until the the exception "triggerred" previously gets resolved and the ES flag is cleared.

jj2007 · June 05, 2012, 04:37:19 PM

Quote from: raymond on June 05, 2012, 12:45:43 PM
I strongly believe that we are essentially arguing on semantics, i.e. the meaning of the word "trigger".

I agree, we are arguing over semantics. The word "trigger" implies for me some kind of strong action, like in "pull the trigger" - but then, I am not a native English speaker, right?

The process seems pretty clear by now:
- when it encounters fdiv zero or similar, the FPU sets a flag and tells the OS "there might be a problem ahead"
- the OS registers the EIP and the code of the exception but otherwise does nothing 8)
- if no further FPU instruction follows, no problems (i.e. ExitProcess etc would not raise an exception)
- however, if a second FPU instruction follows, (almost any, but see testit = 3 above in reply #7), then the FPU calls ("traps" in Intel speak) the OS and says "please handle my problem now".

So can we conclude that with the combined force of our almost 100 years of programming experience, the two of us solved one of the last mysteries in the history of FPU programming? ;)

In the meantime, I tested FpuSet MbNear64, exall on my currently biggest source, the RichMasm editor with some 14,000 lines, and it works just as before, i.e. although all masks except P are cleared, no exceptions happen - but then, an editor does not have many FPU instructions that could cause havoc. For Rui's Calcula, one could expect quite a number of (legitimate) crashes :lol:

RuiLoureiro · June 06, 2012, 03:50:33 AM

Quote
For Rui's Calcula, one could expect quite a number of (legitimate) crashes

Yes, Jochen, tons of crashes one after another !

Now, my business is "crashes"

The MASM Forum

News:

FPU exceptions

jj2007

raymond

jj2007

raymond

jj2007

dedndave

raymond

jj2007

RuiLoureiro

raymond

jj2007

RuiLoureiro