News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

FPU exceptions

Started by jj2007, June 02, 2012, 03:53:10 AM

Previous topic - Next topic

jj2007

Quote from: raymondon May 24, 2012, 11:02:24 AM in the old Forum ;)
From the description of the Status Word,
QuoteBits 6-0 are flags raised by the FPU whenever it detects an exception. Those exception flags are cumulative in the sense that, once set (bit=1), they are not reset (bit=0) by the result of a subsequent instruction which, by itself, would not have raised that flag. Those flags can only be reset by either initializing the FPU (FINIT instruction) or by explicitly clearing those flags (FCLEX instruction).

Which means that if you check for an invalid operation at the end of a series of FPU instructions, that flag could have been raised by any of those previous instructions and not necessarily by the last one. When you need to find where the invalid operation occurs, you need to check for it after each operation which could be the culprit

Strangely enough, it seems that the OS does the job for you: It reports the line where the flag was set, not the line where the exception was triggered...

   FpuSet MbNear64, exall   ; set rounding=near, 64 bits precision, all flags set except precision flag
  Try "fdiv zero is not a good idea"   ; you can pass your own message with try
   fldpi   ; push 3.14
   fldz      ; push 0

; the deb macro can display all kinds of numerical values (reg8/16/32, immediate constants, xmm regs, fpu regs etc
; you can use it with the console (deb 4), with messageboxes (deb 1-3) and even with logfiles (deb 5)
   deb 4, "First, let's have a look at the FPU:", ST(0), ST(1)

   PrintLine CrLf$, "And now the illegal fdiv instruction:"
   fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
   nop      ; line 18
   nop      ; line 19
   fstp st   ; triggers exception (check the line reported by the OS!!)
  Catch
   PrintLine "LastEx=  ", Tb$, Hex$(LastEx(code))
   PrintLine "Address=  ", Tb$, Hex$(LastEx(addr))
   PrintLine Str$("Source line=\t%i ", LastEx(line)), String$(25, 60)
   .if LastEx(user)
      PrintLine "Your coder says ", eax
   .endif
     Inkey CrLf$, "The OS reports:", CrLf$, LastEx(info)

Source & exe attached; see also the Exceptions, runtime errors and debugging in Masm32 thread.

First, let's have a look at the FPU:
ST(0)           0.0
ST(1)           3.14159265358979324

And now the illegal fdiv instruction:
LastEx=         C000008E
Address=        004011BE
Source line=    17 <<<<<<<<<<<<<<<<<<<<<<<<<
Your coder says fdiv zero is not a good idea

The OS reports:
{ERRORE DI EXCEPTION}
Divisione a virgola mobile per zero.
EIP     004011BE
Code    C000008E

raymond

QuoteFpuSet MbNear64, exall   ; set rounding=near, 64 bits precision, all flags set except precision flag
My apology Jochen if I don't have a copy of all your multiple macros and it's almost impossible for me to understand all your High Level Language.

Before I make other comments based on false assumptions, I have to assume that the flags you are talking about are the interrupt masks of the Control Word (since the rounding and precision control bits are part of that register).

Without any alteration, those masks are all set by default upon opening a program. Under those conditions, the FPU would handle ALL exceptions without passing any to the OS. When one (or more) of those masks is cleared, the related exception(s) would then be passed to the OS for exception handling. (This can then be intercepted by the program to handle it instead of the OS.)

If you keep all masks set except the precision mask, only precision exceptions would be passed to the OS. All other exceptions including Division-by-Zero and Invalid Operations would be handled by the FPU itself and the OS would never be informed about them.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

jj2007

Quote from: raymond on June 02, 2012, 10:18:21 AM
QuoteFpuSet MbNear64, exall   ; set rounding=near, 64 bits precision, all flags set except precision flag
My apology Jochen if I don't have a copy of all your multiple macros and it's almost impossible for me to understand all your High Level Language.

Sorry, Raymond, I had assumed that Try/Catch, Print etc were self-explanatory, but it seems I was wrong all the time  ;)

Quote
Before I make other comments based on false assumptions, I have to assume that the flags you are talking about are the interrupt masks of the Control Word (since the rounding and precision control bits are part of that register).

Without any alteration, those masks are all set by default upon opening a program. Under those conditions, the FPU would handle ALL exceptions without passing any to the OS. When one (or more) of those masks is cleared, the related exception(s) would then be passed to the OS for exception handling. (This can then be intercepted by the program to handle it instead of the OS.)

If you keep all masks set except the precision mask, only precision exceptions would be passed to the OS. All other exceptions including Division-by-Zero and Invalid Operations would be handled by the FPU itself and the OS would never be informed about them.

Yes, that's correct, of course. Passing exall to the FpuSet macro means "trigger all FPU exceptions" (and the precision flag remains set because it would be triggered all the time). My wording above was incorrect, what I meant is indeed "all flags set to zero except precision flag".

The whole point was a different one, though: In case of an invalid operation, the FPU sets a flag but non-FPU code continues operating until it hits another FPU instruction, e.g. fstp st. So the line where the invalid operation occurs is different from the line where the exception is being passed to the OS - bad news for bug-chasers.

However, it seems that through some magic that I don't yet understand, the OS knows the line where the invalid operation occurred, and reports it correctly. With /MAP /MAPINFO:LINES for the linker, the RichMasm editor is able to grab that info and thus can highlight, in the source, the line where the invalid FPU instruction occurred. Which makes bug-chasing a little bit easier ;-)

raymond

Now that my first assumption has been cleared, let's clear my second assumption.

Quotefdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception

You have confirmed that the Divide-by-Zero interrupt mask had been cleared. THEREFORE, the divide by zero DOES trigger an exception which is sent to the OS for handling and may be the one retrieved by LastEx.

Quotefstp st   ; triggers exception (check the line reported by the OS!!)

As far as I know, this fstp instruction should NOT trigger any exception. The value of INFINITY is a valid floating point value and storing it in another FPU register (or even memory) should thus be considered valid (even if you pop it immediately).

If you had used the fist instruction to store the INFINITY value as an integer to memory, that would have raised an exception.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

jj2007

Quote from: raymond on June 03, 2012, 09:40:52 AM
THEREFORE, the divide by zero DOES trigger an exception ... this fstp instruction should NOT trigger any exception.

That was also my first assumption, but I was wrong all the time ;)
Below is a snippet showing what happens...

include \masm32\include\masm32rt.inc

.code
start:
push 1000001100100000b
fldcw [esp]

fldpi
fldz
fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
nop
nop
fstp st   ; triggers exception - comment out to see...

inkey "Hi Ray"
exit

end start

dedndave

i don't suppose you guys have looked at the Intel Programmer's Reference Manual ?   :biggrin:

raymond

Quotefstp st   ; triggers exception - comment out to see...

I disagree with the comment.

i) An exception was triggerred by the fdiv instruction due to the division by zero.
ii) Your exception mask was cleared for the Division-by-Zero exceptions.
iii) The FPU exception was sent to the OS to be resolved. Meanwhile, while it gets resolved, CPU instructions can continue to be processed.
iv) The OS did not find a reference where the FPU exception could be handled and resolved. Since the OS is not designed to resolve it by itself, it could not clear the IR bit on the Status Word of the FPU. The program thus simply halts as soon as the FPU is sent another instruction to process, regardless of whether it may be valid or not.

QuoteThe IR field (bit 7) or Interrupt Request gets set to 1 by the FPU while an exception is being handled and gets reset to 0 when the exception handling is completed. When the interrupt is masked in the Control Word for the FPU to handle the exception, this bit may never be seen to be set while stepping through the instructions with a debugger. However, if the programmer handles the interrupt, that bit should remain set until the interrupt handling routine is completed.

I should be able to provide more input/insight when I start playing with the exception handler.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

jj2007

Quote from: dedndave on June 03, 2012, 11:06:04 PM
i don't suppose you guys have looked at the Intel Programmer's Reference Manual ?   :biggrin:

You mean this part?

When a floating-point exception is unmasked and the exception condition occurs, the x87 FPU stops further execution of the floating-point instruction and signals the exception event. On the next occurrence of a floating-point instruction or a WAIT/FWAIT instruction in the instruction stream, the processor checks the ES flag in the x87 FPU status word for pending floating-point exceptions. If floating-point exceptions are pending, the x87 FPU makes an implicit call (traps) to the floating-point software exception handler. The exception handler can then execute recovery procedures for selected or all floating-point exceptions.

Check yourself :biggrin:
include \masm32\include\masm32rt.inc

.code
start:
push 1000001100100000b
fldcw [esp]

fldpi
fldz
fdiv      ; 3.1415/0 -> sets Z flag but does not yet trigger the exception
nop
nop

testit = 4 ; check which ones crash the proggie

if testit eq 1
fstp st
elseif testit eq 2
fclex
elseif testit eq 3
fnclex
elseif testit eq 4
nop
endif
inkey "That tickled a bit but I liked it"
exit

end start


Hint: One of them is present just before the call to ErrLineFromMap in \Masm32\MasmBasic\MasmBasic.inc  ;)

RuiLoureiro

Jochen,
         I decided to compute a division by 0 and not to stop when we
         find a division by 0. It gives infinity and doesnt raise the
         invalid operation flag.
         Better: I decided to compute an expression until it raises
         invalid operation flag. When we get invalid operation we
         call a procedure to examine that case.
         
         If it is an "Indeterminate form" it stops. If not it continues.
         In this way, after fsub, fadd, fmul and fdiv, if we get
         invalid operation, then goto_examine_sub,goto_examine_add,
         goto_examine_mul, goto_examine_div.

        If it is one of this cases:

         +INFINITY-INFINITY, -INFINITY+INFINITY, INFINITY*0, 0*INFINITY
         INFINITY / INFINITY or 0/0

        it stops and gives "Indeterminate form".
        It seems to work well  ;)

EDIT:
(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)-(1/0)+(1/(ln(e)-1))
= Indeterminate form

(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))+(1/0)+(1/0)+(1/(ln(e)-1))
=INFINITY

raymond

I strongly believe that we are essentially arguing on semantics, i.e. the meaning of the word "trigger".

My understanding is that, when the exception masks are cleared, instructions such as a divide by zero would "trigger" an exception and set the ES flag in the Status Word. Thereafter, the FPU would not process any other FPU instruction until the the exception "triggerred" previously gets resolved and the ES flag is cleared.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

jj2007

Quote from: raymond on June 05, 2012, 12:45:43 PM
I strongly believe that we are essentially arguing on semantics, i.e. the meaning of the word "trigger".

I agree, we are arguing over semantics. The word "trigger" implies for me some kind of strong action, like in "pull the trigger" - but then, I am not a native English speaker, right? :biggrin:

The process seems pretty clear by now:
- when it encounters fdiv zero or similar, the FPU sets a flag and tells the OS "there might be a problem ahead"
- the OS registers the EIP and the code of the exception but otherwise does nothing 8)
- if no further FPU instruction follows, no problems (i.e. ExitProcess etc would not raise an exception)
- however, if a second FPU instruction follows, (almost any, but see testit = 3 above in reply #7), then the FPU calls ("traps" in Intel speak) the OS and says "please handle my problem now".

So can we conclude that with the combined force of our almost 100 years of programming experience, the two of us solved one of the last mysteries in the history of FPU programming?  ;)

In the meantime, I tested FpuSet MbNear64, exall on my currently biggest source, the RichMasm editor with some 14,000 lines, and it works just as before, i.e. although all masks except P are cleared, no exceptions happen - but then, an editor does not have many FPU instructions that could cause havoc. For Rui's Calcula, one could expect quite a number of (legitimate) crashes :lol:

RuiLoureiro

 :biggrin:
Quote
For Rui's Calcula, one could expect quite a number of (legitimate) crashes

        Yes, Jochen, tons of crashes one after another !  :greensml:
        Now, my business is "crashes"  :badgrin: