I am doing some FPU programming and have encountered some errors. I am using Visual Studio 2008 debugger. If I run the program and step through the various stages of calculations (not single step, but by blocks of code), then I do not have problems. But, if I let it run and stop at the end of the calculations, one or some of the FPU registers end up with a displayed value of "ST5 = 1#IND" instead of some normal floating value such as "ST0 = +1.8518518518518518e-0002". The question is, "what does this indicate?"
I am also running a C program (ENT) against the same files, and I always get the same output values from my MASM program as from the C program.
Dave.
IND stands for "indeterminate" type
it is a special set of NaN values
The divisions 0/0 and ±∞/±∞
The multiplications 0×±∞ and ±∞×0
The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions
The standard has alternative functions for powers:
The standard pow function and the integer exponent pown function define 00, 1∞, and ∞0 as 1.
The powr function defines all three indeterminate forms as invalid operations and so returns NaN.
you can probably run up to the point just before the instruction that generates the NaN and examine the FPU registers
the fact that it works in single-step and not in normal execution might indicate you need an FWAIT someplace
I wonder why I sometimes get the error and at other times every thing is good? Does it have to do with having to put WAITS before or after some of the instructions?
Dave.
yes - lol
i just updated my previous post :P
that is a very likely reason
refer to Ray's tutorial - some instructions encode an FWAIT, already
Quote from: dedndave on December 19, 2012, 07:15:29 AM
IND stands for "indeterminate" type
it is a special set of NaN values
The divisions 0/0 and ±∞/±∞
The multiplications 0×±∞ and ±∞×0
The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions
The standard has alternative functions for powers:
The standard pow function and the integer exponent pown function define 00, 1∞, and ∞0 as 1.
The powr function defines all three indeterminate forms as invalid operations and so returns NaN.
you can probably run up to the point just before the instruction that generates the NaN and examine the FPU registers
the fact that it works in single-step and not in normal execution might indicate you need an FWAIT someplace
Unfortunately, I do not get an exception so I do not know which calculation is going bad.
Is there any tutorial that explains where an FWAIT is needed?
Dave
well - maybe you can show us the series of instructions
maybe this will help, from Ray's tutorial...
QuoteThis instruction prevents the CPU from executing its next instruction if the Busy bit of the FPU's Status Word is set.
The FPU does not have direct access to the stream of instructions. It will start executing an instruction only when the CPU detects one in the code bits and transmits the information to the FPU. While the FPU is executing that instruction, the CPU can continue to execute in parallel other instructions which are not related to the FPU. Some of the FPU instructions being relatively slow, the CPU could execute several of its own instructions during that period.
The CPU also has some read/write access to the FPU's Status and Control registers even while the FPU is executing an instruction. In some cases, it may be desirable to read/write these registers without delay. But, in most cases, it is preferable or even necessary to wait until the FPU has completed the current instruction before proceeding with the read/write of these registers.
For example, a comparison by the FPU may require several clock cycles to execute and set the bits in the proper fields of the Status Word register. Waiting until the FPU has completed the operation and effectively updated the Status Word is a necessity if reading that Status Word is to have any meaning.
Many of the instructions which read or write to the FPU's Status and Control registers are automatically encoded with the fwait code.
Whenever the FPU is instructed to store a result (such as an integer) intended to be used by a CPU instruction, the latter should be preceded by an explicit FWAIT instruction. (For example, storing the content of a data register to memory as an integer may take some 30 clock cycles to complete.)
Note: Original versions of MASM were inserting an FWAIT instruction in front of every FPU instruction except those specified as "no-wait". This may have been necessary with the earlier co-processors but not on the more modern ones. Such coding may be observed if some old code is disassembled. The CPU can now poll the Busy bit of the Status Word whenever necessary before sending data processing instructions to the FPU.
typically, you don't need FWAIT between FPU "internal" instructions
the FPU isn't multi-core or multi-threaded or anything
one FPU instruction has to complete before it can work on the next
Intel and AMD both says that the [F]WAIT instruction is only useful, if you have unmasked FPU exceptions.
For your problem, store the FP result (or intermediate results) in memory and test the exponent field for all bits set - this indicates NaNs. At this point it maybe helpful to print out all input values of the formula or step into a INT3.
qWord;
Thank you both for the info.
Dave
Dave,
The deb macro may help you to find the point where it happens. Only the include line and the deb 4 ... line are interesting, all the rest of your code can stay "as is".
deb 4 prints to console, deb 5 writes to DebLog.txt
You can add more variables, like registers, local, global variables, even xmm regs.
; this line replaces "include \masm32\include\masm32rt.inc"
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
Init
push 123
xor ecx, ecx
.Repeat
fild dword ptr [esp]
deb 4, "FPU", ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ecx
Inkey "more (y)?"
.Until eax!="y"
pop eax
Exit
end start
Output:
FPU
ST(0) 123.000000000000000
ST(1) 0.0
ST(2) 0.0
ST(3) 0.0
ST(4) 0.0
ST(5) 0.0
ST(6) 0.0
ecx 1
more (y)?
FPU
ST(0) 123.000000000000000
ST(1) 123.000000000000000
ST(2) 0.0
ST(3) 0.0
ST(4) 0.0
ST(5) 0.0
ST(6) 0.0
ecx 2
well - if you do as qWord suggests, you may not see the error
because you will be inserting FWAITS to store the results
what you might try is to enable interrupts in the control word
depends on how many FPU instructions you have
if there aren't too many, stick FWAITS in there and take them out until the problem pops up again :P
I doubt that fwait has anything to do with it. In 99% of my bugs, it's either a missing ffree st(7), or division by zero. Which is a fairly likely case, and difficult to chase, if you work with random numbers.
If there is a bug at all: You write that "one or some of the FPU registers end up with a displayed value of "ST5 = 1#IND" - is the result in ST(0) ok for you? I am using Olly, and the higher FPU regs contain most of the time some trash from intermediate calculations.
Quote from: dedndave on December 19, 2012, 07:49:23 AM
well - if you do as qWord suggests, you may not see the error
because you will be inserting FWAITS to store the results
why should he insert FWAIT? In which way should a synchronization instruction cause an error?
Quote from: dedndave on December 19, 2012, 07:49:23 AMwhat you might try is to enable interrupts in the control word
exactly in that case ( unmasked exceptions) he needs FWAIT for synchronization (according to the manuals)
the FWAIT's won't cause an error
just the opposite - they will mask the problem he is trying to find :P
as to the interrupts - i mis-spoke - lol
what i meant was....
when you initialize the FPU with FINIT, all the exceptions are set up to be handled by the FPU
that is why a NaN is being generated
by altering the control word, you can allow the FPU to generate exceptions that can be captured by a JIT debugger
interrupts - lol - i am showing my age :P
Remember we had a long thread about FPU exceptions (http://masm32.com/board/index.php?topic=186.msg956#msg956) ;-)
Just made a test with this:
Rand(-5.5, 9.3) ; push a random number into ST(0)
fdivr ; divide
It crashes after 165000 iterations. Difficult to chase with a debugger, I did it with the deb macro but setting the control bit is also an option.
Quote from: dedndave on December 19, 2012, 08:03:42 AM
the FWAIT's won't cause an error
just the opposite - they will mask the problem he is trying to find :P
as to the interrupts - i mis-spoke - lol
what i meant was....
when you initialize the FPU with FINIT, all the exceptions are set up to be handled by the FPU
that is why a NaN is being generated
by altering the control word, you can allow the FPU to generate exceptions that can be captured by a JIT debugger
interrupts - lol - i am showing my age :P
I guess I need to modify the control word so I can trap any exceptions. I'll check Raymond's FpuLib documentation.
Dave.
this will enable all 6 exception types
push eax
fstcw word ptr [esp]
fwait
pop eax
and al,0C0h
push eax
fldcw word ptr [esp]
fwait
pop eax
That's undocumented not stable state of FPU after some long cycles calculations, solving by initialisation of it before block of critical code. I'm using such two functions for no exceptions/exceptioins types :eusa_boohoo:
OPTION PROLOGUE : NONE
OPTION EPILOGUE : NONE
feinit PROC
; сбрасываем слово состояния сопроцессора
FINIT ; сборс слова состояния сопроцессора
FSTSW AX ; сохраняем слово состояния сопроцессора
FWAIT ; ожидаем завершение операции сопроцессора
TEST AX, 0FFFFh ; проверяем сброс сопроцессора
MOV EAX, 0 ; ложное значение возврата
JNZ Exit1 ; сброс не выполнен - сопроцессора нет
; отключаем прерывания, устанавливаем точность и округление к нулю
ifdef NO_LD
PUSH WORD PTR 027Fh
else
PUSH WORD PTR 037Fh
endif
FLDCW [ESP]
POP AX ; истинное значение возврата
Exit1 :
RET
feinit ENDP
feninit PROC
; сбрасываем слово состояния сопроцессора
FNINIT ; сборс слова состояния сопроцессора
FNSTSW AX ; сохраняем слово состояния сопроцессора
FWAIT ; ожидаем завершение операции сопроцессора
TEST AX, 0FFFFh ; проверяем сброс сопроцессора
MOV EAX, 0 ; ложное значение возврата
JNZ Exit2 ; сброс не выполнен - сопроцессора нет
; включаем прерывания, устанавливаем точность и округление к нулю
ifdef NO_LD
PUSH WORD PTR 02A0h
else
PUSH WORD PTR 0380h
endif
FLDCW [ESP]
POP AX ; истинное значение возврата
Exit2 :
RET
feninit ENDP
OPTION PROLOGUE : PrologueDef
OPTION EPILOGUE : EpilogueD
Quoteif there aren't too many, stick FWAITS in there and take them out until the problem pops up again
I would do it a bit differently if you know which parameters always generate that NAN (and the time it takes to arrive at those parameters is relatively short).
a) Put a break point after the last FPU instruction. I would also change that instruction to exit the program; the first run will be to make sure that the NAN gets produced.
b) Leave that first break point and insert a second break point behind the preceeding FPU instruction, and restart the program; check after that breakpoint if the NAN got produced.
c) Remove that second breakpoint and continue to move it behind the preceeding FPU instruction until the NAN stops being produced. The next FPU instruction would then be the main culprit. Check the content of all the FPU/CPU registers along with the expected effect of the next FPU/CPU instructions on such content. You may also check if the NAN still got produced at the end of the computation; it could be an important clue.
Have fun.
Edit:
If you have several possible branches in your algo, you may also need to check those other branches if the program doesn't stop at the "second" break point.
Solved!
I enabled all exceptions and immediately ran into a precision exception doing an FDIV. This should be expected, so I re-enabled the PM mask and tried again. I Immediately got a fault and determined that the loaded value in ST(0) was way wrong. I had recently converted my code from using real8 values to using tbytes, but failed to convert the array access for the change in size of the variables. I am using tbytes in an oword array to keep alignment, and just increment the array pointer by 16 for each entry.
Now it all seems to work again.
I will have to re-test all that I did before to insure that I have not introduced another error, but feel confident that all will work.
Dave.
:t
:t :t
But, don't feel bad. We've all done something similar (or worse) in the past and will probably do it again sometime in the future. :icon_redface: