Yesterday I've started a topic;

http://masm32.com/board/index.php?topic=487.msg3698#msg3698. Here is some technical and historical explanation about that research field.

First of all: The basic idea behind the software is an extra long dot product accumulator, which can accumulate partial results without catastrophic cancellation in scientific computations that can quickly lead to completely inaccurate results. The thread above shows one of these examples, but also how to avoid such errors. The image below shows the principle with five bit numbers.

But do we really need such techniques? The only possible answer is: yes. There is, for example, the ASCI program; ASCI stands for Advanced Strategic Computing Initiative with the following goal:

It aims to replace physical nuclear weapons testing with computer simulations.

There is a critical report by John Gustafson, US-Department of Energy, with the title: "Computational Verifyability and Feasibility of the ASCI Program". The author explains that the confidence in usual calculations combined with error-prone software could lead to a nuclear disaster. Moreover he writes:

This forces us to a very different style of computing, one with which very few people have much experience, where results must come with guarantees. This is a new kind of science ....

That's for sure. One single false operation can crash the entire calculation.

Since 2002 we've discussions about the revision of the floating point standard. I've the copy of a letter (from September 2004) to Bob Davis, Chairman of the IEEE Microprocessor Standards Committee; authors are, besides others, Ulrich Kulisch and William Kahan (the father of floating point arithmetic)

https://en.wikipedia.org/wiki/William_Kahan. Here is one quote:

We think that the tremendous progress in computer technology and the great increase in computer speed should be accompanied by extensions of the mathematical capacity of the computer. Beyond what has already been done by IEEE754R, IFIP WG 2.5 expresses its desire that the following two requirements are included in the future arithmetic standard.

- For the data format double precision, interval arithmetic should be made available at the speed of simple floating-point arithmetic. Most processors on the market are equipped with arithmetic for multimedia applications. On these processors we believe that it is likely that only 0.1% more silicon in the arithmetic circuitry would suffice to realise this capability.
- High speed arithmetic for dynamic precision should be made available for real and for interval data. The basic tool to achieve high speed dynamic precision arithmetic for real and interval data is an exact multiply and accumulate operation for the data format double precision.

Both authors are nearly 80 years old, but they have seen the entire question very clear. I'm not sure, if the final result of the new floating point standard will contain the 2 requirements, but I have doubts. The next question is: Will the processor manufacturers follow the standard? The current floating point equipment that we have is far from ideal. What should we think about Intel's and AMD's design decision not to allow BCD arithmetic in the 64 bit long mode?

Anyway, the idea of a extra long accumulator is not very new. I came first in touch with it during my time as a student at the Technical University Dresden. We had an IBM/370 mainframe with the IBM high-accuracy library installed. That library was for real and complex arithmetic and a lot of other fine stuff and did use the long accumulator.

In 1983 I could grab a book by Kulisch and others with the title: "PASCAL-XSC"; that stands for: PASCAL for Extended Scientific Computing. It was impressive - they used the long accumulator, too. The only computer that I had during this time was a Z80 machine (8 bit wide!) with 64 KB RAM (4 KB for the operating system) with a basic interpreter. I hadn't an assembler or compiler (to expensive for me), but I found in one of Berlin's public library a book with the Z80 opcodes. That was enough. I was the assembler and linker in one person, wrote the instructions with pencil and paper down, and poked the bytes via data lines into the memory in a basic array. After 6 weeks and countless system crashes I had my first working long accumulator for REAL4 values (the REAL8 format was unthinkable in this years, because my interpreter didn't support that format). That where wild times!

What else? There's also an English version of Kulisch's book.

And that's for Alex: a Russian version is also available. 2006 came the 3. Russian edition and the compiler is still in use there.

So, that was a bit background information and I hope that other forum members find this questions interesting.

By the way, I've found the books above together with the Z80 code in my bookshelf in the cellar.

Gunther