Getting the sign of 800000000h correct

allynm · May 30, 2012, 02:06:18 AM

Hello everyone,

I have run into a curious (for me, anyway) anomaly when trying to correctly compute the value of 80000000h. Suppose that eax has this value. Then crt_printf will treat it as a negative number. If, however, I know the value should be treated as positive, then I don't know how to get the sign right. The following code snippet detects and corrects the sign for any other integer that has the sign bit set. Works fine for 80000001 and higher, but it fails for 80000000h.

Code Select


mov	eax, 080000000h
	push	eax
	fn	crt_printf, OFFSET fmt1, eax  ;print original eax
	pop	eax
	mov	ebx, 07FFFFFFh
	sub	ebx, eax		      ;check sign bit
	jc	comp
	jmp	fin
comp:	imul	eax, -1			      ;get 2's complement if sign bit set in eax
					      ;FAILS if eax is 80000000h
	fn 	crt_printf, OFFSET fmt1, eax  ;check result.  
	
fin:	exit
end	start

Since I'm still pretty green with masm I imagine that others will have created more robust routines for fixing up errant signs.

Thanks very much,
Mark Allyn

jj2007 · May 30, 2012, 02:37:50 AM

It depends on what you see as "errant" signs...

Code Select

80000000h=-2147483648	; printed as signed integer
80000000h=2147483648	; printed as unsigned integer
80000001h=-2147483647	; printed as signed integer
80000001h=2147483649	; printed as unsigned integer

So what exactly do you want to achieve? I.e., what is "right" for you?

include \masm32\MasmBasic\MasmBasic.inc   ; download
   Init
   Print Str$("80000000h=%i\n", 80000000h)
   Print Str$("80000000h=%u\n", 80000000h)
   Print Str$("80000001h=%i\n", 80000001h)
   Print Str$("80000001h=%u\n", 80000001h)
   Inkey
   Exit
end start

qWord · May 30, 2012, 02:40:16 AM

A common approach would be to use the correct format specifier:

Code Select

frmt db "%lu",0

Regardless this, you should take a look at the instruction NEG, which negates a two's complement:
e.g.:

Quoteneg eax ; eax = (NOT eax)+1

Also keep in mind, that 80000000h can't be negated correct, because there is no way to represent the result (=2147483648) as an singed integer (32bit).

allynm · May 30, 2012, 03:09:42 AM

Qword and jj,

jj first:
By "errant" what I mean is that 80000000h in this particular instance should be interpreted as a positive number. Of course, it could be the other way round in a different application or different instance of the same application, i.e. it should in fact be interpreted as negative, but turns out to be misinterpreted as positive.

qword next:

I looked at neg before I posted. As you say, neg fails on this particular number to do two's complement.

Regards,
Mark

jj2007 · May 30, 2012, 04:19:48 AM

The keyword is "interpretation". 80000000h can be interpretated as a negative or a positive number. Just tell the assembler, pardon: the printf or Str$() macro what you want...

ebx=2147483648
ebx=2147483648
ebx=2147483648

ebx=-2147483648
ebx=-2147483648
ebx=-2147483648

include \masm32\MasmBasic\MasmBasic.inc   ; download
   Init
   mov ebx, 80000000h
   Print Str$("ebx=%u\n", ebx)   ; print unsigned
   neg ebx
   Print Str$("ebx=%u\n", ebx)
   not ebx
   inc ebx
   Print Str$("ebx=%u\n\n", ebx)

   mov ebx, 80000000h
   Print Str$("ebx=%i\n", ebx)   ; print signed
   neg ebx
   Print Str$("ebx=%i\n", ebx)
   not ebx   ; qWord: neg eax ; eax = (NOT eax)+1
   inc ebx
   Print Str$("ebx=%i\n", ebx)

   Inkey
   Exit
end start

allynm · May 30, 2012, 11:21:56 AM

Hi Qword and JJ,

Let's forget about the crt_printf call. I agree that by using the appropriate format statement one can get either the "correct" or "incorrect" answer by using "%u" or "%d". But, suppose that the programmer does not know the "correct" value. Suppose the programmer has written code which results in 080000000h being moved to a register. Suppose that the application code uses the value in this register differently depending upon whether it is positive or negative. In this circumstance how would the application "know" whether to treat the register as containing a positive or negative value? It seems to me that the value (80000000h) is completely ambiguous.

Sorry for these stupid noob questions.

Regards,
Mark

jj2007 · May 30, 2012, 11:34:21 AM

Quote from: allynm on May 30, 2012, 11:21:56 AM
It seems to me that the value (80000000h) is completely ambiguous.

Not really... it's pretty straightforward:

Code Select

	mov ebx, 80000000h
	test ebx, ebx
	.if Sign?
		Print "ebx is negative"
	.else
		Print "ebx is positive"
	.endif

The only thing that is surprising/not intuitive is that neg 80000000h produces 80000000h.

dedndave · May 30, 2012, 01:02:31 PM

QuoteIn this circumstance how would the application "know" whether to treat the register as containing a
positive or negative value? It seems to me that the value (80000000h) is completely ambiguous.

ok - the question is not "is it positive or negative"
the question should be "is it signed or unsigned"

it's an issue of context

let's step back and examine the use of signed vs unsigned values
this applies to bytes, words, dwords - any size integer, actually
but - let's use dwords

a dword is 32 bits wide
we may choose to use that 32 bits to represent an unsigned integer from 0 to 4294967295

well - Intel, as well as the designers of many other CPU's wanted to be able to use the same
register or memory location to store signed integers, at the programmers discretion
and - they wanted many instructions to work correctly, regardless of which was in use
so - two's compliment was born

now, we may use the 32 bits to represent values from -2147483648 to +2147483647
instructions like ADD and SUB still work for either integer type

that gives you a little background to help understand that whether the value represents a signed or unsigned integer is up to the programmer

so, the value 80000000h may represent a positive value of 2147483648 (unsigned)
or, it may represent a negative value of -2147483648 (signed)
it depends on the context in which the value is used
which is a choice the programmer makes when he writes the code

clive · May 31, 2012, 12:37:14 AM

Quote from: allynmBut, suppose that the programmer does not know the "correct" value. Suppose the programmer has written code which results in 080000000h being moved to a register. Suppose that the application code uses the value in this register differently depending upon whether it is positive or negative. In this circumstance how would the application "know" whether to treat the register as containing a positive or negative value? It seems to me that the value (80000000h) is completely ambiguous.

It's more of a matter of documentation. If someone creates an API which steps outside of normal numerical representation or ranges for certain cases, you'd hope that would be expressed somewhere. In binary there are some numbers, like in decimal, that cannot be represented perfectly, ie 1/3 or 1/5.

If you have a signed 32-bit number scheme where 0x80000000 is postitive you would need to special case all of the occasions where standard 2's complement math is being applied. Alternatively you could use more bits, or an architecture that treats numbers differently. Things like floating point add a host more issues.

At the end of the day machines are pretty dumb, you have to pay attention to how data is represented, the type and size of numbers you will encounter, and the conditions and circumstances where they might overflow or underflow and how this might impact the accurary and precsion of the computations. People often ignore these conditions, and compilers/coprocessors often hold intermediate results to higher precision.

raymond · May 31, 2012, 10:37:07 AM

QuoteIt seems to me that the value (80000000h) is completely ambiguous.

It would be completely ambiguous if, and only IF, the programmer himself doesn't take the necessary precautions to prevent it from being ambiguous.

allynm · May 31, 2012, 09:51:34 PM

Hello everyone,

Thanks for your comments. They were very interesting and helpful. Clive zeroed in on my concern very clearly. Raymond's very succinct statement begs the question: yes, but HOW BEST to take precautions? In principle this could mean that at any point in the program where a sign change (btw, thanks Dave for making this important distinction) MIGHT occur, and where its occurrence would matter to the outcome, we must test the sign bit. Maybe so. I guess one could write a simple macro that would make the coding less complex.

In any event, thanks to all who helped. A wonderful forum!

Mark

dedndave · May 31, 2012, 10:05:16 PM

well - you want to learn a little detail about how the flags work

in general - the carry flag extends an unsigned value by one bit
the overflow flag is the signed counterpart of the carry flag
the sign flag tells you if the high bit is set or not
the zero flag tells you if the result is zero

K_F · June 20, 2012, 01:55:18 AM

Quote from: raymond on May 31, 2012, 10:37:07 AM
.... if, and only IF.....

ah! .. maths theorem 1.0s 'introduction'.

The MASM Forum

News:

Getting the sign of 800000000h correct

allynm

jj2007

qWord

allynm

jj2007

allynm

jj2007

dedndave

clive

raymond

allynm

dedndave

K_F