The MASM Forum

Miscellaneous => 16 bit DOS Programming => Topic started by: bugthis on July 08, 2024, 11:29:54 AM

Title: Why does the second string output produce garbage?
Post by: bugthis on July 08, 2024, 11:29:54 AM
Here's the code:

Code (asm) Select
.MODEL SMALL, C

.DATA
  preTest DB 1 DUP(1)  ; if i comment this line, then the output seems to be partly correct. But
                       ; with this line, it isn't. But why? 
    ; (EDIT: DUP is fixed now in the forum code here. It was DUB before. But in the actual program code it was always DUP. )
  VbeInfoBlock DB "VBE2"  ; Vesa VBE 2.0 needs the string VBE2 to detect the VbeInfoBlock
  vesastring DB 508 DUP(48)  ; the VbeInfoBlock needs 4 + 508 = 512 bytes in total.
  ENDL DB "EOL", 10, 13, '$'  ; The ENDL and its '$' is a safety feature.
                              ; I use it also to end the output of the first comment.

.STACK 256

.CODE
start:
  MOV AX, @DATA   ; setup DS and ES
  MOV DS, AX
  MOV ES, AX

  MOV DX, OFFSET VbeInfoBlock
  MOV AH, 09h
  INT 21h         ; first output of VbeInfoBlock

  MOV DI, DX  ; There is probably a better way to setup ES:DI, but this seems to work.
              ; ES:DI needs to point to the beginning of the VbeIinfoBlock starting with VBE2
              ; before calling int 10h. This does the trick.
  MOV AX, 4F00h  ; VESA VBE function 00h
  INT 10h

  MOV DI, 00h    ; setting DI back to 0 manually.
  MOV AX, 0900h

  ; Information:
  ; At this point everything looks fine in debug, when i dump the memory contents with "d es:dx"
  ; which points to VbeInfoBlock the dump output shows the string "VESA" like it should.
  ; this also means the Vesa function call did work. VBE2 is changed to VESA like it should.
  ; AH and AL are also swapped after calling int 10h, as they should be.

  ; The following interrupt should display this string, like it did before, but it doesn't.
  ; And that regardless of the fact, that DX, AX, DI all have the same values as in the first output.
  INT 21h     ; second output of VbeInfoBlock displaying garbage

  MOV AX, 4C00h
  INT 21h     ; return to DOS
  end start

The second output produces garbage. See the screenshots.
When the preTest variable is commented, the output seems to be much better.

Output of vesa.exe. The second output (after EOL) is wrong.
(https://i.postimg.cc/3yYDKRZ0/vesa-problem.png) (https://postimg.cc/3yYDKRZ0)

In debug a dump of ds:dx shows the correct values:
(https://i.postimg.cc/Ty05h6k7/correct-in-ram.png) (https://postimg.cc/Ty05h6k7)

But the output is still broken:
(https://i.postimg.cc/GHb0xnzJ/but-produces-garbage-with-second-ah09-string-output.png) (https://postimg.cc/GHb0xnzJ)
Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 08, 2024, 12:01:59 PM
At first glance:
You set DX = OFFSET VbeInfoBlock before two interrupt calls (INT 21h and INT 10h), by which time DX is, I'm pretty sure, trashed. (I forget just what the ABI specification is for DOS INT 21 calls, but I think only SI, DI and (maybe) BX are preserved, but certainly not AX, CX or DX.)

Try setting DX to the offset of what you want to print just before that INT 21h/AH = 9 and see if that works better.

Also, shouldn't your preTest DB 1 DUB(1) be preTest DB 1 DUP(1)? And if DUP is 1, there's no need for it anyhow. Just write preTest DB 1 if that's the value you want.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 08, 2024, 12:33:03 PM
Quote from: NoCforMe on July 08, 2024, 12:01:59 PMAt first glance:
You set DX = OFFSET VbeInfoBlock before two interrupt calls (INT 21h and INT 10h), by which time DX is, I'm pretty sure, trashed. (I forget just what the ABI specification is for DOS INT 21 calls, but I think only SI, DI and (maybe) BX are preserved, but certainly not AX, CX or DX.)

Try setting DX to the offset of what you want to print just before that INT 21h/AH = 9 and see if that works better.
I had a
MOV DX, OFFSET VbeInfoBlock
at the beginning in my code before all int 21h calls, but DX never changed. And the error still occurred. Since DX remained the same all the time and the command was therefore not necessary, I removed it to save memory.
So this isn't the reason for the error.


QuoteAlso, shouldn't your preTest DB 1 DUB(1) be preTest DB 1 DUP(1)?
The DUB is just a transmission error when writing the code for the forum. Unfortunately, I can't use copy & paste from QEMU. The program compiles, in the code it is DUP like it should be. See the running program in the screenshots. With DUB it wouldn't compile.
But i will correct it above. So thanks for the hint.

QuoteAnd if DUP is 1, there's no need for it anyhow.
That's not the point. I had DUP = 80 before for testing purposes. I just decreased it to the absolute minimum to still be able to trigger the bug.
Title: Re: Why does the second string output produce garbage?
Post by: sinsi on July 08, 2024, 12:35:48 PM
What are you trying to print? If you're trying to dump the bytes you need to convert them to characters first.
DOS function 9 prints a string of characters, a byte of 00 would be converted to two characters, 3030h.
Also, the function needs a $ to terminate the string.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 08, 2024, 12:40:14 PM
Quote from: sinsi on July 08, 2024, 12:35:48 PMWhat are you trying to print? If you're trying to dump the bytes you need to convert them to characters first.
DOS function 9 prints a string of characters, a byte of 00 would be converted to two characters, 3030h.
The program is still in its early stages. I actually just wanted to output the VBE info quick & dirty. The program is not finished yet, values are not  yet extracted correctly from the info block, but it should at least display the string "VESA" correctly and start the output at VESA and end at the $ of ENDL at the latest. It doesn't do that at the moment as long as the preTest is not commented.

QuoteAlso, the function needs a $ to terminate the string.
There is an $, see the ENDL. It's an old programming trick.

EDIT
The variable names are for the assembler only. In principle, all bytes are a single string up to the first $.
So this:
str1 db "This "
str2 db "is "
str3 db "a "
str4 db "long "
str5 db "string. "
str5 db "With "
str6 db "an "
str7 db "end."
str8 db 13, 10, "$'

becomes:
MOV DX, OFFSET str1
MOV AH, 09h
INT 21h
with an output of:
This is a long string. With an end.

But you can also do this:
MOV DX, OFFSET str5
MOV AH, 09h
INT 21h
This will result into:
With an end.

So basically these are not string variables like in high level programming languages like C that require some sort of terminator (in C '\0'). This is assembly. You only need a terminator somewhere at the end. However, not every character string assigned to a variable name needs such a terminator. It is sufficient if one comes later in the data segment.
Title: Re: Why does the second string output produce garbage?
Post by: sinsi on July 08, 2024, 12:46:05 PM
Well according to your rules, the output is correct. DOS is printing the bytes, all 512 of them.
Bytes from 00-1F are not characters as such, e.g. 13 (0Dh) is a carriage return, 07 is a beep, ...
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 08, 2024, 12:58:42 PM
Quote from: sinsi on July 08, 2024, 12:46:05 PMWell according to your rules, the output is correct. DOS is printing the bytes, all 512 of them.
Bytes from 00-1F are not characters as such, e.g. 13 (0Dh) is a carriage return, 07 is a beep, ...

Have you even looked at the DEBUG output? Screenshot 2
The rule should be that everything from DS:DX up to the first $ character is output. But that is not the case.
What should actually be output can be seen in the screenshot with the DEBUG, the DEBUG dump starts with VESA.
But the program output does not start with VESA instead it starts with garbage.
And now explain to me why it doesn't output the data range from DS:DX until $ correctly, even though DS and DX point to the right place.

And one more thing. Of course, the VESA query could insert a $ somewhere before and then the output would end there. But it should at least start with VESA.
Title: Re: Why does the second string output produce garbage?
Post by: sinsi on July 08, 2024, 01:06:50 PM
You are looking at a hex dump of bytes. Debug has converted the buffer of bytes into characters for you.
The VESA string is printed, but if you look at the byte at 0982:0035 it is 0dh which, when printed by DOS, is the CR (carriage return) character, so the cursor moves to the start of the line, then DOS prints the next characters overwriting what's there.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 08, 2024, 01:12:53 PM
Thanks, I just found that out too.

So this answers my question.

BTW:
QuoteYou are looking at a hex dump of bytes. Debug has converted the buffer of bytes into characters for you.
I know that. This is obvious. When I say it should output the stuff from the dump, I mean just the string.
Title: Re: Why does the second string output produce garbage?
Post by: BugCatcher on July 08, 2024, 11:48:38 PM
3. INT 21h Function 09h: Write a $-terminated string to standard output
␁ The string must be terminated by a '$' character.
␁ DS must point to the string's segment, and DX must contain the string's offset:
.data
string BYTE "This is a string$"
.code
mov ah,9
mov dx,OFFSET string
int 21h
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 12:49:22 AM
Quote from: BugCatcher on July 08, 2024, 11:48:38 PM3. INT 21h Function 09h: Write a $-terminated string to standard output
␁ The string must be terminated by a '$' character.
␁ DS must point to the string's segment, and DX must contain the string's offset:
.data
string BYTE "This is a string$"
.code
mov ah,9
mov dx,OFFSET string
int 21h

This is the same, as this:

.data
string DB "T"
hello DB "his "
world DB "is "
maria DB "a "
loves DB "string"
me DB "$"
.code
mov ah,9
mov dx,OFFSET string
int 21h

So, no, not every string requires a terminator. It's enough when there is a following string in the row that has a terminator and when all the following data belongs to the one, that you want to print.
Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 02:56:58 AM
Quote from: bugthis on July 09, 2024, 12:49:22 AMSo, no, not every string requires a terminator.
Wrong.
Every string does require a terminator. How that terminator gets put there is up to you, the programmer, but it's got to be there, otherwise you'll get a "run-on" string, most likely w/garbage at the end.

But you knew this.
Title: Re: Why does the second string output produce garbage?
Post by: zedd151 on July 09, 2024, 03:10:39 AM
Quote from: bugthis on July 09, 2024, 12:49:22 AMSo, no, not every string requires a terminator.
Since there is one terminator, everything before it is considered to be part of the same string - no matter whether or not you declare each part of the string with its own variable as you have done.

True you may print any portion of the string (from a given "variable" address to terminator) that has a variable name as in your example but it will always rely on the fact that a terminator is present.

 :rolleyes:
Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 03:12:02 AM
Is there an echo in here?
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 04:09:01 AM
Quote from: NoCforMe on July 09, 2024, 02:56:58 AM
Quote from: bugthis on July 09, 2024, 12:49:22 AMSo, no, not every string requires a terminator.
Wrong.
Every string does require a terminator. How that terminator gets put there is up to you, the programmer, but it's got to be there, otherwise you'll get a "run-on" string, most likely w/garbage at the end.

But you knew this.
You obviously didn't understand what i wrote.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 04:34:42 AM
Quote from: zedd151 on July 09, 2024, 03:10:39 AM
Quote from: bugthis on July 09, 2024, 12:49:22 AMSo, no, not every string requires a terminator.
Since there is one terminator, everything before it is considered to be part of the same string - no matter whether or not you declare each part of the string with its own variable as you have done.

True you may print any portion of the string (from a given "variable" address to terminator) that has a variable name as in your example but it will always rely on the fact that a terminator is present.

 :rolleyes:
As i said before, this is an old trick from old programmers the kids seem not to understand here.

A string is not defined by a terminator, but a string is simple a sequence of characters assigned to a variable and accessible by the variable. Therefore, subsequent strings are not counted as part of the first string.
A terminator is required for the output and it's absolutely okay to have that terminator in another following string as long as the previous string and what you want to output are before it.

It's an old programming trick because it saves some bytes.
For example, you can do the following.

.data
question DB "Say yes or no?"
endl DB 13, 10, "$"
yes DB "You said: yes.$"
no DB "You said: no.$"

.code
ASKQ:
mov ah,9
mov dx,OFFSET question ; output will end, thanks to the following endl string sequence. 1-3 bytes are saved.

...; some code that does the input, compare and branching (conditional jump)

no:
mov ah,9
mov dx,OFFSET endl ; We want one more line of space.
int 21h
mov dx,OFFSET no
int 21h
mov ah,9
mov dx,OFFSET endl ; reusing endl again, so we have CRLN
int 21h
jmp ASKQ

yes:
mov ah,9
mov dx,OFFSET endl ; We want one more line of space.
int 21h
mov ah,9
mov dx,OFFSET yes
int 21h
mov dx,OFFSET endl ; reusing endl again, so we have CRLN
int 21h

...; some code to end program

And by the way, yes, this trick is from an old assembly language programming book from a time when RAM was expensive and in short supply. So I'm not going to argue with kids about it. Everything has been already said about it. Just terminate every string if you think you have to do it that way. But then you might as well just use C. Because then you don't need the power of assembly language anyway.

Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 04:48:52 AM
Quote from: bugthis on July 09, 2024, 04:09:01 AMYou obviously didn't understand what i wrote.
I did. You were just wrong in what you stated.
Title: Re: Why does the second string output produce garbage?
Post by: sinsi on July 09, 2024, 07:13:04 AM
Quote from: bugthis on July 09, 2024, 04:34:42 AMAs i said before, this is an old trick from old programmers the kids seem not to understand here.
Insults work so well here  :badgrin:

There seems to be one person here who doesn't understand things here.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 07:54:55 AM
Quote from: sinsi on July 09, 2024, 07:13:04 AMThere seems to be one person here who doesn't understand things here.

I know at least one person here whose definition of strings is wrong and who thinks they must be all like C-strings.

Even in the C programming language, one regrets that a terminator character was used when defining C strings instead of simply defining the length of the C string, for example, in the first element.

Some reading:

https://queue.acm.org/detail.cfm?id=2010365

https://www.reddit.com/r/programming/comments/j6s62/the_most_expensive_onebyte_mistake_did_ken_dennis/?rdt=52626

Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 08:03:11 AM
Quote from: bugthis on July 09, 2024, 07:54:55 AM
Quote from: sinsi on July 09, 2024, 07:13:04 AMThere seems to be one person here who doesn't understand things here.

I know at least one person here whose definition of strings is wrong and who thinks they must be all like C-strings.
Dunno if you're referring to me, but yes: apart from Pascal-type strings to which you refer (where the first element is the length of the string), all strings must "be like C strings" in that they must have a terminator. A NULL terminator in most cases, a '$' in the case of DOS, as in your code. Otherwise we all know what happens when you try to display a terminator-less string ...

Oh, sure, you can define a "string" internally in your code any way you like, but if you're going to use standard functions with it, Win32, Unix or DOS, it had better have a terminator!**

Your bringing up your "trick" (which is trivial and which anyone who ever wrote DOS assembly-language programs, as I did back in the day, was familiar with) doesn't change that. At some point in the string, there needs to be a terminator.

Now, have you fixed your problem with your code yet?

** With certain exceptions, like ExtTextOut() which requires the length of the string as one of its parameters.
Title: Re: Why does the second string output produce garbage?
Post by: sinsi on July 09, 2024, 08:16:35 AM
A string is a list of bytes. A computer needs a limit which can be either a terminator (any byte, as long as it is consistent) or a length. Most Windows functions require C type strings (null terminated) but a function like WriteConsole requires a length. Windows also uses Unicode strings in kernel functions that is a counted string (UNICODE_STRING structure) which may or may not be null-terminated.
Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 08:26:47 AM
Quote from: bugthis on July 09, 2024, 07:54:55 AMEven in the C programming language, one regrets that a terminator character was used when defining C strings instead of simply defining the length of the C string, for example, in the first element.

Well, since you brought this up, yes, there was some controversy over the way strings should be structured in the dim, dark past, but that has pretty much been resolved in favor of terminated strings.

And I for one am glad that our overlords and masters decided on that. As an assembly-language programmer, can you imagine the nightmare of having to deal with the alternative? (In a higher-level language, it would be six of one or half a dozen of the other from the programmer's point of view.)

Instead of simply defining a string thus
SomeString  DB "ThIs ArE a StRiNg????", 0

you'd have to write a macro to define a string, which would insert the length element before the text. And you'd have to deal with that intrusive element every time you did anything with the string. So no regret here that we didn't go the other way.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 09:40:25 AM
Quote from: NoCforMe on July 09, 2024, 08:03:11 AM... apart from Pascal-type strings to which you refer (where the first element is the length of the string), all strings must "be like C strings" in that they must have a terminator. A NULL terminator in most cases, a '$' in the case of DOS, as in your code. Otherwise we all know what happens when you try to display a terminator-less string ...

Oh, sure, you can define a "string" internally in your code any way you like, but if you're going to use standard functions with it, Win32, Unix or DOS, it had better have a terminator!**
If you use standard functions, but that's isn't what defines a string. A string is just a sequence of usually printable characters with a few exceptions like cr, ln etc.. As the trick shows, you can define a string without a terminator and let use the function the terminator of the following string definition.

QuoteYour bringing up your "trick" (which is trivial and which anyone who ever wrote DOS assembly-language programs, as I did back in the day, was familiar with) doesn't change that. At some point in the string, there needs to be a terminator.
You still don't understand, that this is only for the string you pass over to a function that relies on the terminator. But the definition itself doesn't have that requirement, because you can append another string in memory that does have this terminator, thus your code will run fine.
Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 10:00:58 AM
Quote from: bugthis on July 09, 2024, 09:40:25 AMBut the definition itself doesn't have that requirement, because you can append another string in memory that does have this terminator, thus your code will run fine.
But the fact remains that ultimately the string (whatever you pass to INT 21H/AH=9, f'rinstance) must have a terminator. How it (the terminator) gets there is irrelevant.
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 11:31:17 AM
Quote from: NoCforMe on July 09, 2024, 10:00:58 AM....(whatever you pass to INT 21H/AH=9, f'rinstance) must have a terminator.
Nobody here disputes that.

QuoteHow it (the terminator) gets there is irrelevant.
Exactly. And that's my point.

Title: Re: Why does the second string output produce garbage?
Post by: NoCforMe on July 09, 2024, 11:35:28 AM
Quote from: bugthis on July 09, 2024, 11:31:17 AM
QuoteHow it (the terminator) gets there is irrelevant.
Exactly. And that's my point.
OK. We can move on now.

You never answered: did you fix the problems with your code?
Title: Re: Why does the second string output produce garbage?
Post by: bugthis on July 09, 2024, 11:44:43 AM
Quote from: NoCforMe on July 09, 2024, 11:35:28 AM
Quote from: bugthis on July 09, 2024, 11:31:17 AM
QuoteHow it (the terminator) gets there is irrelevant.
Exactly. And that's my point.
OK. We can move on now.
I'm glad we got this settled.
QuoteYou never answered: did you fix the problems with your code?

Well, I now know the cause of the bug. And ultimately I just need to evaluate the parameters in a structured way. But I haven't gotten around to that today and I probably won't get around to it any time soon either, as I want to do a lot of other work first.

My program from yesterday was just a small test anyway, because my curiosity to read the VBE info block was too great. So it was just a quick and dirty mini test.