News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Tiny Com Program problem

Started by JnZn558, November 14, 2020, 08:35:17 PM

Previous topic - Next topic

JnZn558

I am using normal segment definition. I also know there are other method like simplified segment possible (.code). But let us keep this way. Question here.


  • Are there also old style method to define the model? like ".code same as code segment"
  • Can I mix the style? .code, data segment and .stack
  • Below I have specified org 100h, but it still warn 0x100 is not equal?


code segment
assume cs: code

org 100h

message db "Hello World!$"

_start:
mov dx, offset message
mov ah, 09h
int 21h
mov ah, 4ch
int 21h

code ends
end _start


compile: ml.exe /c /AT /Fo HelloWorld.obj HelloWorld.asm
link      : link16.exe /TINY HelloWorld.obj, HelloWorld.com, HelloWorld.map

LINK : warning L4055: start address not equal to 0x100 for /TINY

Vortex

#1
Hi JnZn558,

Welcome to the Masm Forum.

In my modest opinion, it would be better to keep things simple :

.model tiny

.data

string db 'Hello world!$'

.code

org 100h

start:

    mov     dx,OFFSET string

    mov     ah,9
    int     21h

    mov     ax,4C00h
    int     21h

END start


If you are using a 64-bit version of Windows, you can try this tool to run 16-bit applications :

MS-DOS Player for Win32-x64

http://takeda-toshiya.my.coocan.jp/msdos/index.html
Quote
This is MS-DOS emulator running on Win32-x64 command prompt.
16bit MS-DOS compatible commands can be executed on Win32-x64 envrionment.

Mikl__

#2
Hi JnZn558! There are several methods to print the string "hello world!" on the screen in COM-file
1) .286
.model tiny
.code
org 100h
start: mov    ah,9
    mov     dx,OFFSET string
    int     21h
    ret
string db 'Hello world!$'
END start
2)     mov cx,sizeof string
    mov si,offset string
@@: lodsb
        int 29h
    loop @b
3)     push 0B800h
    pop es
    mov ax,3
    int 10h
    mov di,0
    mov si,offset string
    mov cx,sizeof string
    mov ah,0Ch
@@: lodsb
        stosw
    loop @b
4) mov ah,40h
mov bx,1
mov cx,sizeof string
mov dx,offset string
int 21h
5)     mov ah,13h
    xor dx,dx
    mov cx,sizeof string
    mov bp,offset string
    mov bx,0Ch
    int 10h
6)     mov ah,0Eh
   mov si,offset string
next: lodsb
   int 10h
   test al,al
   jnz next
   ret
string db 'Hello world!',0
7) .286
.model tiny
.code
org 100h

start: mov bp,offset ABC
    mov ax,1303h
    mov bx,7
    mov cx,16
    xor DX,DX
    int 10h
    retn
ABC db 'H',0Ah,'e',0Bh,'l',0Dh,'l',0Ch
    db 'o',0Bh,',',0Ah,' ',0Ah,'W',09h
    db 'o',08h,'r',07h,'l',06h,'d',05h
    db '!',02h,'!',02h,'!',02h
end start
8) .286
.model tiny
.code
org 100h
start:  mov si,offset string
    mov cx,N
    mov ah,2
@@: lodsb
    mov dl,al
    int 21h
    loop @b
    retn
string db 'Hello, world!'
N = $ - string
end start
9) .286
.model tiny
.code
org 100h
start:  mov si,offset string
    mov cx,N
    mov ah,6
@@: lodsb
    mov dl,al
    int 21h
    loop @b
    retn
string db 'Hello, world!'
N = $ - string
end start
10) .286
.model tiny
.code
org 100h
start: push offset RETURN   
        push cs                   
        pushf                     
        mov cl,9                 
        mov dx,offset MESSAGE   
        db 0EAh,0C0h,0,0,0     ;jmp far ptr 0:0C0h
RETURN: mov ah,4Ch       
        int  21h                   
MESSAGE db 'Hello, world!$'
end start
11) .286
.model tiny
.code
org 100h
start:    push 0
    pop es     
    mov di,es:[21h*4]
    mov si,es:[21h*4+2]
    mov dx,offset string
    mov ah,9     
    pushf           
        push cs
        push offset @f 
    push si     ;cs for int 21h
    push di     ;ip for int 21h
    retf       
@@: mov ah,4Ch     
    int 21h
string db 'Hello, world!$'
end start

mineiro

Welcome sir JnZn558;
In ms-dos there are usually two types of executables; .com and .exe (.sys files are .exe as example).
The O.S. stores some information at the memory address segment:0 to segment:0ffh, code start point is at segment:100h. If you look closely, you will even see the program's exit instruction in that area, so that a simple "ret" works.
Org 100h is generally used for .com programs; if you ever try to write a boot sector for example, you should use "org 7c00h".
There is a program called "exe2bin", it can be found easily on several ftp servers. You create an .exe program using that style of code presented by you and at the end convert it to .com.
If you are not dealing with multiple segments, I mean, if all segments (ds, cs, es) point to the same address(number), then because it is 16 bits it will have 65535 bytes in total size of your final program. In this case, minus the PSP (those 100h reserved bytes) in memory so that your program can return control to ms-dos.
Try to move the data below your code. Or as an option, insert a "jmp my_code" below "org 100h". If not, your data can be seen as code and ... .
Generally, not necessarily, you can use the instruction "int 20h" as "exit" in .com programs; in .exe programs we usually use "int 21h"
.com programs are generaly raw file in essence, I mean bin file. Don't have a header like .exe programs. They have a created region in memory by O.S. but file can be seen as raw.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

16bitPM

Quote from: JnZn558 on November 14, 2020, 08:35:17 PM
I am using normal segment definition. I also know there are other method like simplified segment possible (.code). But let us keep this way. Question here.


  • Are there also old style method to define the model? like ".code same as code segment"
  • Can I mix the style? .code, data segment and .stack
  • Below I have specified org 100h, but it still warn 0x100 is not equal?


  • Yes: .MODEL  tiny
  • Yes. If you use .DATA you can add data just about anywhere in your source: the assembler will put it in the right segment (which will later be attached to the TEXT-segment where your code is. You didn't do that in your code though, which means the compiler assumed you wanted your text in the code segment.
  • .ORG 100h only makes the compiler assume that the starting address is at 100h (instead of 0). It doesn't really move code around - that it the linker's and/or loader's job. If your code would just be JMP start, then without .org 100h you would see "jmp near 0" in your debugger. With .org 100h, it would be "jmp near 100h". However, the code can still be anywhere. You can use this for example with SEGMENT AT to define multiple fixed locations in a particular segment (for example system-specific).
  • DOS just jumps to <segment>:0100h . In your code, your text would be at that address which is of course nonsense

JnZn558

Quote from: 16bitPM on November 15, 2020, 02:50:21 AM
Quote from: JnZn558 on November 14, 2020, 08:35:17 PM
I am using normal segment definition. I also know there are other method like simplified segment possible (.code). But let us keep this way. Question here.


  • Are there also old style method to define the model? like ".code same as code segment"
  • Can I mix the style? .code, data segment and .stack
  • Below I have specified org 100h, but it still warn 0x100 is not equal?


  • Yes: .MODEL  tiny
  • Yes. If you use .DATA you can add data just about anywhere in your source: the assembler will put it in the right segment (which will later be attached to the TEXT-segment where your code is. You didn't do that in your code though, which means the compiler assumed you wanted your text in the code segment.
  • .ORG 100h only makes the compiler assume that the starting address is at 100h (instead of 0). It doesn't really move code around - that it the linker's and/or loader's job. If your code would just be JMP start, then without .org 100h you would see "jmp near 0" in your debugger. With .org 100h, it would be "jmp near 100h". However, the code can still be anywhere. You can use this for example with SEGMENT AT to define multiple fixed locations in a particular segment (for example system-specific).
  • DOS just jumps to <segment>:0100h . In your code, your text would be at that address which is of course nonsense

Thx for all of your help. I got managed compile and link the source code free of error and warning. It was tested successfully in DosBox.
But tested under MS-DOS 6.22 in VMWare failed. Running the com files stucking with no result. DOS does not react anymore.

Quote
.model tiny
code segment
   assume cs: code
   org 100h

_start:
   mov dx, offset message
   mov ah, 09h
   int 21h
        mov ax, 4c00h
        int 21h
   ret

   message db "Hello World!$"

code ends
   end _start

Quote
.model tiny
.code
   org 100h

_start:
   mov dx, offset message
   mov ah, 09h
   int 21h
   int 20h
   ret

   message db "Hello World!$"

   end _start

Both of the code does compile and link free of error and warning. But I realized that the size of these com files is about 23-25 bytes. The size of the com files should be 279 (281) bytes, because of org 100h = 256 + 23 (25) = 279 (281), or am I wrong?

jj2007

The first example assembles at 26 bytes, the second at 23. Both run fine with MS-DOS Player.

This one assembles at 8 bytes. To see some output, you must supply a commandline argument :cool:

.model tiny
.code
        org 100h
start:
        mov ah, 09h             ; write string to STDOUT
        mov dx, 82h             ; get command line
        int 21h                 ; show it... ;-)
        ret                     ; an extract of Ralf Brown's list of DOS interrupts is available here
end start

mineiro

When you type a program name in ms-dos command line, O.S. loader do some preparations in RAM memory and load your file at memory address.
O.S. will choose a segment to your program, so this can be different each time you debug or run it, but offset will be at same address (offset) that's 100h to .com programs.
If your file have 23 bytes, so, 23 bytes will be loaded in memory starting at address (segment:offset) ???:100h. In memory your 23 bytes program will have 0ffh+23 bytes offset. The first byte of your code will be at offset 100h, so, 100h - 1 == 0ffh.
That memory area from seg:0 to seg:0ffh is called PSP (prefix segment program). Remember that ms-dos is monotask, so only one program can be running at that moment. Have some techinques to you install your TSR (terminate and stay resident) program type, generally involves creating your own "interruption".

A good sugestion that I can tell you is: Create that simple program using debug. You will "feel" that you don't know about address of "offset message", because thats come after your code. So, you will write actual address of your code using pen and paper and insert a junk value in address and continue coding. All rest of program have static values, so, after "ret" instruction you will know where is the address of "message string". You need write that location back to "mov dx, ????". This way you start understanding about what assembler is doing. And you perceive that "org 100h" is instructing assembler to do address calculus based in a bias (start point).
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

JnZn558

Quote from: mineiro on November 15, 2020, 06:01:34 AM
When you type a program name in ms-dos command line, O.S. loader do some preparations in RAM memory and load your file at memory address.
O.S. will choose a segment to your program, so this can be different each time you debug or run it, but offset will be at same address (offset) that's 100h to .com programs.
If your file have 23 bytes, so, 23 bytes will be loaded in memory starting at address (segment:offset) ???:100h. In memory your 23 bytes program will have 0ffh+23 bytes offset. The first byte of your code will be at offset 100h, so, 100h - 1 == 0ffh.
That memory area from seg:0 to seg:0ffh is called PSP (prefix segment program). Remember that ms-dos is monotask, so only one program can be running at that moment. Have some techinques to you install your TSR (terminate and stay resident) program type, generally involves creating your own "interruption".

A good sugestion that I can tell you is: Create that simple program using debug. You will "feel" that you don't know about address of "offset message", because thats come after your code. So, you will write actual address of your code using pen and paper and insert a junk value in address and continue coding. All rest of program have static values, so, after "ret" instruction you will know where is the address of "message string". You need write that location back to "mov dx, ????". This way you start understanding about what assembler is doing. And you perceive that "org 100h" is instructing assembler to do address calculus based in a bias (start point).

Thx very much, it was all very helpfull. I finally got it work. It was a problem with the virtual floppy disc.

16bitPM

Quote from: JnZn558 on November 15, 2020, 04:19:40 AM
But I realized that the size of these com files is about 23-25 bytes. The size of the com files should be 279 (281) bytes, because of org 100h = 256 + 23 (25) = 279 (281), or am I wrong?

That is not correct. As I explained earlier, ORG does not move anything at all. It's just a directive for the compiler to change all offsets relative to 100h instead of 0h. DOS loads your code at 100h (0-100h is used for the PSP). If DOS loads your code at 100h, then all offsets must start at 100h and not at zero, hence the ORG-directive.
Internally the assembler will just add an offset of 100h to all pointers relative to the beginning of the segment and output that in the OBJ-file. The linker will then correct all offsets and output a binary. VoilĂ .

To be perfectly clear:  this means you don't need ORG if you don't use any pointers at all. Consider:



   .286
   .MODEL tiny
   .CODE
start:
   push word ptr 0b800h   ; load segment of text mode screen buffer
   pop es
   ; xor ax,ax  ; not needed (already zero)
   mov cx,200h
   rep stosw   ; clear top part of the screen
   ret
   END start


This rather trival piece of code doesn't need ORG, because there is no reference to any other data location that's relative to the start at all.


mineiro

Yes, like that.
When you try as an example, linux code in a near future you will see patterns and anomalies. Let me explain.
Most (not all) windows/linux programs to 32 bits starts at offset 401000h. Ops!!!. Nice, but user can change that start (entry) point to be at end of code as an example. (.exe file)
After that they evoluted and created a thing called PIC (position independent code). So, this means that your entry point code will start "every time" from different address. Hmm, how you can do(code) that? So, that "address 0" will be the solution and loader/O.S. will remap address of your program instructions.
These things changes, have an open mind, but feel whats going on.

Going back to 16 bits (real mode).
When you turn on your computer, bios code is executed and if that code(firmware) found a "valid" boot sector will transfer execution to code located at offset 7c00h in segment 0h. To boot sector be valid, bios code finds for some "signature" (specific bytes at specific location). If these bytes are found so your code (O.S.) will be executed.
So, happy, you start by using int 21h that you have learned. You will figure that don't work. Why? Because O.S. ms-dos inserted (created) that interruptions to you. So, you try a more low level.
You try to create your O.S. that display a hello world, nothing more. You don't have ms-dos interruption, but you have BIOS interruptions. So, you learn again what that offers to you and learn about "int 10h" (video) and "int 16h"(keyboard).
You do your tests using ms-dos but without using ms-dos interruptions and because .com files is not stored as a structure but raw way like in a bin file you got the point.
After that you try again, but now, without using interruptions. Have an way to show in screen a string and get a keypress from keyboard without using a bios interruption?. You start by reading about machine architecture, specific addresses. Video memory (your screen) is stored at specific address in memory, so if you write at that memory address something, so something is displayed in screen. You search for specific address again to where keys typed in keyboard are stored in memory and found a circular buffer (a snake eating their own tail). Nice.
After that you question again, exist more low level?
You learn that you can do "in" and "out" from/to chips inside that motherboard to reach same results.
And the questions remains.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

mineiro

I forgot to tell you one thing.
You said that you're using a virtual machine, your tests are being done like a virtual floppy disk.
A virtual floppy disk is just a "raw" file that have a lot of "zeros" filled inside with sizeof 1440MB as an example (formated floppy disk).
Create an empty file with sizeof 1440MB fully filed with zeros as an example and try to boot. Will not work. Why? Hmm, signature not found. After that; hmm, code to be executed not found.
The rest of process is the same.
An hint is that you can use an "hexadecimal editor" in that "file"(virtual floppy disk device).
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

Vortex

Starman's website is very good to study the boot records :

https://thestarman.pcministry.com/asm/mbr/DOS50FDB.htm