if you are creating a 16-bit EXE, you don't have to set the stack segment and stack pointer
the initial SS:SP values are stored in the EXE header
the initial SS will point to the stack segment
and SP will point to the last word in that segment
if you have 512 bytes of stack space, SP will be 01FEh (512 - 2, in hex)
and, 8 words of stack space isn't enough :P
i wouldn't even think of using less that 512 bytes stack space in 16-bit code
because it is a single-task, single-user operating system,
INT's and whatever else goes on in the background use the same stack
finally, to simplify your code, use the shortcuts
here's a template for small model 16-bit EXE's
.MODEL Small
.STACK 4096
.DOSSEG
.386
OPTION CaseMap:None
;####################################################################################
.DATA
s$Msg db 'Hello World !',0Dh,0Ah,24h
;************************************************************************************
.DATA?
;####################################################################################
.CODE
;************************************************************************************
_main PROC FAR
mov dx,@data
mov ds,dx
mov dx,offset s$Msg
mov ah,9
int 21h
mov ax,4C00h
int 21h
_main ENDP
;####################################################################################
END _main