Hi folks. Once in a while, I brave myself to test 32-bit MASM code. One attempt was to use ALIGN 32 (to use the YMMs) in my code. Turns out there's no easy way to do it in 32-bit MASM by using my normal .code .data setting. So after an hour of digging, I found the use DSEG/CSEG from the internet. I heard it the first time today. LOL. I tried it out and it didn't complain of alignment at all.
.xmm
.model flat,c
extrn printf:proc
option casemap:none
DSEG SEGMENT PARA PUBLIC 'DATA'
x dq 56.334,10.436,11.982,71.324
z dq 89.123,90.123,11.128,55.567
outstring db '%f',10,0
val1 dq 0.0
DSEG ENDS
CSEG SEGMENT PUBLIC 'CODE'
WinMain@16 proc
push ebp
mov ebp,esp
vmovdqa ymm0,ymmword ptr[x]
vmovdqa ymm1,ymmword ptr[z]
vshufpd ymm2,ymm1,ymm0,0
movq val1,xmm2
push dword ptr val1+4
push dword ptr val1
push offset outstring
call printf
add esp,12
leave
ret
WinMain@16 endp
CSEG ENDS
END
It works, but still I don't quite understand how it works and I don't know what questions to ask either. Something must be wrong somewhere because I got it correct the first time. LOL LOL. Care to give me as much information as possible on this type of code configuration? Really appreciate your help.
Doesn't look very aligned ::)
include \masm32\include\masm32rt.inc
DSEG SEGMENT PARA PUBLIC 'DATA'
var1 dd 111h
DSEG ENDS
DSEG SEGMENT PARA PUBLIC 'DATA'
var2 dd 111h
DSEG ENDS
DSEG SEGMENT PARA PUBLIC 'DATA'
var3 dd 111h
DSEG ENDS
.code
start:
mov eax, offset var1
print str$(eax), 13, 10
mov ecx, offset var2
print str$(ecx), 13, 10
mov edx, offset var3
print str$(edx), 13, 10
exit
end start
Output:
4210688
4210692
4210696
deleted
Hi JJ. But you see, the code works though. I am on Win7 32-bit (64-bit PC). No complain of segfault at all. Curious though.
I can't get it to build in 32 bit MASM.
Quote from: nidud on April 29, 2017, 02:18:49 AM
The OS will load your application on a 16-byte aligned address. This may from time to time be 32 or higher but this will be random so you have to apply alignment-code to be sure it's properly aligned to any granularity above 16.
Hi Nidud. Has the word 'para' anything to do with it? Is that a page boundary?
I applied "USE32" to DSEG moments ago, now it complains something else. Confusing.
Quote from: hutch-- on April 29, 2017, 02:21:18 AM
I can't get it to build in 32 bit MASM.
What's the complain Hutch? Misalignment or something else? Works on mine though.
Owh, I link with "gcc -m32 myprog.obj -o myprog.exe" just to be clear.
In both cases, I can't use ALIGN 32, except by manual padding after ALIGN 16.
I have got this much of it to build. RE : data alignment for the YMM code, allocate dynamic memory, align it then write you data to it.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.686p ; maximum processor model
.XMM ; SIMD extensions
.model flat, stdcall ; memory model & calling convention
option casemap :none ; case sensitive
; ---------------------------------------------------------
; Write the prototype for the procedure name below ensuring
; that the parameter count & size match and put it in your
; library include file.
; ---------------------------------------------------------
.data ; initialised data section
x dq 56.334,10.436,11.982,71.324
z dq 89.123,90.123,11.128,55.567
outstring db '%f',10,0
val1 dq 0.0
.code ; code section
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
;;; extrn printf:proc
.code
WinMain@16 proc
push ebp
mov ebp,esp
vmovdqa ymm0,ymmword ptr[x]
vmovdqa ymm1,ymmword ptr[z]
vshufpd ymm2,ymm1,ymm0,0
movq val1,xmm2
; push dword ptr val1+4
; push dword ptr val1
; push offset outstring
; call printf
add esp,12
leave
ret
WinMain@16 endp
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end
Quote from: coder on April 29, 2017, 02:19:00 AM
Hi JJ. But you see, the code works though. I am on Win7 32-bit (64-bit PC). No complain of segfault at all. Curious though.
My first assumption there was that every time you create a new segment it starts at a 512-byte border. Apparently the linkers are clever enough to prevent this - unless there is another trick. There is another thread on align macros but can't find it right now. It's messy, be careful.
Not sure JJ. I think Windows has a fixed ImageBase policy. With that in mind, I can safely assume that every new sections or segments are allocated page-aligned memory. So whether it's a .data or DSEG, alignment doesn't really become a concern at offset 0 (first data). 512-bytes is a flat layout (no section or segment), IMHO.
deleted
I made the code shorter for easy reference.
;ml /c /coff myprog.asm
;gcc -m32 myprog.obj -o myprog.exe
.686p
.xmm
.model flat,c
option casemap:none
printf PROTO C:vararg
;DSEG SEGMENT PAGE PUBLIC 'DATA'
.data
x dq 56.33,10.36,11.98,71.24
z dq 89.23,90.12,11.28,55.57
v1 dq 0.0
outstring db '%f',10,0
confirm db 'x = %p, WinMain@16 = %p',10,0
;DSEG ENDS
;CSEG SEGMENT PUBLIC 'CODE'
.code
WinMain@16 proc
enter 0,0
invoke printf,addr confirm,addr x, addr WinMain@16
vmovdqa ymm0,ymmword ptr x
vmovdqa ymm1,ymmword ptr z
vshufpd ymm2,ymm1,ymm0,0
movq v1,xmm2
invoke printf,addr outstring,v1
leave
ret
WinMain@16 endp
;CSEG ENDS
END
The output for CSEG and DSEG are page-aligned, but not so with .code and .data
Output with CSEG/DSEG pair
x = 00405000, WinMain@16 = 00403000
89.230000
Output with .code/.data pair
x = 00403010, WinMain@16 = 004026E0
;missing output due to misalignment
Quote from: coder on April 29, 2017, 03:42:17 AM
Not sure JJ. I think Windows has a fixed ImageBase policy. With that in mind, I can safely assume that every new sections or segments are allocated page-aligned memory.
Yes, that's right. Except that the bloody linker doesn't give you a
new segment, as demonstrated in reply #1. Instead, it merges several segments into one. That's why the addresses of the variables increase only by a DWORD.
Interesting that a DSEG/CSEG allows you to set the alignment manually (although it's aligned to page anyway). Below is a modified version for use with XMM instead for those who don't have AVX. Added SSEG. All sections are page-aligned (which also aligned to 32). I think it's more convenient this way for any alignment larger than 16.
;ml /c /coff myprog.asm
;gcc -m32 myprog.obj -o myprog.exe
.686p
.xmm
.model flat,c
option casemap:none
printf PROTO C:vararg
DSEG SEGMENT ALIGN(32) PUBLIC 'DATA'
x dq 56.33,10.36,11.98,71.24
z dq 89.23,90.12,11.28,55.57
v1 dq 0.0
outstring db '%f',10,0
confirm db 'x = %p, WinMain@16 = %p, t = %p',10,0
DSEG ENDS
SSEG SEGMENT STACK 'STACK'
t dq 0
SSEG ENDS
CSEG SEGMENT PUBLIC 'CODE'
WinMain@16 proc
enter 0,0
invoke printf,addr confirm,addr x, addr WinMain@16, addr t
movdqa xmm0,xmmword ptr x
movdqa xmm1,xmmword ptr z
shufpd xmm1,xmm0,0
movq v1,xmm1
invoke printf,addr outstring,v1
leave
ret
WinMain@16 endp
CSEG ENDS
END
The output for all three markers
x = 00405000, WinMain@16 = 00403000, t = 00406000
89.230000
I'm just not sure if SSEG really mean anything to ESP though.
The crude technique for testing if a variable (esp based) is aligned correctly for an SSE or AVX instruction is to make a test piece and use an instruction that must be aligned to a minimum boundary. You will find out soon enough if its correct or not. :P
Yea hutch. I still can't figure out what SSEG has anything to do with ESP. Using ASSUME SS don't seem to help either. ESP still aligns 4.
This should do the job.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
comment * -----------------------------------------------------
Build this template with
"CONSOLE ASSEMBLE AND LINK"
----------------------------------------------------- *
.data?
__esp dd ?
.code
start:
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
call main
inkey
exit
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
call aligned
ret
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
aligned proc
mov __esp, esp ; save ESP
memalign esp, 1024
sub esp, 1024
; -------------------------------
; see what your alignment is here
; -------------------------------
mov esp, __esp ; restore ESP
ret
aligned endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end start
deleted