Hi all,
I've heard about code alignation. How to do it? How to align code and data. What happens when for example i use directive ALIGN 4. I've heard that important in it is CPU cache parameter BYTE LINE SIZE.
thanks in advance.
Hi flipflop,
Quote from: flipflop on February 21, 2013, 05:27:52 AM
I've heard about code alignation. How to do it? How to align code and data. What happens when for example i use directive ALIGN 4. I've heard that important in it is CPU cache parameter BYTE LINE SIZE.
your code will start at an address which is divisible by 4 (without remainder, that's clear).
Gunther
that means if for example a variable DWORD "VAR1" which is loaded without align at the address 403003H will be loaded at the address 403008H when i use ALIGN 4. Right?
for data, it is often important that data is aligned to the word-size of the machine
(even-aligned for 16-bit code, 4-aligned for 32-bit code, 8-aligned for 64-bit code)
this helps speed up accesses to the data in code
in some cases, it is required that the data be aligned
this is true for many SSE instructions
as for aligning code, it seems to help CALL's if the target is 16-aligned
for NEAR loops, 16-aligned
for SHORT loops, alignment doesn't seem to be critical
this is done using the ALIGN directive
ALIGN 4
in the data sections, the assembler may place bytes of 0's to pad in order to achieve alignment
in the code section, the assembler may use NOP's or JMP's to achieve alignment
I myself not straight understood how it' need to do, so could explain more detailed :
1 - segment aliment by dot segment directives is ever proper to system
2 - when code and data in one module for suppress misalignment is need ALIGN directive before PROCs and some datas
3 - structure fields aligning by reserved db fields manually
4 - stack alignment no need to break by push word vars (ever doubled or dword vars instead)
5 - before label that used by far jmps need nops, for proper alignment
Quote from: flipflop on February 21, 2013, 06:51:00 AM
that means if for example a variable DWORD "VAR1" which is loaded without align at the address 403003H will be loaded at the address 403008H when i use ALIGN 4. Right?
not quite
403004h is the next 4-aligned address
Adamanteus, i'm not so good at assembly. Things about you're writting are not familiar to me. I wanted to learn basics of alignment. Could you explain each of them a bit?
i think those are questions :biggrin:
a little hard to understand exactly what he is asking, though
Hi flipflop,
as Dave explained, you can align code and you can align your data. In some cases, for example, the hot spot of a loop, an alignment of 16 is necessary. Another example are SSE instructions. To use for example:
movaps xmm0, [value]
the variable value must be aligned by 16, otherwise the CPU generates an exception. You can avoid that by using:
movups xmm0, [value]
but that's slower.
Gunther
Quote from: flipflop on February 21, 2013, 07:10:57 AM
Adamanteus, i'm not so good at assembly. Things about you're writting are not familiar to me. I wanted to learn basics of alignment. Could you explain each of them a bit?
That's answer - follow good examples given here, and I could add that most easy way for proper align everything - put all in separate files, each procedure and variable, so dot segment directives will do everything for you.
flipflop:
This code shows the effect of the alignments.
;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
;----------------------------------------
; Returns the maximum alignment of _ptr.
;----------------------------------------
alignment MACRO _ptr
push ecx
xor eax, eax
mov ecx, _ptr
bsf ecx, ecx
jz @F
mov eax, 1
shl eax, cl
@@:
pop ecx
EXITM <eax>
ENDM
;==============================================================================
.data
D0 dd 0
db 0
D1 db 0
align 16
D2 db 0
db 0
D3 db 0
align 8
D4 db 0
db 0
D5 db 0
align 4
D6 db 0
db 0
.code
;==============================================================================
start:
;==============================================================================
;------------------------------------------------------------------------
; The OFFSET operator specifies the offset address of a memory location.
; To get the offset address of a data label you must use the OFFSET
; operator, but to get the offset address of a code label you can omit
; the operator.
;
; The align directive aligns the next variable or instruction on a byte
; address that is a multiple of the specified number. This ensures that
; the minimum alignment will be as specified, but note that the actual
; alignment can be greater than specified. At least for ML 6.15 the
; number must be 1, 2, 4, 8, or 16.
;------------------------------------------------------------------------
printf("start\t%Xh\t%d\n", start, alignment(start))
L1:
align 16
L2:
nop ; 1-byte
L3:
align 8
L4:
nop
L5:
align 4
L6:
printf("L1\t%Xh\t%d\n", L1, alignment(L1))
printf("*L2\t%Xh\t%d\n", L2, alignment(L2))
printf("L3\t%Xh\t%d\n", L3, alignment(L3))
printf("*L4\t%Xh\t%d\n", L4, alignment(L4))
printf("L5\t%Xh\t%d\n", L5, alignment(L5))
printf("*L6\t%Xh\t%d\n\n", L6, alignment(L6))
printf("D0\t%Xh\t%d\n", OFFSET D0, alignment(OFFSET D0))
printf("D1\t%Xh\t%d\n", OFFSET D1, alignment(OFFSET D1))
printf("*D2\t%Xh\t%d\n", OFFSET D2, alignment(OFFSET D2))
printf("D3\t%Xh\t%d\n", OFFSET D3, alignment(OFFSET D3))
printf("*D4\t%Xh\t%d\n", OFFSET D4, alignment(OFFSET D4))
printf("D5\t%Xh\t%d\n", OFFSET D5, alignment(OFFSET D5))
printf("*D6\t%Xh\t%d\n\n", OFFSET D6, alignment(OFFSET D6))
inkey "Press any key to exit..."
exit
;==============================================================================
end start
start 401000h 4096
L1 401029h 1
*L2 401030h 16
L3 401031h 1
*L4 401038h 8
L5 401039h 1
*L6 40103Ch 4
D0 403000h 4096
D1 403005h 1
*D2 403010h 16
D3 403012h 2
*D4 403018h 8
D5 40301Ah 2
*D6 40301Ch 4
As Dave stated, to achieve the alignment the assembler pads the data or code, using bytes with a value of zero to pad the data and various forms of NOP to pad the code. In the example code I used the actual 1-byte NOP instruction to disturb the alignment, but depending on the size of the required pad the assembler may use various combinations of selected instructions. Since the instructions may become part of the instruction stream, and may be executed, the goal of the selection (a goal not necessarily met, see the 5-byte NOP in the list below) is to pick instructions that will have no adverse effects on the code they are placed in. This is a listing of the NOP sequences used by ML 6.14:
; -------------------------------------------------------------
; No-op sequences inserted for align, MASM 6.14, 1 to 15 bytes:
; -------------------------------------------------------------
00401001 90 nop
00401006 8BFF mov edi,edi
00401009 8D4900 lea ecx,[ecx]
00401014 8D642400 lea esp,[esp]
0040101B 0500000000 add eax,0
00401022 8D9B00000000 lea ebx,[ebx]
00401029 8DA42400000000 lea esp,[esp]
00401038 8DA42400000000 lea esp,[esp]
0040103F 90 nop
00401047 8DA42400000000 lea esp,[esp]
0040104E 8BFF mov edi,edi
00401056 8DA42400000000 lea esp,[esp]
0040105D 8D4900 lea ecx,[ecx]
00401065 8DA42400000000 lea esp,[esp]
0040106C 8D642400 lea esp,[esp]
00401074 8DA42400000000 lea esp,[esp]
0040107B 0500000000 add eax,0
00401083 8DA42400000000 lea esp,[esp]
0040108A 8D9B00000000 lea ebx,[ebx]
00401092 8DA42400000000 lea esp,[esp]
00401099 8DA42400000000 lea esp,[esp]
004010A1 8DA42400000000 lea esp,[esp]
004010A8 8DA42400000000 lea esp,[esp]
004010AF 90 nop
I recall that the GNU assembler at some point used different sequences, and the 15-byte sequence consisted of some number of NOPs preceded by a jump instruction that effectively jumped over the NOPs instead of executing them.