Hi guys.
I have the following method to exchange the first 4 bytes of a pointer for the values of the array. I would like to know if it can be done in some way more efficient in execution cycles not space.
movzx (%rsp), %rax
movzx 1(%rsp), %rbx
movzx 2(%rsp), %rdi
movzx 3(%rsp), %rcx
movb arr(%rax), %al
movb arr(%rbx), %ah
movb arr(%rdi), %bl
movb arr(%rcx), %bh
movw %ax, (%rsp)
movw %bx, 2(%rsp)
arr: .quad 0x0807060504030201
The array can be larger up to 256 elements that a byte can have, it is the best way that has occurred to me. Any idea how it could be made more efficient? Sure, you can't swap all 4 bytes at once, since the effective address would be wrong, this way of course.
Regards.
It looks like AT&T notation and I am not sure of the array member size as different sized data is being copied into the array members.
Rough guess its something like this in Intel notation.
mov QWORD PTR [rsp], rax
mov QWORD PTR [rsp+8], rbx
mov QWORD PTR [rsp+16], rdi
mov QWORD PTR [rsp+24], rcx
mov BYTE PTR [rsp+32], al
mov BYTE PTR [rsp+40], ah
mov BYTE PTR [rsp+48], bl
mov BYTE PTR [rsp+56], bh
If it is the syntax of at&t, mistake on my part to place the code like this, here it is in intel:
code: file format elf64-x86-64-freebsd
Disassembly of section .text:
0000000000201158 <_start>:
201158: 48 0f b6 04 24 movzx rax,BYTE PTR [rsp]
20115d: 48 0f b6 5c 24 01 movzx rbx,BYTE PTR [rsp+0x1]
201163: 48 0f b6 7c 24 02 movzx rdi,BYTE PTR [rsp+0x2]
201169: 48 0f b6 4c 24 03 movzx rcx,BYTE PTR [rsp+0x3]
20116f: 8a 80 8f 11 20 00 mov al,BYTE PTR [rax+0x20118f]
201175: 8a a3 8f 11 20 00 mov ah,BYTE PTR [rbx+0x20118f]
20117b: 8a 9f 8f 11 20 00 mov bl,BYTE PTR [rdi+0x20118f]
201181: 8a b9 8f 11 20 00 mov bh,BYTE PTR [rcx+0x20118f]
201187: 66 89 04 24 mov WORD PTR [rsp],ax
20118b: 66 89 1c 24 mov WORD PTR [rsp+0x2],bx
000000000020118f <arr>:
...
the array (arr) can be of any length, as seen more clearly in the code i passed in intel used the address of arr and the sum of the byte to create an effective address, store it in an 8 bit register and then return them to the pointer, with the value of the array.
I guess there is no longer any way to make this process more efficient, does anyone have any ideas?
Thanks.
I have thought about it, I see that there is no other way to do it, my question is out of place.
There's no more.