News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

[Solved] Help with SSE instructions (hangs on pshufb)

Started by mulu64, March 20, 2025, 01:32:05 AM

Previous topic - Next topic

mulu64

Hello,

I'm trying to display value of a register (here: RAX) in hexadecimal form.
After doing it with classical x86 code, i decide to make it with SSE.
The problem is that's not so simple. I've asked to chatGPT to see if it was already somewhene on the internets, and it answers with the code part below (data comes from me, it was my previous try. It has just added `shuffle_mask`).

This code hangs on Linux (tested on several more or less old CPUs. The program either coredump or get kind of a bus error (CPU too old, i guess)).

It doesn't go beyond:
  pshufb xmm0,xmmword ptr [shuffle_mask]  # segfault

I don't understand how an instruction could kill the program.
I was thinking of memory access to shuffle_mask, but how could this be wrong ?
pushfb just get values from the mask, not dereference it again.

anyway, i think the code given below is wrong.
I understand pushfb with shuffle nibble values to a whole byte, but the « and » done just after seems very suspect.

Any ideas ?

Thank you.


   .data
number: .dc.b '0','x'
        .ds.b 16,'0' # 8 bytes, so we need 16 chars to show them
        .dc.b '\n'
number_len: .quad .-number
hex_digits: .ascii "0123456789ABCDEF"
# nibble re-arrangement order
shuffle_mask: .dc.b 0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15

  .text

_start:
  movdqu xmm1, [hex_digits]
  mov rax,0x123456789abcdef0
  movq xmm0, rax
  pshufb xmm0,xmmword ptr [shuffle_mask]  # segfault
  pand xmm0, xmm1
  movdqu [number + 2], xmm0

zedd151

Is SSE really necessary to display the value in RAX? Or are you asking for help with SSE instructions?
SSE may be a little advanced for the Campus, imo.

If it is help with SSE that you need, maybe edit the title? The way title is worded, It sounds like you are using SSE to display the hex value in RAX.

edit to add:
QuoteThis code hangs on Linux
:rolleyes:
I missed that bit when I first read the post.
In that case, this should be in the Linux Assembly board.
Or at the very least, put Linux in the title to avoid confusion.  :wink2:

I removed my Windows Masm64 example and attachment. It wasn't apparent at first, that this is for Linux.
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

mulu64

Yes i'd like to use SSE instructions. I know how to do it with « classicals » instructions, but there is a loop involved, it could be avoided. I feel it can be serialized, but i have almost no knowledge of SSE (or AVX).

And it's not a linux problem. I was careful posting only generic multi-OS code.
I bet my SSE unit it would be the same with windows (you may get a BSOD instead, but you see the idea)

zedd151

Ah, okay.
I misinterpreted your intention.

There are a few members fluent in the use of SSE, that might be able to help you.
To me, it's a foreign language that I do not fully understand.   :biggrin:
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

TimoVJL

May the source be with you

mulu64

#5
Well done TimoVJL ! I have just added
.align 16 before shuffle_mask declaration.
It doesn't hang anymore !

For the moment, it prints out « 0x048@ », so not exactly a success.

Now i can debug, print XMM registers and so ...

I will post here when i have news.
(if someone has ideas on this, don't hesitate  :azn: )

Thank you.