The MASM Forum

64 bit assembler => 64 Bit Assembler => Topic started by: HSE on April 27, 2022, 01:07:01 AM

Title: qword to unicode?
Post by: HSE on April 27, 2022, 01:07:01 AM
Hi All!

Somebody have a 64 bit procedure to convert signed qword integers to unicode string?

Thanks in advance, HSE.
Title: Re: qword to unicode?
Post by: hutch-- on April 27, 2022, 01:10:26 AM
Hector,

What about one of the MSVC functions ?
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 01:35:38 AM
Hi Hutch!

What about one of the MSVC functions ?

No, because I want that for programs running directly from UEFI.

It's not a critical problem because I can load integer to FPU, and float to unicode work well, but I think a more direct solution could be better.
Title: Re: qword to unicode?
Post by: Biterider on April 27, 2022, 02:22:01 AM
Hi HSE
What format? Decimal, Hex, Bin?


Biterider
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 04:28:56 AM
Hi Biterider!

What format? Decimal, Hex, Bin?

Decimal
Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 05:48:44 AM
Code: [Select]
include \Masm32\MasmBasic\Res\JBasic.inc ; ## builds in 32- or 64-bit mode with UAsm, ML, AsmC ##
q2aBuffer db 80 dup(?)
.code
q2a:
 push rsi
 push rdi
  mov rsi, offset q2aBuffer+32
  lea rdi, [rsi-32]
  FBSTP REAL10 ptr [rsi]
  mov ecx, REAL10
@@: movzx edx, byte ptr [rsi+rcx]
test edx, edx
je NoNumber
mov al, dl
sar al, 4
and al, 15
add al, "0"
stosw
mov al, dl
and al, 15
add al, "0"
stosw
NoNumber:
dec ecx
jns @B
  pop rdi
  pop rsi
  ret
MyQ QWORD 123456789012345678
Init ; OPT_64 1 ; put 0 for 32 bit, 1 for 64 bit assembly
  PrintLine Chr$("This program was assembled with ", @AsmUsed$(1), " in ", jbit$, "-bit format.")
  fild MyQ
  call q2a
  jinvoke printf, Chr$("Result=%ls"), offset q2aBuffer
EndOfCode

Output:
Code: [Select]
This program was assembled with ml64 in 64-bit format.
Result=123456789012345678
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 07:02:04 AM
Hi JJ!

Very strange:
Code: [Select]
\Masm32\MasmBasic\Res\JBasic.inc(554) : fatal error A1000:cannot open file : \Masm32\MasmBasic☺That is from command line because RichMasm have some path problem.

Using procedure:
Code: [Select]
  numero  qword 15
result:
Code: [Select]
q2aBuffer = 1F5F [q2u.asm, 226]
Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 07:19:53 AM
Which assembler?
Does \Masm32\MasmBasic\Res\JBasic.inc exist?
Can you post the code that produces garbage for numero 15?
Here everything works fine, and I see no reason why it shouldn't work :rolleyes:
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 07:42:24 AM
Which assembler?
Code: [Select]
*** Start D:\masm32\MasmBasic\Res\bldallRM.bat ***
*** 64-bit assembly ***

*** Assemble, link and run q2a ***

*** Assemble using \masm32\bin64\ml64  ***
El sistema no puede encontrar la ruta especificada.
*** Assembly error ***

Does \Masm32\MasmBasic\Res\JBasic.inc exist?
The error is in JBasic.inc

Can you post the code that produces garbage for numero 15?
Code: [Select]
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

% include @Environ(OBJASM_PATH)\Code\Macros\Model.inc   ;Include & initialize standard modules
SysSetup OOP, WIDE_STRING, NUI64, DEBUG(CON)            ;Load OOP files and basic OS support

    .data

        ConInput    CHR 10 DUP(0)                       ;Get some space for the console input buffer
        dBytesRead  DWORD       0

        numero      qword     15
        q2aBuffer   db 80 dup(?)
       
    .code

q2a:
    push rsi
    push rdi
    mov rsi, offset q2aBuffer+32
    lea rdi, [rsi-32]
    FBSTP REAL10 ptr [rsi]
    mov ecx, REAL10
@@: movzx edx, byte ptr [rsi+rcx]
test edx, edx
je NoNumber
mov al, dl
sar al, 4
and al, 15
add al, "0"
stosw
mov al, dl
and al, 15
add al, "0"
stosw
NoNumber:
dec ecx
jns @B
  pop rdi
  pop rsi
  ret

start proc

    SysInit
    DbgClearAll                                           

    fild numero
    call q2a

    DbgStrA q2aBuffer
 
    DbgText "Press \[ENTER\] to continue..."
    invoke CreateFile, $OfsCStr("CONIN$"), GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0
    invoke ReadFile, xax, addr ConInput, sizeof(ConInput), addr dBytesRead, NULL

    SysDone
    invoke ExitProcess,0

    ret

start endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end

Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 09:21:43 AM
Code: [Select]
*** Start D:\masm32\MasmBasic\Res\bldallRM.bat ***
*** 64-bit assembly ***

*** Assemble, link and run q2a ***

*** Assemble using \masm32\bin64\ml64  ***
El sistema no puede encontrar la ruta especificada.
*** Assembly error ***
So you are using an OPT_Assembler \masm32\bin64\ml64 in your source... sorry, that won't work. RichMasm assumes that all your tools reside in \masm32\bin\*. Copy ml64.exe there, and use OPT_Assembler ml (or let RichMasm use the default \masm32\bin\UAsm64.exe) :cool:

Quote
Does \Masm32\MasmBasic\Res\JBasic.inc exist?
The error is in JBasic.inc

Right - sorry. So what is at the error line 554 in your JBasic.inc, causing fatal error A1000:cannot open file : \Masm32\MasmBasic☺?

Code: [Select]
  Open "I", #0, repargA(fname)
  xchg rsi, rax
  jinvoke GetFileSize, rsi, addr bytesWritten
  inc rax
  xchg rax, rdi           <<<<<<<<<<<<<<<<<<<<<<<<< line 554 <<<<<<<<<<<<<<<<<<
  jinvoke HeapAlloc, MbProHeap, HEAP_GENERATE_EXCEPTIONS, rdi
  mov MbFileReadPtr, rax
  jinvoke ReadFile, rsi, rax, rdi, addr bytesWritten, 0
  mov rdx, MbFileReadPtr

Quote
Can you post the code that produces garbage for numero 15?
Code: [Select]
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
...

I can't see where it fails, can you post the exe, please?
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 09:50:16 AM
... sorry, that won't work. RichMasm assumes that all your tools reside in \masm32\bin\*

All 64 bits tools are in bin64 folder because Masm64 SDK standard, but not problem. We can wait until you make the corrections  :biggrin: :biggrin: :biggrin:

Right - sorry. So what is at the error line 554 in your JBasic.inc, causing fatal error A1000:cannot open file : \Masm32\MasmBasic☺?

 :biggrin: I have DualMacs.inc but I lost DualWin.inc somewhere. And I never see some pt.inc. For sure I miss some update.

I can't see where it fails, can you post the exe, please?
Adjunted.
Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 06:25:30 PM
I can't see where it fails, can you post the exe, please?
Adjunted.
Quote
@@:   movzx ecx, byte ptr [rsi+rdx]
   jecxz NoNumber
   mov eax, ecx
   sar al, 4
   and al, 15
   add al, "0"
   stosw
Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 08:21:48 PM
In this forum, beating the CRT by at least a factor 5 is our favourite pastime :biggrin:

Code: [Select]
This program was assembled with ml64 in 64-bit format.
561 ticks for crt swprintf
Result=123456789012345678
109 ticks for q2a
Result=123456789012345678

Code: [Select]
This program was assembled with ml in 32-bit format.
687 ticks for crt swprintf
Result=123456789012345678
109 ticks for q2a
Result=123456789012345678
Title: Re: qword to unicode?
Post by: HSE on April 27, 2022, 09:54:53 PM
 :biggrin:
Code: [Select]
numero      qword 5000
Code: [Select]
q2aBuffer = 50 [q2u.asm, 69]
Title: Re: qword to unicode?
Post by: jj2007 on April 27, 2022, 10:24:43 PM
:biggrin:
Code: [Select]
numero      qword 5000
Code: [Select]
q2aBuffer = 50 [q2u.asm, 69]

Code: [Select]
Result=5000 :biggrin:

Post your exe...
Title: Re: qword to unicode?
Post by: HSE on April 28, 2022, 12:13:30 AM
I found a very interesting procedure from bitRAKE for ASCII, pretty easy to make for Unicode:
Code: [Select]
.data
    align 64
    digit_table dw '0','1','2','3','4','5','6','7','8','9'
                dw 'A','B','C','D','E','F','G','H','I','J'
                dw 'K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'
.code
      ;-------------------------------------------------------------------------------
      ;  Proc UINT64__Baseform
      ;  Modification from bitRAKE's fasmg_playground
      ;  https://github.com/bitRAKE/fasmg_playground/blob/master/string/baseform.asm
      ;-------------------------------------------------------------------------------

UINT64__Baseform:
; RAX number to convert
; RCX number base to use [2,36]
; RDI string buffer of length [65,14] bytes
       push rbx
       push rdi
       lea rdi, q2aBuffer
       lea rbx, digit_table
push 0
A: xor edx,edx
div rcx
push qword ptr [rbx+rdx*2]
test rax,rax
jnz A

B: pop rax
stosw
test al,al
jnz B
       mov rax, rdi         ; comment for timing
       pop rdi
       pop rbx
ret
; RCX unchanged
; RAX end of null-terminated string

Code: [Select]
    mov rax, 1500
    mov rcx, 10
    call UINT64__Baseform

LATER: JJ your algorithm fail because can not to manage a "00" byte  :thdn:    :sad:
Title: Re: qword to unicode?
Post by: HSE on April 28, 2022, 01:15:25 AM
JJ, this work:
Code: [Select]
q2a:
  push rsi
  push rdi
  mov rsi, offset q2aBuffer+32
  lea rdi, [rsi-32]
  FBSTP REAL10 ptr [rsi]
 
  push REAL10
  pop rdx
  mov r8, 0
@@:
movzx ecx, byte ptr [rsi+rdx]
        add r8, rcx
        test r8, r8
        je NoNumber
        mov r8, 1
mov eax, ecx
shr al, 4
or al, "0"
stosw
mov al, cl
and al, 15
or al, "0"
stosw
NoNumber:
dec rdx
jns @B
  pop rdi
  pop rsi
  ret

Once you have a number, "00" is valid,
Title: Re: qword to unicode?
Post by: jj2007 on April 28, 2022, 02:19:52 AM
Clever :thumbsup:

Do you really need the test r8, r8?
Title: Re: qword to unicode?
Post by: HSE on April 28, 2022, 02:23:50 AM
Do you really need the test r8, r8?

Clever :thumbsup:
Title: Re: qword to unicode?
Post by: Biterider on April 28, 2022, 07:09:06 AM
Hi
I checked both procs, UINT64__Baseform (bitRAKE) and q2a.
Apart from the ugly leading "0" of q2a, UINT64__Baseform is faster (depending on argument size) and the representation base can be changed.

Argument = 123 => 2x faster (Base = 10)
Argument = 1234567890 => same performance

A signed version is not too hard to code.

Biterider
Title: Re: qword to unicode?
Post by: jj2007 on April 28, 2022, 10:44:42 AM
Very nice, Biterider :thumbsup:

Code: [Select]
This program was assembled with ml64 in 64-bit format.
87      bytes for q2a
125     bytes for UINT64

1482 ticks for crt swprintf
Result=123456789
452 ticks for q2a
484 ticks for q2a
452 ticks for q2a
468 ticks for q2a
Result=123456789
468 ticks for UINT64__Baseform
484 ticks for UINT64__Baseform
468 ticks for UINT64__Baseform
468 ticks for UINT64__Baseform
Result=123456789

For short strings up to 12345678, your 64-bit code is faster; above 123456789 mine is faster. Your 32-bit version is significantly faster.

The leading zero problem is solved. My routine is signed, but that's a minor difference, of course.

Attached source and executables (built with ML64, but I recommend UAsm64 (http://www.terraspace.co.uk/uasm.html#p2)).
Title: Re: qword to unicode?
Post by: HSE on April 28, 2022, 12:03:54 PM
Hi Biterider!

UINT64__Baseform is faster (depending on argument size) and the representation base can be changed.

Yes, very elegant and versatil. I think uq2baseW is an enough descriptive name.

A signed version is not too hard to code.

That could be sq2baseW.

HSE
Title: Re: qword to unicode?
Post by: HSE on April 28, 2022, 12:11:28 PM
JJ:

Have you to adjust your glasses? :biggrin:

your 64-bit code is faster; above 123456789 mine is faster. Your 32-bit version is significantly faster.

bitRAKE could sound similar to Biterider but are different known persons. I also deserve some credit, essentially I changed some "e"  by "r"  :biggrin: :biggrin: :biggrin:
Title: Re: qword to unicode?
Post by: TimoVJL on April 28, 2022, 04:53:49 PM
Old AMD
Code: [Select]

This program was assembled with ml in 32-bit format.
81      bytes for q2a
126     bytes for UINT64

281 ticks for crt swprintf
Result=123456789
31 ticks for q2a
47 ticks for q2a
47 ticks for q2a
47 ticks for q2a
Result=123456789
46 ticks for UINT64__Baseform
63 ticks for UINT64__Baseform
47 ticks for UINT64__Baseform
62 ticks for UINT64__Baseform
Result=111111111

--- hit any key ---
Code: [Select]

This program was assembled with ml64 in 64-bit format.
87      bytes for q2a
125     bytes for UINT64

219 ticks for crt swprintf
Result=123456789
31 ticks for q2a
47 ticks for q2a
46 ticks for q2a
32 ticks for q2a
Result=123456789
62 ticks for UINT64__Baseform
47 ticks for UINT64__Baseform
62 ticks for UINT64__Baseform
63 ticks for UINT64__Baseform
Result=111111111

--- hit any key ---
Title: Re: qword to unicode?
Post by: Biterider on April 29, 2022, 06:09:02 AM
Hi
While coding the signed version of UINT64__Baseform, I became unsure what we expect to see from the conversion from let's say -123 (decimal) to base 16 or to base 2. Are minus signs allowed on bases other than 10?
Does anyone know for sure the correct answer?

Biterider
Title: Re: qword to unicode?
Post by: HSE on April 29, 2022, 06:46:53 AM
Are minus signs allowed on bases other than 10?
:biggrin: Maybe is a wrong question because that is obvious.

A negative number is negative in any base, the number is always the same.

Perhaps the question is: Are used negative numbers expressed in other bases than 10?  :thumbsup: 

Just that in computation a negative number in base 2 it's not a binary number, and a negative number in base 16 is not hexadecimal (because complement and fixed size of register for binary and hexadecimal).
Title: Re: qword to unicode?
Post by: jj2007 on April 29, 2022, 09:42:14 AM
Are minus signs allowed on bases other than 10?

They are not forbidden but highly unusual. In the meantime, I gave my routines a little speed boost - grateful for some timings:

Code: [Select]
This program was assembled with UAsm64 in 64-bit format.
87      bytes for q2a
72      bytes for q2asc
117     bytes for UINT64

2699 ticks for crt swprintf
2699 ticks for crt swprintf
    Result=123456789012345678

499 ticks for q2a
515 ticks for q2a
499 ticks for q2a
    Result=123456789012345678

187 ticks for q2asc
187 ticks for q2asc
187 ticks for q2asc
    Result=123456789012345678

1030 ticks for UINT64__Baseform
1045 ticks for UINT64__Baseform
1014 ticks for UINT64__Baseform
    Result=123456789012345678


686 ticks for crt swprintf
671 ticks for crt swprintf
    Result=123

453 ticks for q2a
436 ticks for q2a
453 ticks for q2a
    Result=123

31 ticks for q2asc
31 ticks for q2asc
31 ticks for q2asc
    Result=123

156 ticks for UINT64__Baseform
156 ticks for UINT64__Baseform
172 ticks for UINT64__Baseform
    Result=123
Title: Re: qword to unicode?
Post by: HSE on April 29, 2022, 10:02:48 AM
Code: [Select]
2265 ticks for crt swprintf
2157 ticks for crt swprintf
    Result=123456789012345678

407 ticks for q2a
390 ticks for q2a
422 ticks for q2a
    Result=123456789012345678

125 ticks for q2asc
141 ticks for q2asc
109 ticks for q2asc
    Result=123456789012345678

766 ticks for UINT64__Baseform
781 ticks for UINT64__Baseform
766 ticks for UINT64__Baseform
    Result=123456789012345678


578 ticks for crt swprintf
609 ticks for crt swprintf
    Result=123

359 ticks for q2a
375 ticks for q2a
375 ticks for q2a
    Result=123

32 ticks for q2asc
15 ticks for q2asc
16 ticks for q2asc
    Result=123

109 ticks for UINT64__Baseform
110 ticks for UINT64__Baseform
125 ticks for UINT64__Baseform
    Result=123

--- hit any key ---
Title: Re: qword to unicode?
Post by: TimoVJL on April 29, 2022, 04:01:09 PM
Old AMD
Code: [Select]

This program was assembled with UAsm64 in 64-bit format.
87      bytes for q2a
72      bytes for q2asc
117     bytes for UINT64

4103 ticks for crt swprintf
4056 ticks for crt swprintf
    Result=123456789012345678

468 ticks for q2a
484 ticks for q2a
468 ticks for q2a
    Result=123456789012345678

265 ticks for q2asc
265 ticks for q2asc
281 ticks for q2asc
    Result=123456789012345678

1607 ticks for UINT64__Baseform
1607 ticks for UINT64__Baseform
1622 ticks for UINT64__Baseform
    Result=123456789012345678


936 ticks for crt swprintf
936 ticks for crt swprintf
    Result=123

375 ticks for q2a
358 ticks for q2a
359 ticks for q2a
    Result=123

63 ticks for q2asc
31 ticks for q2asc
32 ticks for q2asc
    Result=123

140 ticks for UINT64__Baseform
140 ticks for UINT64__Baseform
141 ticks for UINT64__Baseform
    Result=123



--- hit any key ---
Title: Re: qword to unicode?
Post by: jj2007 on April 29, 2022, 04:12:32 PM
Thanks, Timo & Hector :thup:
Title: Re: qword to unicode?
Post by: InfiniteLoop on April 30, 2022, 07:36:33 AM
I've been working on this very problem.
AVX512 for 16-figures : 2400mb/s
The idea was stolen  :toothy:
Code: [Select]
;==============================================================
;Integer to String Using AVX512. RCX=unsigned long long. RDX=ptr to char string.
;==============================================================
IntToChar_4 proc
mov r8,rdx
mov rdx,12379400392853802749
mov rax,rcx
mulx rax,rax,rax
mov rdx,rcx
shr rax,26
mov rdx,rax
imul rdx,100000000
sub rcx,rdx
vpxor xmm2,xmm2,xmm2 ;maintain integer domain
vpxor xmm3,xmm3,xmm3
vpbroadcastq zmm0, rax
vpbroadcastq zmm1, rcx
;vmovq xmm2, zeroZ ;original code. don't understand purpose. < 52-bit cutoff.
;vmovdqa64 zmm3, zmm2
vpmadd52luq zmm2, zmm0, zmmword ptr iFMAZ
vpmadd52luq zmm3, zmm1, zmmword ptr iFMAZ
vpbroadcastq zmm4, qword ptr TenZ
vpbroadcastq zmm5, qword ptr CharZ
vmovdqa64 zmm0, zmm5
vpmadd52huq zmm0, zmm4, zmm2
vpmadd52huq zmm5, zmm4, zmm3
vpxor xmm1,xmm1,xmm1 ;not necessary
vmovdqu xmm1, xmmword ptr permZ
vpermi2b zmm1,zmm5,zmm0
vmovdqu xmmword ptr [r8],xmm1
vzeroupper
ret
permZ BYTE 78h,70h,68h,60h,58h,50h,48h,40h,38h,30h,28h,20h,18h,10h,8h,0 ;selects bytes from 2 zmmwords. 0 to 127.
iFMAZ QWORD 0000199999999999ah,0000028f5c28f5c29h, 0000004189374bc6bh, 000000068db8bac72h, 00000000a7c5ac472h,0000000010c6f7a0ch, 0000000001ad7f29bh,00000000002af31dch ;2^52/10^y
;zeroZ QWORD 1A1A400h ;Serves no purpose. Does this become zero?
TenZ QWORD 10
CharZ QWORD '0'
IntToChar_4 endp

No idea how to efficiently do all 20-figures or remove the 0's.


Title: Re: qword to unicode?
Post by: jj2007 on April 30, 2022, 10:08:11 AM
No idea how to efficiently do all 20-figures or remove the 0's.

Keep trying :thumbsup:
Title: Re: qword to unicode?
Post by: Biterider on May 03, 2022, 12:46:29 AM
Hi
I wrote a routine that is a combination of the bitRAKE and the JJ algorithm.

It has the advantage of writing to the beginning of the destination buffer, which avoids an extra string copy in most cases. It also returns the number of bytes written to the buffer.

Performance is much better than UINT64__Baseform and slightly slower than q2asc, but when you add a string copy to the last, it far outperforms both.

In the attached file are the unsigned an signed versions out the routine.


Biterider



Title: Re: qword to unicode?
Post by: HSE on May 03, 2022, 02:03:42 AM
In the attached file are the unsigned an signed versions out the routine.

Fantastic  :thumbsup:

Meanwhile, I added to ObjMemEFI the bitRAKE algorithm:
Code: [Select]
; ==================================================================================================
; Title:      uq2baseW.asm
; Author:     Héctor S. Enrique
; Version:    C.1.0
; Notes:      Version C.1.0, April 2022
;               - First release.
; ---------------------------------------------------------------
;  Modification from bitRAKE's Proc UINT64__Baseform in fasmg_playground
;  https://github.com/bitRAKE/fasmg_playground/blob/master/string/baseform.asm
; ==================================================================================================

% include @Environ(OBJASM_PATH)\\Code\\OA_SetupEFI.inc
% include &ObjMemPath&ObjMem.cop
.data

    align 64
    digit_table dw '0','1','2','3','4','5','6','7','8','9'
                dw 'A','B','C','D','E','F','G','H','I','J'
                dw 'K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'

.code

; ——————————————————————————————————————————————————————————————————————————————————————————————————
; Procedure:  uq2baseW
; Purpose:    Converts a QWORD to its base WIDE string representation.
; Arguments:  Arg1: -> Destination WIDE string buffer.
;             Arg2: QWORD value.
;             Arg3: QWORD base.
; Return:     Nothing.
; Notes:      In code


align ALIGN_CODE
uq2baseW proc uses xbx xsi xdi lpBuffer:POINTER, uqValue:QWORD, uqBase:QWORD
    mov rax, uqValue        ; RAX number to convert
    mov rcx, uqBase         ; RCX number base to use [2,36]
    mov rdi, lpBuffer       ; RDI string buffer of length [65,14] bytes
    lea rbx, digit_table
push 0
A: xor edx,edx
div rcx
push qword ptr [rbx+rdx*2]
test rax,rax
jnz A

B: pop rax
stosw
test al,al
jnz B
       mov rax, rdi         ; comment for timing
ret

; RCX unchanged
; RAX end of null-terminated string

uq2baseW endp

end
Title: Re: qword to unicode?
Post by: jj2007 on May 03, 2022, 09:18:04 AM
I wrote a routine that is a combination of the bitRAKE and the JJ algorithm.

Your uqw2dec is pretty fast :thumbsup:

Code: [Select]
This program was assembled with UAsm64 in 64-bit format.
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz

328 ticks for uqw2dec
312 ticks for uqw2dec
312 ticks for uqw2dec
    Result=1234567890123456789

249 ticks for q2asc
250 ticks for q2asc
249 ticks for q2asc
    Result=1234567890123456789

172 ticks for uqw2dec
156 ticks for uqw2dec
172 ticks for uqw2dec
    Result=1234567890

202 ticks for q2asc
125 ticks for q2asc
125 ticks for q2asc
    Result=1234567890

250 ticks for q2asc
249 ticks for q2asc
    Result=1234567890123456789

1420 ticks for Baseform (bitRAKE)
1388 ticks for Baseform (bitRAKE)
    Result=1234567890123456789

96      bytes for uqw2dec
72      bytes for q2a32
88      bytes for q2asc
80      bytes for UINT64

It has the advantage of writing to the beginning of the destination buffer, which avoids an extra string copy in most cases.

That has been solved some time ago for q2asc.
Title: Re: qword to unicode?
Post by: Biterider on May 03, 2022, 03:31:32 PM
Hi JJ

That has been solved some time ago for q2asc.

That's great!
I got my q2asc version from here http://masm32.com/board/index.php?topic=10022.15#:~:text=q2asc.zip%20(6.04%20kB%20%2D%20downloaded%207%20times.) (http://masm32.com/board/index.php?topic=10022.15#:~:text=q2asc.zip%20(6.04%20kB%20%2D%20downloaded%207%20times.)). Is there a new one?

Biterider
Title: Re: qword to unicode?
Post by: jj2007 on May 03, 2022, 07:28:04 PM
Is there a new one?

Hi Biterider,

There are many new ones, it's a mess :badgrin:

This is the tail of my current q2asc version. As you can see, it does indeed a 3*16=48 bytes copy, but it's fast:

Code: [Select]
  movups xmm0, [rdi] ; src
  movups xmm1, [rdi+16]
  if @64
movups xmm2, [rdi+32] ; not needed in 32-bit code because limited to DWORD
  endif
  movaps [rax], xmm0 ; dest is align 16
  movaps [rax+16], xmm1
  if @64
movaps [rax+32], xmm2
  endif
  pop rbx
  ife @64
pop rcx
  endif
  pop rdi
  ret
Title: Re: qword to unicode?
Post by: InfiniteLoop on May 04, 2022, 07:48:22 AM
I tested BiteRider's code. VS2022 doesn't like macro statements.
For reference SPrintf(): ~100mb/s using random xorshift64 unsigned long longs.
The naiive 20-figure "divide by 10" loop achieves 287mb/s and removes zero's.
This "SWAR" algorithm is the fastest (scalar) yet ~973mb/s, although its still 16-figures with the zeros.
Code: [Select]
;==============================================================
;Integer to String using SWAR method. RCX=num RDX=str
;==============================================================
EncodeTens proc ;rcx,rdx
shl rdx,32
or rcx,rdx
mov rax,20972
imul rax,rcx
shr rax,21
mov r8, 7f0000007fh ;((merged * 10486ULL) >> 20) & ((0x7FULL << 32) | 0x7FULL);
and rax,r8 ;top
mov rdx,100
imul rdx,rax
sub rcx,rdx ;bottom
shl rcx,16
add rcx, rax ;hundreds
mov rax,103
imul rax,rcx
shr rax, 10 ;tens
mov r8,0f000f000f000fh
and rax,r8
lea rdx, [rax+rax]
lea rdx, [rdx*4+rdx]
sub rcx,rdx
shl rcx,8
add rax,rcx
ret
EncodeTens endp

IntToChar_SWAR proc
mov r11,rdx
mov rdx,12379400392853802749
mov r8, 100000000
mov rax,rcx
mulx rax,rax,rax
shr rax,26 ;top
imul r8,rax
sub rcx,r8 ;bottom
push rcx
mov ecx,3518437209
imul rcx,rax
shr rcx,45 ;top\10^4
mov edx,10000
imul edx,ecx
sub eax,edx
mov edx,eax
call EncodeTens
mov r10,3030303030303030h
add rax,r10
mov qword ptr [r11],rax
pop rax
mov ecx,3518437209
imul rcx,rax
shr rcx,45 ;top\10^4
mov edx,10000
imul edx,ecx
sub eax,edx
mov edx,eax
call EncodeTens
add rax,r10
mov qword ptr [r11+8],rax
ret
IntToChar_SWAR endp
;==============================================================
Title: Re: qword to unicode?
Post by: jj2007 on May 04, 2022, 08:44:59 AM
Looks interesting, but can you post working code? What does RCX=num RDX=str mean?
Title: Re: qword to unicode?
Post by: jj2007 on May 05, 2022, 04:57:06 AM
See new Lab post The joy of beating the CRT by a factor 10 (http://masm32.com/board/index.php?topic=10037.0).

When debugging some benchmarks, I stumbled over some code that looked very familiar:
Code: [Select]
    .while (eax > 0)
      mov ebx,eax
      mul ecx
      shr edx, 3
      mov eax,edx
      lea edx,[edx*4+edx]
      add edx,edx
      sub ebx,edx
      add bl,'0'
      mov [edi],bl
      add edi, 1
    .endw

I'm sure Biterider will recognise it, too :biggrin:

Check \Masm32\m32lib\dwtoa.asm :cool:
Title: Re: qword to unicode?
Post by: Biterider on May 05, 2022, 06:45:45 AM
Hi

I'm sure Biterider will recognise it, too :biggrin:
Seems quite familiar to me :biggrin:

Today I found some time to play with this procedure a bit more. I have tried to combine all the code pieces we discussed before, using all available x64 registers, removing all unnecessary frame instructions and interleaving other instructions.
I came up with a combination that gave the best results on my machine and looks like this:
Code: [Select]
OPTION PROC:NONE
uqw2dec2W proc pBuffer:POINTER, qNumber:QWORD
  sub rsp, 32h
  lea r9, [rsp + 30h]
  mov rax, rdx
  mov word ptr [r9], 0
  mov r10, 0CCCCCCCCCCCCCCCDh
@@:
  sub r9, 2
  mov r8, rax
  mul r10
  shr rdx, 3
  mov rax, rdx
  lea rdx, [4*rdx + rdx]
  lea rdx, [2*rdx - "0"]
  sub r8, rdx
  mov [r9], r8w
  test rax, rax
  jne @B
  movups xmm0, [r9]
  movups xmm1, [r9 + 16]
  movups xmm2, [r9 + 32]
  add rsp, 32h
  movups [rcx], xmm0
  mov rax, rsp
  movups [rcx + 16], xmm1
  sub rax, r9
  movups [rcx + 32], xmm2
  ret
uqw2dec2W endp
OPTION PROC:DEFAULT

The only requirement is that the destination buffer has at least 48 bytes.
On return, eax contains the number of bytes written, including the zero termination char.

Biterider
Title: Re: qword to unicode?
Post by: jj2007 on May 05, 2022, 07:21:41 AM
Looks good, Biterider :thumbsup:

Code: [Select]
This program was assembled with UAsm64 in 64-bit format.
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz

905 ticks for uqw2dec
889 ticks for uqw2dec
890 ticks for uqw2dec
    Result=1234567890123456789

436 ticks for uqw2dec2W
437 ticks for uqw2dec2W
453 ticks for uqw2dec2W
    Result=1234567890

421 ticks for q2asc
421 ticks for q2asc
421 ticks for q2asc
    Result=1234567890

842 ticks for q2asc
858 ticks for q2asc
796 ticks for q2asc
    Result=1234567890123456789

105     bytes for uqw2dec
88      bytes for q2asc
Title: Re: qword to unicode?
Post by: Biterider on May 05, 2022, 04:18:29 PM
Hi JJ
Thank you for sharing your recent modifications.  :thumbsup:

I tried the lea-not-lea sequence in the main loop, but didn't get the improvement you see on your machine.

The last real boost came from the XMM copy you introduced recently. I didn't use the aligned write because I often concatenate strings and the target isn't guaranteed to be aligned. I timed the change and didn't see a disadvantage, but that may be different on other CPUs.
The returned value of bytes written is very useful for a general API and has little impact on timing.

Biterider
Title: Re: qword to unicode?
Post by: jj2007 on May 05, 2022, 06:25:53 PM
I tried the lea-not-lea sequence in the main loop, but didn't get the improvement you see on your machine.

The lea-not-lea boost is not that big, but it saves one test rax, rax, of course. The current MasmBasic Str$() uses it in the DWORD and QWORD to Ansi versions. My DWORD to Ansi is roughly 30% faster than the Masm32 SDK dwtoa.

Quote
The last real boost came from the XMM copy you introduced recently. I didn't use the aligned write because I often concatenate strings and the target isn't guaranteed to be aligned. I timed the change and didn't see a disadvantage, but that may be different on other CPUs.
The returned value of bytes written is very useful for a general API and has little impact on timing.

Unaligned write, too, for the latest MasmBasic version (http://masm32.com/board/index.php?topic=94.0). However, I chose to return the end position of the last write, as demonstrated in the Lab post The joy of beating the CRT by a factor 10 (http://masm32.com/board/index.php?topic=10037.0).
Title: Re: qword to unicode?
Post by: Biterider on May 23, 2022, 04:19:19 AM
Hi
Searching through previously written code I found one by P. Dixon. It's not new as it's been discussed here several times (check the old forum).

It took me some time to build and test the x64 version.
Testing the code is not easy because checking all possible input values takes ages.
The performance is outstanding, surpassing the previously discussed routines by a factor of ~2.

Regards, Biterider
Title: Re: qword to unicode?
Post by: HSE on May 23, 2022, 06:48:06 AM
 :thumbsup: Is working.
Title: Re: qword to unicode?
Post by: HSE on May 24, 2022, 01:43:28 AM
Hi Biterider!

Look like there is a problem with ZTC, and previous string in buffer remain moved to right (at least running from UEFI)

Regards, HSE.
Title: Re: qword to unicode?
Post by: Biterider on May 24, 2022, 02:53:04 AM
Hi HSE
Thanks for the feedback.
For a better understanding, could you please write 3-4 lines of code showing the problem?

Biterider

PS: I don't think that it is an UEFI thing  :tongue:
Title: Re: qword to unicode?
Post by: Biterider on May 24, 2022, 05:42:41 AM
Hi HSE
I think I found the problem. There was a typo when setting the ZTC in the wide version of the proc.
I replaced the download from the post above (reply #44).

Biterider
Title: Re: qword to unicode?
Post by: HSE on May 24, 2022, 07:18:37 AM
Hi Biterider

Perfect now  :thumbsup:

HSE