News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Dword to ascii (dw2a, dwtoa, dw2str, Str$, ...)

Started by jj2007, April 01, 2024, 03:42:50 AM

Previous topic - Next topic

jj2007

Quote from: six_L on April 02, 2024, 08:42:46 PMOnly 10,19,28,37,46,55 is a magic number in 2-60.

Oops, that's bad news :sad:

So we'll better stick with Ray's div ebx. It's a very elegant algo indeed.

ahsat

Quote from: jj2007 on April 02, 2024, 08:58:34 PMSo we'll better stick with Ray's div ebx. It's a very elegant algo indeed.
But now you will have to make another decision. Which is valid, 0x0A or 0x0a.

ahsat

Recursion can also be used to very simply handle unzipping zips within zips, or any like problem.

daydreamer

Quote from: ahsat on April 03, 2024, 12:22:46 AMRecursion can also be used to very simply handle unzipping zips within zips, or any like problem.
I wonder if you have skill with chess recursive code to climb up and down a tree until find best chess move? 
Simpler code is do it with tic tac toe
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

jimg


six_L

Hi,jj2007

Quotethat's bad news
That's a good news because of avoid to waste your unnecessary times.

regard
Say you, Say me, Say the codes together for ever.

jimg

Finally got something to beat dwtoa consistently using Rays recursion technique.  (Apologies to Ray, we're trying to beat dwtoa in this thread, not trash your fine general purpose algorithm.)

ctAnyToAnyMul:  ; setup
  test eax,eax
  jnz sign
  mov word ptr [edi],30h
  ret

sign:
    jns pos
    mov byte ptr [edi],'-'
    neg eax
    add edi, 1
pos:
   push ebx
   mov ecx,0CCCCCCCDh  ; 8 * 1/10
   call ctanydoit
   pop ebx
   ret

align 16
ctanydoit:
  push edx ; save edx, or a digit
      mov ebx,eax         ; save original
      mul ecx             ; 0CCCCCCCDh 8 * 1/10   num * 1/10 magic number
      shr edx, 3          ; magic number fixup (divide by 8)
      mov eax,edx
      lea edx,[edx*4+edx] ; *5
      add edx,edx         ; *10
      sub ebx,edx         ; remainder
      mov edx,ebx     
   test eax,eax
   .if !zero?
      call ctanydoit ; generate next digit
   .endif
   add dl,'0'
   mov [edi],dl
   inc edi
   pop edx ; restore edx, or get next digit
   ret

Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz (SSE4)

5984    cycles for 100 * dwtoa
5921    cycles for 100 * Ray+mul

5979    cycles for 100 * dwtoa
5905    cycles for 100 * Ray+mul

5976    cycles for 100 * dwtoa
5877    cycles for 100 * Ray+mul

5978    cycles for 100 * dwtoa
5882    cycles for 100 * Ray+mul

Averages:
5978    cycles for dwtoa
5894    cycles for Ray+mul

20      bytes for dwtoa
104    bytes for Ray+mul

dwtoa                                  -123456789
Ray+mul                                -123456789

jimg

Okay, those test are probably invalid.  I added a duplicate in TestG, called it Ray+mul2, and it ran slower than the other two, even though the code was identical.  Maybe a page crossing?  Which one is faster depends on where it sits in the program and in memory I guess.  But the new routine is at least as fast as dwtoa.

6014    cycles for 100 * dwtoa
5975    cycles for 100 * Ray+mul
6083    cycles for 100 * Ray+mul2

6038    cycles for 100 * dwtoa
5914    cycles for 100 * Ray+mul
6111    cycles for 100 * Ray+mul2

6041    cycles for 100 * dwtoa
5886    cycles for 100 * Ray+mul
6074    cycles for 100 * Ray+mul2

6032    cycles for 100 * dwtoa
5870    cycles for 100 * Ray+mul
6108    cycles for 100 * Ray+mul2

Averages:
6035    cycles for dwtoa
5900    cycles for Ray+mul
6096    cycles for Ray+mul2

20      bytes for dwtoa
104     bytes for Ray+mul
104     bytes for Ray+mul2

dwtoa                                   -123456789
Ray+mul                                 -123456789
Ray+mul2                                -123456789

ahsat

Quote from: daydreamer on April 03, 2024, 12:54:31 AMI wonder if you have skill with chess recursive code
No, but I did a tic tac toe in fortran once, without recursion.

six_L

Hi,all
Perhaps the Ray's algorithm has an important research and application field in compression/encryption/large number calculation. since it includes "any".

regard
Say you, Say me, Say the codes together for ever.

ahsat

Quote from: jimg on April 03, 2024, 02:46:08 AMMaybe a page crossing?
I think you can force either or both to a page boundary with an "Align 4096" and then a "db 4090 dup (?)" and then the code.

NoCforMe

Can someone here explain how these magic numbers work (for example, how you can use them to turn a divide into a multiply)? In a simple way? My non-higher-math head hurts from all this.
Assembly language programming should be fun. That's why I do it.

Biterider

Hi
The web is full of good examples. 
This technique is now part of compiler optimisations.

https://www.bartol.udel.edu/mri/sam/Athlon_code_optimization_guide.pdf   Integer Optimizations
https://www.complang.tuwien.ac.at/papers/ertl19kps.pdf  Good intro
https://gmplib.org/~tege/divcnst-pldi94.pdf  Original publication by Torbjörn Granlund (~1991)

Biterider


NoCforMe

Thanks, but all of those are too technical or too mathematically abstract. Can't someone here explain, in simple terms, how it works?
Assembly language programming should be fun. That's why I do it.

Biterider

Hi
In chapter "2 Background" of the second link, it is explained very well.

Biterider