The MASM Forum

Microsoft 64 bit MASM => MASM64 SDK => Topic started by: HSE on June 09, 2024, 08:03:22 AM

Title: cmd_tail
Post by: HSE on June 09, 2024, 08:03:22 AM
Hi all!

I was thinking that cmd_tail function don't worked at all. I don't remember what test I made, but perhaps failed because almost nothing  :biggrin:

The function in SDK don't preserve registers:
Code ("Original cmd_tail.asm") Select
    mov rax, r14                        ; return output buffer in RAX
    ret
  zeroexit:
    mov rax, r14
    mov BYTE PTR [rax], 0              ; zero 1st byte of r14
    mov rax, r14                        ; return zeroed buffer in RAX
    RestoreRegs
    ret

That can be changed to:
Code ("Modification cmd_tail.asm") Select
    mov rax, r14                       
    jmp exit                            ; return output buffer in RAX
  zeroexit:
    mov rax, r14                        ; return zeroed buffer in RAX
    mov BYTE PTR [rax], 0              ; zero 1st byte of r14
  exit:
    RestoreRegs
    ret

And work pretty well  :thumbsup:

Regards, HSE.
Title: Re: cmd_tail
Post by: jj2007 on June 11, 2024, 10:04:59 PM
The second line solves a serious problem with cmd_tail:
  lbl:
    call cmd_tail ; get commandline
    mov byte ptr [rax+r13], 0 ; put zero delimiter
    mov pFile, rax ; store rax in variable

Inter alia, the missing zero delimiter was the reason why the Project macros didn't work when teditor was invoked from the commandline with the filename as argument.
Title: Re: cmd_tail
Post by: sinsi on June 12, 2024, 12:01:27 AM
    mov r12, rvcall(GetCommandLine)
    mov r13, len(r12)
    mov r14, alloc(r13)
Does len take into account the zero terminator?
Otherwise alloc would be one byte short (which would probably be swallowed by alignment, but...)
Title: Re: cmd_tail
Post by: jj2007 on June 12, 2024, 03:24:30 AM
The proper solution would clearly be to fix the library source, i.e. cmd_tail.asm
Title: Re: cmd_tail
Post by: HSE on June 12, 2024, 11:59:09 PM
Quote from: sinsi on June 12, 2024, 12:01:27 AMDoes len take into account the zero terminator?

:thumbsup: Fantastic Sinsi. That it's the point.

Then must be:
    mov r12, rvcall(GetCommandLine)
    mov r13, len(r12)
    add r13, 1
    mov r14, alloc(r13)
Title: Re: cmd_tail
Post by: Rockphorr on August 30, 2024, 01:34:40 AM
Quote from: HSE on June 12, 2024, 11:59:09 PM
Quote from: sinsi on June 12, 2024, 12:01:27 AMDoes len take into account the zero terminator?

:thumbsup: Fantastic Sinsi. That it's the point.

Then must be:
    mov r12, rvcall(GetCommandLine)
    mov r13, len(r12)
    add r13, 1
    mov r14, alloc(r13)


Why uses add instead inc ?
Title: Re: cmd_tail
Post by: NoCforMe on August 31, 2024, 04:57:24 AM
Because it'll make it 0.00001% faster?
Title: Re: cmd_tail
Post by: sinsi on August 31, 2024, 09:06:53 AM
Quote from: Rockphorr on August 30, 2024, 01:34:40 AMWhy uses add instead inc ?

Long ago, a guru told us to
Quote16.2 INC and DEC
The INC and DEC instructions do not modify the carry flag but they do modify the other
arithmetic flags. Writing to only part of the flags register costs an extra µop on some CPUs.
It can cause a partial flags stalls on some older Intel processors if a subsequent instruction
reads the carry flag or all the flag bits. On all processors, it can cause a false dependence
on the carry flag from a previous instruction.
Use ADD and SUB when optimizing for speed. Use INC and DEC when optimizing for size or
when no penalty is expected.
Title: Re: cmd_tail
Post by: NoCforMe on August 31, 2024, 09:15:17 AM
So what I wrote was true ...
Title: Re: cmd_tail
Post by: sinsi on August 31, 2024, 10:31:39 AM
Quote from: NoCforMe on August 31, 2024, 09:15:17 AMSo what I wrote was true ...
Not sure about the percentage, but yes - for a particular series of CPU (early P4s I think).
Title: Re: cmd_tail
Post by: jj2007 on August 31, 2024, 03:28:21 PM
Quote from: sinsi on August 31, 2024, 09:06:53 AMUse ADD and SUB when optimizing for speed. Use INC and DEC when optimizing for size or
when no penalty is expected.

We've tested that several times in the Lab. No difference.
Title: Re: cmd_tail
Post by: Rockphorr on September 01, 2024, 04:11:38 PM
Quote from: jj2007 on August 31, 2024, 03:28:21 PM
Quote from: sinsi on August 31, 2024, 09:06:53 AMUse ADD and SUB when optimizing for speed. Use INC and DEC when optimizing for size or
when no penalty is expected.

We've tested that several times in the Lab. No difference.



 So, it is just the preferences.