News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

"Hello masm32", not a BOT, new member

Started by LordAdef, January 22, 2017, 09:42:24 AM

Previous topic - Next topic

LordAdef

JJ:
QuoteSo you get an assembler error. Where did you find numbers.asm? Which error, which line? Same error in RichMasm?

Numbers.asm is the demo 5 in masm32 tutorials. Hutch already pin pointed the issue. masm32rt is the way to go!

LordAdef

Hutch
QuoteIf you wanted to use the older form in the original, it would look like this.

Thanks! I´m fine with the modern way. I bloody love masm32rt.inc!!!!

Sorry to bother you guys!

jj2007

Quote from: LordAdef on January 26, 2017, 12:03:12 PM
Numbers.asm is the demo 5 in masm32 tutorials.

Glad you solved it :t

Tmp_File.asm(73) : Error A2159: INVOKE requires prototype for procedure
sval(2)[macros.asm]

      sval MACRO lpstring
        IFNDEF __UNICODE__
          invoke crt_atol,reparg(lpstring)
        ELSE
          invoke crt__wtol,reparg(lpstring)
        ENDIF
        EXITM <eax>
      ENDM

LordAdef

Hi guys,

So.....I´ve been doing my homework... and I wrote this small (and useless) program. My first!

I wanted to do something very simple, so that I could accomplish and show it to you gurus.

It DOES work!!

Your comments will be invaluable, from every point of view... Not just in ways to optimize it but also how it´s commented and presented.

It was all coded in qEditor and I didn´t need debugging for this one.

Please, be kind, I´m actually quite proud of my small little beast  :biggrin:

Any comment will be of great help.
Cheers

Alex
ps: .zip file attached

jj2007

Hi Alex,
Nice effect, it works like a charm :t

Little suggestions:
    mov esi, offset Ln1
 
    ; xor eax, eax ; will be immediately overwritten
    mov eax, SIZEOF Ln1 ; no Ssize needed                  ;Get item based on LoopC index
    mul LoopC
    add esi, eax                    ;ADD size => next line

    print esi, 13, 10               ;PRINT next

    inc LoopC                       ;increment Loop
    ; xor edx, edx                    ;Get =>LoopC mod LINES
    mov eax, LoopC
    cdq ; sets edx to zero but is one byte shorter than xor edx, edx


For assigning small numbers (-128 ... +127) to registers, there is the m2m macro:
m2m eax, 123
mov eax, 123


Same result but m2m is 2 bytes shorter. Do not use it in a speed-critical innermost loop (i.e. one with >1Mio iterations).
m2m means "memory to memory", it does a
  push 123
  pop eax
which is not exactly mem to mem but it works :P

You can eliminate one jump, but attention to the logic:
    .ELSE
        inc LineC


We all love eliminating jumps 8)

LordAdef

Thank you so much JJ!!!!

I made all the changes, and also managed to eliminate the jump:

Quote.IF LineC==LINES                ;AFTER lines are printed
        inc LineOffSet              ;increment LineOffSet and make it modular
        xor edx, edx
        mov eax, LineOffSet
        mov ebx, LINES
        div ebx
        mov LoopC, edx              ;add LineOffSet to LoopC, so to increment the starting line
       
        mov LineC, 0                ;Set to 0 instead of 1, to eliminate the jump
        loc 0,0                     ;Return console cursor to line 0
    .ENDIF
   
    inc LineC   

I´m thinking that maybe I could eliminate LineC. I´m using LoopC for modulus and LineC to keep track of my Lines. Maybe, I could increment a LOCAL variable (or a safe register) based on LoopC and throw LineC away. Not sure if it will take any considerable load though.


LordAdef

Yes, I eliminated LineC and am using edi for the job. I guess I get some boost from this.

Any place you think I should use m2m in the code?

jj2007

Quote from: LordAdef on January 30, 2017, 09:50:05 PMAny place you think I should use m2m in the code?

m2m eax, SIZEOF Ln1

But really, this is just a habit of old assembler programmers who like optimising ;)

Other example:
         and LineC, 0                ;reset LineC
3 bytes shorter.

(some may argue that this is a) slower and b) useless, because there is more than enough RAM around. Truth is that it's no good in a tight innermost loop; but anywhere else it is good, because the instruction cache is very small, and it may make a difference if a whole loop fits into that cache because you saved a byte here and there)

LordAdef

Thanks JJ, ALL comments well noted and learned!

hutch--

Just a comment on the macro "m2m". The rough distinction is you use m2m in high level code to make it more compact and easier to read. In any code that is time critical it is faster to use a register.

mov reg, memory2
mov memory1, reg

Its common from the 16 bit DOS era for byte pinching here and there but we have very different hardware and OS conditions these days and you generally chase speed before size as the size difference is trivial where the speed difference is worth having. Speed comes from picking the right algorithm, coding the algorithm carefully with the right choice of instructions, minimising the number of branches and keeping memory access to a minimum (memory is slow in comparison to registers).

This stuff comes with practice and the CLOCK, when speed matters, your only real friend is the CLOCK. Its like in motor racing, when the flag drops the bullshit stops.

LordAdef

Thanks Hutch! I see what you and JJ mean. I incorporated "m2m" and "and" in my codes. I imagine these principles get more valid when you have larger codes, when saving bytes here and there will count (as pointed out by JJ).

LordAdef

My first prog was reading the lines as variables from the data section. I thought it would be nice if it read the data from file.

So I coded my prog2. This time I made a procedure, since I intend to integrate this LoadFile proc into the other code.
I used m2m and AND, following JJ´s observation.

It was a nice exercise. I had to deal with things I haven´t touched before.
I am also learning that Assembly is a severe father! It took me half an hour to realize I had written "map1.text" instead of "map1.txt". I felt a complete idiot, since I almost got nuts trying to understand what was wrong...

I sending to the proc an offset of my string (the file name).
I´m using:
QuotepLoad proc, string:LPSTR

I tried DWORD and it worked too. I found about LPSTR searching our forum, and it was the recommended one. Would you clarify me on this?

Once again, thank you for all this support.
Alex

ps: I may not touch any C for the next decade.... Assembly is beautiful :eusa_boohoo:

jj2007

Quote from: hutch-- on January 31, 2017, 10:47:46 AMIn any code that is time critical it is faster to use a register ... your only real friend is the CLOCK

Correct. Just to illustrate this:

include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init
  loopcount=100000000
  PrintCpu 0
  Print Str$("\nTimings for %i loops:\n", loopcount)
  REPEAT 5
      NanoTimer()
      mov ecx, loopcount
      .Repeat
            mov eax, 127
            dec ecx
      .Until Zero?
      Print Str$("%i ms for mov\n", NanoTimer(ms))
      NanoTimer()
      mov ecx, loopcount
      .Repeat
            m2m eax, 127
            dec ecx
      .Until Zero?
      Print Str$("%i ms for m2m\n\n", NanoTimer(ms))
  ENDM
EndOfCode


Results:Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz

Timings for 100000000 loops:
35 ms for mov
69 ms for m2m

34 ms for mov
69 ms for m2m

35 ms for mov
103 ms for m2m

35 ms for mov
70 ms for m2m

34 ms for mov
68 ms for m2m


In short: You need ONE-HUNDRED MILLION iterations to demonstrate that m2m is "slower" than mov.

LordAdef

QuoteIn short: You need ONE-HUNDRED MILLION iterations to demonstrate that m2m is "slower" than mov

Incredible! Noted
Also you just taught me how to profile my code... I was missing this JJ. This may be my entry point to masmBasic

jj2007

Quote from: LordAdef on February 01, 2017, 07:44:37 PMI found about LPSTR searching our forum, and it was the recommended one. Would you clarify me on this?

Windows.inc: LPSTR typedef DWORD

It's exactly the same type. 90% of all code in assembler deals with dwords, the rest being REAL4/8/10, byte and word variables.

Now what is right here? Raise the question and find yourself in an ideological war between the old "hey, it's 4 bytes, isn't it?" fraction and those who come from C/C++ and see their world crumbling if you "mistype" LPSTR as DWORD :P

Btw if there is a real problem, the assembler will shout at you, at least in the 32-bit world. So don't worry.

Re LoadFile: Works like a charm, well done :t

You over-optimise a little bit, though. Once your code is running, it's always nice to see it through the eyes of a debugger...
pLoad proc, string:LPSTR       ;=============== Load, Parse file & fill array ========
    int 3                                       ;make Olly stop here when hitting F9
    and ebx,0                                   ;loop
    xor ebx, ebx                                ;shorter way of zeroing a register
    m2m esi, InputFile(string)                  ;Source file (ecx ret. size)         
    m2m edi, offset Map                         ;Destination
    mov edi, offset Map                         ;shorter - offsets are big numbers ;-)
   @@:
    m2m ecx, cWidth                             ;shorter, cWidth is a small number
    mov ecx, cWidth                             ;lenght of line
    rep movsb                                   ;copy
    inc edi                                     ;Skips 0s
    inc ebx                                     ;loop
    cmp ebx, cHeight
    jb @B
    free esi                                    ;closes it
    ret
pLoad endp



00401032        ³.  CC              int3
00401033        ³.  83E3 00         and ebx, 00000000
00401036        ³.  33DB            xor ebx, ebx
00401038        ³.  68 14224000     push offset 00402214           ; ÚArg3 = LoadFile.402214
0040103D        ³.  68 10224000     push offset 00402210           ; ³Arg2 = LoadFile.402210
00401042        ³.  FF75 08         push dword ptr [ebp+8]         ; ³Arg1 => [Arg1]
00401045        ³.  E8 C6000000     call 00401110                  ; ÀLoadFile.00401110
0040104A        ³.  8B0D 14224000   mov ecx, [402214]
00401050        ³.  A1 10224000     mov eax, [402210]
00401055        ³.  50              push eax                       ; m2m esi, InputFile(string)
00401056        ³.  5E              pop esi
00401057        ³.  68 00204000     push offset 00402000
0040105C        ³.  5F              pop edi
0040105D        ³.  BF 00204000     mov edi, offset 00402000
00401062        ³>  6A 0A           Úpush 0A
00401064        ³.  59              ³pop ecx
00401065        ³.  B9 0A000000     ³mov ecx, 0A
0040106A        ³.  F3:A4           ³rep movsb


Note, for example, that m2m esi, InputFile(string) results in a push eax, pop esi sequence; mov eax, esi has the same size, so no need to optimise here. In fact, the only valid cases for m2m are 1. memory-to-memory transfers between variables (that's why the macro was written) and 2. mov reg32, small integer (-128...+127).

A hard-core optimiser, btw, would save one byte by using xchg InputFile(string), esi:
00401045        ³.  E8 C6000000     call 00401110                  ; ÀLoadFile.00401110, read_disk_file
0040104A        ³.  8B0D 14224000   mov ecx, [402214]
00401050        ³.  A1 10224000     mov eax, [402210]
00401055        ³.  96              xchg eax, esi                  ; m2m esi, InputFile(string)


Don't make this a habit, though: You risk losing a lot of time chasing single bytes instead of concentrating on good code.

For better understand the disassembly, here an excerpt from \masm32\macros\macros.asm:

      InputFileA MACRO lpFile
...
        invoke read_disk_file,reparg(lpFile),
               ADDR ipf@__@mem@__@PtrA,
               ADDR ipf@__file__@lenA

        mov ecx, ipf@__file__@lenA   ;; file length returned in ECX
        mov eax, ipf@__@mem@__@PtrA  ;; address of memory returned in EAX
        EXITM <eax>
      ENDM