Author Topic: MasmBasic  (Read 183265 times)

2B||!2B

  • Regular Member
  • *
  • Posts: 7
Re: MasmBasic
« Reply #435 on: April 19, 2018, 04:51:57 PM »
I have a question. Why do you use FPU/MMX/SSE instructions set so often? are there any advantages by using them rather than using the standard x86 instructions set? like faster execution, smaller code size maybe?

jj2007

  • Moderator
  • Member
  • *****
  • Posts: 8230
  • Assembler is fun ;-)
    • MasmBasic
Re: MasmBasic
« Reply #436 on: April 19, 2018, 05:32:39 PM »
There are no MMX instructions in MasmBasic (MMX code is inefficient, and there are conflicts with the FPU).

For calculations, the FPU is generally being used; it's not faster (and not slower) than the corresponding SSE* instructions, but internally it gives REAL10 precision, while SSE* offers only double precision, i.e. REAL8.

SSE* code is generally not shorter than ordinary x86 (but there are exceptions). However, it's often a lot faster. Take a simple example, getting the length of a zero-delimited string. There is a little thread here showing how the Masm32 team developed the current MasmBasic Len() macro. The thread has 149 replies, meaning it took a while to find the fastest option. Check yourself how fast it is, compared to the strlen function of the C runtime library (remember C is the fastest language on Earth :P):

Code: [Select]
include \masm32\MasmBasic\MasmBasic.inc
  Init
  Recall "\Masm32\include\Windows.inc", L$() ; get a string array
  push eax ; #elements
  xor ebx, ebx
  xor edi, edi
  NanoTimer()
  .Repeat
invoke crt_strlen, L$(ebx)
add edi, eax ; calculate total len
inc ebx
  .Until ebx>=stack
  Print "The CRT needs ", NanoTimer$(), Str$(" for calculating the length of %i strings", ebx), Str$(", a total of %i bytes\n", edi)
  xor ebx, ebx
  xor edi, edi
  NanoTimer()
  .Repeat
add edi, Len(L$(ebx)) ; calculate total len
inc ebx
  .Until ebx>=stack
  pop edx
  Inkey " MB Len needs ", NanoTimer$(), Str$(" for calculating the length of %i strings", ebx), Str$(", a total of %i bytes\n", edi)
EndOfCode

Note that Windows.inc has relatively short strings. The difference between the speed of Len vs crt_strlen is much bigger for long strings. For example, if you apply the code above to a bible.txt file, with longer strings, the difference is twice as much. I won't tell you how much because there a C/C++ coders here in the forum, and they might fall into a deep depression if they see the results :icon_cool: 

Attached a full example, comparing also the Instr() performances of MasmBasic and the CRT (if that sounds interesting, see Benchmark 2: Alice in Wonderland for a CRT strstr vs Boyer-Moore comparison)

2B||!2B

  • Regular Member
  • *
  • Posts: 7
Re: MasmBasic
« Reply #437 on: April 20, 2018, 03:35:12 PM »
Ah ok i got it now thanks for clarifying that. Actually i used some FPU instructions for floating point result (like progress bar 0.# for example) but have never used SSE instructions.  :biggrin:

I won't tell you how much because there a C/C++ coders here in the forum, and they might fall into a deep depression if they see the results :icon_cool: 

Lol, they can still use inline ASM though  :biggrin: