Print Page - New HJWasm release

Title: New HJWasm release
Post by: habran on May 17, 2016, 06:30:15 AM

Hello everyone,
there is a new HJWasm release on the Terraspace (http://www.terraspace.co.uk/hjwasm.html) with some sophisticated improvements 8)
Built in two new options for the .SWITCH block:
option SWITCHSTYLE : ASMSTYLE (default)
option SWITCHSTYLE: CSTYLE (optional)
hutch, I hope you will be happy with this one, now you have ASMSTYLE .SWITCH block, thank you for your suggestion :t
Multiple cases can be used with the comma ',' and can be continued in new line if they don't fit in one
If you don't need .DEFAULT it can be omitted in both ASMSTYLE or CSTYLE
This version also reduces memory consumption in multiple cases, it will create only one jump table for all cases in both styles.
You can switch to another style and go back to first one as many time as you want using option SWITCHSTYLE
here are some examples:

Quotemov eax, 184h
.switch eax
.case 179h,180h,1c5h,17bh,17dh,
182h,184h,185h
mov edx,1d5h
.case 1d3h
mov edx, 1d3h
.case 1f4h
mov edx, 1f4h
.case 200h
mov edx, 200h
.case 201h
mov edx, 201h
.case 202h
mov edx, 202h
.case 203h
mov edx, 203h
.default
mov edx, 0
.endswitch

option SWITCHSTYLE: CSTYLE

mov eax, 184h
.switch eax
.case 179h, 180h, 1c5h, 17bh, 17dh,
182h, 184h, 185h
mov edx, 1d5h
.break
.case 1d3h
mov edx, 1d3h
.break
.case 1f4h
mov edx, 1f4h
.break
.case 200h
mov edx, 200h
.break
.case 201h
mov edx, 201h
.break
.case 202h
mov edx, 202h
.break
.case 203h
mov edx, 203h
.break
.default
mov edx, 0
.break
.endswitch

option SWITCHSTYLE : ASMSTYLE

.switch bl
.case 202
mov edx, 202
.case 203
mov edx, 203
.case 2013
mov edx, 213
.endswitch

Title: Re: New HJWasm release
Post by: jj2007 on May 17, 2016, 09:32:47 AM

Well done :t

With my 800 lines testbed, it's much faster than ML and considerably faster than Japheth's last JWasm version:

Code Select

  OxPT_Assembler  	mlv615	; 44.0kB, 1070 ms
  OxPT_Assembler  	mlv10	; 44.0kB, 1070 ms
  OxPT_Assembler  	JWasm	; 44.0kB, 650 ms
  OxPT_Assembler  	HJWasm32	; 44.0kB, 580 ms
  OxPT_Assembler  	HJWasm64	; 44.0kB, 580 ms
  OxPT_Assembler  	asmc	; 44.0kB, 480 ms

Note that there is no measurable speed difference between the 32-bit and 64-bit versions. The latter is 63% fatter, though ;)

Same pattern with the RichMasm source (17k lines):

Code Select

OxPT_Assembler	mlv10	; 1200 ms
OxPT_Assembler	mlv615	; 1200 ms
OxPT_Assembler	JWasm		; 880 ms
OxPT_Assembler	HJWasm32	; 820 ms
OPT_Assembler	HJWasm64	; 820 ms
OxPT_Assembler	AsmC	; 740 ms

Now the surprise when building the MasmBasic library (28k lines):

Code Select

OPT_Assembler	mlv615		; 6.9 secs
OxPT_Assembler	JWasm	; 5.5 secs
OxPT_Assembler	HJWasm32	; 2.8
OxPT_Assembler	HJWasm64	; 4.2 secs
OxPT_Assembler	AsmC		; 2.1 secs

The 32-bit version is consistently 50% faster 8)

Title: Re: New HJWasm release
Post by: habran on May 17, 2016, 01:58:39 PM

Thanks jj2007 :biggrin:
HJWasm32 has to deal with less code so it can be the reason,
but I suspect that your machine is more experienced in running 32 bit and hence the speed ;)
I have here example how much has the .SWITCH block being improved:

Quote

This code:
mov eax, 1c5h
.switch eax
.case 179h,17bh,17dh,182h,184h,187h,18bh,191h,198h,
1a0h,1a2h,1a4h,1a7h,1ach,1afh,1b3h,1b5h,1b8h,
1bch,1c5h,1c8h,1cbh,1cdh,1cfh,1d1h,1d3h,1d5h,
1d7h,1d9h,1dbh,1f2h,1f4h,200h
mov edx, 200h
.case 201h
mov edx, 201h
.case 202h
mov edx, 202h
.case 203h
mov edx, 203h
.default
mov edx, 0
.endswitch

Now it makes this:
?_021 LABEL NEAR
sub rsp, 472 ; 40001165 _ 48: 81. EC, 000001D8
mov eax, 453 ; 4000116C _ B8, 000001C5
jmp ?_023 ; 40001171 _ EB, 32

; Note: No jump seems to point here
mov edx, 512 ; 40001173 _ BA, 00000200
jmp ?_026 ; 40001178 _ E9, 000000FF

; Note: No jump seems to point here
mov edx, 513 ; 4000117D _ BA, 00000201
jmp ?_026 ; 40001182 _ E9, 000000F5

; Note: No jump seems to point here
mov edx, 514 ; 40001187 _ BA, 00000202
jmp ?_026 ; 4000118C _ E9, 000000EB

; Note: No jump seems to point here
mov edx, 515 ; 40001191 _ BA, 00000203
jmp ?_026 ; 40001196 _ E9, 000000E1

?_022: mov edx, 0 ; 4000119B _ BA, 00000000
jmp ?_026 ; 400011A0 _ E9, 000000D7

?_023: cmp eax, 515 ; 400011A5 _ 3D, 00000203
ja ?_022 ; 400011AA _ 77, EF
sub eax, 377 ; 400011AC _ 2D, 00000179
jc ?_022 ; 400011B1 _ 72, E8
lea rdx, ptr [?_025] ; 400011B3 _ 48: 8D. 15, 00000037(rel)
movzx rax, byte ptr [rax+rdx] ; 400011BA _ 48: 0F B6. 04 10
lea rdx, ptr [?_024] ; 400011BF _ 48: 8D. 15, 00000003(rel)
jmp qword ptr [rdx+rax*8] ; 400011C6 _ FF. 24 C2

?_024 label qword ; switch/case jump table
dq Unnamed_80000000_0 ; 400011C9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011D1 _ 000000014000117D (d)
dq Unnamed_80000000_0 ; 400011D9 _ 0000000140001187 (d)
dq Unnamed_80000000_0 ; 400011E1 _ 0000000140001191 (d)
dq Unnamed_80000000_0 ; 400011E9 _ 000000014000119B (d)
[?_025] label byte
db 00 04 00 04 00 04 04 04 04 00 04 00 04 04 00 04 04 04 00 04 04 04
db 04 04 00 04 04 04 04 04 04 00 04 04 04 04 04 04 04 00 04 00 04 00
db 04 04 00 04 04 04 04 00 04 04 00 04 04 04 00 04 00 04 04 00 04 04
db 04 00 04 04 04 04 04 04 04 04 00 04 04 00 04 04 00 04 00 04 00 04
db 00 04 00 04 00 04 00 04 00 04 00 04 04 04 04 04 04 04 04 04 04 04
db 04 04 04 04 04 04 04 04 04 04 04 00 04 00 04 04 04 04 04 04 04 04
db 04 04 04 00 01 02 03 b8 44 00 00 00 eb 66 e8 10 fe ff ff e9 aa 00

Before:

?_021 LABEL NEAR
sub rsp, 472 ; 40001165 _ 48: 81. EC, 000001D8
mov eax, 453 ; 4000116C _ B8, 000001C5
jmp ?_023 ; 40001171 _ EB, 32

; Note: No jump seems to point here
mov edx, 512 ; 40001173 _ BA, 00000200
jmp ?_026 ; 40001178 _ E9, 000001FF

; Note: No jump seems to point here
mov edx, 513 ; 4000117D _ BA, 00000201
jmp ?_026 ; 40001182 _ E9, 000001F5

; Note: No jump seems to point here
mov edx, 514 ; 40001187 _ BA, 00000202
jmp ?_026 ; 4000118C _ E9, 000001EB

; Note: No jump seems to point here
mov edx, 515 ; 40001191 _ BA, 00000203
jmp ?_026 ; 40001196 _ E9, 000001E1

?_022: mov edx, 0 ; 4000119B _ BA, 00000000
jmp ?_026 ; 400011A0 _ E9, 000001D7

?_023: cmp eax, 515 ; 400011A5 _ 3D, 00000203
ja ?_022 ; 400011AA _ 77, EF
sub eax, 377 ; 400011AC _ 2D, 00000179
jc ?_022 ; 400011B1 _ 72, E8
lea rdx, ptr [?_025] ; 400011B3 _ 48: 8D. 15, 00000137(rel)
movzx rax, byte ptr [rax+rdx] ; 400011BA _ 48: 0F B6. 04 10
lea rdx, ptr [?_024] ; 400011BF _ 48: 8D. 15, 00000003(rel)
jmp qword ptr [rdx+rax*8] ; 400011C6 _ FF. 24 C2

?_024 label qword ; switch/case jump table
dq Unnamed_80000000_0 ; 400011C9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011D1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011D9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011E1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011E9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011F1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400011F9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001201 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001209 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001211 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001219 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001221 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001229 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001231 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001239 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001241 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001249 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001251 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001259 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001261 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001269 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001271 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001279 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001281 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001289 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001291 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 40001299 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012A1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012A9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012B1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012B9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012C1 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012C9 _ 0000000140001173 (d)
dq Unnamed_80000000_0 ; 400012D1 _ 000000014000117D (d)
dq Unnamed_80000000_0 ; 400012D9 _ 0000000140001187 (d)
dq Unnamed_80000000_0 ; 400012E1 _ 0000000140001191 (d)
dq Unnamed_80000000_0 ; 400012E9 _ 000000014000119B (d)

db 00 24 01 24 02 24 24 24 24 03 24 04 24 24 05 24 24 24 06 24 24 24
db 24 24 07 24 24 24 24 24 24 08 24 24 24 24 24 24 24 09 24 0a 24 0b
db 24 24 0c 24 24 24 24 0d 24 24 0e 24 24 24 0f 24 10 24 24 11 24 24
db 24 12 24 24 24 24 24 24 24 24 13 24 24 14 24 24 15 24 16 24 17 24
db 18 24 19 24 1a 24 1b 24 1c 24 1d 24 24 24 24 24 24 24 24 24 24 24
db 24 24 24 24 24 24 24 24 24 24 24 1e 24 1f 24 24 24 24 24 24 24 24
db 24 24 24 20 21 22 23 b8 44

Title: Re: New HJWasm release
Post by: jj2007 on May 17, 2016, 04:46:59 PM

Quote from: habran on May 17, 2016, 01:58:39 PM
Thanks jj2007 :biggrin:
HJWasm32 has to deal with less code so it can be the reason,
but I suspect that your machine is more experienced in running 32 bit and hence the speed ;)
I have here example how much has the .SWITCH block being improved:

I would argue about the "more code" logic if my 27k lines of code consisted significantly of .switch structures, but for compatibility reasons I am still using good ol' Switch_ (http://masm32.com/board/index.php?topic=94.msg57249#msg57249) (remind me to set up a speed & size comparison between Switch_ and .switch ...)

But the "more experienced in running 32 bit" argument is certainly logical :lol:

Title: Re: New HJWasm release
Post by: TWell on May 17, 2016, 05:49:44 PM

@jj2007
here is PellesC 8 x64 version for speed test.

Title: Re: New HJWasm release
Post by: jj2007 on May 17, 2016, 09:18:53 PM

@TWell: library build exactly the same as HJWasm32, RichMasm a bit faster:

Code Select

OxPT_Assembler	JWasm		; 880 ms
OxPT_Assembler	HJWasm32	; 820 ms
OxPT_Assembler	HJWasm64	; 820 ms
OxPT_Assembler	HJwasm64poc	; 770 ms

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 06:15:47 AM

jj2007,
Test this one please

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 08:00:58 AM

Quote from: habran on May 18, 2016, 06:15:47 AM
jj2007,
Test this one please

As fast as the 32-bit version with the RichMasm source,
but with the library, the 32-bit version is exactly 50% faster.

Title: Re: New HJWasm release
Post by: TWell on May 18, 2016, 08:16:14 AM

My test with m32lib *.asm files

Code Select

HJWasm64.exe -c -coff -q \masm32\m32lib\*.asm

Code Select


HJWasm32.exe	16.982s
HJWasmGcc.exe	16.712s
HJWasm64poc.exe	21.224s
HJWasm.exe	29.547s
HJWasm64.exe	30.14s

This is odd :icon_confused:

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 08:32:05 AM

OK, last one was built with VS15 with full optimization
this one is built with GCC via C:B
let see which one is faster

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 09:13:03 AM

This one, MB library:
last one 2.77 secs (version 504866 bytes, 18 May)
32-bit 3.15 secs (version 361472 bytes, 16 May)

VS15 has fantastic optimisations :eusa_boohoo:

Btw the last one loads remarkably well with OllyDbg, a well-known 32-bit debugger 8)

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 09:25:01 AM

Does that mean that GCC is the best version?

Title: Re: New HJWasm release
Post by: TWell on May 18, 2016, 04:25:26 PM

32-bit gcc version is fast.
m32lib again with another PC.

Code Select

HJWasmGcc.exe    7.368s 7.659s
HJWasm32.exe     9.153s 9.729s
HJWasm64poc.exe  10.626s 10.664s
HJWasm64Gcc.exe  16.941s 17.214s

64-bit not so fast, compiled with v 5 with option -O2

Code Select

@SET SRC=..\HJWasm-master
@SET CC=gcc -c -O2 -m64 -I..\HJWasm-master\H -DWIN64=1
@REM hjwasm64gcc.exe:
	for %%c in (%SRC%\*.c) do %CC% %%c

gcc -s main.o apiemu.o assemble.o assume.o atofloat.o backptch.o bin.o branch.o cmdline.o codegen.o coff.o condasm.o context.o cpumodel.o data.o dbgcv.o directiv.o elf.o end.o equate.o errmsg.o expans.o expreval.o extern.o fastpass.o fixup.o fpfixup.o hll.o input.o invoke.o label.o linnum.o listing.o loop.o lqueue.o macro.o mangle.o memalloc.o msgtext.o omf.o omffixup.o omfint.o option.o parser.o posndir.o preproc.o proc.o queue.o reswords.o safeseh.o segment.o simsegm.o string.o symbols.o tbyte.o tokenize.o types.o -o hjwasm64gcc.exe

@DEL *.o

Is in my test something wrong :icon_confused:

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 05:01:55 PM

I have built mine GCC with -O3 and you can see a reduction in size
are you saying that -O2 is producing faster code than -O3?

I can also see that Pelle's C is producing even less code than GCC
Are you sure that HJWasmPoc.64 is fastest 64 bit?

Let show only 64 bit speed with all versions

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 05:18:07 PM

Quote from: habran on May 18, 2016, 05:01:55 PMLet show only 64 bit speed with all versions

OK, but don't distort the competition by smuggling in (Reply #9) 32-bit versions called HJWasm64 :eusa_naughty:

GCC yes, but no doping please :t

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 05:30:15 PM

I was not aware of that :icon_eek:
Here it is:
mingw32-gcc.exe -O3 -DWIN64 -DNDEBUG -I.\H -I"C:\Program Files (x86)\mingw-w64\i686-4.9.2-posix-dwarf-rt_v3-rev0\mingw32\i686-w64-mingw32\bin" -I"C:\Program Files (x86)\CodeBlocks\MinGW\bin" -c "C:\Users\Brane

Title: Re: New HJWasm release
Post by: TWell on May 18, 2016, 05:38:08 PM

-m64 isn't in commandline, so resut was 32-bit.

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 05:38:56 PM

It seems you picked the right options this time - almost as fast as AsmC :t

Code Select

OxPT_Assembler	HJWasm32	; 3.15
OxPT_Assembler	HJWasm64	; 2.80 secs

Btw where are your bottlenecks? Is loading string arrays and finding matches in these strings one of them?

Code Select

	- the FAST option is typically about twice as fast as CRT strstr, but 3..4 times as fast when used with
	  MasmBasic string arrays (Intel Core i5 timings for counting a rare word in a file with 800 MB, 6 Mio lines):
		232 ms for fast Instr_
		795 ms for "normal" Instr_
		999 ms for Masm32 InString
		929 ms for CRT strstr

;)

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 05:40:34 PM

Quote from: TWell on May 18, 2016, 05:38:08 PM
-m64 isn't in commandline, so resut was 32-bit.

Oops, you are right - doping alarm :dazzled:

Btw you could check if you have an old GCC version. Google finds plenty of hits for gcc 64-bit slower than 32-bit, many of them around spring 2014. So maybe the developers have saved the honour of 64-bit compilers in the meantime 8)

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 07:34:55 PM

OK, let see this one, is it x64 ::)

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 08:08:48 PM

Quote from: TWell on May 18, 2016, 04:25:26 PMIs in my test something wrong :icon_confused:

Code Select

OPT_Assembler	hjwasm32gcc3	; 2.7
OxPT_Assembler	hjwasm64gcc	; 6.1
OPT_Assembler	AsmC		; 2.1 secs

::)

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 08:32:01 PM

Quote from: habran on May 18, 2016, 07:34:55 PM
OK, let see this one, is it x64 ::)

2.55 secs, so far your best one :t

(one little problem: Olly says it's 32-bit code...)

Title: Re: New HJWasm release
Post by: nidud on May 18, 2016, 09:02:29 PM

deleted

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 09:48:54 PM

Quote from: nidud on May 18, 2016, 09:02:29 PMThe switch still trashes registers

That seems not the only problem. Here is a snippet - try with the latest AsmC and HJWasm versions 8)

Code Select

include \masm32\include\masm32rt.inc

.code
start:
  m2m edi, -5
  .Repeat
	print chr$(13, 10)
	print str$(edi), 9
	.switch edi
	.case -4
		print "case -4 "
		; .break
	.case -2
		print "case -2 "
		; .break
	.case 0
		print "case 0 "
		.break
	.case 2
		print "case +2 "
		.break
	.case 4
		print "case +4 "
		.break
	.Default
		print "default"
	.Endsw
	inc edi
  .Until sdword ptr edi>5
  inkey chr$(13, 10, "--- ok? ---")
  exit
end start

Title: Re: New HJWasm release
Post by: nidud on May 18, 2016, 10:38:12 PM

deleted

Title: Re: New HJWasm release
Post by: jj2007 on May 18, 2016, 11:32:39 PM

With AsmC, no options but .break instead, the .break causes an exit of the .Repeat loop. Is that "by design"?

Title: Re: New HJWasm release
Post by: habran on May 18, 2016, 11:35:35 PM

Hi nidud,
This code below is a brilliant idea, and I like it a lot, however, it is not applicable in 64 bit because it would force linker LARGEADDRESS error, while I was writing the code my mind was focused on 64 bit and 32 bit was just conversion from 64 bit, now I see that I have to start to think from 32 bit side if I write 32 bit code

Code Select


	cmp	eax,min
	jl	endsw
	cmp	eax,max
	jg	endsw
	push	eax
	movzx	eax,index[eax-min]
	mov	eax,table[eax*4]
	xchg	eax,[esp]
	retn

In the case of jj2007 source, I don't think that any ASM programmer would write such a code that would trash registers that are needed for next iteration.
You can, of course, push register on the stack, and then purposely overwrite that memory on the stack, and than complain that the assembler is not good because it let you overwrite the stack space ::)
If that happened I would not fill sorry for that "programmer"

Title: Re: New HJWasm release
Post by: nidud on May 19, 2016, 12:07:59 AM

deleted

Title: Re: New HJWasm release
Post by: jj2007 on May 19, 2016, 12:11:07 AM

Quote from: habran on May 18, 2016, 11:35:35 PMIn the case of jj2007 source, I don't think that any ASM programmer would write such a code that would trash registers that are needed for next iteration ... would not fill sorry for that "programmer"

So far, I have tried to be helpful. Show me one occasion where I insulted you as a "programmer" or similar.

Besides, this is obviously valid code, since edi is a non-volatile register. I perfectly understand why you are pissed off, but please concentrate on your homework instead of attacking others.

Title: Re: New HJWasm release
Post by: habran on May 19, 2016, 12:21:33 AM

My intention was not to attack you jj2007, I am grateful for your help and cooperation, and I already said before how much I appreciate you as a programmer, however, I am sure that you would never write this construction in your programs, and you have to admit that :biggrin:

Title: Re: New HJWasm release
Post by: habran on May 19, 2016, 12:25:15 AM

"programmer" was not pointed to you but to someone who would write some program to delete his stack

Title: Re: New HJWasm release
Post by: nidud on May 19, 2016, 01:06:14 AM

deleted

Title: Re: New HJWasm release
Post by: jj2007 on May 19, 2016, 01:59:36 AM

Quote from: habran on May 19, 2016, 12:21:33 AMI am sure that you would never write this construction in your programs, and you have to admit that :biggrin:

OK, let's declare it a misunderstanding. But now I am curious: Where in my code do I trash the stack that I need later on?

Title: Re: New HJWasm release
Post by: jj2007 on May 19, 2016, 06:59:53 AM

Quote from: nidud on May 19, 2016, 12:07:59 AMAh, finally: Did you RTFM :lol:

No, I was busy reading the rest of the Internet 8)

Anyway, latest results from my switch testbed:

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Assembled with HJWasm32
24 ms   case 260, MB Switch_ table
230 ms  case 260, MB Switch_ chain
455 ms  case 260, Masm32 switch
31 ms   case 260, HJWasm .Switch
6 ms    case 260, AsmC .Switch

24 ms   case 196, MB Switch_ table
178 ms  case 196, MB Switch_ chain
341 ms  case 196, Masm32 switch
38 ms   case 196, HJWasm .Switch
6 ms    case 196, AsmC .Switch

23 ms   case 132, MB Switch_ table
127 ms  case 132, MB Switch_ chain
229 ms  case 132, Masm32 switch
22 ms   case 132, HJWasm .Switch
6 ms    case 132, AsmC .Switch

23 ms   case 68, MB Switch_ table
76 ms   case 68, MB Switch_ chain
120 ms  case 68, Masm32 switch
40 ms   case 68, HJWasm .Switch
7 ms    case 68, AsmC .Switch

23 ms   case 4, MB Switch_ table
24 ms   case 4, MB Switch_ chain
6 ms    case 4, Masm32 switch
38 ms   case 4, HJWasm .Switch
6 ms    case 4, AsmC .Switch

2989    bytes for MbTable
4840    bytes for MbChain
4799    bytes for Masm32
6978    bytes for hjwasm
4208    bytes for asmc

The last AsmC row was added "by hand" because obviously you can't assemble the source with both assemblers at the same time. If you want to build it yourself, open the source in RichMasm and press Ctrl End. The OPT_Assembler rows should speak for themselves. OxPT is a disabled one (RichMasm looks for a case-sensitive OPT_). If two options are active, the last one is valid (I know, I know, modern IDEs have somewhere a project options menu where you can set the assembler if you find the right menu item; RM is very old fashioned, sorry).

Title: Re: New HJWasm release
Post by: TWell on May 19, 2016, 07:20:44 PM

DELETED

Title: Re: New HJWasm release
Post by: jj2007 on May 19, 2016, 08:35:00 PM

64-bit version is faster:

Code Select

OxPT_Assembler	hjwasm32msvcrt	; 7.6
OxPT_Assembler	hjwasm64msvcrt	; 6.2
OxPT_Assembler	AsmC		; 2.5 secs

Title: Re: New HJWasm release
Post by: TWell on May 19, 2016, 09:43:14 PM

DELETED

Title: Re: New HJWasm release
Post by: jj2007 on May 19, 2016, 10:03:10 PM

2.66 secs :t

Here are all my current timings:

Code Select

HJWasmTWell	; 7.8 secs (9 May)
mlv10		; 7.8 secs
mlv615		; 7.0 secs - use for release version
JWasm		; 5.5 secs
HJWasm32	; 3.15
HJWasm64	; 2.80 secs
HJwasm64poc	; 2.75
hjwasm32gcc3	; 2.7
hjwasm64gcc	; 6.1
HJWasm64Habran	; 2.55 secs, but it's 32-bit code
hjwasm32msvcrt	; 7.6
hjwasm64msvcrt	; 6.2
hjwasm32msv13	; 2.65
AsmC		; 2.5 secs (used to be 2.1...)

In practice, I use AsmC for testing, not only because it's fastest but also because it gives direct feedback, i.e. you can see

Code Select


Assembling: C:\Masm32\MasmBasic\libtmpAA.asm
Assembling: C:\Masm32\MasmBasic\libtmpAB.asm
Assembling: C:\Masm32\MasmBasic\libtmpAC.asm
Assembling: C:\Masm32\MasmBasic\libtmpAD.asm

while it is assembling. JWasm and ML 6.15 do the same, most others let you wait until everything is complete, which is less nice to watch. But that is a very personal preference, of course 8)

Btw it would be nice if Nidud or Habran or both could identify the innermost loop that makes the assembly slow. We are experts here in speeding up C code... :badgrin:

Title: Re: New HJWasm release
Post by: jj2007 on May 20, 2016, 09:00:52 AM

Quote from: jj2007 on May 19, 2016, 10:03:10 PMBtw it would be nice if Nidud or Habran or both could identify the innermost loop that makes the assembly slow. We are experts here in speeding up C code... :badgrin:

Thanks, Tim :t

Code Select

 2669 ms, 9459844 time(s): address 004386A0	_SymFind                   004386a0 f   symbols.obj
 2647 ms, 4850279 time(s): address 00420400	_my_fgets                  00420400 f   input.obj
 1383 ms, 10972418 time(s): address 0043A300	_get_id                    0043a300 f   tokenize.obj
 1242 ms, 4800861 time(s): address 0043A6A0	_Tokenize                  0043a6a0 f   tokenize.obj
 1136 ms, 10357549 time(s): address 00434DF0	_FindResWord               00434df0 f   reswords.obj
 1041 ms, 9461863 time(s): address 004384D0	_hashpjw                   004384d0 f   symbols.obj
  921 ms, 16118880 time(s): address 0043A540	_GetToken                  0043a540 f   tokenize.obj

Title: Re: New HJWasm release
Post by: TWell on May 20, 2016, 09:54:19 PM

DELETED

Title: Re: New HJWasm release
Post by: jj2007 on May 20, 2016, 11:32:35 PM

3.0 secs, so far the best 64-bit version (AsmC: 2.5 secs).
But 32-bit version is 10% faster:

Code Select

OxPT_Assembler	hjwasm642005DDK	; 3.0
OPT_Assembler	hjwasm322005DDK	; 2.7
OxPT_Assembler	AsmC		; 2.5 secs

What about _SymFind and _my_fgets? Long and complicated, or is there a chance to give them a boost?

Title: Re: New HJWasm release
Post by: johnsa on May 23, 2016, 11:04:21 PM

Hey,

Do we have any clear indication as to why a C project compiled with an 11 year old version of MSVC is so much faster than if compiled with VS2015 ??
Is it purely down to the CRT inclusions being more bloated/less performant?

Title: Re: New HJWasm release
Post by: TWell on May 23, 2016, 11:10:51 PM

MS MT CRT fault.
~~Here are cl v19 compiled version with 2003 DDK libc.lib~~

Code Select

5.325s        asmc
6.521s        hjwa64-2015clib.exe
7.180s        hjwa32-2015clib.exe

Title: Re: New HJWasm release
Post by: jj2007 on May 24, 2016, 12:18:45 AM

Quote from: johnsa on May 23, 2016, 11:04:21 PMwhy a C project compiled with an 11 year old version of MSVC is so much faster than if compiled with VS2015 ??

Compilers develop. We are all running extremely old CPUs, new compilers optimise for the latest CPUs 8)

Title: Re: New HJWasm release
Post by: habran on May 24, 2016, 12:25:47 AM

Hi TWell,
There is a new HJWasm on Terraspace built with your tools,thank you :t
as well as improved source on Github
This one you built above doesn't debug on source level in 32 bit

Title: Re: New HJWasm release
Post by: habran on May 24, 2016, 12:33:47 AM

Sorry JJ, I was busy these days to fix HJWasm :biggrin:
Try it now please, it is updated on Terraspace 8)

Title: Re: New HJWasm release
Post by: TWell on May 24, 2016, 01:40:06 AM

Quote from: habran on May 24, 2016, 12:25:47 AM
This one you built above doesn't debug on source level in 32 bit

That AVX was missing too :redface:
I was testing m32lib compile only as there is a lot file access.

PS: have anyone old Vista DDK ? Is there libc.lib (VS 2005?)

Title: Re: New HJWasm release
Post by: jj2007 on May 24, 2016, 02:09:47 AM

Quote from: habran on May 24, 2016, 12:33:47 AM
Sorry JJ, I was busy these days to fix HJWasm :biggrin:
Try it now please, it is updated on Terraspace 8)

Code Select

OxPT_Assembler	HJWasm32	; 2.7
OxPT_Assembler	HJWasm64	; 3.0 secs, and yes, it's 64-bit code
OxPT_Assembler	AsmC		; 2.5 secs

Title: Re: New HJWasm release
Post by: habran on May 24, 2016, 06:04:42 AM

Thanks JJ :t
So, taking in consideration that ASMC has most important parts translated to asm, and has less code to run, that is amazing speed.
With HJWasm64 you are testing 32 bit code optimized for 32 bit, that is why it is slower than HJWasm32,
we should try opposite. I am planing to write some code optimized for x64 and than we will see how it will perform 8)

Title: Re: New HJWasm release
Post by: TWell on May 24, 2016, 04:04:29 PM

From VC6 samples:
# When building single-threaded applications you can link your executable
# with either LIBC, LIBCMT, or CRTDLL, although LIBC will provide the best
# performance.

Title: Re: New HJWasm release
Post by: habran on May 24, 2016, 04:11:05 PM

AFAIK LIBCMT, is for static build and LIBC for dynamic build>
Where is that LIBC.LIB which you gave me the link for?
Isn't it from msvc2005?

Title: Re: New HJWasm release
Post by: TWell on May 24, 2016, 04:37:18 PM

libc.lib is a static library.
libc.lib can found from Windows NT 5.2 DDK (Server 2003 SP1) and x64 from 5.2.3790.2075.51.PlatformSDK_Svr2003R2_rtm too.

I can't inspect Vista DDK :(

Title: Re: New HJWasm release
Post by: habran on May 25, 2016, 07:54:54 PM

Hi TWell,
The one you build the last is working fine but I can't test for the speed
I have fixed some minor errors in hll.c and added a new feature to the .SWITCH block, so we will upgrade tonight or tomorrow.
If you give me your email address I will send you new hll.c for testing

Title: Re: New HJWasm release
Post by: TWell on May 27, 2016, 05:45:57 PM

With VisualCppBuildTools2015

Code Select

5.418s        asmc
8.596s        HJWasm64.exe
12.372s       hjwa64-2015.exe

Not bad eh??

Title: Re: New HJWasm release
Post by: habran on May 27, 2016, 07:32:05 PM

VS2015 sucks :(
I prefer this one:

Code Select

OxPT_Assembler	HJWasm32	; 2.7
OxPT_Assembler	HJWasm64	; 3.0 secs, and yes, it's 64-bit code
OxPT_Assembler	AsmC		; 2.5 secs

Title: Re: New HJWasm release
Post by: jj2007 on May 29, 2016, 06:35:58 AM

Quote from: jj2007 on May 19, 2016, 10:03:10 PMIn practice, I use AsmC for testing, not only because it's fastest but also because it gives direct feedback, i.e. you can see
Code Select Expand
Assembling: C:\Masm32\MasmBasic\libtmpAA.asm Assembling: C:\Masm32\MasmBasic\libtmpAB.asm Assembling: C:\Masm32\MasmBasic\libtmpAC.asm Assembling: C:\Masm32\MasmBasic\libtmpAD.asm
while it is assembling. JWasm and ML 6.15 do the same, most others let you wait until everything is complete, which is less nice to watch. But that is a very personal preference, of course 8)

Btw it would be nice if Nidud or Habran or both could identify the innermost loop that makes the assembly slow. We are experts here in speeding up C code... :badgrin:

Re "others let you wait until everything is complete": would fflush(..) after each module help?

Title: Re: New HJWasm release
Post by: habran on June 01, 2016, 07:36:14 PM

New HJWasm uploaded on Terraspace (http://www.terraspace.co.uk/hjwasm.html) with some bug fixes and hopefully some speed improvement for the .SWITCH block hll

Title: Re: New HJWasm release
Post by: jj2007 on June 01, 2016, 11:33:21 PM

Timings for building the MB library are unchanged.

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Assembled with HJWasm32
20 ms   case 260, MB Switch_ table
197 ms  case 260, MB Switch_ chain
370 ms  case 260, Masm32 switch
46 ms   case 260, HJWasm .Switch

19 ms   case 196, MB Switch_ table
146 ms  case 196, MB Switch_ chain
277 ms  case 196, Masm32 switch
46 ms   case 196, HJWasm .Switch

19 ms   case 132, MB Switch_ table
104 ms  case 132, MB Switch_ chain
186 ms  case 132, Masm32 switch
46 ms   case 132, HJWasm .Switch

19 ms   case 68, MB Switch_ table
62 ms   case 68, MB Switch_ chain
98 ms   case 68, Masm32 switch
46 ms   case 68, HJWasm .Switch

19 ms   case 4, MB Switch_ table
20 ms   case 4, MB Switch_ chain
5 ms    case 4, Masm32 switch
46 ms   case 4, HJWasm .Switch

2989    bytes for MbTable
4840    bytes for MbChain
4799    bytes for Masm32
5729    bytes for hjwasm

Title: Re: New HJWasm release
Post by: habran on June 02, 2016, 12:00:39 AM

Thanks JJ,
It looks like we have at least stable speed, considering it is written in C language it is pretty good.
It is probably possible to make it little bit faster with some more optimization.

Title: Re: New HJWasm release
Post by: habran on June 03, 2016, 09:33:15 PM

Hi JJ,
Can you please test this build with the same sources you did with the last one to see if there is the difference in speed?

Title: Re: New HJWasm release
Post by: johnsa on June 03, 2016, 10:34:51 PM

Aren't the timings JJ provided run-time execution of the switch rather than compile-time related?

Title: Re: New HJWasm release
Post by: habran on June 03, 2016, 11:19:44 PM

Yes, that is what I want to be tested.
I think that maybe I have faster sorting routine in this one.

Title: Re: New HJWasm release
Post by: jj2007 on June 03, 2016, 11:26:36 PM

Quote from: habran on June 03, 2016, 09:33:15 PMCan you please test this build with the same sources you did with the last one to see if there is the difference in speed?

Hi Habran,
Build speed for MasmBasic library is very good, only 10% slower now than AsmC :t
Would be nice to flush the console after each assembly, though.

Quote from: johnsa on June 03, 2016, 10:34:51 PM
Aren't the timings JJ provided run-time execution of the switch rather than compile-time related?

Here is run-time execution of the switch:

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Assembled with AsmC
20 ms   case 260, MB Switch_ table
193 ms  case 260, MB Switch_ chain
374 ms  case 260, Masm32 switch
5 ms    case 260, AsmC .Switch

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Assembled with HJWasm32
20 ms   case 260, MB Switch_ table
190 ms  case 260, MB Switch_ chain
374 ms  case 260, Masm32 switch
48 ms   case 260, HJWasm .Switch

And the size of the generated switch codes:

Code Select

2989    bytes for MbTable
4840    bytes for MbChain
4799    bytes for Masm32
4201    bytes for asmc
5729    bytes for hjwasm

The AsmC .Switch is clearly fastest, while the MasmBasic Switch_ macro generates the smallest code for high case numbers (where it auto-selects the compact table version).

Title: Re: New HJWasm release
Post by: habran on June 03, 2016, 11:43:07 PM

Thanks JJ, it looks like former one is 2 ms faster :(

Title: Re: New HJWasm release
Post by: Raistlin on June 08, 2016, 08:04:18 PM

So I downloaded HJWasm - and lo and behold - when I tried to email it to myself
Google (Gmail scanner) says it contains malware.....seeesh

So I know Habran would never include such - but what file could be the cause of the false positive ?

Title: Re: New HJWasm release
Post by: johnsa on June 08, 2016, 08:33:44 PM

There is definitely no malware in the archives.. I would assume its heuristic scanning possibly picks up that the exe generates code which it doesn't like (possibly based on an unfamiliar name or origin) or alternatively the fact that the archive contains asm files..
Is there anyway to get a more detailed report from the scan as to exactly what it's not happy with?

Title: Re: New HJWasm release
Post by: jj2007 on June 08, 2016, 09:07:03 PM

If you are in doubt, upload the file to Jotti: nothing for HJWasm32 (https://virusscan.jotti.org/en-US/filescanjob/auaptrk7ij). As johnsa wrote, the AV scanners use heuristics, and those are a PITA (Pain In The A**). As soon as they see something that was apparently not built with MSVC or GCC, they make racist remarks about assembler code being dangerous etc 8)

Btw we have a dedicated sub-forum for that: AV Software sh*t list (http://masm32.com/board/index.php?board=23.0)

Title: Re: New HJWasm release
Post by: habran on June 09, 2016, 07:35:20 PM

New HJWasm uploaded on Terraspace with some more bug fixes in the .SWITCH block.

The MASM Forum

64 bit assembler => UASM Assembler Development => Topic started by: habran on May 17, 2016, 06:30:15 AM