Recent Posts

Pages: [1] 2 3 ... 10
1
UASM Assembler Development / Re: MOVBE missing
« Last post by AW on Today at 12:37:11 AM »
2
UASM Assembler Development / Re: MOVBE missing
« Last post by hutch-- on November 17, 2018, 11:58:22 PM »
From memory there is an SSE4 instruction where you use a mask to determine the altered order of data and from memory it was a lot faster than BSWAP.

> I will use it as a justification to get the ok for new hardware from my wife. Any ideas?

Best of luck with that one, it sounds like an attempt to wring blood out of a stone.
3
UASM Assembler Development / UASM 2.47 Available
« Last post by johnsa on November 17, 2018, 11:54:15 PM »
Hi,

UASM 2.47 is now available. Packages on the site and repository updated!

Changes:

Fixed symbol undefined with IF
Add -PIE command line switch for position indepedant executable. (not used yet)
Fixed vxorpd evex warning with missing register.
Corrected encoding for vmovsd.
Fix vgather/scatter encoding issue.
Fix vectorcall register over-write ordering.
Fix codegen integer shift encoding for SSE vs. AVX/EVEX.
Add m512 built-in types
Fix error reporting for evex compare instructions (http://masm32.com/board/index.php?topic=7320.0) => VPCMPB,VPCMPUB,VPCMPD,VPCMPUD,VPCMPQ,VPCMPUQ,VPCMPW,VPCMPUW
Clang and Gcc 8+ compiler compatibility fixes
Name mangling for vectorcall
Include improved build files for osx/linux
Fix generation of instructions when in data section (due to lbl: conversion to lbl label byte regression from 2.19)
Add error when literal string used in invoke and parameter is NOT of type PTR or VARARG.
Fixed transfer of language type info from PROC to equate (http://masm32.com/board/index.php?topic=7137.0)
Ensure that an equate used with invoke redirects to the actual PROC symbol to ensure proper relocations are generated (http://masm32.com/board/index.php?topic=7137.0)
Fix static method HLL invocation without parameters.
Fixed console colour background to match active setting under Windows.
remove duplicated string literals
add plt,got relocation types provisionally.
Fix register ordering in memory operands where base and index could be swapped when no scale present. (Legacy jwasm bug)
AVX-512 encoding fix for VPGATHERDD in 32bit.
Added MOVBE instruction support.

Cheers !
4
UASM Assembler Development / Re: MOVBE missing
« Last post by habran on November 17, 2018, 11:51:43 PM »
 .if (hardware == old)
    bswap eax
 .else
   movbe rax, qword ptr wife
 .endif
 ;)
5
UASM Assembler Development / Re: MOVBE missing
« Last post by jj2007 on November 17, 2018, 10:00:25 PM »
Yes indeed! If I ever encounter an application with an innermost time-critical loop where bswap is too slow, I will use it as a justification to get the ok for new hardware from my wife. Any ideas?
6
UASM Assembler Development / Re: MOVBE missing
« Last post by habran on November 17, 2018, 09:38:05 PM »
It can be quite useful, here is an example:
Code: [Select]
.data
szReplace db 'Replaced',0

.code

   mov          rdx,qword ptr szReplace
   movbe rax,qword ptr szReplace
output:

RAX = 5265706C61636564
RDX = 646563616C706552

7
UASM Assembler Development / Re: MOVBE missing
« Last post by jj2007 on November 17, 2018, 09:06:49 PM »
MOVBE is now implemented in UASM and I hope John will include it in this release 8)

A winning team :t
8
UASM Assembler Development / Re: MOVBE missing
« Last post by johnsa on November 17, 2018, 09:01:41 PM »
Now included, packages coming shortly :)
9
UASM Assembler Development / Re: MOVBE missing
« Last post by habran on November 17, 2018, 08:00:26 PM »
MOVBE is now implemented in UASM and I hope John will include it in this release 8)
10
UASM Assembler Development / Re: MOVBE missing
« Last post by LiaoMi on November 17, 2018, 07:59:34 PM »
Hi  :P,

Code: [Select]
Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz (SSE4)

22      cycles for 100 * mov+bswap
11      cycles for 100 * movbe

23      cycles for 100 * mov+bswap
37      cycles for 100 * movbe

23      cycles for 100 * mov+bswap
8       cycles for 100 * movbe

21      cycles for 100 * mov+bswap
16      cycles for 100 * movbe

35      cycles for 100 * mov+bswap
18      cycles for 100 * movbe

9       bytes for mov+bswap
9       bytes for movbe


--- ok ---
Pages: [1] 2 3 ... 10