Print Page - Intel SHA - Instruction Set Extensions

Title: Intel SHA - Instruction Set Extensions
Post by: LiaoMi on January 21, 2018, 03:33:36 AM

Hello everybody,

uasm does not support extension instructions, for example sha256rnds2 (SHA - Instruction Set Extensions), with AES everything seems to be okay :lol:

Documentation package:

SHA

SHA Docs - https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf (https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf)
ASM Source (Intel® SHA Extensions Implementations) - https://software.intel.com/sites/default/files/article/402126/intel-sha-extensions_1.zip (https://software.intel.com/sites/default/files/article/402126/intel-sha-extensions_1.zip)

AES may be interesting for tests

Intel AESNI Sample Library - Assembler & C Source code (intel-aesni-sample-library-v1.2.zip) - https://web.archive.org/web/20170713153528/https://software.intel.com/sites/default/files/article/181731/intel-aesni-sample-library-v1.2.zip (https://web.archive.org/web/20170713153528/https://software.intel.com/sites/default/files/article/181731/intel-aesni-sample-library-v1.2.zip)
AES-NI white paper - Intel® Developer Zone https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf (https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf)

Intel® Architecture Instruction Set Extensions Programming Reference
https://web.archive.org/web/20130929035331if_/http://download-software.intel.com/sites/default/files/319433-015.pdf (https://web.archive.org/web/20130929035331if_/http://download-software.intel.com/sites/default/files/319433-015.pdf)

It would be cool to add SHA Instruction Extensions to the processing set. As you can see from the source code in assembler from Intel, yasm assembles these sets.

Best regards, LiaoMi

Title: Re: Intel SHA - Instruction Set Extensions
Post by: habran on January 21, 2018, 04:12:49 PM

Will look at that ASAP, however, take in consideration the Australian Open ;)

Title: Re: Intel SHA - Instruction Set Extensions
Post by: habran on January 22, 2018, 05:52:44 AM

done 8):

Code Select


00007ff6c08d1807 0F 38 CC CA                      sha256msg1			xmm1, xmm2  
00007ff6c08d180b 0F 38 CD CA                      sha256msg2			xmm1, xmm2  
00007ff6c08d180f 0F 38 CC 09                      sha256msg1			xmm1, xmmword ptr [rcx]  
00007ff6c08d1813 0F 38 CD 09                      sha256msg2			xmm1, xmmword ptr [rcx]  
00007ff6c08d1817 0F 3A CC CA 0C                   sha1rnds4			xmm1, xmm2, 0xc  
00007ff6c08d181c 0F 3A CC 09 0C                   sha1rnds4			xmm1, xmmword ptr [rcx], 0xc  
00007ff6c08d1821 0F 38 C8 CA                      sha1nexte			xmm1, xmm2  
00007ff6c08d1825 0F 38 C8 09                      sha1nexte			xmm1, xmmword ptr [rcx]  
00007ff6c08d1829 0F 38 C9 CA                      sha1msg1			xmm1, xmm2  
00007ff6c08d182d 0F 38 C9 09                      sha1msg1			xmm1, xmmword ptr [rcx]  
00007ff6c08d1831 0F 38 CA CA                      sha1msg2			xmm1, xmm2  
00007ff6c08d1835 0F 38 CA 09                      sha1msg2			xmm1, xmmword ptr [rcx]  
00007ff6c08d1839 0F 38 CB CA                      sha256rnds2			xmm1, xmm2  
00007ff6c08d183d 0F 38 CB 09                      sha256rnds2			xmm1, xmmword ptr [rcx]

will be in next release

Title: Re: Intel SHA - Instruction Set Extensions
Post by: johnsa on January 22, 2018, 08:12:35 PM

Hi,

This will be included in 2.46.8 which should be up tonight or tomorrow along with the DEREF fix for com->Release() as well as support for typedef'ed PROC return types.

Title: Re: Intel SHA - Instruction Set Extensions
Post by: LiaoMi on January 22, 2018, 11:55:15 PM

Hi, habran & johnsa,

thanks for the work, this is great news!

CLMUL instruction set is also not fully supported - https://en.wikipedia.org/wiki/CLMUL_instruction_set (https://en.wikipedia.org/wiki/CLMUL_instruction_set)

Code Select

pclmulqdq xmm1, xmm2, 5
pclmulqdq xmm1, [rax], byte 5
pclmulqdq xmm1, dqword [rax], 5
vpclmulqdq xmm1, xmm2, 0x10
vpclmulqdq xmm1, dqword [rbx], 0x10
vpclmulqdq xmm0, xmm1, xmm2, 0x10
vpclmulqdq xmm0, xmm1, dqword [rbx], 0x10

pclmullqlqdq xmm1, xmm2
pclmullqlqdq xmm1, [rax]
pclmullqlqdq xmm1, dqword [rax]
vpclmullqlqdq xmm1, xmm2
vpclmullqlqdq xmm1, dqword[rbx]
vpclmullqlqdq xmm0, xmm1, xmm2
vpclmullqlqdq xmm0, xmm1, dqword[rbx]

pclmulhqlqdq xmm1, xmm2
pclmulhqlqdq xmm1, [rax]
pclmulhqlqdq xmm1, dqword [rax]
vpclmulhqlqdq xmm1, xmm2
vpclmulhqlqdq xmm1, dqword[rbx]
vpclmulhqlqdq xmm0, xmm1, xmm2
vpclmulhqlqdq xmm0, xmm1, dqword[rbx]

pclmullqhqdq xmm1, xmm2
pclmullqhqdq xmm1, [rax]
pclmullqhqdq xmm1, dqword [rax]
vpclmullqhqdq xmm1, xmm2
vpclmullqhqdq xmm1, dqword[rbx]
vpclmullqhqdq xmm0, xmm1, xmm2
vpclmullqhqdq xmm0, xmm1, dqword[rbx]

pclmulhqhqdq xmm1, xmm2
pclmulhqhqdq xmm1, [rax]
pclmulhqhqdq xmm1, dqword [rax]
vpclmulhqhqdq xmm1, xmm2
vpclmulhqhqdq xmm1, dqword[rbx]
vpclmulhqhqdq xmm0, xmm1, xmm2
vpclmulhqhqdq xmm0, xmm1, dqword[rbx]

RDSEED and RDRAND instruction set (Edited - it works completely, it was my mistake)

Code Select

rdrand cx
rdrand ecx
rdrand rcx

Sample x86 asm code to check upon RDRAND instruction

Code Select

; using NASM syntax

section .data
	msg db "0x00000000",10

section .text
global _start
_start:
	mov eax,1
	cpuid
	bt ecx,30
	mov edi,1 ; exit code: failure
	jnc .exit

	; rdrand sets CF=0 if no random number
	; was available. Intel documentation
	; recommends 10 retries in a tight loop
	mov ecx,11
.loop1:
	sub ecx, 1
	jz .exit ; exit code is set already
	rdrand eax
	jnc .loop1

	; convert the number to ASCII
	mov edi,msg+9
	mov ecx,8
.loop2:
	mov edx,eax
	and edx,0Fh
	; add 7 to nibbles of 0xA and above
	; to align with ASCII code for 'A'
	; ('A' - '0') - 10 = 7
	xor r9d, r9d
        lea r8d, [r9+7] ; r8=7
	cmp dl,9
	cmova r9,r8
	add edx,r9d
	add [rdi],dl
	shr eax,4
	sub edi, 1
	sub ecx, 1
        jnz .loop2

	mov eax,1 ; SYS_WRITE
	mov edi,eax ; stdout=SYS_WRITE=1
	mov esi,msg
	mov edx,11
	syscall

	xor edi,edi ; exit code zero: success
.exit:
	mov eax,60 ; SYS_EXIT
	syscall

Here is the document, can be useful as a reference - Intel Advanced Vector Extensions Programming Reference https://software.intel.com/file/36945 (https://software.intel.com/file/36945) (save file as pdf)

Title: Re: Intel SHA - Instruction Set Extensions
Post by: johnsa on January 23, 2018, 02:53:12 AM

We'll check on the CLMUL completeness and add them too :)

I'm busy adding the regression tests for both sets now anyway.

Title: Re: Intel SHA - Instruction Set Extensions
Post by: LiaoMi on January 23, 2018, 03:32:32 AM

This applies only to Intel processors, for amd, yasm supports other sets too, one of them The XOP (eXtended Operations) instruction set, FMA4 instruction set, TBM (Trailing Bit Manipulation). Trailing Bit Manipulation - for Intel is not supported, amd also want to refuse. FMA4 instruction set - for Intel is not supported, XOP (eXtended Operations) instruction set - for Intel is not supported.

QuoteIt is uncertain whether future Intel processors will support FMA4, due to Intel's announced change to FMA3.

this means that these sets dont make sense. Agner`s CPU blog - Stop the instruction set war - http://www.agner.org/optimize/blog/read.php?i=25 (http://www.agner.org/optimize/blog/read.php?i=25)

QuoteThe incompatibility between Intel's FMA3 and AMD's FMA4 is due to both companies changing plans without coordinating coding details with each other. AMD changed their plans from FMA3 to FMA4 while Intel changed their plans from FMA4 to FMA3 almost at the same time.

Does anybody have any experience with an AMD processor? Can ml64.exe understand these XOP? FMA3 intel, FMA3 AMD, TBM sets?!

Title: Re: Intel SHA - Instruction Set Extensions
Post by: habran on January 23, 2018, 07:56:22 AM

CLMUL instructions were implemented already:

Code Select


00007ff770711807 66 0F 3A 44 08 05                pclmulqdq			xmm1, xmmword ptr [rax], 0x5  
00007ff77071180d 66 0F 3A 44 CA 05                pclmulqdq			xmm1, xmm2, 0x5  
00007ff770711813 C4 E3 69 44 CB 05                vpclmulqdq			xmm1, xmm2, xmm3, 0x5  
00007ff770711819 C4 E3 69 44 08 05                vpclmulqdq			xmm1, xmm2, xmmword ptr [rax], 0x5

Title: Re: Intel SHA - Instruction Set Extensions
Post by: habran on January 24, 2018, 05:46:38 AM

we have added Pseudo-Op, when using it we don't need imm, just like this:

Code Select


PCLMULLQLQDQ xmm1, xmm2
PCLMULHQLQDQ xmm1, xmm2
PCLMULLQHQDQ xmm1, xmm2
PCLMULHQHQDQ xmm1, xmm2
VPCLMULLQLQDQ xmm1, xmm2,xmm3
VPCLMULHQLQDQ xmm1, xmm2,xmm3
VPCLMULLQHQDQ xmm1, xmm2,xmm3
VPCLMULHQHQDQ xmm1, xmm2,xmm3

SDE debugger doesn't recognise Pseudo-OP but I have MSVS 2013
maybe MSVS 2017 SDE does:

Code Select


00007ff64a421807 66 0F 3A 44 CA 00                pclmulqdq            xmm1, xmm2, 0x0  
00007ff64a42180d 66 0F 3A 44 CA 01                pclmulqdq            xmm1, xmm2, 0x1  
00007ff64a421813 66 0F 3A 44 CA 10                pclmulqdq            xmm1, xmm2, 0x10  
00007ff64a421819 66 0F 3A 44 CA 11                pclmulqdq            xmm1, xmm2, 0x11  
00007ff64a42181f C4 E3 69 44 CB 00                vpclmulqdq            xmm1, xmm2, xmm3, 0x0  
00007ff64a421825 C4 E3 69 44 CB 01                vpclmulqdq            xmm1, xmm2, xmm3, 0x1  
00007ff64a42182b C4 E3 69 44 CB 10                vpclmulqdq            xmm1, xmm2, xmm3, 0x10  
00007ff64a421831 C4 E3 69 44 CB 11                vpclmulqdq            xmm1, xmm2, xmm3, 0x11

The MASM Forum

64 bit assembler => UASM Assembler Development => Topic started by: LiaoMi on January 21, 2018, 03:33:36 AM