Author Topic: Modular multiplicative inverse  (Read 439 times)

mabdelouahab

  • Member
  • ***
  • Posts: 428
Modular multiplicative inverse
« on: September 09, 2019, 07:31:44 AM »
What is the rule to get the multiplicative inverse of  X (8 bits)?
What is the condition that is contrived in 02BH to be the multiplicative inverse  for 015H?

Quote
This affine transformation is the sum of multiple rotations of the byte as a vector, where addition is the XOR operation:

    s = b ⊕ ( b ⋘ 1 ) ⊕ ( b ⋘ 2 ) ⊕ ( b ⋘ 3 ) ⊕ ( b ⋘ 4 ) ⊕ 63

where b represents the multiplicative inverse, ....
Rijndael S-box

mabdelouahab

  • Member
  • ***
  • Posts: 428
Re: Modular multiplicative inverse
« Reply #1 on: September 09, 2019, 05:39:56 PM »
Quote
a ≡ b (mod n) means :
    - a and b have the same remainder when divided by n
    - a = K.n + b
    - (a - b) is multiple of n

How it was applied to the Rijndael table?
017h≡05Fh? or 06Dh≡093h? ...

Code: [Select]
.data ;00 , 01   02   03   04   05   06   07   08   09   0A   0B   0C   0D   0E   0F
MulInv DB 001h,001h,08Dh,0F6h,0CBh,052h,07Bh,0D1h,0E8h,04Fh,029h,0C0h,0B0h,0E1h,0E5h,0C7h ;00
DB 074h,0B4h,0AAh,04Bh,099h,02Bh,060h,05Fh,058h,03Fh,0FDh,0CCh,0FFh,040h,0EEh,0B2h ;01
DB 03Ah,06Eh,05Ah,0F1h,055h,04Dh,0A8h,0C9h,0C1h,00Ah,098h,015h,030h,044h,0A2h,0C2h ;02
DB 02Ch,045h,092h,06Ch,0F3h,039h,066h,042h,0F2h,035h,020h,06Fh,077h,0BBh,059h,019h ;03
DB 01Dh,0FEh,037h,067h,02Dh,031h,0F5h,069h,0A7h,064h,0ABh,013h,054h,025h,0E9h,009h ;04
DB 0EDh,05Ch,005h,0CAh,04Ch,024h,087h,0BFh,018h,03Eh,022h,0F0h,051h,0ECh,061h,017h ;05
DB 016h,05Eh,0AFh,0D3h,049h,0A6h,036h,043h,0F4h,047h,091h,0DFh,033h,093h,021h,03Bh ;06
DB 079h,0B7h,097h,085h,010h,0B5h,0BAh,03Ch,0B6h,070h,0D0h,006h,0A1h,0FAh,081h,082h ;07
DB 083h,07Eh,07Fh,080h,096h,073h,0BEh,056h,09Bh,09Eh,095h,0D9h,0F7h,002h,0B9h,0A4h ;08
DB 0DEh,06Ah,032h,06Dh,0D8h,08Ah,084h,072h,02Ah,014h,09Fh,088h,0F9h,0DCh,089h,09Ah ;09
DB 0FBh,07Ch,02Eh,0C3h,08Fh,0B8h,065h,048h,026h,0C8h,012h,04Ah,0CEh,0E7h,0D2h,062h ;0A
DB 00Ch,0E0h,01Fh,0EFh,011h,075h,078h,071h,0A5h,08Eh,076h,03Dh,0BDh,0BCh,086h,057h ;0B
DB 00Bh,028h,02Fh,0A3h,0DAh,0D4h,0E4h,00Fh,0A9h,027h,053h,004h,01Bh,0FCh,0ACh,0E6h ;0C
DB 07Ah,007h,0AEh,063h,0C5h,0DBh,0E2h,0EAh,094h,08Bh,0C4h,0D5h,09Dh,0F8h,090h,06Bh ;0D
DB 0B1h,00Dh,0D6h,0EBh,0C6h,00Eh,0CFh,0ADh,008h,04Eh,0D7h,0E3h,05Dh,050h,01Eh,0B3h ;0E
DB 05Bh,023h,038h,034h,068h,046h,003h,08Ch,0DDh,09Ch,07Dh,0A0h,0CDh,01Ah,041h,01Ch ;0F

LiaoMi

  • Member
  • ****
  • Posts: 573
Re: Modular multiplicative inverse
« Reply #2 on: September 09, 2019, 06:46:14 PM »
Hi mabdelouahab,

4.2 Multiplication
5.1.1 SubBytes()Transformation
http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.197.pdf

AW

  • Member
  • *****
  • Posts: 2243
  • Let's Make ASM Great Again!
Re: Modular multiplicative inverse
« Reply #3 on: September 10, 2019, 12:09:01 AM »
Your question appears to be: How is the Multiplicative Inverse Table produced for the Rijndael characteristic 2 finite field which uses the following polynomial for multiplication: x^8 + x^4 + x^3 + x + 1?

There are a few ways, but you can try the Euclid-Wallis Algorithm, mentioned here on the chapter Simple algebraic field extensions.
If you follow the mentioned example where a= x^6+x^4+x+1 (binary 0101 0011 = 53h) you will get the inverse as x^7 + x^6 + x^3 + x (binary 1100 1010 = CAh). Confirm in the table that it is correct!

If you don't understand, I can't explain better because the person that explained it to me will not be available in the short term.  :biggrin:

manuel

  • Regular Member
  • *
  • Posts: 2
Re: Modular multiplicative inverse
« Reply #4 on: September 10, 2019, 03:47:28 AM »
Translated by Google sorry.
I think you want to calculate the multiplicative inverse of x (Mod 255) you need to calculate
MCDE with the values of x, 255.  If the mcde (x, 255) is different from 1, then x has no inverse
If the mcde (x, 255) == 1 then the algorithm also returns the inverse.
I have the function written in C # for the BigInteger class. If you want it, you tell me.

manuel

  • Regular Member
  • *
  • Posts: 2
Re: Modular multiplicative inverse
« Reply #5 on: September 10, 2019, 04:05:45 AM »
Mcd(2Bh,FFh) = 1 OK. The condition       inv(2B) = ACh 
Mcd(15h,FFh) = 3 != 1   No inverse exist.

AW

  • Member
  • *****
  • Posts: 2243
  • Let's Make ASM Great Again!
Re: Modular multiplicative inverse
« Reply #6 on: September 10, 2019, 04:17:48 AM »
Mcd(2Bh,FFh) = 1 OK. The condition       inv(2B) = ACh 
Mcd(15h,FFh) = 3 != 1   No inverse exist.

We are talking about a different thing,we are talking about the Finite Field GF(2^8). Things are much more complicated.
Introduction to Finite Fields.
http://www.cs.utsa.edu/~wagner/laws/FFM.html

mabdelouahab

  • Member
  • ***
  • Posts: 428
Re: Modular multiplicative inverse
« Reply #7 on: September 10, 2019, 06:54:30 AM »
Hi mabdelouahab,

4.2 Multiplication
5.1.1 SubBytes()Transformation
http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.197.pdf
Hi LiaoMi
"For  the  AES  algorithm,  this  irreducible polynomial is m(x) = x^8 + x^4 + x^3 + x +1 or {01}{1b} in hexadecimal notation."
These explain a lot, Thanks

If you don't understand, I can't explain better because the person that explained it to me will not be available in the short term.  :biggrin:

I think we will not need your friend, because we have LiaoMi  :biggrin:
But the algorithm did not work with me، It gives me in the first step : quotient=x +1 , but the author confirms that the first step : quotient=x^2 +1
 
"Moreover, div is an auxiliary function that computes the quotient of the Euclidean division."
Code: [Select]
function inverse(a, p)
    t := 0;     newt := 1;   
    r := p;     newr := a;   
    while newr ≠ 0
        quotient := r div newr
        (r, newr) := (newr, r - quotient * newr)
        (t, newt) := (newt, t - quotient * newt)
    if degree(r) > 0 then
        return "Either p is not irreducible or a is a multiple of p"
    return (1/r) * t

Code: [Select]
inversefunc proc a;,p
LOCAL _t,_newt,_r,_newr,_quotient
mov _t,0
mov _newt,1
mov eax,011BH ;p in my case p Always equal 011bh
mov _r,eax
mov eax,a
mov _newr,eax
@@:
xor edx,edx
mov eax,_r
mov ecx,_newr
div ecx
mov _quotient,eax ;        quotient := r div newr
mov edx,_newr
mov ebx,_r
mov _r,edx ; r:= newr
mul ecx
sub ebx,eax
mov _newr,ebx ; newr:=r - quotient * newr
mov eax,_quotient
imul _newt
mov ebx,_t
mov edx,_newt
mov _t,edx ; t:=newt
sub ebx,eax
mov _newt,ebx ; newt:=t - quotient * newt)
cmp _newr,0
jne @B
mov eax,_t
ret

inversefunc endp

AW

  • Member
  • *****
  • Posts: 2243
  • Let's Make ASM Great Again!
Re: Modular multiplicative inverse
« Reply #8 on: September 10, 2019, 01:20:47 PM »
@mabdelouahab
You can't divide polynomials that way!  :badgrin:
Check with Liao Mi, if you don't believe. :biggrin:

Here is another example of using the Euclid-Wallis Algorithm, to build the table.
We want the inverse of x^5+1 (which is 10 0001 = 21h). The result is x^6+x^5+x^3+x^2+x (which means 110 1110 = 6Eh)
Confirm in the table that the result is correct. I did, and can tel you it is!  :thumbsup:

mabdelouahab

  • Member
  • ***
  • Posts: 428
Re: Modular multiplicative inverse
« Reply #9 on: September 13, 2019, 01:31:30 AM »
Final code

Code: [Select]
include masm32rt.inc
include GF28.inc
v=053h
.code
Start:
invoke gf28GetMultiplicativeInverse,v
printf("Multiplicative inverse of {%xh}=%Xh\n",v,eax)
inkey
exit

end Start

GF28.inc:
Code: [Select]
tPolyType struct
coeff db 16 dup(0)
tPolyType ends
.data
  gf28IrreduciblePolynomial db 1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0 ; =11Bh

.code
bytetopoly proc b  :byte ,p :ptr poly8
mov esi,p
dec esi
xor eax,eax
mov al,b
mov ecx,-1
bytetopolystrt:
inc cl
cmp cl,8
je bytetopolyend__
inc esi
bt eax,ecx
jnc bytetopolyz00
mov byte ptr [esi],1
jmp bytetopolystrt
bytetopolyz00:
mov byte ptr [esi],0
jmp bytetopolystrt
bytetopolyend__:
ret
bytetopoly endp
wordtopoly proc b  :word ,p :ptr poly16
mov esi,p
dec esi
xor eax,eax
mov ax,b
mov ecx,-1
wordtopolystrt:
inc cl
cmp cl,16
je wordtopolyend__
inc esi
bt eax,ecx
jnc wordtopolyz00
mov byte ptr [esi],1
jmp wordtopolystrt
wordtopolyz00:
mov byte ptr [esi],0
jmp wordtopolystrt
wordtopolyend__:
ret
wordtopoly endp
polytobyte proc pb :ptr byte,p :ptr poly8
mov esi,p
mov edi,pb
mov byte ptr [edi],0
xor ecx,ecx
mov eax,1
polytobytestrt:
cmp byte ptr [esi],1
jne @F
or byte ptr [edi],al
@@:
inc esi
shl al,1
inc cl
cmp cl,8
jne polytobytestrt
ret
polytobyte endp
polytoword proc pb :ptr word,p :ptr poly16
mov esi,p
mov edi,pb
mov word ptr [edi],0
xor ecx,ecx
mov eax,1
polytowordstrt:
cmp byte ptr [esi],1
jne @F
or word ptr [edi],ax
@@:
inc esi
shl ax,1
inc cl
cmp cl,16
jne polytowordstrt
ret
polytoword endp
gf28resetpoly proc ptPolyType,deg
mov edi,ptPolyType
xor ecx,ecx
@@: mov byte ptr [edi+ecx],0
inc ecx
cmp ecx,deg
jne @B
ret
gf28resetpoly endp

gf28MovPolyToPoly proc pdes,psrc,l
mov edi,psrc
mov esi,pdes
xor ecx,ecx
@@:
mov al,[edi+ecx]
mov [esi+ecx],al
inc ecx
cmp ecx,l
jne @B
ret
gf28MovPolyToPoly endp

gf28GetDegreePoly proc uses esi edi ecx edx pfx
LOCAL d,notzero_
mov d,0
mov notzero_,0
mov esi,pfx

xor ecx,ecx
@@:
.if byte ptr [esi+ecx] !=0
mov notzero_,1
mov d,ecx
.endif
inc ecx
cmp ecx,15
jne @B
mov eax,d
.if notzero_
inc eax
.endif
ret

gf28GetDegreePoly endp

gf28AddPoly proc pfxgx:ptr,pfx:ptr,pgx:ptr,l
LOCAL tmppoly:tPolyType
invoke gf28resetpoly,addr tmppoly,16

lea esi,tmppoly
mov edi,pfx
mov edx,pgx
xor ecx,ecx

@@: mov al,[edx+ecx]
xor al,[edi+ecx]
mov [esi+ecx],al
inc ecx
cmp ecx,l
jne @B

invoke gf28MovPolyToPoly,pfxgx,addr tmppoly,16

ret
gf28AddPoly endp

gf28mulPoly proc pfxgx:ptr,pfx:ptr,pgx:ptr
LOCAL indx1,indx2
LOCAL tmppoly:tPolyType
invoke gf28resetpoly,addr tmppoly,16
lea esi,tmppoly
mov edi,pfx
mov indx1,0
gf28mulPolyf:
mov edx,pgx
mov indx2,0
gf28mulPolyg:
;*************************
mov al,[edx]
and al,[edi]
mov ecx,indx1
add ecx,indx2
xor al,[esi+ecx]
mov [esi+ecx],al
;*************************
inc edx
inc indx2
cmp indx2,8
jne gf28mulPolyg
inc edi
inc indx1
cmp indx1,8
jne gf28mulPolyf

lea edi,gf28IrreduciblePolynomial

gf28mulPolymod:
invoke gf28GetDegreePoly,addr tmppoly

sub eax,9
mov indx1,eax
.if sdword ptr eax < 0
invoke gf28MovPolyToPoly,pfxgx,addr tmppoly,16
ret
.endif
xor ecx,ecx
@@:
mov edx,indx1
add edx,ecx
mov al,[esi+edx]
xor al,[edi+ecx]
mov [esi+edx],al
inc ecx
cmp ecx,9
jne @B
jmp gf28mulPolymod

ret
gf28mulPoly endp
gf28divpoly proc  pfxgx:ptr,pfx:ptr,pgx:ptr,pmod:ptr
LOCAL indx1,indx2
LOCAL tmppolymod:tPolyType
LOCAL tmppolyqot:tPolyType
.if pfxgx==0 && pmod==0
ret
.endif
.if pfxgx != 0
invoke gf28resetpoly, addr tmppolyqot,16
lea esi,tmppolyqot
.endif
invoke gf28resetpoly,addr tmppolymod,16
lea ebx,tmppolymod
mov edi,pfx
xor ecx,ecx
@@:
mov al,[edi+ecx]
mov [ebx+ecx],al
inc ecx
cmp ecx,9
jne @B

mov edi,ebx
mov edx,pgx

mov indx2,rv(gf28GetDegreePoly,pgx)
gf28divmod:
mov indx1,rv(gf28GetDegreePoly,edi)
sub eax,indx2
mov indx1,eax
.if (sdword ptr eax < 0)
.if pmod != 0
invoke gf28MovPolyToPoly,pmod,addr tmppolymod,9
.endif
.if pfxgx != 0
invoke gf28MovPolyToPoly,pfxgx,addr tmppolyqot,9
.endif
ret
.endif
.if pfxgx !=0
mov byte ptr [esi+eax],1
.endif
push edx
push ebx
push edi
pop edi
pop ebx
pop edx

xor ecx,ecx
@@:
mov ebx,indx1
add ebx,ecx
mov al,[edi+ebx]
xor al,[edx+ecx]
mov [edi+ebx],al
inc ecx
cmp ecx,indx2
jne @B
jmp gf28divmod

ret

gf28divpoly endp

gf28invPoly proc pinv,pfx
LOCAL _t :tPolyType
LOCAL _r :tPolyType
LOCAL _newt :tPolyType
LOCAL _newr :tPolyType
LOCAL _quotient :tPolyType
LOCAL _tmp :tPolyType

invoke gf28MovPolyToPoly, addr _r,addr gf28IrreduciblePolynomial,16 ; r := p; invoke wordtopoly,011bh,addr _r       
invoke wordtopoly,0h,addr _t ;    t := 0;       
invoke wordtopoly,1h,addr _newt ;    newt := 1;   
invoke gf28MovPolyToPoly, addr _newr,pfx,16 ;    newr := a

@@:
invoke gf28GetDegreePoly,addr _newr
.if eax == 0
invoke gf28MovPolyToPoly, pinv,addr _t,16
ret
.endif
invoke gf28MovPolyToPoly, addr _tmp,addr _newr,16    
invoke gf28divpoly,addr _quotient,addr _r,addr _newr,addr _newr
invoke gf28MovPolyToPoly, addr _r,addr _tmp,16 ;    r := _newr;   

invoke gf28mulPoly,addr _tmp,addr _newt,addr _quotient
invoke gf28AddPoly,addr _tmp,addr _tmp,addr _t,8
invoke gf28MovPolyToPoly, addr _t,addr _newt ,8 ;    t := _newt; ;   
invoke gf28MovPolyToPoly,addr _newt, addr _tmp,8   
jmp @B ; loop
ret
gf28invPoly endp
gf28GetMultiplicativeInverse proc a:byte
LOCAL _tmpPoly :tPolyType
LOCAL _tmpInvPoly :tPolyType
LOCAL _byteInverse :byte

invoke gf28resetpoly,addr _tmpPoly,16
invoke bytetopoly,a,addr _tmpPoly
invoke gf28invPoly,addr _tmpInvPoly,addr _tmpPoly
invoke polytobyte,addr _byteInverse,addr _tmpInvPoly
xor eax,eax
mov al,_byteInverse
ret
gf28GetMultiplicativeInverse endp

AW

  • Member
  • *****
  • Posts: 2243
  • Let's Make ASM Great Again!
Re: Modular multiplicative inverse
« Reply #10 on: September 13, 2019, 03:58:43 AM »
It works but is a confusing mess, it appears like you converted it from some HLL.
All you need are 3 (three) ASM functions. Polynomial divide, GF(2^8) multiply and GF(2^8) add.

Code: [Select]
.model flat, C

.code

polyDivision proc uses esi edi num:dword, den:dword, quocPtr:ptr, remPtr:ptr
LOCAL remainder
LOCAL term

mov eax, num
mov remainder, eax
bsr esi, remainder
bsr edi, den
xor edx, edx

.while esi>=edi
mov ecx, esi
sub ecx, edi
mov eax, den
shl eax, cl
mov term, eax
mov eax, 1
shl eax, cl
xor edx, eax
mov eax, remainder
xor eax, term
mov remainder, eax

bsr esi, remainder
bsr edi, den
.endw
mov eax, quocPtr
mov dword ptr [eax], edx
mov eax, remPtr
mov ecx, remainder
mov dword ptr [eax], ecx

ret
polyDivision endp

polyMultiply proc mult1:byte, mult2:byte, res : ptr
LOCAL result : byte
mov result, 0
.while (mult1 != 0) && (mult2 !=0)
movzx eax,mult2
and al, 1
.if al !=0
movzx eax, result
movzx ecx, mult1
xor eax, ecx
mov result, al
.endif
movzx eax, mult1
and eax, 80h
.if al != 0
movzx eax, mult1
shl eax, 1
xor eax, 11Bh
mov mult1, al
.else
mov al, mult1
shl al, 1
mov mult1, al
.endif
mov al, mult2
shr al,1
mov mult2, al
.endw
mov eax, res
mov cl, result
mov byte ptr [eax], cl
ret
polyMultiply endp

polyAdd proc add1:byte, add2:byte
movzx eax, add1
movzx ecx, add2
xor eax, ecx
ret
polyAdd endp

end

When you use the above procedures with the following C you can make fast a beautiful table:

Code: [Select]
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

extern
void polyDivision(unsigned num, unsigned den, unsigned* polyDivision, unsigned* remPtr);
void polyMultiply(unsigned char mult1, unsigned char mult2, unsigned char* res);
unsigned char polyAdd(unsigned char add1, unsigned char add2);

int main()
{
int i, j;
int t, newt, r, newr, quotient, remain;
unsigned char tempByte, tempNew, printValue, tempNewr;

for (i = 0; i < 16; i++)
printf("\t%.2x", i);

printf("\n\n");

for (i = 0; i < 16; i++)
{
printf("%.2x\t", i);
for (j = 0; j < 16; j++)
{
tempNewr = i * 16 + j;
newr = tempNewr;
r = 0x11B;
t = 0;
newt = 1;
if (newr>1)
{
while (newr != 1)
{
polyDivision(r, newr, &quotient, &remain);
r = newr;
newr = remain;
polyMultiply(quotient, newt, &tempByte);
tempNew = polyAdd(t, tempByte);
t = newt;
newt = tempNew;
}
}
if (tempNewr < 2)
printValue = tempNewr;
else
printValue = tempNew;
printf("%.2x\t", printValue);

}
printf("\n");
}

return 0;
}


Output:

Code: [Select]
        00      01      02      03      04      05      06      07      08      09      0a      0b      0c      0d     0e       0f

00      00      01      8d      f6      cb      52      7b      d1      e8      4f      29      c0      b0      e1     e5       c7
01      74      b4      aa      4b      99      2b      60      5f      58      3f      fd      cc      ff      40     ee       b2
02      3a      6e      5a      f1      55      4d      a8      c9      c1      0a      98      15      30      44     a2       c2
03      2c      45      92      6c      f3      39      66      42      f2      35      20      6f      77      bb     59       19
04      1d      fe      37      67      2d      31      f5      69      a7      64      ab      13      54      25     e9       09
05      ed      5c      05      ca      4c      24      87      bf      18      3e      22      f0      51      ec     61       17
06      16      5e      af      d3      49      a6      36      43      f4      47      91      df      33      93     21       3b
07      79      b7      97      85      10      b5      ba      3c      b6      70      d0      06      a1      fa     81       82
08      83      7e      7f      80      96      73      be      56      9b      9e      95      d9      f7      02     b9       a4
09      de      6a      32      6d      d8      8a      84      72      2a      14      9f      88      f9      dc     89       9a
0a      fb      7c      2e      c3      8f      b8      65      48      26      c8      12      4a      ce      e7     d2       62
0b      0c      e0      1f      ef      11      75      78      71      a5      8e      76      3d      bd      bc     86       57
0c      0b      28      2f      a3      da      d4      e4      0f      a9      27      53      04      1b      fc     ac       e6
0d      7a      07      ae      63      c5      db      e2      ea      94      8b      c4      d5      9d      f8     90       6b
0e      b1      0d      d6      eb      c6      0e      cf      ad      08      4e      d7      e3      5d      50     1e       b3
0f      5b      23      38      34      68      46      03      8c      dd      9c      7d      a0      cd      1a     41       1c

mabdelouahab

  • Member
  • ***
  • Posts: 428
Re: Modular multiplicative inverse
« Reply #11 on: September 13, 2019, 04:51:22 AM »
 :thumbsup:

nidud

  • Member
  • *****
  • Posts: 1717
    • https://github.com/nidud/asmc
Re: Modular multiplicative inverse
« Reply #12 on: September 13, 2019, 07:26:16 AM »
It works but is a confusing mess, it appears like you converted it from some HLL.

Indeed :biggrin:

The divide function should return a value so the last argument is not needed.

unsigned polyDivision(unsigned num, unsigned den, unsigned* polyDivision);

The function is 89 bytes and bench at 100106 cycles.

A reduced version with less confusing mess:

polyDivision proc uses edi num:dword, den:dword, quocPtr:ptr
    mov eax,num
    bsr edi,den
    bsr ecx,eax
    xor edx,edx
    .while ecx >= edi
        sub ecx,edi
        mov edi,den
        shl edi,cl
        xor eax,edi
        mov edi,1
        shl edi,cl
        xor edx,edi
        bsr ecx,eax
        bsr edi,den
    .endw
    mov ecx,quocPtr
    mov [ecx],edx
    ret
polyDivision endp

57 bytes and 63363 cycles.

You should try to avoid using memory operands if possible. Bear metal is shorter and faster.

polyMultiply proc mult1:byte, mult2:byte
    xor eax,eax
    mov dl,mult1
    mov dh,mult2
    .while dl && dh
        .if dh & 1
            xor al,dl
        .endif
        .if dl & 80h
            shl dl,1
            xor dl,1Bh
        .else
            shl dl,1
        .endif
        shr dh,1
    .endw
    ret
polyMultiply endp

106 byte and 100392 cycles versus 48 byte and 37749 cycles.

AW

  • Member
  • *****
  • Posts: 2243
  • Let's Make ASM Great Again!
Re: Modular multiplicative inverse
« Reply #13 on: September 13, 2019, 08:38:47 AM »
A reduced version with less confusing mess:

At least you understood the mess enough to roll up your sleeves and spend ten minutes looking for a way to save 2 milliseconds.  :thumbsup:

Quote
Bear metal is shorter and faster.

Do you mean beer metal?   :rolleyes:

HSE

  • Member
  • *****
  • Posts: 1081
  • <AMD>< 7-32>
Re: Modular multiplicative inverse
« Reply #14 on: September 13, 2019, 09:53:20 AM »
Bare  :biggrin: