SMAC : Compression with Arithmetic Coding+Context Mixing

qWord · December 02, 2013, 08:34:45 AM

Quote from: Jean-Marie on December 01, 2013, 12:15:40 PM
I'm replacing the Squash function by Qword's one, but I have some trouble to convert the Double in xmm0 to an integer.

yepp, only SDWORDs are supported for x32 - it may be possible to change the type in ADAPTIVEMODEL to REAL8? Even these integer <-> FP conversions are generally "slow" and should be avoided as far as possible.

dedndave · December 02, 2013, 08:52:16 AM

somewhere, i saw a snippet that used a shuffle to get it done

Jean-Marie · December 02, 2013, 04:00:00 PM

Sadly, we need integer versions of Upper & Lower limits since HighRange, LowRange & Ranges are kept as integers.
Once the most significant byte in High & Low matches, we output it as an encoded byte in our destination buffer.

I'm going to test a quick & dirty solution for the moment till I find something better.
Like:
movsd Temp,xmm0
fld Temp
Fistp qword ptr [edi]

Jean-Marie · December 02, 2013, 11:47:32 PM

Well that's impressive : the compression ratio is practically the same. Only 2 bytes are lost on enwik8, which is 100 000 000 bytes long! I clocked a gain of nearly 30 seconds on the same file. And the code can still be improved.
Much obliged, pals

Code Select


@@Squash:
	fstp	qword ptr Temp
	lea	edi,Ranges.ModelStart
	movsd	xmm0,Temp		;xmm0=weighted average
	movsd	xmm1,@@minus16
	movsd	xmm2,@@plus16
	comisd	xmm0,xmm1		;test if p<-15.999988
	jb	@@SQ2			;if so, bound it
	comisd	xmm0,xmm2		;test if p>15.999988
	ja	@@SQ3			;if so, bound it
@@SQ1:	movaps	xmm1,xmm0
	addsd	xmm0,@@16
	addsd 	xmm0,xmm0
	cvttsd2si ecx,xmm0
	imul 	ecx,48
	lea 	ecx,@@tbl1[ecx]
	movsd 	xmm0,[ecx+0*8]
	mulsd 	xmm0,xmm1
	addsd 	xmm0,[ecx+1*8]
	mulsd 	xmm0,xmm1
	addsd 	xmm0,[ecx+2*8]
	mulsd 	xmm0,xmm1
	addsd 	xmm0,[ecx+3*8]
	mulsd 	xmm0,xmm1
	addsd 	xmm0,[ecx+4*8]
	mulsd 	xmm0,xmm1
	addsd 	xmm0,[ecx+5*8]		;xmm0=2^32/(1+2^-p)
	movsd	Temp,xmm0		;st=bit 0 upper limit
	fld	Temp
	fistp	qword ptr [edi]		;store bit 0 upper limit
	mov	dword ptr [edi+4],-1	;Upper limit of bit 1 always FFFFFFFFH
...
@@SQ2:	movaps	xmm0,xmm1		;p=-15.999988
	jmp	@@SQ1
@@SQ3:	movaps	xmm0,xmm2		;p=+15.999988
	jmp	@@SQ1
	INCLUDE SQUASHTB.INC
@@TwoPower32	DQ 4294967296.0		;2^32 (float)
@@16		DQ 16.0
@@plus16	DQ 15.999988
@@minus16	DQ -15.999988

qWord · December 03, 2013, 04:31:02 AM

Comparing your version from the first post with the latest, the speed grows about 6% (Intel i7QM, file=enwik8). :t

Are the compares -16>x>16 really needed? Also, why 15.999988 and not 16?

regards, qWord

Jean-Marie · December 03, 2013, 10:24:51 AM

I have been quite inaccurate in my post #2. Actually, p generally ranges from -15,999978 to +15,999978 (cf enclosed picture). But in some rare cases, because the weights are unbounded, p might fall out from this range so we need to correct it.
The objective is to have a value of Squash(p) between [10000H; FFFF0000H], otherwise the calculation of the ranges would be subject to an underflow or overflow situation.
This is why I used those weird decimals : I calculated that Squash(-15.99998) returns 10000H, and Squash(+15.99998) returns FFFF0000H.
Squash(-16) returns FFFFH, and Squash(+16) returns a NAN, but may be this can be corrected by slightly modifying the tables.

Out of curiosity, qWord, what are the steps to transform the Squash formula in (Taylor?) polynomials approximation?
I'm wondering if a step of 1 would be sufficient.

qWord · December 03, 2013, 12:12:16 PM

Quote from: Jean-Marie on December 03, 2013, 10:24:51 AMOut of curiosity, qWord, what are the steps to transform the Squash formula in (Taylor?) polynomials approximation?
I'm wondering if a step of 1 would be sufficient.

For the polynomials I use the minimax approximation, which minimize the maximal error for a specific range. It would also work with Taylor series, but that require higher degree polynomials and/or finer intervals to get the same result.
In the attachment some more tables (range -16<=x<=16, degree 3, 5 and 7; step 1/2, 1 and 2). The maximal relative error can be found at top of each file.

Jean-Marie · December 03, 2013, 02:10:03 PM

Thanks a bunch qWord : I'm gonna look into all this :icon14:

The MASM Forum

News:

SMAC : Compression with Arithmetic Coding+Context Mixing

qWord

dedndave

Jean-Marie

Jean-Marie

qWord

Jean-Marie

qWord

Jean-Marie