News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

How difficult is it to build a shrinking/deflating routine?

Started by frktons, December 15, 2012, 10:58:49 AM

Previous topic - Next topic

Donkey

For that you could set the most significant bit of the nRepeats character followed by a 7 bit index in the char section. For example "mov eax,0" would reduce to the following given that an index entry is command #1 and it was the first entry in the index:

0000000:1000001:0000001

All index entries would reduce to 21 bits total length regardless of size. (plus the actual index entry itself and the lookup table)
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

frktons

Quote from: jj2007 on December 16, 2012, 02:36:06 PM

Yes, the source is available here. If you can convince your C compiler to translate LZMA\C\Util\Lzma\LzmaUtil.c into a DLL or LIB, then I will offer your the CompressBuffer function.

My Visual Studio 2010 refuses to convert any "project" older than two years or so, sometimes with loads of cryptic error messages, sometimes with a long silence. Pelles C utters "internal errors" without specifying what's wrong, so it's the usual C mess. But try your luck.

And now I start to understand your reasons. Probably they use a port of GCC for
Windows. I'm not going to ask you to write the CompressBuffer function, there is
no reason for that.
If you need to know what compiler they used, for any reason, I think it is quite easy
for you to find out. You know that better than me.  :t

Quote from: jj2007 on December 16, 2012, 02:42:41 PM
Quote from: Donkey on December 16, 2012, 02:15:47 PMFor repeat characters any 3 repeated characters or greater can be replaced by 0000000:nRepeats:char where nRepeats would be 6 bits preceded by 0 (allowing up to 63 repeats) and char would be 7 bits.

Go a step further and set the rule "everything above 127 is an index into the dictionary of frequently used words"

So the three-byte string chr$(3+128, 32, 7+128) becomes <invoke lstrcpy>, 14 bytes

1 mov
2 add
3 invoke
4 proc
5 endp
6 call
7 lstrcpy

This is more or less the approach of archivers nowadays - find long frequently used byte sequences.

Do you remember the simple ASCII program I wrote 2 years ago?
One of the routine drew boxes on the screen, and another displayed
all the numbers from 1 to 255 with the chars associated. In that case a couple of
procs created data about 1200*4 bytes long, and the routines themselves
where not bigger than 100 bytes.

In my opinion if you use a standard optimized algo you can have some nice
performances in the average cases, but sometime you have to decide alternative
ways, if you don't mind doing it and are not in a hurry. 
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

qWord

Quote from: jj2007 on December 16, 2012, 02:36:06 PMYes, the source is available here. If you can convince your C compiler to translate LZMA\C\Util\Lzma\LzmaUtil.c into a DLL or LIB, then I will offer your the CompressBuffer function.

My Visual Studio 2010 refuses to convert any "project" older than two years or so, sometimes with loads of cryptic error messages, sometimes with a long silence. Pelles C utters "internal errors" without specifying what's wrong, so it's the usual C mess. But try your luck.
you need to compile also the modules in LZMA\C. Compiled, but not tested version in the attachment.
MREAL macros - when you need floating point arithmetic while assembling!

hutch--

If general purpose compression is what you are after and you are not trying to design this from scratch yourself, have a look at the masm32 example that uses Jibz's aPlib compression library. A quick scruffy test on WINDOWS.INC compressed it from 977426 bytes down to 207882 bytes. It is primarily designed for binary compression but works fine on plain text.

exampl02\appack\appack.asm

Its all there and works.

Here is a link to Jibz's site where you can download his compression software legally and for free.

http://www.ibsensoftware.com/index.html


jj2007

Quote from: qWord on December 16, 2012, 04:24:01 PM
you need to compile also the modules in LZMA\C. Compiled, but not tested version in the attachment.
Thanks. With PeView, it throws an exception, unfortunately.

In the meantime, I could build the lib (attached) with Pelles C with some "warning #2117: Old-style function definition for 'GetError'", plus one that looks a bit more serious (see attachment):
D:\LZMA_SDK\C\Threads.c(37): warning #2018: Undeclared function '_beginthreadex'; assuming 'extern' returning 'int'. But I have no clue how to get from there to the \util result. There are two makefiles around, see attachment, but Pelles C help does not know the word "makefile" ::)

jj2007

Quote from: hutch-- on December 16, 2012, 05:55:56 PM.. Jibz's aPlib compression library. A quick scruffy test on WINDOWS.INC compressed it from 977426 bytes down to 207882 bytes.

Hutch,

Sure it works, but as a forum of "aliens" we are morally obliged to achieve the best compression rate the world has ever seen, with small & fast assembler code of course ;-)

dedndave


frktons

Quote from: hutch-- on December 16, 2012, 05:55:56 PM
If general purpose compression is what you are after and you are not trying to design this from scratch yourself, have a look at the masm32 example that uses Jibz's aPlib compression library. A quick scruffy test on WINDOWS.INC compressed it from 977426 bytes down to 207882 bytes. It is primarily designed for binary compression but works fine on plain text.

exampl02\appack\appack.asm

Its all there and works.

Here is a link to Jibz's site where you can download his compression software legally and for free.

http://www.ibsensoftware.com/index.html



Thanks Hutch, I suspected there was something like that packed inside MASM32.

Quote from: jj2007 on December 16, 2012, 08:41:42 PM

Hutch,

Sure it works, but as a forum of "aliens" we are morally obliged to achieve the best compression rate the world has ever seen, with small & fast assembler code of course ;-)

:dazzled:  :t

Quote from: dedndave on December 16, 2012, 09:35:11 PM
aliens - you mean like sergiu...


:greenclp:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave


wjr

Quote from: jj2007 on December 16, 2012, 08:28:09 PM
Thanks. With PeView, it throws an exception, unfortunately.

The file LzmaUtil.dll has long section names (debug related) which, although allowed in OBJ files, shouldn't be making it into an EXE/DLL file (according to the Microsoft specs).

I fixed this in the latest version of PEview 0.9.9 which does not throw an exception (however, it does not display these long section names).

frktons

Quote from: dedndave on December 17, 2012, 04:05:06 AM
don't know if you remember him, Frank
he promised us 100 to 1 compression - lol

http://www.masmforum.com/board/index.php?topic=13454.0

Yes Dave, I remember him. I also partially agreed with him because
in some conditions you really can have a compress ratio of 100:1, but
I didn't buy the whole story, of course.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

jj2007

Quote from: wjr on December 17, 2012, 05:31:21 AM
I fixed this in the latest version of PEview 0.9.9 which does not throw an exception (however, it does not display these long section names).

Hi Wayne,
Thanks for this. Do you have a download link for the binary? Google has no clue :bgrin:


qWord

Quote from: wjr on December 17, 2012, 05:31:21 AMThe file LzmaUtil.dll has long section names (debug related) which, although allowed in OBJ files, shouldn't be making it into an EXE/DLL file (according to the Microsoft specs).
gcc - you get what you pay for  :icon_mrgreen:
MREAL macros - when you need floating point arithmetic while assembling!

hutch--

 :biggrin:

> gcc - you get what you pay for  :icon_mrgreen:

:P