News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

How difficult is it to build a shrinking/deflating routine?

Started by frktons, December 15, 2012, 10:58:49 AM

Previous topic - Next topic

dedndave

RtlDecompressBuffer
http://msdn.microsoft.com/en-us/library/windows/hardware/ff552191%28v=vs.85%29.aspx

RtlCompressBuffer
http://msdn.microsoft.com/en-us/library/windows/hardware/ff552127%28v=vs.85%29.aspx


ok
use RtlCompressBuffer to create a raw binary
add it to your program as a resource
use the LoadFromRsrc routine i posted above to get it into a buffer
use RtlDecompressBuffer to decompress it

BANG - 50%
which is, you guessed it, $50

of course, you could just create a DB list like you have of the compressed binary
then, you don't need my routine - just put it in the .DATA section

dedndave

you will want to use
        INCLUDE    \masm32\include\masm32rt.inc
        INCLUDE    \masm32\include\ntoskrnl.inc
        INCLUDELIB \masm32\lib\ntoskrnl.lib

frktons

Yes Dave, I've thought about these systems as well.

But where is the fun in using precooked meals?

Here we have something that will help in the process of
building our own code.  :lol:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

ok
write your own code
hopefully, it is smaller that the 606 bytes you are trying to save   :biggrin:
that's the advantage of using the API function

frktons

Quote from: dedndave on December 15, 2012, 01:50:35 PM
ok
write your own code
hopefully, it is smaller that the 606 bytes you are trying to save   :biggrin:
that's the advantage of using the API function

The point is, I'm thinking about multiple long strings, so the code will
earn its $50% many times inside the program.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

nidud

deleted

frktons

Hi nidud.

Will you post also the algo to obtain:

bcode db 0,6,5*6 ; 'blabalblabalblabalblabalblabal'
db -3,3,3 ; 'bal'
db -3,2,2 ; 'ba'
db 6,2,2 ; 'hn'
db -4,2,2 ; 'ba'
db 8,3,3 ; 'gbg'
db -2,2,2 ; 'bg'
db 11,3,3 ; 'abg'
db -10,3,3*3 ; 'bagbagbag'
db -56,56,2*56 ; repeat line * 2
db -1 ; 45 byte


I miss the compression phase.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

LZW data compression is not rocket science
basically, you break up the data and replace sections with tokens
in the token table, you put the original "string"
i think the token table can only hold something like ~254 tokens
when it gets full, and you need to create a new token, you trash it and start a new token table
the tokens then replace the strings in the compressed data stream

so - you can write your own if you like
there are also pre-written libraries like gzip, etc
i seem to recall someone making a LIB and INC for gzip a while back
don't remember if it was in the new forum or the old one

frktons

Quote from: dedndave on December 15, 2012, 02:03:58 PM
LZW data compression is not rocket science
so - you can write your own if you like
there are also pre-written libraries like gzip, etc
i seem to recall someone making a LIB and INC for gzip a while back
don't remember if it was the new forum or the old one

Probably it was on the old forum. By the way, as you say I can write a simplified
algo that suits my needs. I've thought about it during the last 2 years, when I
remembered the matter, not very often, but I've got some ideas to try.
And I think writing some code shouldn't hurt in the process of learning a bit of
Assembly :lol:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

i updated my previous post
LZW compression...

Quotebasically, you break up the data and replace sections with tokens
in the token table, you put the original "string"
i think the token table can only hold something like ~254 tokens
when it gets full, and you need to create a new token, you trash it and start a new token table
the tokens then replace the strings in the compressed data stream

frktons

I think the algo was translated by Jochen:
http://www.masmforum.com/board/index.php?topic=15470.0

Quote from: dedndave on December 15, 2012, 02:12:01 PM
LZW compression...

basically, you break up the data and replace sections with tokens
in the token table, you put the original "string"
i think the token table can only hold something like ~254 tokens
when it gets full, and you need to create a new token, you trash it and start a new token table
the tokens then replace the strings in the compressed data stream

Yes Dave, this is one of the few things I understand about compression methods.
And I think it'll be enough for the time being.   :t
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

nidud

deleted

dedndave

yes - the trick is to pick good strings to tokenize
token selection should match the type of data
that's the key to getting good compression ratios

plain text like this is probably the easiest data type to work with (except for a shit-load of zeros)
it's somewhat predictable   :P

frktons

Quote from: nidud on December 15, 2012, 02:19:38 PM

:biggrin:

There are many algos to do that
You scan the output buffer for duplicate strings

The most common is to use { WORD offset, BYTE length }
The minimum string length is then 3 byte

The string buffer is also converted to bits:
a = 0
b = 10
l = 1100
h = 1101
n = 1110
g = 1111

bla = 10 1100 0 = 7 bits


OK nidud, if you feel to do it, show me your idea inside the code
I posted, and tell me what % you get shrinking the string.

Quote from: dedndave on December 15, 2012, 02:32:13 PM
yes - the trick is to pick good strings to tokenize
token selection should match the type of data
that's the key to getting good compression ratios

That's also the most difficult thing to do in my opinion.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama