My point being,check file format and switch/case check different entropy depending on bit depths? Simplest with. Bmps
Maybe need adjust to different algo with. Jpg lossy compression?
Good point, it's hard answer because it's related to data instead of specific algorithm.
Try a one collor only figure and save as jpg and gif. Gif will compress better.
Depending on data, Huffman tree can do the same as arithmetic encoder; but in other data can fail.
Jpg have 2 versions, a lossy and a lossless. When was released jpeg format the arithmetic encoder was patented hard, so they used Huffman. But now patent expired. I have see some persons exporting compressed jpg image as raw data, then compressing with arithmetic encoder and structured that back to jpeg structure/form. Have some gain by doing this. Jpeg support arithmetic encoding but don't used that because patents.
Exist a lot of compression algorithms, a lot of LempelZiv variations (Zip, Rar, ...), exist transforms like Burrows-Wheller (gzip) that is usefull in text files, a move-to-front after BW transform so more seen symbols are at start of 'pseudo' alphabeth, and output of this going to huffman/AC.
Others start with specific dictionary frequency, after digraphs, trigraphs, ..., elevating orders, trying a dictionary prediction (predict partial match, PPM), and outputing that to some arithmetic encoding.
AC reach very close to entropy, have some bits lost because magnitude problem. Well, the carry. It's not viable to take the whole file in memory, and deal with carry passing by a lot of bytes. Generally we take a register size, like 32 or 64 bits, and if carry goes and overflow register, we deal only with this. We predict that this can happen when left most 2 bits are oposite one from each other; so, take an action.
The tendency is to go to Artificial Inteligence. Teach a brain to learn about data and their own errors and correct itself, and output of this brain layer going to some data comrpression algorithm.
It's hard answer because if you change input source file being tested, one compressor program can compress better than other. And supposition here is that the same data type (text as example) is being source input.