ZenHAX
http://zenhax.com/

How to recognize the compression algorithms with your eyes
http://zenhax.com/viewtopic.php?f=4&t=27
Page 1 of 2

Author:  aluigi [ Thu Aug 07, 2014 5:29 pm ]
Post subject:  How to recognize the compression algorithms with your eyes

The example is a PNG used in the previous thread (the one showing "QUICKBMS") and attached to this thread.

zlib
It starts with 0x78 (rarely also with 0x58).
Use offzip to test if it's really zlib.
Code:
78 da ed 8f 6b 48 53 61 1c c6 df 65 35 ed 32 8d   x...kHSa...e5.2.
4a 49 9d 65 20 88 93 2d 13 ba a0 53 ab 85 5a b9   JI.e ..-...S..Z.
96 49 a2 76 d0 32 d7 cd a8 b9 72 a9 1d 2d fd 60   .I.v.2....r..-.`
65 84 a5 cd 4a 26 eb 2a 76 d9 64 5e 32 87 a7 bc   e...J&.*v.d^2...


deflate
Usually starts with 0xe*.
Use "offzip -z -15" to test if it's really deflate.
Code:
ed 8f 6b 48 53 61 1c c6 df 65 35 ed 32 8d 4a 49   ..kHSa...e5.2.JI
9d 65 20 88 93 2d 13 ba a0 53 ab 85 5a b9 96 49   .e ..-...S..Z..I
a2 76 d0 32 d7 cd a8 b9 72 a9 1d 2d fd 60 65 84   .v.2....r..-.`e.
a5 cd 4a 26 eb 2a 76 d9 64 5e 32 87 a7 bc 2c 69   ..J&.*v.d^2...,i


lzo1x
Parts of the original data are uncompressed.
Code:
25 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44   %.PNG........IHD
52 00 00 01 f4 6e 00 08 02 44 02 01 44 b4 48 dd   R....n...D..D.H.
58 00 06 09 70 48 59 73 00 00 0e c4 6c 00 00 32   X...pHYs....l..2
01 95 2b 0e 1b 00 00 07 81 49 44 41 54 78 da ed   ..+......IDATx..


lzss
Parts of the original data uncompressed.
Code:
ff 89 50 4e 47 0d 0a 1a 0a ff 00 00 00 0d 49 48   ..PNG.........IH
44 52 6f 00 00 01 f4 fe f1 08 02 f6 f0 ef 44 b4   DRo...........D.
48 dd f6 f0 09 70 48 bf 59 73 00 00 0e c4 17 01   H....pH.Ys......
01 ff 95 2b 0e 1b 00 00 07 81 ff 49 44 41 54 78   ...+.......IDATx


Xmemcompress / LZX
Usually it starts with 0xff.
There are also some file formats created with the xbcompress tool, they start with 0x0F 0xF5 0x12 0xEE (lzx native) or 0x0F 0xF5 0x12 0xED (lzx decode).
Code:
ff 07 cf 03 4a 00 10 f3 7c 00 00 42 00 50 22 00   ....J...|..B.P".
00 5f 00 c1 41 0c 02 bb bd 70 b9 29 b3 1b db 8c   ._..A....p.)....
38 f3 dc 0e b0 54 59 32 67 c5 9c 1b cf 8f 9c 2f   8....TY2g....../
7b cc 73 26 2a 81 59 4f 89 2e 4a 11 da 90 31 03   {.s&*.YO..J...1.


Bzip2
Fixed signature, BZh91.
Code:
42 5a 68 39 31 41 59 26 53 59 c3 87 b9 ea 00 02   BZh91AY&SY......
9e ff ff ff ff ef bf f2 5d f9 ef fe ff fd be ff   ........].......
fe ff ff f8 fd 7f fb 7f bf df fb ff b5 f7 bf 9f   ................
ff ff ff c0 02 9c 1a cc db 02 2a 90 d0 d0 1a 03   ..........*.....


gzip
0x1f 0x8b, note that usually it contains deflate data, rarely lzma and some rare games use also other types of compressions (quickbms automatically handles all of them).
Code:
1f 8b 08 00 00 00 00 00 00 00 eb 0c f0 73 e7 e5   .............s..
92 e2 62 60 60 e0 f5 f4 70 09 62 60 60 fc 02 c2   ..b``...p.b``...
1c 4c 40 11 97 2d 1e 77 81 14 67 81 47 64 31 03   .L@..-.w..g.Gd1.
03 df 11 10 66 9c aa cd 27 cd c0 c0 de e8 e9 e2   ....f...'.......


JCalg
It starts with JC.
Code:
4a 43 cf 07 00 00 84 a9 56 bb 14 6a e2 20 15 36   JC......V..j. .6
3c ea 03 45 26 1a 12 45 9a 14 1f 90 03 a4 20 42   <..E&..E...... B
09 a0 da 44 93 40 3b a0 3b 52 ac 18 b8 09 71 87   ...D.@;.;R....q.
d0 dc 95 03 4a 00 81 8d 96 95 49 03 1f 24 97 6a   ....J.....I..$.j


LZMA
0x5d at the beginning (it's a flag), a 32bit field (size of the dictionary) and the lzma data.
The raw lzma stream usually starts with a 0x00 (offset 0x5)
Note: if you use "comptype lzma_compress" in QuickBMS to compress data, your output will start with 0x2c instead of 0x5d, I modified the dump to make everything easier for you.
Code:
5d 00 00 00 08 00 44 94 a6 b1 a9 14 37 65 03 e8   ].....D.....7e..
61 4e b5 0a 29 f7 bc f4 0a 39 10 76 ec 9c fe 41   aN..)....9.v...A
1a 6a 07 81 ce e1 e0 58 3f 2f a1 6a c9 03 2d 24   .j.....X?/.j..-$
38 74 b0 3d 19 ab 33 0c 73 57 75 94 da 8a ac 7e   8t.=..3.sWu....~


lzma 86 head
As before with a 64bit uncompressed size field before the compressed data.
Code:
5d 00 00 00 08 cf 07 00 00 00 00 00 00 00 44 94   ].............D.
a6 b1 a9 14 37 65 03 e8 61 4e b5 0a 29 f7 bc f4   ....7e..aN..)...
0a 39 10 76 ec 9c fe 41 1a 6a 07 81 ce e1 e0 58   .9.v...A.j.....X
3f 2f a1 6a c9 03 2d 24 38 74 b0 3d 19 ab 33 0c   ?/.j..-$8t.=..3.


lzma 86 dec
One byte more than lzma.
Code:
5d 00 00 00 08 00 00 44 94 a6 b1 a9 14 37 65 03   ]......D.....7e.
e8 61 4e b5 0a 29 f7 bc f4 0a 39 10 76 ec 9c fe   .aN..)....9.v...
41 1a 6a 07 81 ce e1 e0 58 3f 2f a1 6a c9 03 2d   A.j.....X?/.j..-
24 38 74 b0 3d 19 ab 33 0c 73 57 75 94 da 8a ac   $8t.=..3.sWu....


lzma 86 dec head
All the fields seen before.
Code:
5d 00 00 00 08 00 cf 07 00 00 00 00 00 00 00 44   ]..............D
94 a6 b1 a9 14 37 65 03 e8 61 4e b5 0a 29 f7 bc   .....7e..aN..)..
f4 0a 39 10 76 ec 9c fe 41 1a 6a 07 81 ce e1 e0   ..9.v...A.j.....
58 3f 2f a1 6a c9 03 2d 24 38 74 b0 3d 19 ab 33   X?/.j..-$8t.=..3


lzma efs
Used by the ZIP file format.
Code:
5d 00 00 00 08 00 00 05 00 00 44 94 a6 b1 a9 14   ].........D.....
37 65 03 e8 61 4e b5 0a 29 f7 bc f4 0a 39 10 76   7e..aN..)....9.v
ec 9c fe 41 1a 6a 07 81 ce e1 e0 58 3f 2f a1 6a   ...A.j.....X?/.j
c9 03 2d 24 38 74 b0 3d 19 ab 33 0c 73 57 75 94   ..-$8t.=..3.sWu.


lzma without prop / headerless
Code:
00 44 94 a6 b1 a9 14 37 65 03 e8 61 4e b5 0a 29   .D.....7e..aN..)
f7 bc f4 0a 39 10 76 ec 9c fe 41 1a 6a 07 81 ce   ....9.v...A.j...
e1 e0 58 3f 2f a1 6a c9 03 2d 24 38 74 b0 3d 19   ..X?/.j..-$8t.=.
ab 33 0c 73 57 75 94 da 8a ac 7e 5d 55 f3 19 4d   .3.sWu....~]U..M


RNC
"RNC" magic, version (1 and 2), uncompressed size.
Code:
52 4e 43 01 00 00 07 cf 00 00 03 06 d5 26 b5 99   RNC..........&..
00 00 20 21 12 9a 21 06 60 45 22 32 00 a6 40 64   .. !..!.`E"2..@d
04 80 80 64 00 42 89 50 4e 47 0d 0a 1a 0a 00 00   ...d.B.PNG......
00 0d 49 48 44 52 00 00 01 f4 5f 2d 08 02 02 01   ..IHDR...._-....


Zpaq
"zPQ" magic, currently I have never seen this compression used in games.
Code:
7a 50 51 01 01 c4 00 05 09 00 00 16 01 a0 03 05   zPQ.............
08 0d 01 08 10 02 08 12 03 08 13 04 08 13 05 08   ................
14 06 04 16 18 03 11 08 13 09 03 0d 03 0d 03 0d   ................
03 0e 07 10 00 0f 18 ff 07 08 00 10 0a ff 06 00   ................


Snappy
Uncompressed size before the data.
Code:
cf 0f 4c 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49   ..L.PNG........I
48 44 52 00 00 01 f4 01 04 50 08 02 00 00 00 44   HDR......P.....D
b4 48 dd 00 00 00 09 70 48 59 73 00 00 0e c4 01   .H.....pHYs.....
04 f0 46 01 95 2b 0e 1b 00 00 07 81 49 44 41 54   ..F..+......IDAT


Gipfeli
Small header with 32bit uncompressed size.
Code:
02 cf 07 60 00 35 da 0c 80 28 06 40 13 03 75 00   ...`.5...(.@..u.
2a 01 02 38 00 d2 01 a8 04 ea 00 34 40 05 54 0d   *..8.......4@.T.
6a 0b 40 73 0d 35 50 07 a0 2d a4 03 50 cc 35 48   j.@s.5P..-..P.5H
45 48 2d 00 cc ff 16 80 e6 1a b4 2d 00 07 c0 df   EH-........-....


LZG
"LZG" magic and uncompressed size.
Code:
4c 5a 47 00 00 07 cf 00 00 03 26 94 ed 70 6b 01   LZG.......&..pk.
12 18 23 24 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d   ..#$.PNG........
49 48 44 52 00 00 01 f4 24 62 08 02 23 0a 44 b4   IHDR....$b..#.D.
48 dd 24 c1 09 70 48 59 73 00 00 0e c4 24 62 01   H.$..pHYs....$b.


Doboz
Small header with uncompressed size.
Code:
08 cf 07 5a 03 00 00 90 90 89 50 4e 47 0d 0a 1a   ...Z......PNG...
0a 00 00 00 0d 49 48 44 52 00 00 01 f4 06 01 08   .....IHDR.......
02 48 44 b4 48 dd 1c 09 70 80 00 00 80 48 59 73   .HD.H...p....HYs
00 00 0e c4 06 01 01 95 2b 0e 1b 00 00 07 81 49   ........+......I


SFL block
Code:
40 00 00 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49   @...PNG........I
48 44 52 09 08 00 00 01 f4 00 41 08 02 01 20 44   HDR.......A... D
b4 48 dd 00 70 09 70 48 02 00 59 73 00 00 0e c4   .H..p.pH..Ys....
00 41 01 95 2b 0e 1b 00 00 07 81 00 00 49 44 41   .A..+........IDA


SFL bits
Code:
0f 89 50 4e 47 0d 0a 1a 0a 84 0c 0d 49 48 44 52   ..PNG.......IHDR
83 00 08 f4 83 00 08 f4 03 01 84 0b 44 b4 48 dd   ............D.H.
84 0c 09 70 48 59 73 83 09 0e c4 83 09 0e c4 00   ...pHYs.........
0b 95 2b 0e 1b 83 15 07 81 49 44 41 54 78 da ed   ..+......IDATx..


LZF
Code:
13 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44   ..PNG........IHD
52 00 00 01 f4 40 03 01 08 02 20 11 03 44 b4 48   R....@.... ..D.H
dd 20 06 08 09 70 48 59 73 00 00 0e c4 40 03 1f   . ...pHYs....@..
01 95 2b 0e 1b 00 00 07 81 49 44 41 54 78 da ed   ..+......IDATx..


Brieflz
Code:
89 00 00 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48   ...PNG........IH
44 52 00 00 10 00 01 f4 03 08 02 00 00 00 44 b4   DR............D.
48 04 00 dd 00 00 00 09 70 48 59 73 00 00 0e c4   H.......pHYs....
00 00 03 01 95 2b 0e 1b 00 00 07 81 49 44 41 54   .....+......IDAT


Falcom (used in the Ys series)
32bit compressed size at the beginning.
Code:
3d 03 00 00 89 50 4e 47 0d 0a 1a 0a 0a 4a 00 01   =....PNG.....J..
0d 49 48 44 52 07 01 f4 04 24 21 08 02 12 44 b4   .IHDR....$!...D.
48 dd 07 41 89 09 70 48 59 73 07 0e c4 04 a0 00   H..A..pHYs......
01 95 2b 0e 1b 09 07 81 49 44 41 54 78 da 00 00   ..+.....IDATx...


LZ4
Usually it starts with a 0xf* byte.
Code:
f0 05 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48   ...PNG........IH
44 52 00 00 01 f4 04 00 f0 06 08 02 00 00 00 44   DR.............D
b4 48 dd 00 00 00 09 70 48 59 73 00 00 0e c4 04   .H.....pHYs.....
00 f5 37 01 95 2b 0e 1b 00 00 07 81 49 44 41 54   ..7..+......IDAT


Yappy
Code:
1f 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44   ..PNG........IHD
52 00 00 01 f4 00 00 01 f4 08 02 00 00 00 44 b4   R.............D.
48 1f dd 00 00 00 09 70 48 59 73 00 00 0e c4 00   H......pHYs.....
00 0e c4 01 95 2b 0e 1b 00 00 07 81 49 44 41 54   .....+......IDAT


NitroSDK (Nintendo)
The first byte is the type of compression: 0x00, 0x10, 0x11, 0x20, 0x40.
Code:
10 cf 07 00 00 89 50 4e 47 0d 0a 1a 0a 00 00 00   ......PNG.......
00 0d 49 48 44 52 09 00 00 01 f4 10 03 08 02 00   ..IHDR..........
11 08 44 b4 48 dd 00 18 09 70 48 02 59 73 00 00   ..D.H....pH.Ys..
0e c4 10 03 01 00 95 2b 0e 1b 00 00 07 81 00 49   .......+.......I


LZMA2
Code:
18 e0 07 ce 02 fa 5d 00 44 94 05 c4 7a 27 f6 f7   ......].D...z'..
ee 89 8e 50 90 88 b3 aa cc 1b 2e 9b 5a d1 1a 08   ...P........Z...
c2 69 96 f7 ad ab 24 88 1f 78 89 db 47 9f ab 1e   .i....$..x..G...
d5 ee e0 c1 8b b2 c9 82 e1 c5 12 78 20 65 03 85   ...........x e..


LZMA2 headerless
Code:
e0 07 ce 02 fa 5d 00 44 94 05 c4 7a 27 f6 f7 ee   .....].D...z'...
89 8e 50 90 88 b3 aa cc 1b 2e 9b 5a d1 1a 08 c2   ..P........Z....
69 96 f7 ad ab 24 88 1f 78 89 db 47 9f ab 1e d5   i....$..x..G....
ee e0 c1 8b b2 c9 82 e1 c5 12 78 20 65 03 85 04   ..........x e...


Oodle
Usually it starts with the byte 0x8c.
Code:
8c 0b 43 03 61 df 01 00 12 19 83 e0 b4 78 4b e0   ..C.a........xK.
ab 74 91 77 97 86 13 9d 40 07 b7 d4 0d 76 5c 7d   .t.w....@....v\}
56 81 8f 7c f0 33 c0 1a 9a fc 0d ad 47 80 4b fc   V..|.3......G.K.
49 93 f9 fc 4c a6 b7 80 17 a4 bc 8c 07 f9 8d 31   I...L..........1


zstd
It starts with a little endian 32bit magic number, when seen with a hex editor only the first byte (the low 8bit) is different because it depends by the version of the algorithm:
Code:
1e b5 2f fd    v0.1
22 b5 2f fd    v0.2
23 b5 2f fd    v0.3
24 b5 2f fd    v0.4
25 b5 2f fd    v0.5
26 b5 2f fd    v0.6
27 b5 2f fd    v0.7
28 b5 2f fd    v0.8, current version


Attachments:
File comment: Example image used for the compressions
quickbms.png [1.95 KiB]
Not downloaded yet

Author:  shekofte [ Mon Aug 11, 2014 3:07 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

thanks a lot for sharing these worthful knowledge
i hope this reference grow exponentially by support of other experts too

Author:  aluigi [ Mon Aug 11, 2014 3:37 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

Yeah, having more experts is just half of the target of the Zenhax community.
The bigger part is providing material very simple and easy to read so that anyone with minimal or no skills can be an "expert" in short time.

Let me know if you have suggestions about specific topics.

Author:  shekofte [ Mon Aug 11, 2014 4:33 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

aluigi wrote:
Let me know if you have suggestions about specific topics.


1th suggestion
i am so interested in the source codes you accompanied by Quickbms , i think if we can manually edit some lines of code we can reach on demand purposes ... do you see any spotlight on this strategy ?
maybe some examples or a guide to recompiling of them ?
i am not a programmer , but i am familiar with c++ programming styles and i have skill in mathematics , but i usually learn everything rapidly ...

I owe gratitude to you Luigi Auriemma because of many things i already learned on xentax from you
your sincerely Fereydoon Shekofte

Author:  aluigi [ Mon Aug 11, 2014 5:00 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

Recompiling the source code of quickbms everytime you need a customization is not a good idea, additionally it would create tons of incompatibility when people use it.
What is good is writing plugins for quickbms and this is already possible with the calldll command.

Basically you write a simple dll that accepts an input buffer, its size and additional arguments you desire.
Your dll performs some operations on this data and can return the data to quickbms or can dump the output directly on disk.

I use this feature everytime I have a decryption function that is too slow to write in bms code.

The other good thing is that the dll can be embedded directly in the bms script so the people don't need to download them separately.

Quickbms allows even to execut a program on each extracted file, for example if you convert them in mp3 you can use:
quickbms -S "lame -b 192 -t --quiet #INPUT#" extractor.bms archive.dat output_folder

Author:  aluigi [ Tue Aug 12, 2014 7:07 am ]
Post subject:  Re: How to recognize the compression algorithms with your ey

Ah, I forgot: for any suggestion and improvement of quickbms just post them in the following topic:
http://zenhax.com/viewtopic.php?f=13&t=19

Or in the relative section:
http://zenhax.com/viewforum.php?f=13

Author:  aluigi [ Thu Nov 06, 2014 5:22 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

Added some other examples, but they don't have important fields like magic signatures and similar.

Author:  michalss [ Thu Nov 06, 2014 10:13 pm ]
Post subject:  Re: How to recognize the compression algorithms with your ey

man this is perfect :D

Author:  aluigi [ Thu Mar 02, 2017 4:56 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

Added the example of Oodle because its easy to recognize (byte 0x8c) and it's starting to be used more often (Project Cars 2, Telltale Games, granny2 library and so on).
Note that the network packets are different so the provided example is the result of the main functions used mainly for files.

Author:  chrrox [ Sun Mar 05, 2017 1:48 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

oodle can use different starting values quick bms currently crashes on the non 0x8C versions.
viewtopic.php?f=13&t=556&start=169

Author:  aluigi [ Sun Mar 05, 2017 2:09 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

Oh come on, I simply forgot the word "usually" in the description! :) [now fixed]
0x8c is just a mix of flags just like the 0x78 of zlib and 0x5d of lzma.

Author:  BCGhost [ Tue Jan 02, 2018 11:55 am ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

Might sound like a dumb question but how should I handle headerless LZMA compression with QuickBMS?

Author:  aluigi [ Tue Jan 02, 2018 12:03 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

comtype lzma0

If in doubt about size and type of lzma use: comtype lzma_dynamic

Author:  BCGhost [ Tue Jan 02, 2018 3:30 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

I found some strings in a library file of Asphalt 8 which said it's LZMA compressed, and I tried lzma0 & lzma_dynamic but the former threw me an error while the latter gave me the wrong output far smaller than the unzip size in theory.

Here I attach a sample file if you don't mind: :P
Attachment:
tex.rar [250.02 KiB]
Downloaded 59 times


In the attachment there're two files come from different version of the game, where one contains uncompressed data while the other, is compressed. They're the same file but I've renamed them just to make it less confusing.

Author:  aluigi [ Tue Jan 02, 2018 4:12 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

That's zstd compression (version 0.6).
Anyway here you are off-topic and zstd pvr are known from long time since I found reference about it on the zstd issues page https://github.com/facebook/zstd/issues/94
Please open a new topic in the Graphics section.

Author:  BCGhost [ Wed Jan 03, 2018 12:56 am ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

aluigi wrote:
That's zstd compression (version 0.6).
Anyway here you are off-topic and zstd pvr are known from long time since I found reference about it on the zstd issues page https://github.com/facebook/zstd/issues/94
Please open a new topic in the Graphics section.


Oh sorry about that :mrgreen:
I didn't think too much when I looking at the string "zstdjetc", though I did guess it right that "zstd" is as "zip standard". Didn't google it first, my bad no doubt.

Test some data with zstd compression and it works perfectly.

In fact I don't have to start a new topic since I'm not actually concerned about the textures, but the compressed model data.
The only reason I provide this texture file as a sample is that I'm sure that the decompressed size recorded is correct.
I think such a request here is off-topic anyway.

Again thanks for the cue, sincerely. :P

Author:  aluigi [ Wed Jan 03, 2018 9:24 am ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

Good. Anyway I have added zstd to the list in the first post since it uses a magic number that allow to guess it at 100%.

Author:  BCGhost [ Wed Jan 03, 2018 3:20 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

QuickBMS(v0.8.1) failed to decompress the data though current version is newer than v0.6.
The lastest version(v1.3.3) I used works correctly.

Author:  aluigi [ Wed Jan 03, 2018 4:29 pm ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

Yeah, it will be fixed in quickbms 0.8.2
The reason is that you need to support zstd legacy to work with older versions of the algorithm, it's like a separate thing.

Author:  barumbads [ Sat Jan 13, 2018 12:03 am ]
Post subject:  Re: How to recognize the compression algorithms with your eyes

+1 :mrgreen:

Page 1 of 2 All times are UTC
Powered by phpBB® Forum Software © phpBB Limited
https://www.phpbb.com/