Free Game Research Forum | Official QuickBMS support | twitter @zenhax | SSL HTTPS://zenhax.com
It is currently Sun May 19, 2019 8:26 am

All times are UTC

Post new topic  Reply to topic  [ 2 posts ] 
Author Message
PostPosted: Thu Mar 21, 2019 3:26 am 

Joined: Wed Mar 20, 2019 7:50 pm
Posts: 2
Hey guys. I have a request that would probably be pretty easy for some of you, so would be cool if you can assist!
I have a small file from PS3. It looks like lz77 but I was not able to properly decompress it with the algorithms I've found online.

So here are my questions:
Is this really lz77 or it is a common variation that has some specific name?
Is there a code somewhere that can properly decompress this file? Preferably not C++, "bare" and as a single function. I'd like to see a code without any frills not required to perform the actual decompression because I am mainly interested in understanding how to decompress this file.

Bonus question:
If decompression solution will be found, in case I want to create a fake-compressed file that would work when fed to a compliant decoder, what would a stream of such a file look like?
Would it be something like this: [uncompsize-long]FFdata-here-data-hereFFdata-here-data-here-and-more-dataFFeven-more-data ?
I'd like to know what exactly these "inserted" bytes would be, would they always be the same, how often they should occur and what are the "rules" of adding them in general?

Additional info:
Provided file has a header of 4 bytes that looks like an uncompressed size.
I don't know where the actual compressed data starts. In other words I am not 100% positive that there is no some kind of additional header beyond 1st 4 bytes and what may look like a header may in fact be the header of an uncompressed file and thus not relevant.
The very last human-readable sentence in file should be something like this "ことも好き" (sjis). I suspect that if it would be possible to obtain it, the result can be declared correct.

EDIT:never mind I am stoopid. It is not lz77, it is classical lzss. Still curious about bonus question though!

PostPosted: Tue Mar 26, 2019 2:56 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 10231
Exist many implementations for all the various compression algorithms.
Basically may exist 1000 ways for using lz77, all incompatible between them.

Regarding the bonus question, if the algorithm is byte-based it's very easy to write a fake compressor just by using the flags for non-compressed data followed by the original data as-is.
It depends by the flags and how they are used in the algorithm, so there is no general rule.

In some algorithms 0xff does the job as flag:
int moh_lzss_compress_fake(unsigned char *in, int insz, unsigned char *out) {
    unsigned char   b = 0;
    while(in < inl) {
        b <<= 1;
        if(!b) {
            *o++ = 0xff;
            b = 1;
        *o++ = *in++;
    return o - out;

In RLE algorithms you need to use a specific byte (maybe 0x80) followed by the original data byte, or a certain range.

In other algorithms a certain amount of bits of that byte are the length and the remaining ones are the flag:
int dr_enc_fake(unsigned char *in, int insz, unsigned char *out) {
    unsigned char   *inl = in + insz;
    unsigned char   *o = out;

    while(in < inl) {
        int     len = 0x1f;
        if((in + len) >= inl) len = inl - in;
        unsigned char flags = 0 | len;
        *o++ = flags;
        memcpy(o, in, len);
        o  += len;
        in += len;
    return o - out;

And so on...

For bit-based algorithms... good luck :D

Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 2 posts ] 

All times are UTC

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited