ZenHAX

Free Game Research Forum | Official QuickBMS support | twitter @zenhax | SSL HTTPS://zenhax.com
It is currently Sat Oct 19, 2019 10:41 pm

All times are UTC




Post new topic  Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Unknown Compression
PostPosted: Tue Jul 09, 2019 10:14 pm 

Joined: Thu Aug 14, 2014 3:29 am
Posts: 46
I have a compressed file, and a potential uncompressed file. It looks like a custom compression. I used comtype_scan.bat without any result.
The uncompressed file is from PC version of the same file, and compressed file is from the PS4 version of the same file.
The uncompressed size of PS4 version is 37066.


Attachments:
survive1.survive1.debug.7z [10.65 KiB]
Downloaded 23 times
Top
   
 Post subject: Re: Unknown Compression
PostPosted: Wed Jul 10, 2019 1:53 am 

Joined: Fri Aug 26, 2016 3:11 pm
Posts: 61
rengareng wrote:
I have a compressed file, and a potential uncompressed file. It looks like a custom compression. I used comtype_scan.bat without any result.
The uncompressed file is from PC version of the same file, and compressed file is from the PS4 version of the same file.
The uncompressed size of PS4 version is 37066.


this looks more like a deflation algorithm rather than traditional compression (usually part of but not only component in compression), a lot of the text is still very readable which makes me think there's no dictionary to start with. You'll see that it's back referencing bytes where letters were before and there are only 2 bytes at the start that don't align to the original file. I'll see if I can figure out which it is but I would look at deflate / inflate stuff rather than normal compression.

I'm thinking it's lz77, going to make a script to test.


Top
   
 Post subject: Re: Unknown Compression
PostPosted: Wed Jul 10, 2019 2:33 am 

Joined: Thu Aug 14, 2014 3:29 am
Posts: 46
LokiReborn wrote:
rengareng wrote:
I have a compressed file, and a potential uncompressed file. It looks like a custom compression. I used comtype_scan.bat without any result.
The uncompressed file is from PC version of the same file, and compressed file is from the PS4 version of the same file.
The uncompressed size of PS4 version is 37066.


this looks more like a deflation algorithm rather than traditional compression (usually part of but not only component in compression), a lot of the text is still very readable which makes me think there's no dictionary to start with. You'll see that it's back referencing bytes where letters were before and there are only 2 bytes at the start that don't align to the original file. I'll see if I can figure out which it is but I would look at deflate / inflate stuff rather than normal compression.

I'm thinking it's lz77, going to make a script to test.

I saw similar compression in some EA games like here: http://wiki.niotso.org/RefPack
Problem is that I don’t know the way they encoded distance length pairs. I’d use a disassembler if it was in PC, rather than in PS4.


Top
   
 Post subject: Re: Unknown Compression
PostPosted: Wed Jul 10, 2019 2:53 am 

Joined: Fri Aug 26, 2016 3:11 pm
Posts: 61
rengareng wrote:
LokiReborn wrote:
rengareng wrote:
I have a compressed file, and a potential uncompressed file. It looks like a custom compression. I used comtype_scan.bat without any result.
The uncompressed file is from PC version of the same file, and compressed file is from the PS4 version of the same file.
The uncompressed size of PS4 version is 37066.


this looks more like a deflation algorithm rather than traditional compression (usually part of but not only component in compression), a lot of the text is still very readable which makes me think there's no dictionary to start with. You'll see that it's back referencing bytes where letters were before and there are only 2 bytes at the start that don't align to the original file. I'll see if I can figure out which it is but I would look at deflate / inflate stuff rather than normal compression.

I'm thinking it's lz77, going to make a script to test.

I saw similar compression in some EA games like here: http://wiki.niotso.org/RefPack
Problem is that I don’t know the way they encoded distance length pairs. I’d use a disassembler if it was in PC, rather than in PS4.


Ya I don't think it will be that difficult, the part that's screwing me up right now is that the byte order seems to be big endian, i would try rerunning the regular script with that set.


Top
   
 Post subject: Re: Unknown Compression
PostPosted: Wed Jul 10, 2019 8:57 pm 

Joined: Thu Aug 14, 2014 3:29 am
Posts: 46
It looks easy, but still, I don't have any idea for the actual encoding.
This compression used in fat/dat files of Watch Dogs PS4 version.


Top
   
 Post subject: Re: Unknown Compression
PostPosted: Thu Jul 11, 2019 10:58 pm 

Joined: Thu Aug 14, 2014 3:29 am
Posts: 46
LokiReborn wrote:
rengareng wrote:
LokiReborn wrote:

this looks more like a deflation algorithm rather than traditional compression (usually part of but not only component in compression), a lot of the text is still very readable which makes me think there's no dictionary to start with. You'll see that it's back referencing bytes where letters were before and there are only 2 bytes at the start that don't align to the original file. I'll see if I can figure out which it is but I would look at deflate / inflate stuff rather than normal compression.

I'm thinking it's lz77, going to make a script to test.

I saw similar compression in some EA games like here: http://wiki.niotso.org/RefPack
Problem is that I don’t know the way they encoded distance length pairs. I’d use a disassembler if it was in PC, rather than in PS4.


Ya I don't think it will be that difficult, the part that's screwing me up right now is that the byte order seems to be big endian, i would try rerunning the regular script with that set.


I found the algorithm. If I delete the first byte, I can use the following quickbms script to unpack the compressed file:
Code:
comtype lz77ea_970
get SIZE asize
get NAME filename
string NAME += ".unpacked"
clog NAME 0 SIZE 10000000

I don't know the purpose of the first byte. Any guess?
Unfortunately, it failed for the attached file after deleting first byte. First byte could be some options for the algorithm. Uncompressed size should be 103475.
However, it extracts to 94550.


Attachments:
resultsapp.feu_compressed.unpacked.zip [28.44 KiB]
Downloaded 15 times
Top
   
 Post subject: Re: Unknown Compression
PostPosted: Sat Jul 13, 2019 1:45 am 

Joined: Fri Aug 26, 2016 3:11 pm
Posts: 61
rengareng wrote:
LokiReborn wrote:
rengareng wrote:
I saw similar compression in some EA games like here: http://wiki.niotso.org/RefPack
Problem is that I don’t know the way they encoded distance length pairs. I’d use a disassembler if it was in PC, rather than in PS4.


Ya I don't think it will be that difficult, the part that's screwing me up right now is that the byte order seems to be big endian, i would try rerunning the regular script with that set.


I found the algorithm. If I delete the first byte, I can use the following quickbms script to unpack the compressed file:
Code:
comtype lz77ea_970
get SIZE asize
get NAME filename
string NAME += ".unpacked"
clog NAME 0 SIZE 10000000

I don't know the purpose of the first byte. Any guess?
Unfortunately, it failed for the attached file after deleting first byte. First byte could be some options for the algorithm. Uncompressed size should be 103475.
However, it extracts to 94550.


The first 3 bytes of this file are FEU (this is the file extension, and implying it's probably safe to be it's magic number) so I don't think there is anything in front of it to remove, as for the other file even in the LUA name the -pc was removed so I'm not sure they're the exact same file, if we go on that premise it could be something for the LUA script itself and maybe not garbage data? So maybe try with removing nothing or if that doesn't work understand what that lz77 variation is doing better? I might be offbase but usually the simpler things are the correct ones.

Edit:
Actually looking at the FEU file again I'm seeing multiple repeated strings without LZ style compression, you may already have the file in its correct form.


Top
   
 Post subject: Re: Unknown Compression
PostPosted: Sat Jul 13, 2019 5:25 am 

Joined: Thu Aug 14, 2014 3:29 am
Posts: 46
LokiReborn wrote:
rengareng wrote:
LokiReborn wrote:

Ya I don't think it will be that difficult, the part that's screwing me up right now is that the byte order seems to be big endian, i would try rerunning the regular script with that set.


I found the algorithm. If I delete the first byte, I can use the following quickbms script to unpack the compressed file:
Code:
comtype lz77ea_970
get SIZE asize
get NAME filename
string NAME += ".unpacked"
clog NAME 0 SIZE 10000000

I don't know the purpose of the first byte. Any guess?
Unfortunately, it failed for the attached file after deleting first byte. First byte could be some options for the algorithm. Uncompressed size should be 103475.
However, it extracts to 94550.


The first 3 bytes of this file are FEU (this is the file extension, and implying it's probably safe to be it's magic number) so I don't think there is anything in front of it to remove, as for the other file even in the LUA name the -pc was removed so I'm not sure they're the exact same file, if we go on that premise it could be something for the LUA script itself and maybe not garbage data? So maybe try with removing nothing or if that doesn't work understand what that lz77 variation is doing better? I might be offbase but usually the simpler things are the correct ones.

Edit:
Actually looking at the FEU file again I'm seeing multiple repeated strings without LZ style compression, you may already have the file in its correct form.


It's definitely LZ4 which is explained here (https://fastcompression.blogspot.com/20 ... ained.html). I've found that they used some tricks to have offsets >= 65536. Here is the template for 010Editor that worked for that file:
Code:
// author: celikeins
// watch_dogs 1, cmp type 4 in fat/dat files
LittleEndian();

int read() {
    local int a = 0;
    do {
        struct { ubyte b; } n;
        a += n.b;
    } while (n.b == 0xFF);
    return a;
}
local int out = 0;
local int infile = GetFileNum();
local int outfile = FileNew();
local int i;

byte unknown;
while (!FEof()) {
    struct {
        local int outpos = out;
        ubyte hl;
        local int proceed = hl >> 4, copy = hl & 0x0F;
        if (proceed == 0x0F) {
            proceed += read();
        }
        if (proceed > 0) {
            ubyte proceed_from_input[proceed];
            FileSelect(outfile);
            WriteBytes(proceed_from_input, out, proceed);
            out += proceed;
            FileSelect(infile);
        }
        if (FEof()) { break; };
        ushort offset0;
        local int offset = offset0;
        // offset can be beyond 64KB
        if (offset >= 0xE000) {
            ubyte offset1;
            offset += offset1 * 0x2000;
        }
        Assert(offset > 0 && (out - offset) >= 0);
        if (copy == 0x0F) {
            copy += read();
        }
        copy += 4;
        FileSelect(outfile);
        // copy from output to output
        for (i = 0; i < copy; ++i) {
            WriteByte(out, ReadByte(out - offset));
            ++out;
        }
        FileSelect(infile);
    } block;
}

However, the problem is that some files have extra 2 bytes in the beginning (for example menu_selfshadow.xbt_compressed).
Another problem is that, in some files there is uncompressed data in the end after the LZ4 sequence blocks.
For example, in the attached barkconfig_37fd2f17.obj file last 0x49 bytes are not belong to any LZ4 sequence blocks.
I suspect, decompressor in the game knows when to stop decompressing using the extra bytes in the beginning of the file.
Example files are attached.


Attachments:
example-files.zip [41.5 KiB]
Downloaded 16 times
Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 8 posts ] 

All times are UTC


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited