ZenHAX

Free Game Research Forum | Official QuickBMS support | twitter @zenhax | SSL HTTPS://zenhax.com
It is currently Tue Jun 27, 2017 12:15 am

All times are UTC




Post new topic  Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Fri Jun 02, 2017 2:24 pm 

Joined: Fri Jun 02, 2017 2:15 pm
Posts: 4
Hi guys,

I used the Compression Detection BMS script on a file that I knew to be a compressed plaintext document. The results of the BMS have told me the compression algorithm is ZLIBX.

Can anyone please enlighten me as to what ZLIBX is, and how it differs from ZLIB - I can barely find anything on it in trusty Google, the best I think I can determine is that it's built upon the deflatex algorithm, which i assume may be part of PKZip.

If someone could please shine some light on this, it'd be very much appreciated.

Thanks


Top
   
 Post subject: Re: ZLIBX Compression
PostPosted: Fri Jun 02, 2017 3:28 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 6163
zlibx is used only by the games developed by Aeria Games as far as I know.
The difference between what I called deflatex and deflate is the size of the lenght and dist tables used by the algorithm, 30 in the original and 288 in this one.
The source code is in src/compression/tinflatex.c of quickbms and the difference with the original code is just that ALUIGI_30.


Top
   
 Post subject: Re: ZLIBX Compression
PostPosted: Sun Jun 04, 2017 4:02 am 

Joined: Fri Jun 02, 2017 2:15 pm
Posts: 4
Wow, thats wonderful, everything I needed to know and more :-) thanks so much for your help.


Top
   
 Post subject: Re: ZLIBX Compression
PostPosted: Mon Jun 05, 2017 2:20 pm 

Joined: Fri Jun 02, 2017 2:15 pm
Posts: 4
Just a follow-on from the above, I was working with files in the CAB archives generated by InstallShield. Even though Wikipedia indicated the compression was ZLib, I found it'd only work with your ZLibX compression variant. Further to that, you indicate that ZLibX is just DeflateX with a 2-byte header - for the archive I was examining, the 2-byte header was the length of the compressed data in the block, along these lines...

Code:
do {
  2 Bytes - Compressed Block Length
  X Bytes - Compressed Data (ZLibX)
  }
until (you reach the total length of the decompressed file)


The InstallShield archives are a CAB file with a HDR file of the same name. The CAB file contains the file data, the HDR contains the directory, and other information. The very-rough structure I have for this archive is below - note there is still quite a bit of other data in these archives, which I haven't bothered to analyse - I was just interested in getting the files out of there...

Code:
+-----------------------------+
| InstallShield *.cab + *.hdr |
+-----------------------------+

// The *.cab file contains the file data, the *.hdr file contains the directory
// Uses a variant of ZLib Compression (ZLibX, as defined by Aluigi from ZenHax)
// Not all files have filenames

// HDR FILE...

// HEADER
  4 - File Header ("ISc(")
  4 - Unknown (16798209)
  4 - null
  4 - Strings Directory Offset (512)
  4 - Strings Directory Length
  4 - Length of this HDR file
 
X - Unknown Stuff to offset 512

// STRINGS DIRECTORY
  4 - English Strings Tables Offset
  4 - null
  4 - Unknown Length/Offset
  4 - Strings Directory Length
  4 - null
  4 - Unknown (337)
  4 - Unknown (337)
  4 - Unknown (1)
  4 - null
  4 - Unknown (4)
  4 - Unknown (5)
  4 - Unknown (4)

X - Unknown Stuff

// FILES DIRECTORY
  4 - Files Directory Length
 
  // for each file
    4 - Offset to File Details (relative to the start of the Files Directory)
 
  // for each file (58 bytes per entry)
    4 - Filename Offset (relative to the start of the Files Directory) (or null if not a file)
    4 - null
    2 - Entry Type (12=Unknown, 4=File)
    4 - Decompressed File Length (or null if not a file)
    4 - Compressed File Length (or null if not a file)
    4 - Unknown (128) (or null if not a file)
    4 - Unknown (or null if not a file)
    4 - Unknown (or null if not a file)
    4 - null
    4 - null
    4 - File Data Offset in CAB file (first entry starts at 512, non-file entries retain the previous offset value)
    8 - Chechsum? Hash? Time?
    8 - Chechsum? Hash? Time?
   
  // for each filename
    X - Filename
    1 - null Filename Terminator
   
  1 - null End of HDR File Terminator?
     
 
// CAB FILE...

// HEADER
  4 - File Header ("ISc(")
  4 - Unknown (16798209)
  4 - null
  4 - File Data Offset (512)
  4 - null
  4 - File Data Offset (512)
  4 - null
  4 - null
  4 - Unknown (4)

X - Unknown Stuff

// FILE DATA
  // for each file
    // for each compressed block in the file (keep reading until you reach the DecompressedFileLength or the CompressedFileLength, depending on what you're recording...)
      2 - Compressed Data Block Length
      X - File Data (compressed)


If you're interested in the archives I was examining, this particular one was the demo of the game Project Eden (downloaded from https://archive.org/details/ProjectEdenDemo). I used WinRAR to extract the files from the EXE wrapper -- there were 3 InstallShield CAB+HDR archives within it, which all analysed and worked fine with the above structure.

Thanks so much Aluigi for your help, you're welcome to use the above info to create a BMS if you want.


Top
   
 Post subject: Re: ZLIBX Compression
PostPosted: Mon Jun 05, 2017 9:37 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 6163
Very good, I have made a script on the fly and works perfectly with that Project Eden sample:
http://aluigi.org/bms/installshield_cab.bms

Just curious if it works correctly also with other cabs :)


Top
   
PostPosted: Tue Jun 06, 2017 1:43 pm 

Joined: Fri Jun 02, 2017 2:15 pm
Posts: 4
Hi Aluigi,

Some of the earlier InstallShield CAB archives appear to have a different format (the files in the ProjectEden EXE say it's from 2001 - I just found another InstallShield EXE with files from 1999). The format is quite similar, but fundamentally there is no separate HDR file - it's just a single CAB archive. The directory entries are also missing their CRCs and/or Timestamps in this version.

See a rough spec below...

Code:
+-------------------------------------+
| InstallShield *.cab (no *.hdr file) |
+-------------------------------------+

// Uses a variant of ZLib Compression (ZLibX, as defined by Aluigi from ZenHax)
// Not all files have filenames

// HEADER
  4 - File Header ("ISc(")
  2 - Version? (4)
  2 - Unknown (256)
  4 - null
  4 - Unknown Data Block Offset (512)
  4 - Unknown Data Block Length (approximate only) [-8 or -16]
  4 - Unknown
  4 - null
  4 - null
  4 - Number of Files [+1] (including files of type=8)
 
X - Unknown Stuff to offset 512

// UNKNOWN DATA BLOCK
  4 - Unknown
  4 - null
  4 - Unknown
  4 - Unknown Data Block Length (including these header fields)
  X - Unknown
 
// FILES DIRECTORY
  4 - Files Directory Length
 
  // for each file and each directory
    4 - Offset to File Details (relative to the start of the Files Directory) (not necessarily in offset order!) (or offset to directory name)
 
  // for each file (42 bytes per entry)
    4 - Filename Offset (relative to the start of the Files Directory) (or -35520232 if not a file)
    4 - Directory Name ID Number (index starts at 0)
    2 - Entry Type (8=Unknown, 4=File)
    4 - Decompressed File Length (or null if not a file)
    4 - Compressed File Length (or null if not a file)
    4 - Unknown (32/33) (or null if not a file)
    4 - Unknown (or null if not a file)
    4 - Unknown (or null if not a file)
    4 - null
    4 - null
    4 - File Data Offset
   
  // for each filename
    X - Filename
    1 - null Filename Terminator
 
  // for each directory name
    X - Directory Name (can be empty - ie 0 bytes)
    1 - null Directory Name Terminator
   
// FILE DATA
  // for each file
    // for each compressed block in the file (keep reading until you reach the DecompressedFileLength or the CompressedFileLength, depending on what you're recording...)
      2 - Compressed Data Block Length
      X - File Data (compressed)


Reading the directory was a little more complicated too (the little 4-byte loop at the start of the Files Directory), as the offsets are not necessarily in the correct order, and it also includes entries for directory names.

For reference/validation, I was working with the PBA Bowling 2 demo that I downloaded from http://download.cnet.com/PBA-Bowling-2- ... 34625.html . As before, extract the CAB files from the self-extracting EXE using WinRAR, then run through the above specs. This CAB has some good files in there for testing - MP3/WAV audio, INI files, and TGA images.


Top
   
PostPosted: Tue Jun 06, 2017 4:00 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 6163
Script 0.2 :D


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 7 posts ] 

All times are UTC


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited