ZenHAX

Free Game Research Forum | Official QuickBMS support | twitter @zenhax | SSL HTTPS://zenhax.com
It is currently Tue Dec 11, 2018 1:45 pm

All times are UTC




Post new topic  Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Sun Nov 11, 2018 6:07 am 

Joined: Sun Nov 11, 2018 5:54 am
Posts: 2
Hello! I've hit a wall in making progress on ripping the assets from Harvest Moon - A Wonderful Life on Gamecube. The issue is that most of the assets are compressed in .clz files, which are an absolute mystery to me. I'm assuming it's some variation of LZ compression that seems to be common for a lot of Nintendo stuff, but the tools available for this confirm that it's a non-standard compression. As far as I can tell, this compression was only used on this game, the release of the same game with a female main character, and the PS2 port.

I'm sorry this isn't terribly specific - if someone could point me in a helpful direction, that would be great. I'm still playing around with some tools from QuickBMS, but I have a growing sense of dread that I need to reverse engineer the compression. Luckily, there are test files included on the disk that have the same information, where only one is compressed! I've included one example here. Optimistically, this would make determining the compression the file easy.

I'm sure you could tell, but this is my first exploration into reverse engineering something like this. Please pardon my ignorance.

Thanks! :)


Attachments:
File comment: Same file, uncompressed
Dummy_Uncompressed.txt [188 Bytes]
Downloaded 20 times
File comment: CLZ compressed file
Dummy_Compressed.txt [65 Bytes]
Downloaded 21 times
Top
   
PostPosted: Mon Nov 12, 2018 9:21 am 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 9438
Can you provide bigger compressed samples?
The only good results I obtained (bcl_rice, lzfu_raw and SCUMMVM39) look false positives so I don't think I have a ready solution.

Dummy_Uncompressed.txt is not related to Dummy_Compressed.txt.


Top
   
PostPosted: Mon Nov 12, 2018 5:54 pm 

Joined: Sun Nov 11, 2018 5:54 am
Posts: 2
Sure, most of the data for the game is included in .arc.clz files. I think this is similar to a tarball, so after decompressing it should be a valid .arc file. I've used the comtype_scan2.bat tool and similarly had found a couple that were close to being a valid .arc file, but none that worked.

Thank you so much for looking at this, and sorry for the unrelated files. I made a correlation between file sizes, which in hindsight was a bad assumption.


Attachments:
mainchapter1.txt [4.27 MiB]
Downloaded 18 times
Top
   
PostPosted: Sun Dec 02, 2018 6:48 am 

Joined: Sun Dec 02, 2018 5:53 am
Posts: 3
I've also been looking into the clz compression from A Wonderful Life.

From what I can tell, the file header is composed of several parts.
  1. 4 bytes at 0x00000000 which is the CLZ identifier (i.e. 43 4C 5A 00)
  2. 4 bytes at 0x00000004 of the size (in bytes) of the decompressed data, in hex (e.g. 00 53 54 90 [5.46MB] for AWL’s commonall.arc)
    Currently this is only speculated. I am unable to confirm that this is what this variable actually is until I successfully decompress a clz file.
  3. 4 bytes at 0x00000008 with blank space (i.e. 00 00 00 00)
  4. A repeat at 0x000000c of the size in bytes (in hex). (e.g. 00 53 54 90 for the above file)
  5. One null byte at 0x00000010 (e.g. 00)
  6. The compressed file data starting at 0x00000011 (e.g. 55 AA 38 2D as this file contains a U8 [arc] Archive)

Image

I ran signsrch on the game executables and got the following results:
Quote:
A Wonderful Life: dvdroot/&&systemdata/Start.dol
Code:
  offset   num  description [bits.endian.size]
  --------------------------------------------
  0024bc70 3049 DMC compression [32.be.16&]
  0024bee1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
  002521c8 2304 zinflate_distanceExtraBits [32.be.120]
  002521cb 2303 zinflate_distanceExtraBits [32.le.120]
  0028e19b 1040 SSL3 #define [32.le.176&]
  0028e7a8 2417 MBC2 [32.le.248&]
  0028e7ab 2418 MBC2 [32.be.248&]
  002939c8 1563 libavcodec ff_zigzag_direct [..64]

- 8 signatures found in the file in 1 seconds


Another Wonderful Life (girl version of the game): dvdroot/&&systemdata/Start.dol
Code:
  offset   num  description [bits.endian.size]
  --------------------------------------------
  0023bd54 2417 MBC2 [32.le.248&]
  0023c36b 2418 MBC2 [32.be.248&]
  0024d3c4 3049 DMC compression [32.be.16&]
  0024d5d1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
  00250dd0 2304 zinflate_distanceExtraBits [32.be.120]
  00250dd3 2303 zinflate_distanceExtraBits [32.le.120]
  0028ebb8 1563 libavcodec ff_zigzag_direct [..64]

- 7 signatures found in the file in 1 seconds


Interestingly, the PS2 version of A Wonderful Life Special Edition contains both a compressed and uncompressed version of what appears to be the same file (mainchapter0.arc.clz and mainchapter0.arc).


Top
   
PostPosted: Mon Dec 10, 2018 12:26 am 

Joined: Sun Dec 02, 2018 5:53 am
Posts: 3
I ran another one of the files (preload.arc.clz) through comtype_scan2 and it seems like the best candidate would be some variant of either LZFU (most likely) or FIN (less likely).


Attachments:
File comment: LZFU.dmp output from comtype_scan2 analysis of preload.arc.clz
LZFU.dmp.zip [275.75 KiB]
Downloaded 1 time
File comment: FIN.dmp output from comtype_scan2 analysis of preload.arc.clz
FIN.dmp.zip [461.51 KiB]
Downloaded 1 time
File comment: CLZ-Compressed version of preload.arc (U8 archive).
preload.arc.clz.zip [372.7 KiB]
Downloaded 1 time
Top
   
PostPosted: Mon Dec 10, 2018 12:49 am 

Joined: Sun Dec 02, 2018 5:53 am
Posts: 3
I also tried scanning the above file (preload.arc.clz) using offzip, and got the following results:
Attachment:
File comment: Output of "offzip.exe -z -15 -S preload.arc.clz 0x00000010"
offzip_output_preload.arc.clz.txt [4.24 KiB]
Downloaded 2 times


Summary of valid compressed streams:
Code:
+------------+-----+----------------------------+----------------------+
| hex_offset | ... | zip -> unzip size / offset | spaces before | info |
+------------+-----+----------------------------+----------------------+
  0x00000fd1  61201 -> 61187 / 0x0000fee2 _ 4049
  0x00019447  45209 -> 45199 / 0x000244e0 _ 38245
  0x0002d8b0  36618 -> 36608 / 0x000367ba _ 37840
  0x00037206  46 -> 321 / 0x00037234 _ 2636
  0x0003a263  65375 -> 65365 / 0x0004a1c2 _ 12335
  0x0004aeca  42 -> 342 / 0x0004aef4 _ 3336
  0x0004d330  34 -> 347 / 0x0004d352 _ 9276
  0x00052f48  55 -> 665 / 0x00052f7f _ 23542
  0x00057771  37 -> 47 / 0x00057796 _ 18418
  0x0005ab5a  36 -> 85 / 0x0005ab7e _ 13252
  0x0005f9b6  34 -> 103 / 0x0005f9d8 _ 20024
 
- 11 valid compressed streams found
- 0x00032f2f -> 0x0003355d bytes covering the 51% of the file


Top
   
PostPosted: Mon Dec 10, 2018 7:46 am 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 9438
deflate is prone to many false positives because it's just the compressed data without any crc or header (which is instead available in zlib).
So you can ignore those results.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 7 posts ] 

All times are UTC


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited