ZenHAX

Free Game Research Forum | Official QuickBMS support | twitter @zenhax | SSL HTTPS://zenhax.com
It is currently Sat Aug 19, 2017 8:34 pm

All times are UTC




Post new topic  Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Thu Aug 14, 2014 5:45 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 6436
A real example/tutorial about a not-so-simple archive format.
I think this is a bit advanced for beginners but I wanted to do something based on a recent file format I analyzed.

Tools:
QuickBMS http://quickbms.aluigi.org
Hex editor if you don't have idea of what to take, try XVI32
Read hex, speak hex, eat hex: forget the decimal notation and think only to 0xNUMBER, it's what will help you during reversing... so 10 is 0xa.

The sample is attached:
download/file.php?mode=view&id=47


First step, open the file with a hex editor and check its content:
Image


Things to notice:

Do we have a magic number?
It's a string/signature or number that is usually used to identify a file format, for example ZIP archives have "PK".

In this case we have a 0x00 "CAP" which looks just like a magic.


What is the endianess?
The endianess is the direction of the numbers stored in the archives.
Big endian of 0x11223344 is 11 22 33 44
Little endian of 0x11223344 is 44 33 22 11

The secret is watching the data as blocks of 4 bytes (32bit) so in this case after " CAP" we have:
00 00 00 00
00 00 00 0c

That second field looks just a 0xc, so a big endian.
Doesn't have sense to be 0x0c000000 in little endian :)
We don't know yet what is this 0xc, let's check it later.


Then we have another 00 00 00 00, skip the fields set to zero.

And now 00 00 00 e0, so 0xe0.

We are at the beginning of the format so it may be an offset, or the size of a section or the number of files or maybe nothing important.

Go in your hex editor and press CTRL-G, select hexadecimal and type e0:
Image


The data at that offset looks just like a DDS image, take it in your notes.
We can notice that it's a non-compressed file because there are many zeroes and the file is easily identified.
Image


The next field is 00 4e db 38, so 0x4edb38.
Repeat the same operation as before and at that offset you will see a sequence of complete filenames (path + name):
Image


Now go back to the beginning of the file because there are still a lot of fields between that DDS image and the current position:
Code:
00 43 41 50 00 00 00 00 00 00 00 0c 00 00 00 00   .CAP............
00 00 00 e0 00 3e db 38 00 00 00 00 00 00 00 00   .....>.8........
00 00 00 00 00 10 00 80 00 00 00 00 00 00 00 2f   .............../
00 10 00 80 00 10 00 80 00 00 00 30 00 00 00 24   ...........0...$
00 20 01 00 00 01 cc a8 00 00 00 58 00 00 00 1c   . .........X....
00 21 cd a8 00 00 6a 5f 00 00 00 78 00 00 00 1c   .!....j_...x....
00 22 38 08 00 02 a9 11 00 00 00 98 00 00 00 1c   ."8.............
00 24 e1 1c 00 04 13 83 00 00 00 b8 00 00 00 1c   .$..............
00 28 f4 a0 00 04 70 93 00 00 00 d8 00 00 00 1c   .(....p.........
00 2d 65 34 00 03 85 79 00 00 00 f8 00 00 00 1c   .-e4...y........
00 30 ea b0 00 04 e2 dc 00 00 01 18 00 00 00 1c   .0..............
00 35 cd 8c 00 01 8b 5a 00 00 01 38 00 00 00 1c   .5.....Z...8....
00 37 58 e8 00 07 7a 40 00 00 01 58 00 00 00 14   .7X...z@...X....
00 3e d3 28 00 00 07 2f 00 00 01 70 00 00 00 2b   .>.(.../...p...+
44 44 53 20 7c 00 00 00 07 10 00 00 00 04 00 00   DDS |...........

If you watch carefully you can notice a certain "pattern", from offset 0x20.
Basically a sequence of fields that gets repeated, like 4 32bit numbers, 4 numbers, 4 numbers...

So let's try to identify this pattern splitting the fields in our mind:
Image


Now it's time to make some math operations.

We have the number 0xc at the beginning of the file that is also the number of these patterns composed by 4 fields each one, so probably that 0xc is the number of files.

The first field is zero so if it's an offset it's for sure a relative offset:
OFFSET + 0xe0 = file offset.

The second field seems related to the first one.
For example:
OFFSET 0x00 and second field 0x100080
OFFSET 0x100080 (previous offset + previous size) and second field 0x100080
OFFSET 0x200100 (previous offset + previous size) and second field 0x1cca8
and so on.
So let's say it's a SIZE.

When we work with archives we need at least 3 parameters:
  • OFFSET
  • SIZE
  • NAME

The missing one is NAME and considering the ascendant number (0, 0x30, 0x58, 0x78) it may be a relative offset for the names table we have seen at the beginning of the analysis at offset 0x3edb38.
Check it:
0x3edb38 + 0x30 = "BattleRes/talk/st_bg/screen_bg02.dds"
0x3edb38 + 0x58 = "se/talk/07Vat/Vat00_0001.msf"
0x3edb38 + 0x78 = "se/talk/07Vat/Vat00_0002.msf"
Ok we have the NAME relative offset :)

And what may be the last one?
It's not important, but if you check it you will notice that it's related to the incremental name offset, it's the name size.


The reversing of the file format is finished, we can extract all the files with their original filenames
Let's check the relative script:
http://aluigi.org/papers/bms/others/uniel.bms


And in the next post we will see how to write the script from scratch with these parameters.


Attachments:
img5.png [8.13 KiB]
Not downloaded yet
img4.png [19.26 KiB]
Not downloaded yet
img3.png [13.01 KiB]
Not downloaded yet
img2.png [3.03 KiB]
Not downloaded yet
img0.png [12.26 KiB]
Not downloaded yet
File comment: Sample
talk_Vat_00.zip [2.04 MiB]
Downloaded 303 times
Top
   
PostPosted: Thu Nov 06, 2014 3:44 pm 

Joined: Fri Oct 24, 2014 3:13 pm
Posts: 71
Very good tutorial, thanks for sharing


Top
   
PostPosted: Sun May 31, 2015 10:27 am 

Joined: Sat Feb 21, 2015 2:09 pm
Posts: 4
aluigi wrote:
And in the next post we will see how to write the script from scratch with these parameters.

Is there any "next post"? I cant find it.


Top
   
PostPosted: Sun May 31, 2015 2:02 pm 
Site Admin
User avatar

Joined: Wed Jul 30, 2014 9:32 pm
Posts: 6436
Unfortunately I have spent no other time in the Tutorials section.
The reason is mainly that there are so many examples of scripts for quickbms that I think and hope people can learn by reading them and the relative sample files with a hex editor.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 4 posts ] 

All times are UTC


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited