Difference between revisions of "Talk:Project Gotham Racing 2"
(→Reverse engineering notes for file formats: new section) |
(Updated, but still broken) |
||
(7 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
Project Gotham Racing 2 use a .PAK container for storing 3D elements, textures and 3D configurations files. | Project Gotham Racing 2 use a .PAK container for storing 3D elements, textures and 3D configurations files. | ||
− | |||
Multiple types of .PAK name can be seen in PGR2, their names are related to their functions and indicate what type of elements are stored: | Multiple types of .PAK name can be seen in PGR2, their names are related to their functions and indicate what type of elements are stored: | ||
Line 45: | Line 44: | ||
.pak_overcast | .pak_overcast | ||
− | .pak_stream | + | .pak_stream (This uses a non-standard PAK format) |
Line 51: | Line 50: | ||
To extract those .PAK files, a tool called [http://aluigi.altervista.org/quickbms.htm quickbms] and a [http://aluigi.altervista.org/bms/project_gotham_2_pak.bms PGR2 bms script] ( both made by Luigi Auriemma ) can be used to get most of the contents stored in the archive. | To extract those .PAK files, a tool called [http://aluigi.altervista.org/quickbms.htm quickbms] and a [http://aluigi.altervista.org/bms/project_gotham_2_pak.bms PGR2 bms script] ( both made by Luigi Auriemma ) can be used to get most of the contents stored in the archive. | ||
− | When extracted, the | + | When extracted, the content stored in the archive is sliced in sections, creating a folder for each of them and joining a file with the actual contents in it, mostly with a '''.dat''' suffix. |
Here an example for objects: | Here an example for objects: | ||
− | '''.PAK for | + | |
+ | '''.PAK for object : red_cone.pak ''' | ||
---- | ---- | ||
Line 71: | Line 71: | ||
END ( nothing, no folder, no files ) | END ( nothing, no folder, no files ) | ||
+ | |||
+ | ---- | ||
+ | |||
+ | === PAK File format === | ||
+ | |||
+ | The PAK format is a chunked format. Each chunk starts with: | ||
+ | |||
+ | * u32 chunk-type (can be interpreted as 4 byte ASCII magic) | ||
+ | * u32 unknown | ||
+ | * u32 size of chunk (excluding this field?) | ||
+ | |||
+ | The chunk can contain compressed or uncompressed data. Compression is probably indicated by the upper bits of the size (0xC0000000) being set. | ||
+ | Compression seems to be with a zlib header (<code>0x78, 0xDA</code>). | ||
+ | |||
+ | Data can be inline, directly following the header. However, in INDX chunks, the data is pointed to by an extra u32 field after each header. | ||
+ | |||
+ | |||
+ | ==== INDX ==== | ||
+ | |||
+ | * u32: number of chunks | ||
+ | * array of chunk headers, each with additional u32 field pointing to the data | ||
+ | * array of chunk data | ||
+ | |||
+ | ==== WMSH ==== | ||
+ | |||
+ | The ''00000000.dat'' file contain the '''faces indices''', and ''supposedly'', the '''strip''' and the how the vertices are read '''( float, word ? )'''. | ||
+ | |||
+ | ==== MESH ==== | ||
+ | |||
+ | Same as WMSH? | ||
+ | |||
+ | ==== MAT ==== | ||
+ | |||
+ | Material properties | ||
+ | |||
+ | The ''00000001.dat'' file contain the '''material''' properties : diffuse, specular, ambient etc... of each material applied to the 3D file. | ||
+ | |||
+ | ==== SKY ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== WRAP ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== INST ==== | ||
+ | |||
+ | Header has some field set to != 0. | ||
+ | Seems to contain subchunks which end with "END" chunk? | ||
+ | |||
+ | ==== TIME ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== RCAM ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== ANIC ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== ROUT ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== RPRM ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== RTMP ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== GPUD ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== GPUS ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== TEX ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== LGHT ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== ACT ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== INFO ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== DRVP ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== AUDI ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== TVC ==== | ||
+ | |||
+ | |||
+ | |||
+ | ==== END ==== | ||
+ | |||
+ | Marks the end of the file. Can also be inside a file and might mark the end of a subchunk there? | ||
+ | |||
+ | ==== GPUD ==== | ||
+ | |||
+ | GPU Data? | ||
+ | |||
+ | The ''00000002.dat'' file contain all the '''textures''', each of them followed by a small part of data ''( texture configuration ? mipmaps ? )''. Finally, after all the textures, the '''vertices and UV position''' are stored in one chunk of data. | ||
+ | |||
+ | ==== TEXT ==== | ||
+ | |||
+ | Texture information | ||
+ | |||
+ | The ''00000003.dat'' file contain all the '''text and name''' related to the texture, it also indicate where each textures are located inside '''GPUD'''. | ||
+ | |||
+ | ==== VB ==== | ||
+ | |||
+ | Vertex buffer information | ||
+ | |||
+ | The ''00000004.dat'' file contain the '''start address offset''' of the vertices and UV section in the '''GPUD'''. If the 3D mesh have multiples groups/sections, the VB file will store each address. | ||
+ | |||
+ | == Hacky python script to extract PAK textures == | ||
+ | |||
+ | <pre> | ||
+ | #!/usr/bin/env python3 | ||
+ | |||
+ | # Project Gotham Racing 2 (.pak) | ||
+ | # Originally a script for QuickBMS http://quickbms.aluigi.org | ||
+ | |||
+ | # comtype unzip_dynamic | ||
+ | |||
+ | import sys | ||
+ | import struct | ||
+ | import zlib | ||
+ | from PIL import Image | ||
+ | |||
+ | # Array entries to export for debug purposes | ||
+ | N = 20 | ||
+ | |||
+ | def readLong(f): | ||
+ | return struct.unpack('<I', f.read(4))[0] | ||
+ | |||
+ | def clog(f, NAME, OFFSET, ZSIZE, SIZE): | ||
+ | print('Exporting ' + NAME + ' (Compressed) from ' + str(OFFSET)) #FIXME: Export | ||
+ | f.seek(OFFSET) | ||
+ | compressed = f.read(ZSIZE) | ||
+ | #print(compressed) | ||
+ | data = bytes() | ||
+ | try: | ||
+ | decompress = zlib.decompressobj(15) | ||
+ | for b in compressed: | ||
+ | data += decompress.decompress(compressed) | ||
+ | raise | ||
+ | except: | ||
+ | print("Decompression failed after " + str(len(data)) + ' / ' + str(SIZE) + ' bytes') | ||
+ | with open(NAME, 'wb') as e: | ||
+ | e.write(data) | ||
+ | #print(data) | ||
+ | return data | ||
+ | |||
+ | def log(f, NAME, OFFSET, ZSIZE): | ||
+ | f.seek(OFFSET) | ||
+ | data = f.read(ZSIZE) | ||
+ | print('Exporting ' + NAME) #FIXME: Export | ||
+ | with open(NAME, 'wb') as e: | ||
+ | e.write(data) | ||
+ | return data | ||
+ | |||
+ | textures = [] | ||
+ | vbs = [] | ||
+ | meshs = [] | ||
+ | gpud = bytes([]) | ||
+ | |||
+ | def readChunks(f): | ||
+ | while True: | ||
+ | chunk = readChunk(f) | ||
+ | if chunk['type'] == b'END\0': | ||
+ | break | ||
+ | |||
+ | def readChunk(f, indexed = False): | ||
+ | global gpud | ||
+ | global textures | ||
+ | global vbs | ||
+ | global meshs | ||
+ | |||
+ | print("At " + str(f.tell())) | ||
+ | |||
+ | chunkType = f.read(4) | ||
+ | NAME = chunkType.decode('ascii').rstrip('\0') # FIXME: Remove?! | ||
+ | print("Found chunk " + NAME) | ||
+ | |||
+ | DUMMY = readLong(f) | ||
+ | SIZE = readLong(f) | ||
+ | |||
+ | chunk = {} | ||
+ | chunk['type'] = chunkType | ||
+ | chunk['size'] = SIZE | ||
+ | if DUMMY: | ||
+ | chunk['children'] = [] | ||
+ | |||
+ | print("Dummy " + str(DUMMY)) | ||
+ | print("Size " + str(SIZE)) | ||
+ | |||
+ | if indexed == True: | ||
+ | OFFSET = readLong(f) + 0xC | ||
+ | print("Offset " + str(OFFSET)) | ||
+ | chunk['offset'] = OFFSET | ||
+ | else: | ||
+ | OFFSET = f.tell() | ||
+ | |||
+ | TMP = f.tell() | ||
+ | |||
+ | if chunkType == b'INDX': | ||
+ | print("Reading index?!") | ||
+ | FILES = readLong(f) | ||
+ | |||
+ | for i in range(0, FILES): | ||
+ | readChunk(f, True) | ||
+ | |||
+ | f.read(SIZE - FILES * 16 - 4) | ||
+ | |||
+ | return chunk | ||
+ | |||
+ | elif chunkType == b'WMSH': | ||
+ | pass | ||
+ | elif chunkType == b'SKY\0': | ||
+ | pass | ||
+ | elif chunkType == b'WRAP': | ||
+ | pass | ||
+ | elif chunkType == b'INST': | ||
+ | pass | ||
+ | elif chunkType == b'MAT\0': | ||
+ | pass | ||
+ | elif chunkType == b'TIME': | ||
+ | pass | ||
+ | elif chunkType == b'RCAM': | ||
+ | pass | ||
+ | elif chunkType == b'ANIC': | ||
+ | pass | ||
+ | elif chunkType == b'ROUT': | ||
+ | pass | ||
+ | elif chunkType == b'RPRM': | ||
+ | pass | ||
+ | elif chunkType == b'RTMP': | ||
+ | # Much like INDX? | ||
+ | pass | ||
+ | elif chunkType == b'GPUD': | ||
+ | pass | ||
+ | elif chunkType == b'GPUS': | ||
+ | pass | ||
+ | elif chunkType == b'TEX\0': | ||
+ | pass | ||
+ | elif chunkType == b'LGHT': | ||
+ | pass | ||
+ | elif chunkType == b'ACT\0': | ||
+ | pass | ||
+ | elif chunkType == b'VB\0\0': | ||
+ | pass | ||
+ | elif chunkType == b'INFO': | ||
+ | pass | ||
+ | elif chunkType == b'DRVP': | ||
+ | pass | ||
+ | elif chunkType == b'TVC\0': | ||
+ | pass | ||
+ | elif chunkType == b'MESH': | ||
+ | pass | ||
+ | elif chunkType == b'COLR': | ||
+ | pass | ||
+ | elif chunkType == b'TEXT': | ||
+ | pass | ||
+ | elif chunkType == b'AUDI': | ||
+ | pass | ||
+ | elif chunkType == b'END\0': | ||
+ | assert(SIZE == 0) | ||
+ | return chunk | ||
+ | |||
+ | else: | ||
+ | print("Unknown chunk type: " + NAME) | ||
+ | print(chunkType) | ||
+ | assert(False) | ||
+ | |||
+ | f.seek(TMP) | ||
+ | |||
+ | if indexed == True: | ||
+ | f.seek(OFFSET) | ||
+ | |||
+ | # Nested | ||
+ | if(DUMMY == 1): | ||
+ | chunk['children'] = readChunks(f) | ||
+ | f.seek(TMP) | ||
+ | return chunk | ||
+ | |||
+ | #FIXME: RTMP has DUMMY = 2! | ||
+ | if DUMMY == 2: | ||
+ | print("Not sure how to handle DUMMY = 2") | ||
+ | |||
+ | if SIZE & 0xC0000000: #FIXME: Figure out actual use of these 2 bits | ||
+ | ZSIZE = SIZE & 0x3FFFFFFF | ||
+ | SIZE = readLong(f) | ||
+ | print("c-zSize is " + str(ZSIZE)) | ||
+ | print("c-Size is " + str(SIZE)) | ||
+ | OFFSET = f.tell() | ||
+ | data = clog(f, NAME, OFFSET, ZSIZE - 4, SIZE) | ||
+ | else: | ||
+ | data = log(f, NAME, OFFSET, SIZE) | ||
+ | |||
+ | if chunkType == b'GPUD': | ||
+ | |||
+ | gpud = data | ||
+ | |||
+ | offset = 87424 | ||
+ | for i in range(0, N): | ||
+ | if False: | ||
+ | x = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | y = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | z = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | color = 0 | ||
+ | print("v %f, %f, %f, 0x%08X" % (x, y, z, color)) | ||
+ | |||
+ | # The start of the file is closer to this | ||
+ | if False: | ||
+ | x = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | y = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | z = struct.unpack('<f', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | color = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print("v %f, %f, %f, 0x%08X" % (x, y, z, color)) | ||
+ | elif chunkType == b'TEXT': | ||
+ | offset = 0 | ||
+ | count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print(str(count) + " car texture(s):") | ||
+ | for i in range(count): | ||
+ | name = data[offset:offset+32] | ||
+ | offset += 32 | ||
+ | fmt = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | dim = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | dataOffset = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | texture = {} | ||
+ | texture['name'] = name.decode('ascii').rstrip('\0') | ||
+ | print("0x%04X" % dim) | ||
+ | texture['width'] = 1 << ((dim >> 4) & 0xF) | ||
+ | texture['height'] = 1 << (dim & 0xF) | ||
+ | texture['format'] = fmt | ||
+ | texture['offset'] = dataOffset | ||
+ | textures += [texture] | ||
+ | print(" Texture name: " + texture['name']) | ||
+ | |||
+ | #FIXME: There is more data here, at least 4 byte! | ||
+ | #count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | #print(str(count) + " Unknown(s):") | ||
+ | elif chunkType == b'TEX\0': | ||
+ | offset = 0 | ||
+ | count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print(str(count) + " world texture(s):") | ||
+ | for i in range(count): | ||
+ | name = data[offset:offset+32] | ||
+ | offset += 32 | ||
+ | a1 = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | a2 = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | b = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | c = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | d = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | |||
+ | h = 1 << (a2 & 0xF) | ||
+ | w = 1 << ((a2 >> 4) & 0xF) | ||
+ | print("Size: " + str(w) + "x" + str(h) + " (End: 0x%08X)" % (b + (w * h * 4)//3)) | ||
+ | |||
+ | someH = 1 << (d & 0xF) | ||
+ | someW = 1 << ((d >> 4) & 0xF) | ||
+ | print("Max. Size (?): " + str(someW) + "x" + str(someH)) | ||
+ | |||
+ | #FIXME: a? | ||
+ | gpudOffset = b | ||
+ | #FIXME: c? | ||
+ | #FIXME: d? | ||
+ | |||
+ | texture = {} | ||
+ | texture['name'] = name.decode('ascii').rstrip('\0') | ||
+ | texture['offset'] = gpudOffset | ||
+ | texture['width'] = w | ||
+ | texture['height'] = h | ||
+ | texture['format'] = a1 | ||
+ | textures += [texture] | ||
+ | |||
+ | print(" Texture name: " + texture['name'] + "\n @ 0x%04X 0x%04X 0x%08X 0x%08X 0x%08X " % (a1,a2,b,c,d)) | ||
+ | |||
+ | |||
+ | print("\n") | ||
+ | #FIXME: There is more data here, at least 4 byte! | ||
+ | #count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | #print(str(count) + " Unknown(s):") | ||
+ | elif chunkType == b'VB\0\0': | ||
+ | vb = {} | ||
+ | for i in range(0, SIZE, 4): | ||
+ | vb['offset'] = struct.unpack('<I', data[i:i+4])[0] | ||
+ | vbs += [vb] | ||
+ | elif chunkType == b'MESH' or chunkType == b'WMSH': | ||
+ | # Originally written for MESH, also might work for WMSH | ||
+ | |||
+ | offset = 42 | ||
+ | count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print("Index count might be " + str(count)) | ||
+ | |||
+ | unk = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print("Unk " + str(unk)) | ||
+ | |||
+ | #FIXME: what is this? always zero?! | ||
+ | offset += 2 | ||
+ | |||
+ | indices = [] | ||
+ | for i in range(0, count): #FIXME: Number of indices | ||
+ | j = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | indices += [j] | ||
+ | offset += 2 | ||
+ | |||
+ | if False: | ||
+ | #FIXME: Very much WIP.. only developing this for WMSH | ||
+ | # (MESH seems to be slightly different) | ||
+ | |||
+ | # Align to next 4 byte barrier | ||
+ | #offset += 3 | ||
+ | #offset &= ~3 | ||
+ | |||
+ | batchCount = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | print("Batches " + str(batchCount)) | ||
+ | |||
+ | if batchCount > 0: | ||
+ | |||
+ | # Format and len until first primitive restart or something? | ||
+ | for i in range(0, batchCount): | ||
+ | #FIXME: Number of indices per batch in 32 bit?! | ||
+ | x = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | print(" A: " + str(x)) | ||
+ | offset += 4 | ||
+ | for i in range(0, batchCount): | ||
+ | #FIXME: Same as before but in 16 bit?! | ||
+ | x = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | print(" B: " + str(x)) | ||
+ | offset += 2 | ||
+ | |||
+ | # We need at least one batch | ||
+ | batchCount = max(1, batchCount) | ||
+ | |||
+ | for i in range(0, batchCount): | ||
+ | x = struct.unpack('<BBBBBB', data[offset:offset+6]) | ||
+ | offset += 6 | ||
+ | print("data: %02X %02X %02X %02X %02X %02X" % x) | ||
+ | |||
+ | # FIXME: How to get here on our own?! | ||
+ | assert(offset == (len(data) - batchCount * 2)) | ||
+ | totalSize = 0 | ||
+ | for i in range(0, batchCount): | ||
+ | batchSize = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | totalSize += batchSize | ||
+ | print(" Batch size is " + str(batchSize) + " total: " + str(totalSize)) | ||
+ | |||
+ | mesh = {} | ||
+ | mesh['indices'] = indices | ||
+ | meshs += [mesh] | ||
+ | elif chunkType == b'AUDI': | ||
+ | offset = 0 | ||
+ | count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print(str(count) + " audio sample(s) (???):") | ||
+ | for i in range(count): | ||
+ | name = data[offset:offset+4] | ||
+ | offset += 4 | ||
+ | print(" Name: " + str(name)) | ||
+ | #FIXME: Read rest of data | ||
+ | offset += 12 | ||
+ | elif chunkType == b'MAT\0': | ||
+ | offset = 0 | ||
+ | count = struct.unpack('<I', data[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | print(str(count) + " material(s):") | ||
+ | for i in range(count): | ||
+ | a = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | b = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | c = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | d = struct.unpack('<H', data[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | print(" Material: 0x%04X, 0x%04X, 0x%04X, 0x%04X" % (a, b, c, d)) | ||
+ | #FIXME: Read rest of data | ||
+ | |||
+ | if indexed == True: | ||
+ | f.seek(TMP) | ||
+ | |||
+ | return chunk | ||
+ | |||
+ | with open(sys.argv[1], 'rb') as f: | ||
+ | while True: | ||
+ | tmp = f.tell() | ||
+ | try: | ||
+ | if f.read(1) == b'': | ||
+ | raise | ||
+ | except: | ||
+ | print("EOF?!") | ||
+ | break | ||
+ | f.seek(tmp) | ||
+ | readChunks(f) | ||
+ | |||
+ | with open('test.obj', 'w') as e: | ||
+ | for vb in vbs: | ||
+ | print("VB: " + str(vb['offset'])) | ||
+ | offset = vb['offset'] | ||
+ | for i in range(0, 138): | ||
+ | |||
+ | print(i) # Helper so we can figure out when the parser hangs | ||
+ | |||
+ | x = 0 | ||
+ | y = 0 | ||
+ | z = 0 | ||
+ | u = 0 | ||
+ | v = 0 | ||
+ | |||
+ | if False: # /tmp/240Z.pak_hrd | ||
+ | x = struct.unpack('<h', gpud[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | y = struct.unpack('<h', gpud[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | z = struct.unpack('<h', gpud[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | |||
+ | offset += 4 | ||
+ | |||
+ | #FIXME: Fix scale | ||
+ | u = struct.unpack('<h', gpud[offset:offset+2])[0] / 512 | ||
+ | offset += 2 | ||
+ | v = struct.unpack('<h', gpud[offset:offset+2])[0] / 512 | ||
+ | offset += 2 | ||
+ | |||
+ | if False: # /tmp/sharkfin.pak | ||
+ | |||
+ | # Presumably: 17? verts, each 20? bytes | ||
+ | # 25 indices | ||
+ | |||
+ | # xyz uv | ||
+ | |||
+ | x = struct.unpack('<i', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | y = struct.unpack('<i', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | z = struct.unpack('<i', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | |||
+ | #FIXME: Fix scale | ||
+ | u = struct.unpack('<i', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | v = struct.unpack('<i', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | |||
+ | if True: # /tmp/Red Cone.pak | ||
+ | |||
+ | # Presumably: 137 verts, each 24 bytes | ||
+ | # 205 indices | ||
+ | |||
+ | #FIXME: Broken still! Mostly works but contains garbage data | ||
+ | |||
+ | x = struct.unpack('<f', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | y = struct.unpack('<f', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | z = struct.unpack('<f', gpud[offset:offset+4])[0] | ||
+ | offset += 4 | ||
+ | |||
+ | u = struct.unpack('<h', gpud[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | v = struct.unpack('<h', gpud[offset:offset+2])[0] | ||
+ | offset += 2 | ||
+ | |||
+ | offset += 4 | ||
+ | offset += 4 | ||
+ | |||
+ | #FIXME: Fix scale | ||
+ | u /= 8192 | ||
+ | v /= 8192 | ||
+ | |||
+ | print("xyz: %f %f %f; uv: %f %f" % (x,y,z, u,v)) | ||
+ | e.write('v %f %f %f\n' % (x,y,z)) | ||
+ | e.write('vt %f %f\n' % (u,v)) | ||
+ | |||
+ | for mesh in meshs: | ||
+ | c = -1 | ||
+ | b = -1 | ||
+ | a = -1 | ||
+ | for i in mesh['indices']: | ||
+ | c = b | ||
+ | b = a | ||
+ | a = i + 1 | ||
+ | if c >= 0: | ||
+ | e.write('f %d/%d %d/%d %d/%d\n' % (c,c,b,b,a,a)) | ||
+ | |||
+ | for texture in textures: | ||
+ | #with open(str(texture['name']) + ".raw",'wb') as e: | ||
+ | # DXT3 = 1 byte per pixel | ||
+ | print("Exporting " + texture['name'] + " (%d x %d)" % (texture['width'], texture['height'])) | ||
+ | size = texture['width'] * texture['height'] | ||
+ | if texture['width'] < 4: | ||
+ | print("Texture too small!") # FIXME!!! | ||
+ | continue | ||
+ | data = gpud[texture['offset']:texture['offset']+size] | ||
+ | if texture['format'] == 0x0414: | ||
+ | image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) | ||
+ | elif texture['format'] == 0x040C: | ||
+ | image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) | ||
+ | elif texture['format'] == 0x041C: | ||
+ | image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) | ||
+ | elif texture['format'] == 0x021C: | ||
+ | image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 1) | ||
+ | elif texture['format'] == 0x020C: | ||
+ | image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 1) | ||
+ | else: | ||
+ | print("Unknown format! 0x%04X" % texture['format']) | ||
+ | image = None | ||
+ | if image: | ||
+ | image.save('textures/' + texture['name'] + "-0x%04X" % texture['format'] + ".png") | ||
+ | #e.write() | ||
+ | </pre> | ||
+ | |||
+ | --[[User:JayFoxRox|JayFoxRox]] ([[User talk:JayFoxRox|talk]]) 09:38, 23 September 2017 (PDT) |
Latest revision as of 09:12, 24 September 2017
Contents
Reverse engineering notes for file formats
--Zaykho (talk) 10:34, 22 September 2017 (PDT)
Project Gotham Racing 2 use a .PAK container for storing 3D elements, textures and 3D configurations files.
Multiple types of .PAK name can be seen in PGR2, their names are related to their functions and indicate what type of elements are stored:
.PAK for objects ( Cache\Objects & Cache\Cars )
.pak
.PAK for cars ( Cache\Cars )
.pak_cth
.pak_hrd
.pak_opn ( only for roadster : open mode)
.pak_sft ( only for roadster : closed mode for rain)
.PAK for maps ( Game\Areas )
.pak_common
.pak_day
.pak_night
.pak_overcast
.pak_stream (This uses a non-standard PAK format)
To extract those .PAK files, a tool called quickbms and a PGR2 bms script ( both made by Luigi Auriemma ) can be used to get most of the contents stored in the archive.
When extracted, the content stored in the archive is sliced in sections, creating a folder for each of them and joining a file with the actual contents in it, mostly with a .dat suffix.
Here an example for objects:
.PAK for object : red_cone.pak
WMSH \ 00000000.dat
MAT \ 00000001.dat
GPUD \ 00000002.dat
TEXT \ 00000003.nfc
VB \ 00000004.dat
END ( nothing, no folder, no files )
PAK File format
The PAK format is a chunked format. Each chunk starts with:
- u32 chunk-type (can be interpreted as 4 byte ASCII magic)
- u32 unknown
- u32 size of chunk (excluding this field?)
The chunk can contain compressed or uncompressed data. Compression is probably indicated by the upper bits of the size (0xC0000000) being set.
Compression seems to be with a zlib header (0x78, 0xDA
).
Data can be inline, directly following the header. However, in INDX chunks, the data is pointed to by an extra u32 field after each header.
INDX
- u32: number of chunks
- array of chunk headers, each with additional u32 field pointing to the data
- array of chunk data
WMSH
The 00000000.dat file contain the faces indices, and supposedly, the strip and the how the vertices are read ( float, word ? ).
MESH
Same as WMSH?
MAT
Material properties
The 00000001.dat file contain the material properties : diffuse, specular, ambient etc... of each material applied to the 3D file.
SKY
WRAP
INST
Header has some field set to != 0. Seems to contain subchunks which end with "END" chunk?
TIME
RCAM
ANIC
ROUT
RPRM
RTMP
GPUD
GPUS
TEX
LGHT
ACT
INFO
DRVP
AUDI
TVC
END
Marks the end of the file. Can also be inside a file and might mark the end of a subchunk there?
GPUD
GPU Data?
The 00000002.dat file contain all the textures, each of them followed by a small part of data ( texture configuration ? mipmaps ? ). Finally, after all the textures, the vertices and UV position are stored in one chunk of data.
TEXT
Texture information
The 00000003.dat file contain all the text and name related to the texture, it also indicate where each textures are located inside GPUD.
VB
Vertex buffer information
The 00000004.dat file contain the start address offset of the vertices and UV section in the GPUD. If the 3D mesh have multiples groups/sections, the VB file will store each address.
Hacky python script to extract PAK textures
#!/usr/bin/env python3 # Project Gotham Racing 2 (.pak) # Originally a script for QuickBMS http://quickbms.aluigi.org # comtype unzip_dynamic import sys import struct import zlib from PIL import Image # Array entries to export for debug purposes N = 20 def readLong(f): return struct.unpack('<I', f.read(4))[0] def clog(f, NAME, OFFSET, ZSIZE, SIZE): print('Exporting ' + NAME + ' (Compressed) from ' + str(OFFSET)) #FIXME: Export f.seek(OFFSET) compressed = f.read(ZSIZE) #print(compressed) data = bytes() try: decompress = zlib.decompressobj(15) for b in compressed: data += decompress.decompress(compressed) raise except: print("Decompression failed after " + str(len(data)) + ' / ' + str(SIZE) + ' bytes') with open(NAME, 'wb') as e: e.write(data) #print(data) return data def log(f, NAME, OFFSET, ZSIZE): f.seek(OFFSET) data = f.read(ZSIZE) print('Exporting ' + NAME) #FIXME: Export with open(NAME, 'wb') as e: e.write(data) return data textures = [] vbs = [] meshs = [] gpud = bytes([]) def readChunks(f): while True: chunk = readChunk(f) if chunk['type'] == b'END\0': break def readChunk(f, indexed = False): global gpud global textures global vbs global meshs print("At " + str(f.tell())) chunkType = f.read(4) NAME = chunkType.decode('ascii').rstrip('\0') # FIXME: Remove?! print("Found chunk " + NAME) DUMMY = readLong(f) SIZE = readLong(f) chunk = {} chunk['type'] = chunkType chunk['size'] = SIZE if DUMMY: chunk['children'] = [] print("Dummy " + str(DUMMY)) print("Size " + str(SIZE)) if indexed == True: OFFSET = readLong(f) + 0xC print("Offset " + str(OFFSET)) chunk['offset'] = OFFSET else: OFFSET = f.tell() TMP = f.tell() if chunkType == b'INDX': print("Reading index?!") FILES = readLong(f) for i in range(0, FILES): readChunk(f, True) f.read(SIZE - FILES * 16 - 4) return chunk elif chunkType == b'WMSH': pass elif chunkType == b'SKY\0': pass elif chunkType == b'WRAP': pass elif chunkType == b'INST': pass elif chunkType == b'MAT\0': pass elif chunkType == b'TIME': pass elif chunkType == b'RCAM': pass elif chunkType == b'ANIC': pass elif chunkType == b'ROUT': pass elif chunkType == b'RPRM': pass elif chunkType == b'RTMP': # Much like INDX? pass elif chunkType == b'GPUD': pass elif chunkType == b'GPUS': pass elif chunkType == b'TEX\0': pass elif chunkType == b'LGHT': pass elif chunkType == b'ACT\0': pass elif chunkType == b'VB\0\0': pass elif chunkType == b'INFO': pass elif chunkType == b'DRVP': pass elif chunkType == b'TVC\0': pass elif chunkType == b'MESH': pass elif chunkType == b'COLR': pass elif chunkType == b'TEXT': pass elif chunkType == b'AUDI': pass elif chunkType == b'END\0': assert(SIZE == 0) return chunk else: print("Unknown chunk type: " + NAME) print(chunkType) assert(False) f.seek(TMP) if indexed == True: f.seek(OFFSET) # Nested if(DUMMY == 1): chunk['children'] = readChunks(f) f.seek(TMP) return chunk #FIXME: RTMP has DUMMY = 2! if DUMMY == 2: print("Not sure how to handle DUMMY = 2") if SIZE & 0xC0000000: #FIXME: Figure out actual use of these 2 bits ZSIZE = SIZE & 0x3FFFFFFF SIZE = readLong(f) print("c-zSize is " + str(ZSIZE)) print("c-Size is " + str(SIZE)) OFFSET = f.tell() data = clog(f, NAME, OFFSET, ZSIZE - 4, SIZE) else: data = log(f, NAME, OFFSET, SIZE) if chunkType == b'GPUD': gpud = data offset = 87424 for i in range(0, N): if False: x = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 y = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 z = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 color = 0 print("v %f, %f, %f, 0x%08X" % (x, y, z, color)) # The start of the file is closer to this if False: x = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 y = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 z = struct.unpack('<f', data[offset:offset+4])[0] offset += 4 color = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print("v %f, %f, %f, 0x%08X" % (x, y, z, color)) elif chunkType == b'TEXT': offset = 0 count = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print(str(count) + " car texture(s):") for i in range(count): name = data[offset:offset+32] offset += 32 fmt = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 dim = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 dataOffset = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 texture = {} texture['name'] = name.decode('ascii').rstrip('\0') print("0x%04X" % dim) texture['width'] = 1 << ((dim >> 4) & 0xF) texture['height'] = 1 << (dim & 0xF) texture['format'] = fmt texture['offset'] = dataOffset textures += [texture] print(" Texture name: " + texture['name']) #FIXME: There is more data here, at least 4 byte! #count = struct.unpack('<I', data[offset:offset+4])[0] #print(str(count) + " Unknown(s):") elif chunkType == b'TEX\0': offset = 0 count = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print(str(count) + " world texture(s):") for i in range(count): name = data[offset:offset+32] offset += 32 a1 = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 a2 = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 b = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 c = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 d = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 h = 1 << (a2 & 0xF) w = 1 << ((a2 >> 4) & 0xF) print("Size: " + str(w) + "x" + str(h) + " (End: 0x%08X)" % (b + (w * h * 4)//3)) someH = 1 << (d & 0xF) someW = 1 << ((d >> 4) & 0xF) print("Max. Size (?): " + str(someW) + "x" + str(someH)) #FIXME: a? gpudOffset = b #FIXME: c? #FIXME: d? texture = {} texture['name'] = name.decode('ascii').rstrip('\0') texture['offset'] = gpudOffset texture['width'] = w texture['height'] = h texture['format'] = a1 textures += [texture] print(" Texture name: " + texture['name'] + "\n @ 0x%04X 0x%04X 0x%08X 0x%08X 0x%08X " % (a1,a2,b,c,d)) print("\n") #FIXME: There is more data here, at least 4 byte! #count = struct.unpack('<I', data[offset:offset+4])[0] #print(str(count) + " Unknown(s):") elif chunkType == b'VB\0\0': vb = {} for i in range(0, SIZE, 4): vb['offset'] = struct.unpack('<I', data[i:i+4])[0] vbs += [vb] elif chunkType == b'MESH' or chunkType == b'WMSH': # Originally written for MESH, also might work for WMSH offset = 42 count = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print("Index count might be " + str(count)) unk = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print("Unk " + str(unk)) #FIXME: what is this? always zero?! offset += 2 indices = [] for i in range(0, count): #FIXME: Number of indices j = struct.unpack('<H', data[offset:offset+2])[0] indices += [j] offset += 2 if False: #FIXME: Very much WIP.. only developing this for WMSH # (MESH seems to be slightly different) # Align to next 4 byte barrier #offset += 3 #offset &= ~3 batchCount = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 print("Batches " + str(batchCount)) if batchCount > 0: # Format and len until first primitive restart or something? for i in range(0, batchCount): #FIXME: Number of indices per batch in 32 bit?! x = struct.unpack('<I', data[offset:offset+4])[0] print(" A: " + str(x)) offset += 4 for i in range(0, batchCount): #FIXME: Same as before but in 16 bit?! x = struct.unpack('<H', data[offset:offset+2])[0] print(" B: " + str(x)) offset += 2 # We need at least one batch batchCount = max(1, batchCount) for i in range(0, batchCount): x = struct.unpack('<BBBBBB', data[offset:offset+6]) offset += 6 print("data: %02X %02X %02X %02X %02X %02X" % x) # FIXME: How to get here on our own?! assert(offset == (len(data) - batchCount * 2)) totalSize = 0 for i in range(0, batchCount): batchSize = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 totalSize += batchSize print(" Batch size is " + str(batchSize) + " total: " + str(totalSize)) mesh = {} mesh['indices'] = indices meshs += [mesh] elif chunkType == b'AUDI': offset = 0 count = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print(str(count) + " audio sample(s) (???):") for i in range(count): name = data[offset:offset+4] offset += 4 print(" Name: " + str(name)) #FIXME: Read rest of data offset += 12 elif chunkType == b'MAT\0': offset = 0 count = struct.unpack('<I', data[offset:offset+4])[0] offset += 4 print(str(count) + " material(s):") for i in range(count): a = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 b = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 c = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 d = struct.unpack('<H', data[offset:offset+2])[0] offset += 2 print(" Material: 0x%04X, 0x%04X, 0x%04X, 0x%04X" % (a, b, c, d)) #FIXME: Read rest of data if indexed == True: f.seek(TMP) return chunk with open(sys.argv[1], 'rb') as f: while True: tmp = f.tell() try: if f.read(1) == b'': raise except: print("EOF?!") break f.seek(tmp) readChunks(f) with open('test.obj', 'w') as e: for vb in vbs: print("VB: " + str(vb['offset'])) offset = vb['offset'] for i in range(0, 138): print(i) # Helper so we can figure out when the parser hangs x = 0 y = 0 z = 0 u = 0 v = 0 if False: # /tmp/240Z.pak_hrd x = struct.unpack('<h', gpud[offset:offset+2])[0] offset += 2 y = struct.unpack('<h', gpud[offset:offset+2])[0] offset += 2 z = struct.unpack('<h', gpud[offset:offset+2])[0] offset += 2 offset += 4 #FIXME: Fix scale u = struct.unpack('<h', gpud[offset:offset+2])[0] / 512 offset += 2 v = struct.unpack('<h', gpud[offset:offset+2])[0] / 512 offset += 2 if False: # /tmp/sharkfin.pak # Presumably: 17? verts, each 20? bytes # 25 indices # xyz uv x = struct.unpack('<i', gpud[offset:offset+4])[0] offset += 4 y = struct.unpack('<i', gpud[offset:offset+4])[0] offset += 4 z = struct.unpack('<i', gpud[offset:offset+4])[0] offset += 4 #FIXME: Fix scale u = struct.unpack('<i', gpud[offset:offset+4])[0] offset += 4 v = struct.unpack('<i', gpud[offset:offset+4])[0] offset += 4 if True: # /tmp/Red Cone.pak # Presumably: 137 verts, each 24 bytes # 205 indices #FIXME: Broken still! Mostly works but contains garbage data x = struct.unpack('<f', gpud[offset:offset+4])[0] offset += 4 y = struct.unpack('<f', gpud[offset:offset+4])[0] offset += 4 z = struct.unpack('<f', gpud[offset:offset+4])[0] offset += 4 u = struct.unpack('<h', gpud[offset:offset+2])[0] offset += 2 v = struct.unpack('<h', gpud[offset:offset+2])[0] offset += 2 offset += 4 offset += 4 #FIXME: Fix scale u /= 8192 v /= 8192 print("xyz: %f %f %f; uv: %f %f" % (x,y,z, u,v)) e.write('v %f %f %f\n' % (x,y,z)) e.write('vt %f %f\n' % (u,v)) for mesh in meshs: c = -1 b = -1 a = -1 for i in mesh['indices']: c = b b = a a = i + 1 if c >= 0: e.write('f %d/%d %d/%d %d/%d\n' % (c,c,b,b,a,a)) for texture in textures: #with open(str(texture['name']) + ".raw",'wb') as e: # DXT3 = 1 byte per pixel print("Exporting " + texture['name'] + " (%d x %d)" % (texture['width'], texture['height'])) size = texture['width'] * texture['height'] if texture['width'] < 4: print("Texture too small!") # FIXME!!! continue data = gpud[texture['offset']:texture['offset']+size] if texture['format'] == 0x0414: image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) elif texture['format'] == 0x040C: image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) elif texture['format'] == 0x041C: image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 3) elif texture['format'] == 0x021C: image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 1) elif texture['format'] == 0x020C: image = Image.frombytes('RGBA', (texture['width'], texture['height']), data, 'bcn', 1) else: print("Unknown format! 0x%04X" % texture['format']) image = None if image: image.save('textures/' + texture['name'] + "-0x%04X" % texture['format'] + ".png") #e.write()