Difference between revisions of "HAL DAT (File Format)"

From Custom Mario Kart
Jump to navigation Jump to search
Line 771: Line 771:
 
This template is actively updated, but the current version still uses an outdated file entry method (most files won't work).<br>
 
This template is actively updated, but the current version still uses an outdated file entry method (most files won't work).<br>
 
An updated template was in the works using the pathfinder method, but the pointer count must be under 348 (0x180) or HexEdit will freeze.
 
An updated template was in the works using the pathfinder method, but the pointer count must be under 348 (0x180) or HexEdit will freeze.
 
 
'''HAL_DAT Structure for HexWorkshop 6.8.0'''<br>
 
''Link coming soon''<br>
 
This structure will be actively updated when available.
 
  
 
== References ==
 
== References ==

Revision as of 14:11, 19 January 2015

Under Construction
This article is not finished. Help improve it by adding accurate information or correcting grammar and spelling.


Introduction

This page describes the HAL Labs DAT file format, as found in games such as Super Smash Bros Melee or Kirby Air Ride.

NOTE: this format's extreme simplicity makes it complicated to understand, and there's still quite a lot to be deciphered from it.


Beginner information

If you are unfamiliar with how Game Resource Archives are built up, please read Xentax' Definitive Guide to Exploring File Formats.
It contains all the information that you should know if you are interested in starting to understand how the Melee DAT files work.


File Format

The Melee DAT format can be described as an un-ordered multi-root hierarchy (tree) of various data structures.

All pointers are relative to the Data Block (0x20) except for string pointers.

NOTE: pointers and offsets are the same thing, but the term "Offset" is only used to describe the data at that location.

File Header

Offset Size Format Description
0x00 4 unsigned File Size
0x04 4 pointer Pointer Table Offset
0x08 4 unsigned Pointer Count
0x0C 4 unsigned Root-Node Count
0x10 4 unsigned Reference-Node Count
0x14 4 string Unknown ("001B" when used)
0x18 4 unsigned Unknown (padding?)
0x1C 4 unsigned Unknown (padding?)


Pointer Table

The Pointer Table lists (points to) every valid (0x00000000 is invalid) pointer in the data block.
The order of the list is exactly the order of the pointers in the block.

Offset Size Format Description
Pointer Table Offset Pointer Count * 4 pointer Pointer Offset


Root-Nodes

This is where our journey through the data begins as these contain the data and string offsets.

Offset Size Format Description
0x00 4 pointer Data Offset
0x04 4 pointer String Offset (relative to String Table)

Currently the string usage is unknown as they don't exactly follow any specific pattern.
Reading the strings means reading until a stop character of '\x00'.

NOTE: The strings here are the only strings in the entire file, and have very little description of the file.
The Root-Nodes are believed to be a dictionary where the string is the (hashed) key to identify it.


Reference-Nodes

- Yet to be seen -

Following these is the String Table, which has no definite size and no padding.


Structures

Identifying the Root Structure --

Currently 2 methods have been used to decide what the root structure is.


-- Method 1

This method was looked into by Revel8n only for SSB Melee, which involves analyzing the string of the Root-Node and searching for keywords to identify the structure.


-- Method 2

This method was designed by Tcll which tests structure paths, sizes, pointer locations, and pointer offsets against pre-defined (known) structures.
The structures that match all these standards are tried, and the path is discarded if anything fails.
(This method works with both SSB Melee and Kirby AR)

Code: ( Python 2.7 / UMC-script v3.0a )
full code: https://copy.com/GhS6PAbuBH7MedUI

Step 1: Create a dictionary which holds the info for the known structures. (structure size, function reference, and pointers)
The functions here link the functions for the rest of the structures.

   def _pass(offset): pass # for following standards.
   
   def _MatAnim(offset): pass # not known yet
   
   def _Bone_Set(offset,rig = 'joint_obj'):
       ugeSetObject(rig)
       _Bone(offset, rig=rig) # I'm not posting my entire script
   
   def _Struct2(offset):
       Jump(offset, label=' -- Unknown Struct 2')
   
       ptr0 = bu32(    label=' -- Unknown Pointer')+32
       ptr1 = bu32(    label=' -- Struct 2 Pointer')+32
       ptr2 = bu32(    label=' -- Unknown Pointer')+32
       matrix = bu32(  label=' -- 3x4 Matrix Pointer')+32
       ptr4 = bu32(    label=' -- Struct 3 Pointer')+32
       ptr5 = bu32(    label=' -- Unknown Pointer')+32
       ptr6 = bu32(    label=' -- Struct 7 Pointer')+32
       
       if ptr1>32: _Struct2(ptr1) # I'm not posting my entire script
   
   roots = { # { struct-size: [ struct-name, ... ] }
       12: ['matanim'],
       64: ['bone'],
       48: ['unk2'],
   }
   
   structs = { # { struct-name: [ expected-size, struct-function, [ [ pointer-addr, struct-key ], ... ] ] }
   
       'pass':[ -1,_pass,[]], # the logic will continue testing until this is hit.
   
       'matanim': [ 12, _MatAnim,[
           [0,'pass'],
           [8,'pass']
       ]],
   
       'bone': [ 64, _Bone_Set, [
           [0,'pass'],
           [8,'bone'],
           [12,'bone'],
           [16,'object'],
           [56,'pass'] # (marix) no pointers, no need to test.
       ]],
   
       # code cut here (see the link for the full code)
   
   }

Step 2: Gather and sort the pointers to each structure using the pointer table and root structure pointers.

   jump(pointer_tbl)
   relocations = bu32(['']*pointer_cnt, label=' -- Pointer Reference Address')
   pointers = sorted(list(set( [[jump(addr+32), bu32( label=' -- Pointer')+32][1] for addr in relocations]+[[jump(pointer_tbl+(pointer_cnt*4)+(i*8)),
       bu32( label=' -- Root Pointer')+32][1] for i in range(root_cnt)]+[pointer_tbl] ))) # include the root pointers

Step 3: Validate the structure paths (also testing padded structures) and parse the file.

   def test_path(struct_name, given_size, offset, pointers=pointers, relocations=relocations):
       expected_size, func, ptrs = structs[struct_name]
       if expected_size == -1: return True # allow for ignorance
       if given_size != expected_size:
           if given_size > expected_size:
               jump(offset+expected_size, label=' -- validating struct %s oversize'%struct_name)
               if sum(bu8(['']*(given_size-expected_size))):
                   LABEL('\npathfinder: Error: Structure size %i > %i for struct %s'%(given_size,expected_size,struct_name)); return False
           else: LABEL('\npathfinder: Error: Structure size %i < %i for struct %s'%(given_size,expected_size,struct_name)); return False
       
       for pid,(ptrLoc,name) in enumerate(ptrs):
           pointer=offset+ptrLoc
           jump(pointer, label=' -- struct %s'%struct_name)
           location = bu32(label=' -- pointer %i of struct %s'%(pid,struct_name))+32
           if pointer-32 not in relocations:
               if location==32: continue # 0-pointer
               else: LABEL('\npathfinder: Error: 0-pointer expected, but found data'); return False
           
           if location in pointers: size = pointers[pointers.index(location)+1]-location
           else: LABEL('\npathfinder: Error: Could not determine the structure size.'); return False
           
           if test_path(name, size, location, pointers=pointers, relocations=relocations): continue
           else: return False # breaks the loop
           
       return True
       
   for i in range(root_cnt): # TODO: test 2nd-priority root structures of undefined size, such as image data (found in Melee stages)
       LABEL('\npathfinder: Finding path %i\n'%i)
       Jump(pointer_tbl+(pointer_cnt*4)+(i*8), label=' -- Root Structs') # get to the root node
       
       root_offset = bu32(label=' -- Data Offset')+32
       string_offset = bu32(label=' -- String Offset') # could be a dictionary key: { str(key): value }
       root_size = pointers[pointers.index(root_offset)+1]-root_offset
       
       found = False
       _size = root_size
       if _size not in roots: # test padding
           closest = min(roots, key=lambda x:abs(x-root_size))
           jump(root_offset+closest, label=' -- validating root struct oversize')
           if sum(bu8(['']*(root_size-closest))): pass
           else: _size = closest
       if _size not in roots: LABEL('\npathfinder: Could Not Identify Path %i Root\n'%i)
       else:
           for struct_id,struct_name in enumerate(roots[_size]):
               if test_path(struct_name, _size, root_offset): LABEL('\npathfinder: Path %i Found! Parsing Data:\n'%i); found=True; structs[struct_name][1](root_offset)
       if not found: LABEL('\npathfinder: Could Not Find Path %i\n'%i)
       

Tcll: Sorry if this is diffucult to understand, I will try to work on this.
If you can't understand something here, please post about it on the discussion page. :)


Structure Layout --

Currently, little is known about the full structure layout, but here's what's known so far.

Mesh layout: ( SSBM Pl*.dat, Ty*.dat )
Root
└ Bone
  ├ Bone
  └ Object
    ├ Material
    │ ├ Colors
    │ └ Texture
    │   ├ Image
    │   │ └ (Image Data)
    │   ├ Pallet
    │   │ └ (Pallet Data)
    │   ├ Unknown1
    │   └ Texture
    ├ Mesh
    │ ├ Attributes
    │ │ └ (Vector Data)
    │ ├ Influence Matrix Array
    │ │ └ Weight Array
    │ │   └ Weight
    │ └ Display List
    │   └ (Sub-Vector Data (if any) and/or Indexes)
    └ Object

Structure Definitions ----

Found in SSB Melee—————————————————————————————————————————————————————————

---- Bone Structures (Root Structure)

Offset Size Format Description
0x00 4 pointer Unknown Offset (typically 0)
0x04 4 unsigned Unknown Flags
0x08 4 pointer Child Bone Struct Offset
0x0C 4 pointer Next Bone Struct Offset
0x10 4 pointer Object Struct Offset
0x14 4 float Rotation X
0x18 4 float Rotation Y
0x1C 4 float Rotation Z
0x20 4 float Scale X
0x24 4 float Scale Y
0x28 4 float Scale Z
0x2C 4 float Location X
0x30 4 float Location Y
0x34 4 float Location Z
0x38 4 pointer Inverse Bind Matrix Offset (use parent matrix if 0)
0x3C 4 Unknown


Inverse Matrix (3x4)

Offset Size Format Description
0x00 4 float Rotation 1 1
0x04 4 float Rotation 1 2
0x08 4 float Rotation 1 3
0x0C 4 float Translation X
0x10 4 float Rotation 2 1
0x14 4 float Rotation 2 2
0x18 4 float Rotation 2 3
0x1C 4 float Translation Y
0x20 4 float Rotation 3 1
0x24 4 float Rotation 3 2
0x28 4 float Rotation 3 3
0x2C 4 float Translation Z

---- Object Structures

Offset Size Format Description
0x00 4 pointer Unknown Offset (typically 0)
0x04 4 pointer Next Object Struct Offset
0x08 4 pointer Material Struct Offset
0x0C 4 pointer Mesh Struct Offset

---- Material Structures

Offset Size Format Description
0x00 4 pointer Unknown Offset (typically 0)
0x04 4 unsigned Unknown Flags
0x08 4 pointer Texture Struct Offset
0x0C 4 pointer Colors Struct Offset
0x10 4 unsigned Unknown
0x14 4 unsigned Unknown

---- Texture Structures

Offset Size Format Description
0x00 4 pointer Unknown Offset (typically 0)
0x04 4 pointer Next Texture Struct Offset
0x08 4 unsigned Unknown
0x0C 4 unsigned Layer Flags
0x10 4 unsigned Unknown
0x14 4 unsigned Unknown
0x18 4 unsigned Unknown
0x1C 4 unsigned Unknown
0x20 4 float? Unknown (maybe: TexClamp Max X)
0x24 4 float? Unknown (maybe: TexClamp Max Y)
0x28 4 float Unknown (maybe: TexClamp Angle)
0x2C 4 float Unknown (maybe: TexClamp Min X)
0x30 4 float Unknown (maybe: TexClamp Min Y)
0x34 4 unsigned Wrap S
0x38 4 unsigned Wrap T
0x3C 1 unsigned Scale X
0x3D 1 unsigned Scale Y
0x3E 2 unsigned Unknown
0x40 4 unsigned Unknown (possibly 2 16-bit values)
0x44 4 float Unknown
0x48 4 unsigned Unknown
0x4C 4 pointer Image Struct Offset
0x50 4 pointer Pallet Struct Offset
0x54 4 unsigned Unknown
0x58 4 pointer Unknown(1) Struct Offset

---- Image Structures

Offset Size Format Description
0x00 4 pointer Data Offset (typically 0)
0x04 2 unsigned Width
0x06 2 unsigned Height
0x08 4 unsigned Format
Format Stride Description
0x0 0x1 (2 pixels) I4 (Intensity 4-bit)
0x1 0x1 I8 (Intensity 8-bit)
0x2 0x1 IA4 (Intensity 4-bit, Alpha 4-bit)
0x3 0x2 IA8 (Intensity 8-bit, Alpha 8-bit)
0x4 0x2 RGB565 (Red 5-bit, Green 6-bit, Blue 5-bit)
0x5 0x2 RGB5A3 (Flag 1-bit,

Flag True: (Red 5-bit, Green 5-bit, Blue 5-bit)
Flag False: (Red 4-bit, Green 4-bit, Blue 4-bit, Alpha 3-bit))

NOTE: this may be backwards

0x6 0x4 RGBA8(888) (Red 8-bit, Green 8-bit, Blue 8-bit, Alpha 8-bit)
0x8 0x1 (2 pixels) CI4 (Color Index 4-bit)
0x9 0x1 CI8 (Color Index 8-bit)
0xA 0x2 CI14x2 (Color Index ?)
0xE 0x8 (16 pixels)
(2-color pallet)
CMPR (S3TC Compressed)
0x0C 12 padding

---- Pallet Structures

Offset Size Format Description
0x00 4 pointer Data Offset (typically 0)
0x04 4 unsigned Format
Format Stride Description
0x0 0x2 IA8 (Intensity 8-bit, Alpha 8-bit)
0x1 0x2 RGB565 (Red 5-bit, Green 6-bit, Blue 5-bit)
0x2 0x2 RGB5A3 (Flag 1-bit,

Flag True: (Red 5-bit, Green 5-bit, Blue 5-bit)
Flag False: (Red 4-bit, Green 4-bit, Blue 4-bit, Alpha 3-bit))

NOTE: this may be backwards

0x08 4 unsigned Unknown
0x0C 2 unsigned Color Count
0x0E 2 unsigned Unknown

---- Unknown(1) Structures

Offset Size Format Description
0x00 4 pointer Unknown Offset? (typically 0)
0x04 4 unsigned Unknown (flags?)
0x08 4 unsigned Unknown (flags?)
0x0C 1 unsigned Unknown
0x0D 1 unsigned Unknown
0x0E 1 unsigned Unknown
0x0F 1 unsigned Unknown
0x10 16 padding?

---- Color Structures

Offset Size Format Description
0x00 1*4 unsigned RGBA Diffuse
0x04 1*4 unsigned RGBA Ambient
0x08 1*4 unsigned RGBA Specular
0x0C 4 float Unknown (typically 1.0)
0x10 4 float Unknown (Shininess?)

---- Mesh Structures

Offset Size Format Description
0x00 4 pointer Unknown Offset? (typically 0)
0x04 4 pointer Next Mesh Struct Offset

(parse until a CP_ID of 0xFF)

0x08 4 pointer Mesh-Attributes Struct Array Offset
0x0C 2 unsigned Unknown Flags
0x0E 2 unsigned Display-List size *32
0x10 4 pointer Display-List Data Offset
0x14 4 pointer Influence Matrix Array Offset

(parse the array until 0x00000000)

---- Mesh-Attribute Structures

Data: (HexEdit-styled)
    00 00 00 09  00 00 00 03  00 00 00 01  00 00 00 03
    0B 00 00 06  00 00 00 00

Offset Size Format Description
0x00 4 unsigned CP_ID
Enum Description
0x0 Vert/Normal Influence Matrix ID
0x1 UV[0] Influence Matrix ID
0x2 UV[1] Influence Matrix ID
0x3 UV[2] Influence Matrix ID
0x4 UV[3] Influence Matrix ID
0x5 UV[4] Influence Matrix ID
0x6 UV[5] Influence Matrix ID
0x7 UV[6] Influence Matrix ID
0x8 UV[7] Influence Matrix ID
0x9 Vert ID/Value
0xA Normal ID/Value
0xB Color[0] ID/Value
0xC Color[1] ID/Value
0xD UV[0] ID/Value
0xE UV[1] ID/Value
0xF UV[2] ID/Value
0x10 UV[3] ID/Value
0x11 UV[4] ID/Value
0x12 UV[5] ID/Value
0x13 UV[6] ID/Value
0x14 UV[7] ID/Value
0x15 Vert Influence Matrix Array Offset
0x16 Normal Influence Matrix Array Offset
0x17 UV Influence Matrix Array Offset
0x18 Light Influence Matrix Array Offset
0x14 NBT ID/Value
0xFF NULL
0x04 4 unsigned CP_Type
Enum Description
0x0 None
0x1 Direct (value instead of index)
0x2 Index 8-bit
0x3 Index 16-bit
0x08 4 unsigned Component Count
Enum Description
0x0 XY Position
0x1 XYZ Position
0x0 Normal
0x1 Normal Bi-normal Tangent
0x2 Normal or Bi-normal or Tangent
0x0 RGB Color
0x1 RGBA Color
0x0 S Coord
0x1 ST Coord
0x0C 4 unsigned Data Type/Format
Enum Description
0x0 unsigned 8-bit
0x1 signed 8-bit
0x2 unsigned 16-bit
0x3 signed 16-bit
0x4 float

Colors:

Enum Description
0x0 RGB565 (Red 5-bit, Green 6-bit, Blue 5-bit)
0x1 RGB8(88) (Red 8-bit, Green 8-bit, Blue 8-bit)
0x2 RGBX8(888) (Red 8-bit, Green 8-bit, Blue 8-bit, Discarded 8-bit)
0x3 RGBA4(444) (Red 4-bit, Green 4-bit, Blue 4-bit, Alpha 4-bit)
0x4 RGBA6(666) (Red 6-bit, Green 6-bit, Blue 6-bit, Alpha 6-bit)
0x5 RGBA8(888) (Red 8-bit, Green 8-bit, Blue 8-bit, Alpha 8-bit)
0x10 1 unsigned Divizor (Floating Point Exponent for int data types)
0x11 1 unsigned Unknown
0x12 2 unsigned Stride
0x14 4 pointer Data Offset

---- Weight Structure Arrays

Data: (HexEdit-styled)
    00 00 F3 70  3F 80 00 00  00 00 00 00  00 00 00 00

Offset Size Format Description
0x00 4 pointer Bone Struct Offset
0x04 4 float Weight

The Bone Struct Offset is used to dereference the already existing bone struct to get it's inverse-bind matrix.

-- more structures coming soon --


———————————————————————————————————————————————————————————————————

Found in Kirby Air-Ride --————————————————————————————————————————————————————————

---- Unknown(2) Structures (Root Structure)

Offset Size Format Description
0x00 4 pointer Unknown Offset
0x04 4 pointer Unknown(3) Struct Offset
0x08 4 pointer Unknown(4) Struct Offset
0x0C 4 pointer Unknown Matrix 3x4 Offset
0x10 4 pointer Unknown(5) Struct Offset
0x14 4 pointer Unknown Offset
0x18 4 pointer Unknown(6) Struct Offset
0x1C 20 padding?

---- Unknown(3) Structures

Offset Size Format Description
0x00 4 pointer Unknown Single Bone Struct Offset
0x04 4 unsigned Unknown
0x08 4 unsigned Unknown
0x0C 4 unsigned Unknown
0x10 4 pointer Unknown(7) Struct Offset (attributes?)
0x14 4 pointer Unknown(7) Struct Offset (attributes?)
0x18 4 pointer Unknown(7) Struct Offset (attributes?)
0x1C 4 pointer Unknown(7) Struct Offset (attributes?)
0x20 4 pointer Unknown(7) Struct Offset (attributes?)
0x24 4 pointer Unknown(7) Struct Offset (attributes?)
0x28 4 pointer Bone Struct Offset

-- more structures coming soon --

———————————————————————————————————————————————————————————————————

Resources

HexEdit Pro 4.0 - an advanced hex editor with template support.
Screenshot: http://lh4.ggpht.com/-wdKQc09lsdY/Ulc0kkamcpI/AAAAAAAAFYk/kOhcJzzvPKY/s1400/Screenshot%25202013-10-10%252019.11.44.png
Download (4Shared): http://www.4shared.com/file/VRaD3TtAce/HexEditPro4_0.html
Ad-Ware Warning!: Don't use 4Priority Downloader! (create an account)


HAL_DAT template for HexEdit Pro 4.0
https://copy.com/AjtjUrwVlrKPwveF
This template is actively updated, but the current version still uses an outdated file entry method (most files won't work).
An updated template was in the works using the pathfinder method, but the pointer count must be under 348 (0x180) or HexEdit will freeze.

References

(old) http://smashboards.com/threads/melee-dat-format.292603/