Difference between revisions of "YAZ0 (File Format)"

From Custom Mario Kart
Jump to navigation Jump to search
Line 118: Line 118:
 
</pre>
 
</pre>
 
This code example is taken from [[Wiimms SZS Tools]]: SVN repository [http://opensvn.wiimm.de/viewvc/wii/trunk/wiimms-szs-tools/src/lib-szs.c?annotate=2437#l159 lib-szs.c line 159]
 
This code example is taken from [[Wiimms SZS Tools]]: SVN repository [http://opensvn.wiimm.de/viewvc/wii/trunk/wiimms-szs-tools/src/lib-szs.c?annotate=2437#l159 lib-szs.c line 159]
 
=== Compression ===
 
  
 
[[category:File Format]]
 
[[category:File Format]]

Revision as of 21:30, 6 April 2011

Yaz0 is a run length encoding (RLE compression) method. In Mario Kart Wii most of the SZS files are Yaz0 compressed U8 files.

Data structure

Header

The header of a Yaz0 file is always 16 bytes long. All numeric values stored as big endian values.

Offset Type Description
0x00 char[4] always "Yaz0"
0x04 u32 size of the umompressed data
0x08 char[8] always zero (padding)
GNU C example
typedef struct yaz0_header_t
{
    char	magic[4];		// always "Yaz0"
    be32_t	uncompressed_size;	// total size of uncompressed data
    char	padding[8];		// always 0?
}
__attribute__ ((packed)) yaz0_header_t;

Data Groups

The complete compressed data is organized in data groups. Each data group consists of 1 group header byte an 8 data sequences:

N Size Description
1 1 the group header byte
8 1-3 8 data sequences

Each bit of the group header corespondents to one sequence:

  • The MSB (most significant bit, 0x80) corespondents to sequence 1
  • The LSB (lowest significant bit, 0x01) corespondents to sequence 8

A set bit (=1) in the group header means, that the sequence is 1 exact 1 byte long. This byte must be copied to the output stream 1:1. A cleared bit (=0) defines, that the sequence is 2 or 3 bytes long interpreted as a back reference to already decompressed data that must be copied.

Size 1.B 2.B 3.B Comment
2 NR RR N=1..f, SIZE=N+2
3 0R RR NN N=00..ff, SIZE=N+0x12
  • RRR is a value between 0x000 and 0xfff. Go back RRR+1 bytes in the output stream to find the start of the data to be copied.
  • SIZE is the number of bytes to be copied.

Decoding data groups and data sequences are done until the end of the destination data is reached.

Examples

Decompression

GNU C example
const u8 * src      = // pointer to start of source
const u8 * src_end  = // pointer to end of source (last byte +1)
u8 * dest           = // pointer to start of destination
u8 * dest_end       = // pointer to end of destination (last byte +1)

u8  code            = 0; // code ...
int code_len        = 0; // ... and code_len used to manage groups

while ( src < src_end && dest < dest_end ) 
{ 
    if (!code_len--)
    { 
        code = *src++;
        code_len = 7; 
    } 

    if ( code & 0x80 )
    {
        // copy 1 byte direct
        *dest++ = *src++;
    }
    else 
    { 
        // rle part 

        const u8  b1 = *src++;
        const u8  b2 = *src++;
        const u8 * copy_src = dest - ( ( ( b1 & 0x0f ) << 8 ) | b2 ) - 1;
        ASSERT( copy_src >= szs->data );

        int n = b1 >> 4; 
        if (!n) 
            n = *src++ + 0x12; 
        else
            n += 2;
        ASSERT( n >= 3 && n <= 0x111 );
        ASSERT( copy_src + n <= szs->data + szs->size );

        if ( dest + n > dest_end )
            return ERROR("Corrupted data!\n");

        while ( n-- > 0 )
            *dest++ = *copy_src++;
    }
    code <<= 1;
}
ASSERT( src <= src_end );
ASSERT( dest <= dest_end );

This code example is taken from Wiimms SZS Tools: SVN repository lib-szs.c line 159