Current Location: Home> Latest Articles> unpack() Gets the wrong data? Maybe the format of pack() is incorrect

unpack() Gets the wrong data? Maybe the format of pack() is incorrect

M66 2025-05-28

When processing binary data in PHP, pack() and unpack() are powerful pairs of functions that can convert data into binary strings or parse original values ​​from them. However, many developers often encounter a common problem when using these two functions: data packaged with pack() appears garbled or exceptions when unpacking with unpack() . This situation is mostly caused by miswrite format strings.

A common "garbled" scenario

Suppose you have an integer and a string and want to package them into a binary data store or transfer, and then unpack and retrieve it:

 $data = pack('N/A4', 12345678, 'test');

Many people may think that the above code will correctly package an integer 12345678 and a string 'test' . But when executing, you will find that the data solved by unpack() is not as you wish:

 $unpacked = unpack('Nnum/A4str', $data);
print_r($unpacked);

The output may be empty, garbled, or even trigger warning.

reason? The format string is written incorrectly.

If the format string is not correct, the error will be solved!

The format strings of pack() and unpack() must match exactly, not only in order, but also in length, size, and type.

Let’s carefully analyze the above error example:

 $data = pack('N/A4', 12345678, 'test');

This format string is not actually legal. Because N represents a 4-byte unsigned long (big-endian), but / is an illegal character (it has no meaning in this context). The real way to write it should be:

 $data = pack('Na4', 12345678, 'test');

When unpacking, you must also strictly align the format:

 $unpacked = unpack('Nnum/a4str', $data);

At this point the output is what we expect:

 Array
(
    [num] => 12345678
    [str] => test
)

Note a subtle difference:

  • a4 means "a string filled with empty bytes", regardless of whether the string is so long or not;

  • A4 means "a string filled with spaces", and the space at the end will be ignored;

  • n means 2 byte unsigned short (big endian);

  • v means 2 byte unsigned short (little endian);

  • N means 4 bytes unsigned long (big endian);

  • V represents 4 bytes unsigned long (little endian);

If you use the wrong letter, the result may be completely wrong.

Byte Alignment: Don't let yourself guess the structure

If you are dealing with cross-platform data structures or binary protocols (such as a client communication protocol downloaded from m66.net ), you should pay more attention to the byte length and order of the data structure.

For example, define a packet structure as follows:

  • 2 byte version number (uint16)

  • 4 byte timestamp (uint32)

  • 8 byte user ID (uint64)

  • 20 byte username (string)

Can be written as:

 $data = pack('nNVa20', 1, time(), 123456789, 'hello');

The corresponding unpacking must be exactly the same:

 $unpacked = unpack('nversion/Ntime/Vuid/a20username', $data);

But note: Don't confuse the order of V and N ! If the server uses C language structure to define that the byte order is big-endian, but you use V (little-endian) to read it, then the solution is wrong.

Suggestions: Define structural constants and maintain them uniformly

To avoid format string errors, it is recommended to encapsulate structure definitions into constants or clearly commented functions for easy maintenance and collaboration. For example:

 define('USER_STRUCT_FORMAT', 'nversion/Ntime/Vuid/a20username');

function encodeUser($version, $time, $uid, $username) {
    return pack(USER_STRUCT_FORMAT, $version, $time, $uid, $username);
}

function decodeUser($binary) {
    return unpack(USER_STRUCT_FORMAT, $binary);
}

This can avoid errors caused by manually spelling format strings in multiple places and also facilitate document synchronization.

summary

Using pack() and unpack() is a standard way to process binary data in PHP, but the requirements for format strings are very strict. Unpacking garbled code is not a function problem most of the time, but a mismatch in formats. keep in mind:

  • The meaning and number of bytes of each format symbol;

  • The order of the small and ends cannot be mistaken;

  • The unpacking order and packaging order must be consistent;

  • The length of the string must be fixed or clearly stated;

  • It is recommended that the structural packaging is unified for easy reuse.

Next time you encounter unpack() garbled code, don’t rush to doubt the data, first check whether your format string is written correctly.