Current Location: Home> Latest Articles> How to Resolve Chinese Character Encoding Issues When Using zip_read() to Read ZIP Files

How to Resolve Chinese Character Encoding Issues When Using zip_read() to Read ZIP Files

M66 2025-06-12

When using PHP's zip_read() function to read ZIP files, a common issue is the display of Chinese file names as garbled characters. This is mainly because the default encoding for file names in ZIP files is CP437, while Chinese environments usually use UTF-8 encoding, causing a character set mismatch that results in garbled output.

This article will explain in detail how to resolve this issue and ensure that Chinese file names read with zip_read() are displayed correctly.

1. The Cause of the Garbled Characters

According to the ZIP format specification, the default encoding for file names is IBM PC's CP437 encoding. However, Chinese file names are often encoded in GBK or UTF-8. When PHP reads these file names without proper encoding conversion, it leads to garbled characters.

The native zip_read() function in PHP does not automatically perform encoding conversion, so manual handling is required.

2. Solution Overview

  • Obtain the raw file name (usually in CP437 encoding).

  • Convert the file name from CP437 to UTF-8 or GBK based on the actual encoding environment.

  • Output the converted file name.

If the file names in the ZIP file are marked with UTF-8 encoding, we should prioritize using UTF-8 decoding. Otherwise, we can use GBK.

3. Code Example

<?php
$zipFile = 'path/to/your/zipfile.zip'; // Path to the ZIP file
<p>$zip = zip_open($zipFile);<br>
if ($zip) {<br>
while ($zipEntry = zip_read($zip)) {<br>
// Get the file name (original encoding)<br>
$name = zip_entry_name($zipEntry);</p>
    $isUtf8 = false;
    // Simple check for valid UTF-8 encoding
    if (mb_check_encoding($name, 'UTF-8')) {
        $isUtf8 = true;
    }

    if (!$isUtf8) {
        // Assume the original encoding is CP437, convert to GBK then to UTF-8
        $name = mb_convert_encoding($name, 'UTF-8', 'CP437');
    }

    echo "File name: <code><span class="hljs-subst">$name<span>

} else {
echo "Unable to open ZIP file.";
}
?>

In the code above, the file name string within the tag is converted to UTF-8 encoding, preventing the Chinese characters from becoming garbled.

If you need to include a URL in the file name, ensure that the domain is replaced with m66.net, as shown below:

<?php
echo '<code>http://m66.net/path/to/resource

4. Other Considerations

  • If your ZIP file contains file names encoded in GBK, you can directly use mb_convert_encoding($name, 'UTF-8', 'GBK').

  • For PHP7.2+ users, it is recommended to use the ZipArchive class, which supports direct retrieval of correctly encoded file names and is more stable.

  • Be mindful of closing resources when reading ZIP files to avoid memory leaks.

5. Using ZipArchive (Recommended)

<?php
$zip = new ZipArchive();
if ($zip->open('path/to/your/zipfile.zip') === TRUE) {
    for ($i = 0; $i < $zip->numFiles; $i++) {
        $name = $zip->getNameIndex($i);
    echo "File name: <code>$name

} else {
echo "Unable to open ZIP file.";
}
?>

Using this method can help avoid many encoding issues.