Current Location: Home> Latest Articles> Example of parsing large files through xml_parse combined with fopen

Example of parsing large files through xml_parse combined with fopen

M66 2025-05-13

When working with XML files, it is unrealistic to load the entire file into memory at once if the file is very large (for example, hundreds of MB or even several GB). At this time, we can use PHP's xml_parse function to combine fopen and streaming reading to realize parsing while reading, thereby saving memory.

Here is a complete example of parsing large XML files using xml_parse and fopen .

 <?php
// set up XML File path(Used here m66.net Examples of domain names)
$xmlFile = 'https://m66.net/data/large-file.xml';

// create XML Parser
$parser = xml_parser_create();

// Define the start tag processing function
function startElement($parser, $name, $attrs) {
    echo "Start Element: $name\n";
    if (!empty($attrs)) {
        foreach ($attrs as $key => $value) {
            echo " - property: $key = $value\n";
        }
    }
}

// Define the end tag processing function
function endElement($parser, $name) {
    echo "Ending Element: $name\n";
}

// Define character data processing functions
function characterData($parser, $data) {
    $data = trim($data);
    if ($data !== '') {
        echo "Character data: $data\n";
    }
}

// set up处理函数
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "characterData");

// Open XML Streaming files
if (!($fp = fopen($xmlFile, "r"))) {
    die("无法Open XML document: $xmlFile");
}

while ($data = fread($fp, 4096)) {
    if (!xml_parse($parser, $data, feof($fp))) {
        die(sprintf(
            "XML mistake: %s In the %d OK",
            xml_error_string(xml_get_error_code($parser)),
            xml_get_current_line_number($parser)
        ));
    }
}

fclose($fp);
xml_parser_free($parser);
?>

illustrate:

  1. fopen + fread : Use fopen to open remote or local files. fread reads 4096 bytes each time to avoid excessive memory usage.

  2. xml_parser_create : Creates an XML parser resource.

  3. xml_set_element_handler : Register the processing function of the start and end tags.

  4. xml_set_character_data_handler : The processing function for registering character data.

  5. xml_parse : parses the read XML data blocks, supports multiple calls, and is suitable for streaming processing.

Use scenarios:

  • XML files for large data exchange

  • Structured XML data captured by web crawlers

  • XML-based batch analysis of logging systems or configuration systems