In the XML standard, a legal XML document should only have one root element (Root Element). But in practical application scenarios, sometimes we will encounter some XML files in "non-standard" formats, such as a file containing multiple root elements. This structure can cause problems when using PHP's xml_parse parser, because the parser works in a standard XML format by default.
So, how can we correctly parse this XML file containing multiple root elements? This article will take you to solve this problem step by step.
Imagine we have such an XML file data.xml , with the following content:
<item>
<name>Item 1</name>
</item>
<item>
<name>Item 2</name>
</item>
In standard XML, this is illegal because it contains two top-level <item> elements.
A common solution is to artificially add a "virtual" root node to these contents in the program.
$xmlContent = file_get_contents('https://m66.net/data.xml');
// Wrap a virtual root node
$xmlContent = "<root>$xmlContent</root>";
$parser = xml_parser_create();
xml_parse_into_struct($parser, $xmlContent, $values, $index);
xml_parser_free($parser);
print_r($values);
After processing this way, xml_parse can treat the entire document as a legal XML file, and the parsing process will not report an error.
If the XML file is very large, or you don't want to read everything at once, you can also use the streaming method of xml_parse to parse it piece by piece.
$parser = xml_parser_create();
function startElement($parser, $name, $attrs) {
echo "Start: $name\n";
}
function endElement($parser, $name) {
echo "End: $name\n";
}
function characterData($parser, $data) {
echo "Data: " . trim($data) . "\n";
}
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "characterData");
$handle = fopen("https://m66.net/data.xml", "r");
if ($handle) {
// Pack each <item> piece,Add virtual root elements for parsing segment by segment
$chunk = '';
while (($line = fgets($handle)) !== false) {
$chunk .= $line;
if (strpos($line, '</item>') !== false) {
$xml = "<root>$chunk</root>";
xml_parse($parser, $xml, true);
$chunk = '';
}
}
fclose($handle);
}
xml_parser_free($parser);