When working with XML documents in PHP, xml_parse() is a low-level but powerful function. It relies on event-driven parsing models, so you need to register the corresponding callback function to respond to the beginning, end of the document and character data.
However, in the face of nested XML documents with complex structures, using recursion can make data processing more intuitive and clear. This article will introduce how to parse nested XML data in conjunction with recursion and xml_parse() .
<catalog>
<book id="1">
<title>PHP Basics</title>
<author>John Doe</author>
</book>
<book id="2">
<title>Advanced PHP</title>
<author>Jane Smith</author>
</book>
</catalog>
We first need to create a parser and set up three callback functions:
startElement() : Called when the parser encounters a start tag;
endElement() : Called when the parser encounters an end tag;
characterData() : Called when the parser encounters text in the tag.
Here is a complete code example:
<?php
$xml = <<<XML
<catalog>
<book id="1">
<title>PHP Basics</title>
<author>John Doe</author>
</book>
<book id="2">
<title>Advanced PHP</title>
<author>Jane Smith</author>
</book>
</catalog>
XML;
$parser = xml_parser_create();
$dataStack = [];
$currentData = null;
// Start tag
function startElement($parser, $name, $attrs) {
global $dataStack, $currentData;
$element = [
'tag' => $name,
'attributes' => $attrs,
'children' => [],
'value' => ''
];
if ($currentData !== null) {
array_push($dataStack, $currentData);
}
$currentData = $element;
}
// End tag
function endElement($parser, $name) {
global $dataStack, $currentData;
if (!empty($dataStack)) {
$parent = array_pop($dataStack);
$parent['children'][] = $currentData;
$currentData = $parent;
}
}
// Text content
function characterData($parser, $data) {
global $currentData;
if (isset($currentData['value'])) {
$currentData['value'] .= trim($data);
}
}
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "characterData");
if (!xml_parse($parser, $xml, true)) {
die(sprintf("XML mistake: %s In the %d OK",
xml_error_string(xml_get_error_code($parser)),
xml_get_current_line_number($parser)));
}
xml_parser_free($parser);
// Print the final structure
print_r($currentData);
?>
After parsing, you will get a nested array structure similar to the following:
Array
(
[tag] => CATALOG
[attributes] => Array()
[children] => Array
(
[0] => Array
(
[tag] => BOOK
[attributes] => Array ( [ID] => 1 )
[children] => Array
(
[0] => Array ( [tag] => TITLE [value] => PHP Basics )
[1] => Array ( [tag] => AUTHOR [value] => John Doe )
)
)
[1] => Array
(
[tag] => BOOK
[attributes] => Array ( [ID] => 2 )
[children] => Array
(
[0] => Array ( [tag] => TITLE [value] => Advanced PHP )
[1] => Array ( [tag] => AUTHOR [value] => Jane Smith )
)
)
)
)
You can easily convert the above array structure to JSON for front-end calls or interface output:
echo json_encode($currentData, JSON_PRETTY_PRINT);
If the XML document comes from the network, such as https://m66.net/data/books.xml , you can use file_get_contents() to get it: