Many developers encounter a common problem when parsing XML data using PHP's xml_parse() function: the "XML parsing failed" error. This kind of error occurs often confusing, especially when dealing with seemingly well-formed XML content. This article will analyze the causes of this problem in depth and provide several effective solutions.
xml_parse() is part of the XML parser extension in PHP and relies on the Expat XML parser library for event-driven parsing. It requires that the input XML must be fully compliant with the XML standard, otherwise a parsing error will be thrown.
The syntax is as follows:
bool xml_parse(resource $parser, string $data, bool $is_final)
Parameter description:
$parser : A parser resource created by xml_parser_create() .
$data : XML data to parse.
$is_final : indicates whether it is the last piece of data.
The most common problem is the XML itself is incorrect format. For example:
Label not closed
Illegal characters were used (such as & is not escaped correctly as & )
Declaration error, for example, XML declaration is placed after space
$xml = '<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don\'t forget me this weekend.</note>';
In the above example, the missing </body> tag will cause parsing to fail.
If the XML declaration uses UTF-8 but contains non-UTF-8 encoded characters in the actual content, it will also cause parsing to fail.
<?xml version="1.0" encoding="UTF-8"?>
An error will also occur if the content contains GBK-encoded characters but indicates that they are UTF-8.
If you read XML content through a URL or file, the reading fails or only partial data is read, it will also cause parsing failure.
$url = 'https://m66.net/data/feed.xml';
$xmlData = file_get_contents($url);
if ($xmlData === false) {
die("Read XML fail");
}
When you read large XML in segments, the third parameter $is_final of xml_parse() should be set to true on the last call. If not set, it will result in an "Resolution not completed" error.
These functions can help you locate the specific causes of parsing errors:
$parser = xml_parser_create();
if (!xml_parse($parser, $xml, true)) {
$errorCode = xml_get_error_code($parser);
$errorMessage = xml_error_string($errorCode);
$line = xml_get_current_line_number($parser);
$column = xml_get_current_column_number($parser);
echo "XML parsing failed: $errorMessage at line $line, column $column";
}
xml_parser_free($parser);
It is recommended to use editors that support XML verification (such as VS Code, Sublime Text, Notepad++) or online tools for pre-verification: https://m66.net/tools/xml-validator
Always ensure that the encoding in the XML declaration is consistent with the actual encoding. UTF-8 encoding is recommended and explicitly set when saving files.
If the XML source is a remote URL, it is recommended to check whether the request is successful first and then parse it:
$url = 'https://m66.net/api/xml';
$xml = @file_get_contents($url);
if ($xml === false) {
die("Unable to obtain remote XML data");
}
Compared to the original xml_parse() , PHP provides a more modern and easier XML parsing method, such as SimpleXML:
$xml = simplexml_load_string($xmlData);
if ($xml === false) {
echo "SimpleXML 解析fail";
}
Or use DOMDocument :
$dom = new DOMDocument();
$success = $dom->loadXML($xmlData);
if (!$success) {
echo "DOMDocument 解析fail";
}
Both methods provide more friendly error prompts and make it easier to operate XML nodes.
Although the "XML parsing failed" error is common, the reasons behind it can often be clearly checked through systematic inspections. Understanding and following XML format specifications and using encoding and tools rationally can greatly improve the parsing success rate. Going further, considering using more modern XML parsing tools can also make the development process more efficient and stable.
If you often deal with remote XML interfaces, you might as well encapsulate the above detection logic into functions and reuse them in the project to improve robustness.