Current Location: Home> Latest Articles> Combining xml_parse and curl to download and parse XML data

Combining xml_parse and curl to download and parse XML data

M66 2025-05-13

In daily PHP development, we often need to obtain XML data from remote servers and parse it. This article will demonstrate how to use curl to download XML data, then parse the content in combination with the xml_parse series functions, and convert it to an available array structure.

1. Preparation: Enable the required extensions

First, make sure the following extensions are enabled in your PHP environment:

  • cURL : used to remotely download data

  • XML Parser : used to parse XML documents

These two extensions are default to most PHP installation packages that are built in and require no additional installation.

2. Use curl to obtain XML data

Let's first download the XML content from a remote address through cURL:

 function fetchXmlData($url) {
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);

    $data = curl_exec($ch);

    if (curl_errno($ch)) {
        echo 'Curl error: ' . curl_error($ch);
        return false;
    }

    curl_close($ch);
    return $data;
}

// Example URL
$url = 'https://api.m66.net/data/sample.xml';
$xmlContent = fetchXmlData($url);

if ($xmlContent === false) {
    exit('Get XML fail');
}

3. Use xml_parse to parse XML data

PHP's xml_parse function is an event-driven XML parsing method. We can convert XML content into a structured array by creating a parser and setting up processing functions.

 function parseXmlToArray($xml) {
    $parser = xml_parser_create();
    $values = [];
    $index = [];

    xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
    xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);

    if (!xml_parse_into_struct($parser, $xml, $values, $index)) {
        echo "XML Parsing error: " . xml_error_string(xml_get_error_code($parser));
        xml_parser_free($parser);
        return false;
    }

    xml_parser_free($parser);
    return buildXmlArray($values);
}

function buildXmlArray($values) {
    $result = [];
    $stack = [];

    foreach ($values as $val) {
        switch ($val['type']) {
            case 'open':
                $tag = $val['tag'];
                $child = [];
                if (isset($val['attributes'])) {
                    $child['@attributes'] = $val['attributes'];
                }
                $child['@children'] = [];
                $stack[] = [&$result];
                $result[$tag][] = &$child;
                $result = &$child['@children'];
                break;

            case 'complete':
                $tag = $val['tag'];
                $entry = isset($val['value']) ? $val['value'] : '';
                if (isset($val['attributes'])) {
                    $result[$tag][] = [
                        '@attributes' => $val['attributes'],
                        '@value' => $entry
                    ];
                } else {
                    $result[$tag][] = $entry;
                }
                break;

            case 'close':
                $result = &$stack[count($stack) - 1][0];
                array_pop($stack);
                break;
        }
    }

    return $result;
}

4. Complete example: From download to analysis

By combining the above two parts, we can implement the complete "download + parsing" process:

 $url = 'https://api.m66.net/data/sample.xml';
$xmlContent = fetchXmlData($url);

if ($xmlContent) {
    $parsedData = parseXmlToArray($xmlContent);
    echo "<pre>";
    print_r($parsedData);
    echo "</pre>";
}

5. Summary

Through this article, you have mastered how to use PHP's cURL extension to download XML data, and how to use xml_parse to parse it into a structured array. Although xml_parse is a lower-level method, it has high performance and few dependencies, making it suitable for projects with performance requirements.

If you need to deal with more complex XML, it is recommended that you try to use more advanced parsers such as SimpleXML or DOMDocument in the future.