In development, there are times when we need to retrieve the source code of a web page. PHP provides several ways to achieve this. In this article, we will introduce three of the most common methods:
The file_get_contents() function can be used to read the content of a remote web page, including the HTML source code. Here is an example of using this function:
$source = file_get_contents($url);
In this case, $url is the web page URL from which you want to retrieve the source code, and $source will contain the HTML source code of that page.
cURL is a powerful library for data transfer in PHP, including retrieving web page source code. When using cURL, we usually follow these steps:
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$source = curl_exec($ch);
curl_close($ch);
Here, $ch is the cURL session handle, curl_setopt() sets the options, curl_exec() executes the session and retrieves the source code, and curl_close() closes the cURL session.
DOMDocument is a built-in PHP class that allows easy parsing and manipulation of HTML and XML documents. You can use this class to retrieve web page source code. Here is an example of using DOMDocument to retrieve source code:
$dom = new DOMDocument();
$dom->loadHTML($url);
$source = $dom->saveHTML();
In this case, $dom is the DOMDocument object, loadHTML() loads the web page source code, and saveHTML() returns the HTML content of the loaded document.
All three methods can effectively retrieve web page source code. Depending on your needs, you can choose the most appropriate method. If you just need a simple retrieval of web page content, file_get_contents() is the most straightforward option. If you need more flexible control, cURL is a better choice. If you need to parse and manipulate HTML source code, the DOMDocument class is the ideal tool.