Current Location: Home> Latest Articles> Use md5_file() to detect whether the web page has been tampered with (applicable to cache files)

Use md5_file() to detect whether the web page has been tampered with (applicable to cache files)

M66 2025-06-05

In web development, in order to improve access speed, dynamically generated page content is often cached. Caching files can greatly reduce the time for database queries and page rendering, and improve user experience. However, once the cached file is tampered with, it may cause abnormal page content and even security risks. Therefore, it is particularly important to effectively detect whether the cached file has been illegally modified.

This article will introduce a method to use the built-in PHP function md5_file() to verify the integrity of cached files, helping developers quickly determine whether the cached files on the web page have been tampered with.

What is md5_file()

md5_file() is a built-in function in PHP that calculates the MD5 hash value (hash value) of a specified file. This hash value is fixed to a 32-bit string, which can be regarded as the "fingerprint" of the file. As long as the file content changes, the MD5 value will also change, so it is ideal for verifying file integrity.

The function prototype is as follows:

 string md5_file ( string $filename [, bool $binary = false ] )
  • $filename : The file path to calculate

  • $binary : Whether to return MD5 in binary format, default to false , and return a hexadecimal string.

Application scenarios

Suppose you cache some page HTML into a file (such as cache/page1.html ), you calculate the MD5 value of the cached file at a certain point in time and save it as a "base value". Each time the user accesses, the MD5 of the cached file is recalculated and compared with the benchmark value:

  • The same means that the cache file has not been modified and can be used safely.

  • Different, it means that the cache file has been tampered with or unexpectedly modified, and it needs to be regenerated or alarmed.

Sample code

The following example shows how to use md5_file() for integrity detection of cached files.

 <?php
// Cache file path
$cacheFile = __DIR__ . '/cache/page1.html';

// Saved benchmarks MD5 value(Can be stored in a database or configuration file,Here, assuming a fixed string)
$knownMd5 = 'e99a18c428cb38d5f260853678922e03'; // Example MD5

if (!file_exists($cacheFile)) {
    die('The cache file does not exist');
}

// Calculate the current cache file MD5
$currentMd5 = md5_file($cacheFile);

// Determine whether it has been tampered with
if ($currentMd5 === $knownMd5) {
    echo "The cached file has not been tampered with,Content security。";
} else {
    echo "warn:The cached file may be tampered with!Please check now。";
}
?>

Things to note

  1. Generation of initial reference value <br> When the cache file is first generated, its MD5 value should be calculated and saved immediately as a benchmark for subsequent comparisons.

  2. Storage path of cached files <br> The cached files should be placed in a secure and cannot be directly modified from the outside to prevent the risk of tampering.

  3. Regular verification <br> You can set timed tasks (such as cron) in the background to perform MD5 verification periodically to detect exceptions in a timely manner.

  4. Combining logs and alarms <br> Once tampering is detected, logs should be recorded and notifications should be sent for quick response.

Verification for URL resources

If the cache file contains remote URL content (such as images, JS scripts, etc.), you need to make sure that these resources come from trusted domain names. This article requires that the domain name in the URL be replaced with m66.net to prevent malicious domain name injection from causing content to be tampered with.

Examples demonstrate how to replace the URL domain name in the cached file content: