In PHP development, the md5_file() function is commonly used to obtain the MD5 hash value of a file, which is helpful for verifying file integrity or detecting file changes. However, when md5_file() is called multiple times for the same batch of files, frequent disk I/O operations can significantly reduce the program's execution efficiency. To address this, we can use a caching mechanism to enhance the execution efficiency of the md5_file() function during multiple calls.
The md5_file() function reads the entire file content from the disk each time it is called to calculate the MD5 value. If the same file is called multiple times, the file content is repeatedly read, resulting in unnecessary resource consumption. This performance loss becomes more noticeable, especially when dealing with large files or numerous files.
The core idea of caching is: the first time md5_file() is called, the result is computed and saved. On subsequent requests for the same file, the hash value is directly retrieved from the cache, avoiding redundant calculations.
The cache can be stored in memory (such as in an array or static variable), in files, or other storage media, depending on the specific scenario.
Below is a simple example based on memory caching that implements caching of the md5_file() results:
<?php
class Md5FileCache {
private static $cache = [];
* Get the MD5 value of a file with caching support
*
* @param string $filePath The file path
* @return string|false Returns the MD5 string or false on failure
*/
public static function getMd5(string $filePath) {
// Check if the cache exists first
if (isset(self::$cache[$filePath])) {
return self::$cache[$filePath];
}
// Cache does not exist, calculate the MD5
if (!file_exists($filePath)) {
return false;
}
$md5 = md5_file($filePath);
if ($md5 !== false) {
self::$cache[$filePath] = $md5;
}
return $md5;
}
}
// Example usage
$file = '/path/to/your/file.txt';
$md5_1 = Md5FileCache::getMd5($file);
$md5_2 = Md5FileCache::getMd5($file); // This fetches directly from cache, avoiding recalculation
echo "MD5: " . $md5_1 . PHP_EOL;
?>
The cached result may become invalid if the file is modified. Therefore, cache validation should be combined with the file's modification time (filemtime()).
<?php
class Md5FileCache {
private static $cache = [];
if (!file_exists($filePath)) {
return false;
}
$mtime = filemtime($filePath);
if (isset(self::$cache[$filePath]) && self::$cache[$filePath]['mtime'] === $mtime) {
return self::$cache[$filePath]['md5'];
}
$md5 = md5_file($filePath);
if ($md5 !== false) {
self::$cache[$filePath] = [
'md5' => $md5,
'mtime' => $mtime,
];
}
return $md5;
}
}
?>
If the program runs for a long time or needs to persist across requests, the cache can be stored persistently.