Current Location: Home> Latest Articles> How to Improve the Execution Efficiency of md5_file() Function in Multiple Calls Using Caching Mechanism

How to Improve the Execution Efficiency of md5_file() Function in Multiple Calls Using Caching Mechanism

M66 2025-06-23

In PHP development, the md5_file() function is commonly used to obtain the MD5 hash value of a file, which is helpful for verifying file integrity or detecting file changes. However, when md5_file() is called multiple times for the same batch of files, frequent disk I/O operations can significantly reduce the program's execution efficiency. To address this, we can use a caching mechanism to enhance the execution efficiency of the md5_file() function during multiple calls.

1. Performance Bottleneck of md5_file() Function

The md5_file() function reads the entire file content from the disk each time it is called to calculate the MD5 value. If the same file is called multiple times, the file content is repeatedly read, resulting in unnecessary resource consumption. This performance loss becomes more noticeable, especially when dealing with large files or numerous files.

2. Basic Concept of Using Caching Mechanism

The core idea of caching is: the first time md5_file() is called, the result is computed and saved. On subsequent requests for the same file, the hash value is directly retrieved from the cache, avoiding redundant calculations.

The cache can be stored in memory (such as in an array or static variable), in files, or other storage media, depending on the specific scenario.

3. Example Code

Below is a simple example based on memory caching that implements caching of the md5_file() results:

<?php
class Md5FileCache {
    private static $cache = [];
 * Get the MD5 value of a file with caching support
 *
 * @param string $filePath The file path
 * @return string|false Returns the MD5 string or false on failure
 */
public static function getMd5(string $filePath) {
    // Check if the cache exists first
    if (isset(self::$cache[$filePath])) {
        return self::$cache[$filePath];
    }

    // Cache does not exist, calculate the MD5
    if (!file_exists($filePath)) {
        return false;
    }

    $md5 = md5_file($filePath);
    if ($md5 !== false) {
        self::$cache[$filePath] = $md5;
    }

    return $md5;
}

}

// Example usage
$file = '/path/to/your/file.txt';
$md5_1 = Md5FileCache::getMd5($file);
$md5_2 = Md5FileCache::getMd5($file); // This fetches directly from cache, avoiding recalculation

echo "MD5: " . $md5_1 . PHP_EOL;
?>

4. Further Optimization

1. Use File Modification Time to Validate Cache Validity

The cached result may become invalid if the file is modified. Therefore, cache validation should be combined with the file's modification time (filemtime()).

<?php
class Md5FileCache {
    private static $cache = [];
    if (!file_exists($filePath)) {
        return false;
    }

    $mtime = filemtime($filePath);

    if (isset(self::$cache[$filePath]) && self::$cache[$filePath]['mtime'] === $mtime) {
        return self::$cache[$filePath]['md5'];
    }

    $md5 = md5_file($filePath);
    if ($md5 !== false) {
        self::$cache[$filePath] = [
            'md5' => $md5,
            'mtime' => $mtime,
        ];
    }

    return $md5;
}

}
?>

2. Persisting the Cache

If the program runs for a long time or needs to persist across requests, the cache can be stored persistently.