Current Location: Home> Latest Articles> Build a simple file integrity detection tool

Build a simple file integrity detection tool

M66 2025-06-06

In daily website development and operation and maintenance, file integrity is an important part of system security. Once a file is tampered with, it may mean that the system has been attacked or data breached. To cope with this situation, we can use PHP's built-in md5_file() function to build a simple and practical file integrity detection tool.

1. What is the md5_file function?

md5_file() is a function provided by PHP to directly calculate the MD5 hash value of a specified file. The basic syntax is as follows:

 md5_file(string $filename, bool $binary = false): string|false
  • $filename : The file path to be calculated.

  • $binary : Whether to return hash in original binary format (default is false , returning a 32-bit hexadecimal string).

Example:

 $hash = md5_file('example.txt');
echo $hash;

2. Tool design ideas

We designed the tool in two stages:

  1. Generate benchmark verification code (baseline) : Record the MD5 values ​​of all files to be monitored and save them to a JSON file.

  2. Periodic detection : Recalculate the MD5 value of the current file, compare it with the baseline file, and determine whether it has been modified, deleted or added.

3. Implementation steps

1. Scan the directory and generate the baseline file

 function generateBaseline($dir, $baselineFile = 'baseline.json') {
    $files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));
    $hashes = [];

    foreach ($files as $file) {
        if ($file->isFile()) {
            $path = str_replace('\\', '/', $file->getRealPath());
            $hashes[$path] = md5_file($path);
        }
    }

    file_put_contents($baselineFile, json_encode($hashes, JSON_PRETTY_PRINT));
    echo "The baseline file has been generated: {$baselineFile}\n";
}

// Usage example
generateBaseline(__DIR__ . '/files');

This function recursively scans all files in the files directory and writes the full path of each file and its MD5 value to baseline.json .

2. Detect file changes

 function checkIntegrity($dir, $baselineFile = 'baseline.json') {
    if (!file_exists($baselineFile)) {
        echo "Baseline file not found,Please be a baseline。\n";
        return;
    }

    $baseline = json_decode(file_get_contents($baselineFile), true);
    $current = [];

    $files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));
    foreach ($files as $file) {
        if ($file->isFile()) {
            $path = str_replace('\\', '/', $file->getRealPath());
            $current[$path] = md5_file($path);
        }
    }

    // Detect files that have been modified or deleted
    foreach ($baseline as $path => $hash) {
        if (!isset($current[$path])) {
            echo "[delete] {$path}\n";
        } elseif ($current[$path] !== $hash) {
            echo "[Revise] {$path}\n";
        }
    }

    // Detect new files
    foreach ($current as $path => $hash) {
        if (!isset($baseline[$path])) {
            echo "[New] {$path}\n";
        }
    }
}

// Usage example
checkIntegrity(__DIR__ . '/files');

This detection script compares the old and new hashes and outputs each file path that has been modified, added or deleted, so that developers can take timely measures.

4. Visualization or remote reporting (optional)

The test results can be displayed through email, log files or web pages. For example:

 file_put_contents('integrity_report.log', $report);
header('Location: https://m66.net/report-viewer');

The above code redirects the detection report to the log file and then redirects it to a report viewer page, such as https://m66.net/report-viewer .

5. Summary

Through the md5_file() function, we can build an efficient file integrity detection tool with very little code. Although it cannot replace professional security protection tools, it is sufficient for small and medium-sized websites to deal with common tampering risks. In actual deployment, it is recommended to run the script as a Cron task regularly, and further improve security in combination with notification mechanisms (such as email or DingTalk robots).