Current Location: Home> Latest Articles> Unified format of IP, time and other fields in the log file

Unified format of IP, time and other fields in the log file

M66 2025-05-18

When processing server log files, we often encounter problems with inconsistent formats. For example, the IP address may have leading zeros, the timestamp format may be inconsistent, and the request path may have various parameters. In order to handle these fields uniformly, preg_replace_callback_array is a very powerful tool that allows us to bind different callback functions to multiple regular expressions separately, and complete complex replacement logic in a traversal.

This article will introduce how to use preg_replace_callback_array to uniformly format IP and time fields in the log.

Sample log format

Suppose we have the following log content (simplified version):

 127.000.000.001 - - [21/Apr/2025:15:32:01 +0000] "GET /index.php?id=123 HTTP/1.1" 200
192.168.1.10 - - [21-Apr-2025 15:32:01] "POST /submit.php HTTP/1.1" 404

We hope:

  • Normalize the IP address (removing leading zeroes);

  • Unify the time into YYYY-MM-DD HH:MM:SS format;

  • Optional: mask parameters in the path, for example /index.php?id=123/index.php

Implementation using preg_replace_callback_array

 <?php

$log = <<<LOG
127.000.000.001 - - [21/Apr/2025:15:32:01 +0000] "GET /index.php?id=123 HTTP/1.1" 200
192.168.1.10 - - [21-Apr-2025 15:32:01] "POST /submit.php HTTP/1.1" 404
LOG;

// Define regular and corresponding processing callbacks
$patterns = [
    // IP Address formatting:Remove leading zeros
    '/\b(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})\b/' => function ($matches) {
        return implode('.', array_map('intval', array_slice($matches, 1, 4)));
    },

    // Apache Style Timestamp [21/Apr/2025:15:32:01 +0000]
    '/\[(\d{2})\/(\w{3})\/(\d{4}):(\d{2}):(\d{2}):(\d{2}) [+\-]\d{4}\]/' => function ($matches) {
        $monthMap = [
            'Jan' => '01', 'Feb' => '02', 'Mar' => '03', 'Apr' => '04',
            'May' => '05', 'Jun' => '06', 'Jul' => '07', 'Aug' => '08',
            'Sep' => '09', 'Oct' => '10', 'Nov' => '11', 'Dec' => '12'
        ];
        return sprintf('%s-%s-%s %s:%s:%s',
            $matches[3],                         // Year
            $monthMap[$matches[2]] ?? '01',     // moon
            $matches[1],                         // day
            $matches[4], $matches[5], $matches[6]
        );
    },

    // day志格式中另一种时间 [21-Apr-2025 15:32:01]
    '/(\d{2})-(\w{3})-(\d{4}) (\d{2}):(\d{2}):(\d{2})/' => function ($matches) {
        $monthMap = [
            'Jan' => '01', 'Feb' => '02', 'Mar' => '03', 'Apr' => '04',
            'May' => '05', 'Jun' => '06', 'Jul' => '07', 'Aug' => '08',
            'Sep' => '09', 'Oct' => '10', 'Nov' => '11', 'Dec' => '12'
        ];
        return sprintf('%s-%s-%s %s:%s:%s',
            $matches[3],
            $monthMap[$matches[2]] ?? '01',
            $matches[1],
            $matches[4], $matches[5], $matches[6]
        );
    },

    // Remove URL parameter(like /index.php?id=123 → /index.php)
    '#(GET|POST|PUT|DELETE|HEAD) (/[\w\-\/\.]+)(\?[^\s"]*)?#' => function ($matches) {
        return $matches[1] . ' ' . $matches[2];
    }
];

// Application replacement
$formatted = preg_replace_callback_array($patterns, $log);

// Output result
echo nl2br(htmlspecialchars($formatted));
?>

Output result

After running the above script, the log content will be formatted as:

 127.0.0.1 - - 2025-04-21 15:32:01 "GET /index.php HTTP/1.1" 200  
192.168.1.10 - - 2025-04-21 15:32:01 "POST /submit.php HTTP/1.1" 404

summary

With preg_replace_callback_array we are able to handle multiple fields in different formats in the log in a very elegant way. Its advantage is that it handles multiple modes at once, each mode has its own callback function independently, which has clear logic and easy maintenance.