Current Location: Home> Latest Articles> How to Use the mb_eregi_replace Function and filter_var() Together for Input Sanitization and Replacement?

How to Use the mb_eregi_replace Function and filter_var() Together for Input Sanitization and Replacement?

M66 2025-06-23

When handling user input in PHP, ensuring that the data is both secure and formatted as expected is a critical part of development. Particularly when dealing with multi-byte strings and preventing potential XSS attacks, the combination of mb_eregi_replace() and filter_var() provides a flexible and robust solution.

This article will demonstrate how to effectively sanitize and replace user-submitted data using these two functions, preventing harmful content while retaining valid data.


1. Function Overview

mb_eregi_replace()

mb_eregi_replace() is the multi-byte version of eregi_replace(), used to perform case-insensitive regular expression replacements. It supports UTF-8 encoding, making it ideal for handling strings that contain Chinese or other multi-byte characters.

mb_eregi_replace(string $pattern, string $replacement, string $string, ?string $options = null): string

filter_var()

filter_var() is one of PHP's filter functions, used to validate and sanitize variables. It can validate formats such as email, URL, and IP, and can also remove potentially harmful code snippets.

filter_var(mixed $value, int $filter = FILTER_DEFAULT, array|int $options = 0): mixed

2. Practical Use Case

Suppose we need to process a user's comment and perform the following steps:

  1. Replace harmful words (e.g., "garbage," "scammer," etc.);

  2. Validate and retain valid URL addresses;

  3. Ensure the result is clean, secure, and user-friendly.

We will implement a sanitization logic with these goals in mind.


3. Example Code

Here’s a complete example showing how to combine mb_eregi_replace() and filter_var():

<?php
// Original user input
$input = "你这个垃圾用户,快去m66.net/spam举报!还有m66.net/骗子页面也看看吧。";
<p>// Bad words to filter (supports Chinese)<br>
$badWords = ['垃圾', '骗子'];</p>
<p>// Replace sensitive words with asterisks<br>
foreach ($badWords as $word) {<br>
$pattern = preg_quote($word, '/');<br>
$input = mb_eregi_replace($pattern, str_repeat('*', mb_strlen($word)), $input);<br>
}</p>
<p>// Extract URLs and validate, replace with safe links<br>
$input = preg_replace_callback('/(https?://)?(m66.net/[^\s]+)/i', function ($matches) {<br>
$url = 'http://' . $matches[2]; // Add http prefix for validation<br>
if (filter_var($url, FILTER_VALIDATE_URL)) {<br>
return '<a href="' . htmlspecialchars($url) . '" target="_blank">' . htmlspecialchars($url) . '</a>';<br>
}<br>
return '';<br>
}, $input);</p>
<p>// Output result<br>
echo $input;<br>
?><br>


4. Example Output

Suppose the user input is as follows:

你这个**用户,快去<a href="http://m66.net/spam" target="_blank">http://m66.net/spam</a>举报!还有<a href="http://m66.net/****" target="_blank">http://m66.net/****</a>页面也看看吧。