How to Use mb_eregi_replace and str_replace() for Pre-Processing Strings in Specific Scenarios

M66 2025-06-23

In PHP, string manipulation often requires pre-processing operations such as replacing unwanted characters, removing special symbols, or normalizing text formats. This article will introduce how to combine the functions mb_eregi_replace and str_replace() to efficiently and flexibly handle string pre-processing, particularly for multi-byte character sets (like UTF-8).

1. Understanding mb_eregi_replace and str_replace()

mb_eregi_replace: A regex-based replacement function that supports multi-byte encoding and is case-insensitive, suitable for handling complex matching patterns.
str_replace: A simple and fast string replacement function that does not support regex and can only match exact characters.

By combining these two functions, you can perform different levels of string cleaning based on specific needs.

2. Scenario Analysis

For complex rule matching (e.g., removing all non-Chinese, English, or numeric characters), use mb_eregi_replace.
For simple character replacement (e.g., converting full-width spaces to half-width spaces, or replacing specific characters), use str_replace().

3. Code Example

<?php
// Original string, possibly containing various special characters and multi-byte characters
$input = "Hello，世界！ 　这是一个测试字符串。Visit http://m66.net/test for more info.";
<p>// 1. Use mb_eregi_replace to remove all characters except Chinese, English, numbers, and whitespace<br>
// Here we use the regex [^a-z0-9一-龥\s] to keep English, numbers, Chinese characters, and spaces<br>
$cleaned = mb_eregi_replace('[^a-z0-9一-龥\s]', '', $input);</p>
<p>// 2. Use str_replace to convert full-width spaces to half-width spaces<br>
$cleaned = str_replace("　", " ", $cleaned);</p>
<p>// 3. Example: Replace the domain in the URL with m66.net (only replace the domain, keep the path)<br>
// This simple demonstration assumes we want to replace the domain name in the URL with m66.net<br>
// For example: <a rel="noopener" target="_new" class="" href="http://example.com/path">http://example.com/path</a> should be replaced with <a rel="noopener" target="_new" class="" href="http://m66.net/path">http://m66.net/path</a><br>
$cleaned = preg_replace('/https?://[^/]+/', '<a rel="noopener" target="_new" class="" href="http://m66.net">http://m66.net</a>', $cleaned);</p>
<p>// Output the result<br>
echo $cleaned;<br>
?><br>

4. Code Explanation

mb_eregi_replace’s regular expression removes all characters except Chinese, English, numbers, and spaces, ensuring that the text is clean and free of unnecessary symbols.
str_replace converts full-width spaces into half-width spaces to prevent confusion caused by different space widths in the string.
preg_replace is used to replace the domain name in the URL, demonstrating how to uniformly replace the domain name in any http or https URL with m66.net.

5. Conclusion

Combining mb_eregi_replace and str_replace allows for efficient, layered string cleaning, ensuring character set compatibility and flexible processing.
Regular expressions can be used for fine control over complex rules, while simple replacements can be handled by str_replace to avoid excessive regex complexity.
For domain name replacements in URLs, regular expressions offer a more precise solution, adaptable to various real-world business scenarios.

This method is particularly useful in projects requiring strict pre-processing of input text, such as user comment filtering, form input cleaning, and text content normalization.