How to Use the mb_eregi_replace Function to Replace Various Chinese Punctuation Marks?

M66 2025-06-15

First, let's understand the basic syntax of mb_eregi_replace:

string mb_eregi_replace ( string $pattern , string $replacement , string $string [, string $option = "msri" ] )

This function is characterized by case-insensitive regular expression matching and replacement on multibyte strings (such as UTF-8 encoded Chinese). Although it has been deprecated, it is still widely used in some older versions or specific environments.

Replacing Chinese Punctuation Marks

Common Chinese punctuation marks include:

Full-width comma (，)
Full-width period (。)
Enumeration comma (、)
Full-width semicolon (；)
Full-width question mark (？)
Full-width exclamation mark（！）
Full-width quotation marks (“”『』)

These punctuation marks may need to be replaced with English punctuation or removed in different application scenarios. For example, unifying punctuation is very useful in search engine preprocessing, content deduplication, or text normalization.

Sample Code

Suppose we want to replace all Chinese punctuation marks in a piece of Chinese text with their corresponding English punctuation. We can use mb_eregi_replace combined with multiple replacement steps to accomplish this.

<?php
<p>mb_internal_encoding("UTF-8");</p>
<p>$text = "你好，世界！这是一个测试文本，包括各种中文标点：比如逗号、句号。还有“引号”、问号？等等。";</p>
<p>// Replacement mapping array<br>
$replacements = [<br>
'，' => ',',<br>
'。' => '.',<br>
'、' => ',',<br>
'；' => ';',<br>
'：' => ':',<br>
'？' => '?',<br>
'！' => '!',<br>
'“' => '"',<br>
'”' => '"',<br>
'‘' => "'",<br>
'’' => "'",<br>
'（' => '(',<br>
'）' => ')',<br>
'【' => '[',<br>
'】' => ']',<br>
'《' => '<',<br>
'》' => '>'<br>
];</p>
<p>foreach ($replacements as $chinese => $english) {<br>
$pattern = preg_quote($chinese, '/');<br>
$text = mb_eregi_replace($pattern, $english, $text);<br>
}</p>
<p>echo $text;</p>
<p>?><br>

Output Result

你好,世界!这是一个测试文本,包括各种中文标点:比如逗号,句号.还有"引号",问号?等等.

This way, we have successfully replaced the Chinese punctuation marks in a Chinese text with English punctuation, making it easier for further processing or display.

Tips

Although mb_eregi_replace can handle multibyte characters, since it has been deprecated, it is recommended to use mb_ereg_replace or preg_replace (with the /u modifier) instead.
For processing large volumes of text data, using strtr instead of regex replacements might be more efficient.
If you want to remove punctuation instead of replacing it, simply set $english to an empty string.

Online Testing Suggestions

If you want to debug this script online, you can use online PHP environments such as https://www.m66.net/php-runner to test and observe the actual effects.

By properly using mb_eregi_replace, you can easily standardize punctuation in Chinese text, laying a solid foundation for text data analysis.