In PHP, string manipulation is a very common and important task. To enable more flexible string handling, PHP provides many useful functions. The mb_stristr function is one of them; it searches for a substring within a multibyte-encoded string without case sensitivity. This function is mainly used for strings encoded in UTF-8 or other multibyte character sets.
mb_stristr is the multibyte version of PHP’s case-insensitive string search function. Similar to the standard stristr function, mb_stristr returns the portion of the string starting from the first occurrence of the specified substring, but it correctly handles multibyte character sets.
mb_stristr(string $haystack, string $needle, bool $before_needle = false, string $encoding = </span>null): string|false
</span>
$haystack: The target string to search within.
$needle: The substring to find.
$before_needle: If set to true, the function returns the part of $haystack from the start up to the first occurrence of $needle. Defaults to false, returning the part after $needle.
$encoding: Specifies the character encoding (if omitted, the internal character encoding setting is used, usually UTF-8).
Here is a simple example showing how to use the mb_stristr function to perform a case-insensitive substring search.
<?php
// Set encoding to UTF-8
mb_internal_encoding("UTF-8");
<p>// Target string<br>
$haystack = "Hello, 世界! Welcome to PHP.";</p>
<p>// Substring to find<br>
$needle = "world";</p>
<p>// Use mb_stristr for case-insensitive search<br>
$result = mb_stristr($haystack, $needle);</p>
<p>if ($result !== false) {<br>
echo "Substring found: " . $result;<br>
} else {<br>
echo "Substring not found.";<br>
}<br>
?><br>
mb_internal_encoding("UTF-8") sets the internal character encoding to UTF-8 to ensure proper handling of multibyte strings.
We use mb_stristr($haystack, $needle) to search for $needle within $haystack. Because mb_stristr is case-insensitive, "world" and "World" are considered matches.
If a match is found, the function returns the substring from the match position to the end of the string; if no match is found, it returns false.
An important feature of mb_stristr is its ability to correctly handle multibyte character sets (such as UTF-8, SJIS, etc.). This makes it more reliable than stristr when dealing with non-ASCII strings.
For example, suppose we want to find Chinese characters:
<?php
// Target string containing Chinese characters
$haystack = "I love programming.";
<p>// Substring to find: "喜欢" (like/love)<br>
$needle = "喜欢";</p>
<p>// Search and return matching part<br>
$result = mb_stristr($haystack, $needle);</p>
<p>if ($result !== false) {<br>
echo "Substring found: " . $result;<br>
} else {<br>
echo "Substring not found.";<br>
}<br>
?><br>
In this example, mb_stristr correctly identifies the multibyte substring "喜欢" and returns the portion starting from it.
Character Encoding: Make sure that the character encodings of $haystack and $needle are consistent, especially when dealing with multibyte character sets. Mismatched encodings can cause incorrect or failed matches.
Performance Consideration: Compared to the regular stristr function, mb_stristr may have slightly lower performance, particularly when searching through large texts. Use it based on your actual needs.
Does Not Return Position: Unlike functions such as mb_strpos, mb_stristr returns the substring starting from the match, not the position. If you need the position, use mb_strpos or mb_strrpos for reverse search.
mb_stristr is a very useful PHP function that allows case-insensitive searching within strings encoded in multibyte character sets. By using this function wisely, you can handle UTF-8 or other encoded strings more easily and perform flexible string matching.