Current Location: Home> Latest Articles> Comprehensive Guide to PHP mb_strlen(): Accurately Get Multibyte String Length

Comprehensive Guide to PHP mb_strlen(): Accurately Get Multibyte String Length

M66 2025-08-07

Introduction to PHP mb_strlen(): Getting Multibyte String Length

In PHP development, traditional string functions often fail to accurately calculate the length of multibyte strings such as Chinese or Japanese characters. To address this, PHP offers the mb_strlen() function specifically designed to get the length of multibyte strings. This article explains how to use this function with practical examples.

Basic Usage of mb_strlen()

The mb_strlen() function is part of the mbstring extension, so make sure this extension is installed and enabled before use. You can check its status via php.ini settings or using the phpinfo() function.

The basic syntax of the function is:

int mb_strlen ( string $str [, string $encoding = mb_internal_encoding() ] )

Parameter explanation:

  • $str: The multibyte string whose length you want to measure.
  • $encoding (optional): Specifies the character encoding of the string. Defaults to the internal encoding if omitted.

Simple Example

$str = "你好,世界!";
echo mb_strlen($str); // Outputs: 7

In this example, the string contains 4 Chinese characters and 3 punctuation marks. mb_strlen() correctly outputs the length as 7.

Example Specifying Character Encoding

$str = "こんにちは世界";
echo mb_strlen($str, "UTF-8"); // Outputs: 6

By specifying UTF-8 encoding, the function accurately calculates the length of the string, which includes 3 Japanese characters and 3 Chinese characters, totaling 6.

Length Validation Example

$str = "This is a very long sentence.";
$max_length = 20;
if (mb_strlen($str) > $max_length) {
    echo "String is too long.";
} else {
    echo "String is within the limit.";
}

This example sets a maximum length limit and outputs an appropriate message depending on whether the string exceeds that limit.

Conclusion

The mb_strlen() function is a powerful tool for handling the length of multibyte strings, supporting various character encodings. It solves the problem where traditional string functions fail to correctly process multibyte character lengths. Mastering this function helps improve your program's adaptability in multilingual and multi-encoding environments.