In web development, it’s common to handle strings that contain HTML tags—whether from user input, WYSIWYG editors, or imported data. To extract clean text, PHP offers built-in functions and methods to effectively strip out HTML. This article introduces two popular approaches to accomplish this task.
PHP provides the strip_tags() function, which is designed to remove HTML tags from a string efficiently. It’s straightforward to use and ideal for basic HTML cleanup.
$string = "<p>This is a string with HTML tags.</p>";
$clean_string = strip_tags($string);
echo $clean_string;
Output:
This is a string with HTML tags.
The strip_tags() function takes two parameters: the first is the input string, and the second (optional) parameter allows you to specify which tags should be preserved. For example:
$string = "<p>Paragraph</p><a href='#'>Link</a>";
$clean_string = strip_tags($string, '<a>');
echo $clean_string;
In this example, only the tag is preserved, while all other tags are removed.
Another flexible method is to use regular expressions. By applying preg_replace(), you can eliminate all HTML tags from a string:
$string = "<div>This is a <div>string with HTML tags</div>.</div>";
$clean_string = preg_replace("/<.*?>/", "", $string);
echo $clean_string;
Output:
This is a string with HTML tags.
The regular expression /<.*?>/ matches all HTML tags and replaces them with an empty string. This method is suitable for more complex content or custom tags that aren’t easily managed by strip_tags().
Whether you use strip_tags() or regular expressions, PHP offers effective ways to remove HTML tags from strings. Choose the method that best suits your content structure and flexibility needs. Mastering these techniques helps improve data processing and enhances the user experience in your PHP applications.