In daily development, data filtering is a very common operation, especially when processing user input, database returns results, or external API data, we often need to "exclude" the data, such as: excluding blacklisted users, unqualified products, or processed records from a set of data. At this time, the two PHP native functions array_diff() and in_array() come in handy.
array_diff() is used to compare the values of an array, returning values in the first array but not in other arrays. For example:
$allUsers = ['alice', 'bob', 'charlie', 'david'];
$blacklist = ['bob', 'david'];
$filteredUsers = array_diff($allUsers, $blacklist);
print_r($filteredUsers);
// Output: ['alice', 'charlie']
In this example, bob and david are blacklist members, and we exclude them from the original data via array_diff() .
in_array() is used to determine whether a value exists in an array. This function is very useful for making single judgments or as a logical condition.
For example, if we need to conditionally exclude certain items when traversing the data, we can write it like this:
$exclusions = ['spam', 'banned'];
$itemType = 'spam';
if (!in_array($itemType, $exclusions)) {
echo "Allow processing of this item";
} else {
echo "This item has been excluded";
}
Now let’s take a look at a more practical example, how to combine array_diff() and in_array() to achieve multi-condition exclusion and improve data processing efficiency.
Suppose we have a set of article data, and the fields contain author, status, and tags, we need:
Exclude blackmailed authors
Exclude articles with draft status
Exclude articles with "sensitive" keywords in the tag
We can do this:
$articles = [
['title' => 'article1', 'author' => 'tom', 'status' => 'published', 'tags' => ['php', 'web']],
['title' => 'article2', 'author' => 'jack', 'status' => 'draft', 'tags' => ['php', 'sensitive']],
['title' => 'article3', 'author' => 'lucy', 'status' => 'published', 'tags' => ['laravel']],
['title' => 'article4', 'author' => 'bob', 'status' => 'published', 'tags' => ['sensitive']],
];
$blacklistedAuthors = ['bob', 'jack'];
$excludedStatus = ['draft'];
$sensitiveTags = ['sensitive'];
$filtered = array_filter($articles, function ($article) use ($blacklistedAuthors, $excludedStatus, $sensitiveTags) {
// Exclude blacklist authors
if (in_array($article['author'], $blacklistedAuthors)) {
return false;
}
// Exclude specific status
if (in_array($article['status'], $excludedStatus)) {
return false;
}
// 排除含有sensitive标签的article
foreach ($article['tags'] as $tag) {
if (in_array($tag, $sensitiveTags)) {
return false;
}
}
return true;
});
print_r($filtered);
The output will be:
Array
(
[0] => Array
(
[title] => article1
[author] => tom
[status] => published
[tags] => Array
(
[0] => php
[1] => web
)
)
[2] => Array
(
[title] => article3
[author] => lucy
[status] => published
[tags] => Array
(
[0] => laravel
)
)
)
When the data volume is large, try to use array_diff() to filter out unnecessary content at one time, and avoid frequent calls to in_array() within the loop.
Adjusting the array structure of exclusions to a hash table (i.e., key-value pair form) can further improve the search speed.
For example:
$blacklistedAuthors = array_flip(['bob', 'jack']);
if (isset($blacklistedAuthors[$article['author']])) {
return false;
}
The performance of isset() is usually better than in_array() , especially in high concurrency scenarios.
By reasonably combining array_diff() and in_array() , we can quickly implement data exclusion logic under multiple conditions to improve the readability and execution efficiency of the program. In actual development, rationally organizing data structures and logical judgments will make your code more efficient and stable.