In PHP, array_diff() is a very commonly used function. It compares two or more arrays and returns the values that are present in the first array but not in the other arrays. This works great for one-dimensional arrays, but when you try to use it with "nested arrays" (arrays where the elements are still arrays themselves), you'll find that it's not as straightforward as you might expect.
array_diff() is officially defined as a function that compares the "values" of arrays and returns those that exist in the first array but not in others. Here's a simple example with a one-dimensional array:
$a = ['apple', 'banana', 'cherry'];
$b = ['banana', 'dragonfruit'];
<p>$result = array_diff($a, $b);<br>
// Result: ['apple', 'cherry']<br>
This is very straightforward—it just compares values, regardless of keys. However, if we switch to using nested arrays, the result will be different.
Let's look at the following example:
$a = [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
];
<p>$b = [<br>
['id' => 2, 'name' => 'Bob']<br>
];</p>
<p>$result = array_diff($a, $b);<br>
You might expect $result to return just ['id' => 1, 'name' => 'Alice'], but in reality, you'll get the entire $a array returned. This is because PHP uses == for array comparisons, and the comparison mechanism for nested arrays is not as "smart" as you might think.
array_diff() doesn't recursively process arrays within arrays; instead, it treats the entire sub-array as a "value" and tries to compare it as a string. If the sub-arrays have even a slight difference in structure (such as different key orders), they are considered different values, even if the content is identical.
To correctly compare nested arrays, you can write a custom function to compare the contents of the arrays element by element:
function array_diff_recursive($a, $b) {
$diff = [];
$found = false;
foreach ($b as $itemB) {
if ($itemA == $itemB) {
$found = true;
break;
}
}
if (!$found) {
$diff[] = $itemA;
}
}
return $diff;
}
$a = [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
];
$b = [
['id' => 2, 'name' => 'Bob']
];
$result = array_diff_recursive($a, $b);
// Result: [['id' => 1, 'name' => 'Alice']]
array_udiff() allows you to provide a custom comparison function, which is especially useful when comparing complex structures:
function compare_nested_array($a, $b) {
return $a == $b ? 0 : 1;
}
<p>$result = array_udiff($a, $b, 'compare_nested_array');<br>
This method allows you to customize the comparison logic, such as only caring about whether the id field matches, regardless of whether the name field is the same.
If you're working with large datasets, you can generate hash values for each sub-array and compare them using the hash values:
function get_hash($array) {
return md5(json_encode($array));
}
<p>$a_hashed = array_map('get_hash', $a);<br>
$b_hashed = array_map('get_hash', $b);</p>
<p>$diff_keys = array_diff($a_hashed, $b_hashed);</p>
<p>$result = [];<br>
foreach ($diff_keys as $key => $hash) {<br>
$result[] = $a[$key];<br>
}<br>
This technique is particularly suitable for scenarios involving large datasets. Not only does it perform well, but it also handles complex structures efficiently.
For example, if you're developing an API service to compare two user permission lists:
$oldPermissions = [
['module' => 'user', 'access' => 'read'],
['module' => 'admin', 'access' => 'write'],
];
<p>$newPermissions = [<br>
['module' => 'user', 'access' => 'read'],<br>
];</p>
<p>$removed = array_diff_recursive($oldPermissions, $newPermissions);<br>
// You can send an email to the administrator about the permission changes<br>
Or, when handling configuration version differences, you can use these methods to determine whether configuration items have been added or removed.
While array_diff() is very powerful when working with simple arrays, it becomes less effective when dealing with nested structures. Fortunately, PHP offers powerful tools (such as array_udiff() and custom functions), which can help us adapt to specific needs.
Writing good array comparison logic not only avoids data errors but also makes your system more stable. We hope that you can make good use of these techniques in your actual projects to make your code more robust!