In PHP, array_diff() is a very practical function to compare arrays and return differences. It compares values to find out elements in the first array but not in other arrays. This is usually not a problem with the processing of strings and integers, but when dealing with floating-point arrays, some "unexpected" behavior may occur, because of the accuracy of floating-point numbers.
Let's start with a simple example:
<?php
$a = [1.1, 2.2, 3.3];
$b = [2.2, 3.3];
$result = array_diff($a, $b);
print_r($result);
The output is:
Array
(
[0] => 1.1
)
This result is in line with expectations. However, in some cases, floating point numbers may cause errors in comparison of array_diff() due to precision limitations.
Floating-point numbers cannot accurately represent certain decimals in computers, and there may be slight errors. For example:
<?php
$a = [0.1 + 0.2]; // The actual value is 0.30000000000000004
$b = [0.3];
$result = array_diff($a, $b);
print_r($result);
The output is:
Array
(
[0] => 0.30000000000000004
)
You might think 0.1 + 0.2 == 0.3 , but the binary floating point representation inside the computer makes this equation not always true. This means that array_diff() will consider the two values not equal , resulting in a misjudgment.
The underlying layer of array_diff() is based on loose comparison ( == ) to determine whether the two values are equal. But the accuracy problem of floating-point numbers itself means that even if two numbers are logically "equal", their representations in memory may be different, especially after decimal calculations are involved.
In processing financial data, sensor data, or other business scenarios that require precise calculations, this behavior of array_diff() can lead to:
Incorrectly identify whether the data exists
Logical errors in business judgment branch
Unable to synchronize or compare data differences correctly
This is not only a code bug, but may even be a business security issue .
PHP provides array_udiff() , which allows developers to provide their own comparison functions, which can implement safer floating point difference logic:
<?php
function float_compare($a, $b) {
$epsilon = 0.00001; // Accuracy tolerance
if (abs($a - $b) < $epsilon) {
return 0;
}
return ($a < $b) ? -1 : 1;
}
$a = [0.1 + 0.2];
$b = [0.3];
$result = array_udiff($a, $b, 'float_compare');
print_r($result);
The output is:
Array
(
)
This time, array_udiff() correctly identifies that the two are "equal", avoiding the problems caused by floating point error.
When you cannot use array_udiff() or custom functions, there is also a "curve saves the country" method to format floating point numbers:
$a = array_map(function($v) {
return round($v, 5);
}, [0.1 + 0.2]);
$b = array_map(function($v) {
return round($v, 5);
}, [0.3]);
$result = array_diff($a, $b);
print_r($result);
This method can also effectively avoid most problems caused by accuracy, but it still needs to be used with caution.
When using array_diff() to process floating point arrays, special attention should be paid to the PHP processing mechanism on floating point numbers, especially the impact of accuracy errors. If floating-point numbers are directly compared, it may lead to logical errors and even security risks. To ensure data accuracy, it is recommended to use array_udiff() with a customized accuracy tolerance comparison function, or to perform unified formatting of the data.
In businesses involving important data, any seemingly minor error should not be ignored.