Is it safe to use array_diff() to process floating point arrays?

M66 2025-06-06

In PHP, array_diff() is a very practical function to compare arrays and return differences. It compares values to find out elements in the first array but not in other arrays. This is usually not a problem with the processing of strings and integers, but when dealing with floating-point arrays, some "unexpected" behavior may occur, because of the accuracy of floating-point numbers.

Basic usage of array_diff()

Let's start with a simple example:

 <?php
$a = [1.1, 2.2, 3.3];
$b = [2.2, 3.3];

$result = array_diff($a, $b);
print_r($result);

The output is:

 Array
(
    [0] => 1.1
)

This result is in line with expectations. However, in some cases, floating point numbers may cause errors in comparison of array_diff() due to precision limitations.

Problems caused by floating point accuracy

Floating-point numbers cannot accurately represent certain decimals in computers, and there may be slight errors. For example:

 <?php
$a = [0.1 + 0.2]; // The actual value is 0.30000000000000004
$b = [0.3];

$result = array_diff($a, $b);
print_r($result);

The output is:

 Array
(
    [0] => 0.30000000000000004
)

You might think 0.1 + 0.2 == 0.3 , but the binary floating point representation inside the computer makes this equation not always true. This means that array_diff() will consider the two values not equal , resulting in a misjudgment.

Why does array_diff() fail?

The underlying layer of array_diff() is based on loose comparison ( == ) to determine whether the two values are equal. But the accuracy problem of floating-point numbers itself means that even if two numbers are logically "equal", their representations in memory may be different, especially after decimal calculations are involved.

Safety hazard: data judgment errors

In processing financial data, sensor data, or other business scenarios that require precise calculations, this behavior of array_diff() can lead to:

Incorrectly identify whether the data exists
Logical errors in business judgment branch
Unable to synchronize or compare data differences correctly

This is not only a code bug, but may even be a business security issue .

Solution: Use custom comparison logic

PHP provides array_udiff() , which allows developers to provide their own comparison functions, which can implement safer floating point difference logic:

 <?php
function float_compare($a, $b) {
    $epsilon = 0.00001; // Accuracy tolerance
    if (abs($a - $b) < $epsilon) {
        return 0;
    }
    return ($a < $b) ? -1 : 1;
}

$a = [0.1 + 0.2];
$b = [0.3];

$result = array_udiff($a, $b, 'float_compare');
print_r($result);

The output is:

 Array
(
)

This time, array_udiff() correctly identifies that the two are "equal", avoiding the problems caused by floating point error.

Tips: Another way to ensure consistent accuracy

When you cannot use array_udiff() or custom functions, there is also a "curve saves the country" method to format floating point numbers:

 $a = array_map(function($v) {
    return round($v, 5);
}, [0.1 + 0.2]);

$b = array_map(function($v) {
    return round($v, 5);
}, [0.3]);

$result = array_diff($a, $b);
print_r($result);

This method can also effectively avoid most problems caused by accuracy, but it still needs to be used with caution.

Summarize

When using array_diff() to process floating point arrays, special attention should be paid to the PHP processing mechanism on floating point numbers, especially the impact of accuracy errors. If floating-point numbers are directly compared, it may lead to logical errors and even security risks. To ensure data accuracy, it is recommended to use array_udiff() with a customized accuracy tolerance comparison function, or to perform unified formatting of the data.

In businesses involving important data, any seemingly minor error should not be ignored.