In PHP, the array_column function is often used to extract data from a column from a multi-dimensional array. However, in some cases, using array_column can cause performance problems, especially when frequent operations on large arrays are required. The key to this problem is that array_column will traverse the original array multiple times. If we do not cache the results, it may lead to repeated traversals and unnecessary performance consumption. This article will explore how to avoid repeated traversal of arrays when using array_column through cache policies, thereby improving performance.
Consider the following sample code:
$array = [
['id' => 1, 'name' => 'Alice', 'age' => 25],
['id' => 2, 'name' => 'Bob', 'age' => 30],
['id' => 3, 'name' => 'Charlie', 'age' => 35]
];
$names = array_column($array, 'name');
In this example, array_column will iterate over the $array array once and extract the name column in each subarray. Although this operation itself is not particularly expensive, if you need to extract different columns multiple times, or use array_column repeatedly in a large array, it will cause unnecessary repeated traversals, which will affect performance.
To avoid repeated traversal of the array, we can store the result in the cache when a column is extracted the first time. Next time the same column is needed, it is read directly from the cache without calling array_column .
The easiest way to cache is to use an array to save the already extracted columns. For example:
// Initialize cache array
$cache = [];
function getColumnFromCache($array, $column, &$cache) {
// Check if the column is already in the cache
if (!isset($cache[$column])) {
// If the column is not in the cache,use array_column Get data,And cache the results
$cache[$column] = array_column($array, $column);
}
return $cache[$column];
}
$array = [
['id' => 1, 'name' => 'Alice', 'age' => 25],
['id' => 2, 'name' => 'Bob', 'age' => 30],
['id' => 3, 'name' => 'Charlie', 'age' => 35]
];
// Get from cache 'name' List
$names = getColumnFromCache($array, 'name', $cache);
print_r($names);
// 再次Get from cache 'name' List,Avoid repeated traversals
$namesAgain = getColumnFromCache($array, 'name', $cache);
print_r($namesAgain);
In this example, the getColumnFromCache function first checks whether the data for a certain column has been stored in the $cache array. If the data has been cached, the cached result will be returned directly; otherwise, the array_column is called to get the column data and store it in the cache.
If your application is large and needs to frequently extract the same data columns from multiple requests, consider using more efficient caching schemes such as Redis or Memcached.
For example, the code that uses Redis to cache column data can be like this:
$redis = new Redis();
$redis->connect('127.0.0.1', 6379);
function getColumnFromRedis($array, $column, $redis) {
// examine Redis 中是否已缓存该List
$cachedColumn = $redis->get($column);
if ($cachedColumn === false) {
// if Redis 中没有缓存该List,use array_column Get data
$cachedColumn = json_encode(array_column($array, $column));
// Save the result in Redis
$redis->set($column, $cachedColumn);
}
return json_decode($cachedColumn, true);
}
// Sample data
$array = [
['id' => 1, 'name' => 'Alice', 'age' => 25],
['id' => 2, 'name' => 'Bob', 'age' => 30],
['id' => 3, 'name' => 'Charlie', 'age' => 35]
];
// Get 'name' List
$names = getColumnFromRedis($array, 'name', $redis);
print_r($names);
This way of cached column data through Redis can greatly reduce repeated operations on the same data, especially for frequent access scenarios.
By using a caching policy, multiple traversals of the same data when using array_column can be effectively avoided, thereby improving performance. For small applications, it is sufficient to use a simple memory cache (such as arrays); for large applications, cache systems such as Redis or Memcached can be used to further improve performance and scalability. Choosing the right caching scheme can greatly optimize the efficiency of the program, especially when processing large amounts of data.