Current Location: Home> Latest Articles> How to use array_chunk to perform batch insertion of database operations

How to use array_chunk to perform batch insertion of database operations

M66 2025-04-26

When performing database operations, batch insertion of data improves performance better than inserting it one by one, especially when processing large amounts of data. PHP provides many ways to optimize this operation, with array_chunk being a very practical tool that helps you cut large arrays into smaller arrays, enabling efficient batch insertion operations.

1. What is array_chunk ?

array_chunk is an array function in PHP that is used to split a large array into multiple small arrays. The number of elements in each small array can be specified by you. Its basic syntax is as follows:

 array_chunk(array $array, int $size, bool $preserve_keys = false): array
  • $array : an array that needs to be cut.

  • $size : The size of each small array.

  • $preserve_keys : Whether to retain the key name of the original array, the default is false .

2. Use array_chunk to implement batch insertion

Suppose you need to insert a batch of data into the database. Inserting one by one is not only inefficient, but also easily creates performance bottlenecks. We can use array_chunk to cut the data into multiple small batches and then insert each small batch at once.

Here is an example of implementing batch insertion using array_chunk :

 <?php
// Simulate data sources
$data = [
    ['name' => 'Alice', 'email' => 'alice@m66.net'],
    ['name' => 'Bob', 'email' => 'bob@m66.net'],
    ['name' => 'Charlie', 'email' => 'charlie@m66.net'],
    // ...More data
];

// Set the size of each batch
$batchSize = 100;

// use array_chunk Divide data into small batches
$dataChunks = array_chunk($data, $batchSize);

// Connect to the database
$pdo = new PDO('mysql:host=localhost;dbname=test', 'root', '');

// Start a transaction
$pdo->beginTransaction();

foreach ($dataChunks as $chunk) {
    // Build batch inserts SQL Statement
    $placeholders = [];
    $values = [];
    
    foreach ($chunk as $row) {
        $placeholders[] = "(?, ?)";
        $values[] = $row['name'];
        $values[] = $row['email'];
    }

    // Prepare SQL Statement
    $sql = "INSERT INTO users (name, email) VALUES " . implode(", ", $placeholders);
    $stmt = $pdo->prepare($sql);

    // Perform batch insertion
    $stmt->execute($values);
}

// Submit transactions
$pdo->commit();

echo "Data insertion is completed!";
?>

3. Code parsing

  1. Data source : We simulated a set of data containing the user's name and email address, and replaced the email domain with m66.net .

  2. Batch processing : We use array_chunk to split the data by a specified size (for example, 100), so that 100 pieces of data can be inserted at a time, avoiding the performance problems caused by a large number of single insertions.

  3. Transaction processing : To ensure data consistency and performance, we enable database transactions. In this way, if there is any failure during the insertion process, all operations can be rolled back to maintain the integrity of the data.

  4. Batch Insert SQL : We perform batch insertion by dynamically building SQL statements. In this example, we generate an INSERT INTO statement for each batch, binding the data one by one to the placeholder of the SQL statement.

4. Optimization suggestions

  • Reasonably set the batch size : The batch size of array_chunk is an important factor affecting performance. If the batch is too small, too many times the database is inserted each time, which will lead to performance degradation; if the batch is too large, it may take up too much memory or cause the database connection to timeout. Therefore, the appropriate batch size needs to be adjusted according to the actual situation, and the common batch size is 100 or 1000.

  • Using Transactions : As mentioned above, turning on transactions ensures data consistency and improves performance when performing batch inserts, because the database can combine multiple insert operations into one operation.

  • Error handling : In actual projects, you need to consider error handling mechanisms, such as rollback operations when insertion fails.

5. Summary

By using array_chunk , you can cut large amounts of data into small batches, enabling efficient batch data insertion operations. This not only improves performance, but also avoids excessive memory consumption. At the same time, cooperating with transaction processing can ensure data consistency and integrity. This is a very efficient solution when processing data at scale.