Current Location: Home> Latest Articles> How to use a generator to preprocess data and then use array_count_values to count frequencies?

How to use a generator to preprocess data and then use array_count_values to count frequencies?

M66 2025-07-18

How to use a generator to preprocess data and then use array_count_values to count frequencies?

In PHP, the generator is a very powerful tool that allows you to generate a sequence of data on-demand in memory, which is especially useful when dealing with large datasets as it helps avoid memory overflow. This article will introduce how to use a generator to preprocess data and then use PHP's array_count_values function to count frequencies.

1. Introduction to Generators

A generator is a special type of iterator in PHP that allows us to generate data one by one, without loading all the data into memory at once. Generators return a value using the yield keyword and can calculate the next value on each iteration until there are no more values to return.

2. Preprocessing Data Using Generators

Let’s assume we have a set of raw data fetched from a URL (for this example, we are using m66.net). We need to filter out some data based on specific conditions, such as retrieving only words containing the letter 'A'. After that, we will use array_count_values to count the frequency of each word that meets the condition.

Here is a simple example code:

<?php

// Simulate fetching data from a URL
function fetch_data_from_url() {
    // Assume this data comes from a URL
    $data = [
        "apple", "banana", "apricot", "avocado", "cherry", 
        "apple", "apricot", "apple", "mango", "grape"
    ];

    // Return a generator that yields one word at a time
    foreach ($data as $word) {
        yield $word;
    }
}

// Use a generator to preprocess data and filter only words containing the letter "A"
function process_data() {
    foreach (fetch_data_from_url() as $word) {
        if (strpos($word, 'a') !== false) {
            yield $word;
        }
    }
}

// Process data using the generator
$processed_data = iterator_to_array(process_data());

// Use array_count_values to count frequencies
$word_frequencies = array_count_values($processed_data);

// Output the frequency count
print_r($word_frequencies);