← Back to blog
· 5 min read

PHP Read CSV Files: From fgetcsv() Headaches to Modern Libraries

My 2015 Stack Overflow answer showed how to read CSV in PHP with fgetcsv(). In 2026, League\Csv and generators handle the edge cases that raw functions miss.

php csv stackoverflow data

PHP Read CSV Files: From fgetcsv() Headaches to Modern Libraries

In 2015, I answered a question on Stack Overflow in Portuguese about reading CSV files in PHP. It scored 9 upvotes — a bread-and-butter task that every PHP developer hits at some point.

The answer showed fgetcsv(), and it worked. But “worked” and “worked well in production with real-world CSV files” are very different things.

The 2015 Approach: fgetcsv() and Hope

The standard pattern was straightforward:

// 2015: Basic CSV reading
$handle = fopen('data.csv', 'r');
while (($row = fgetcsv($handle, 1000, ',')) !== false) {
    echo $row[0] . ' - ' . $row[1] . "\n";
}
fclose($handle);

Clean. Simple. And it broke the moment someone handed you a CSV file exported from Excel on a Brazilian Portuguese Windows machine.

The Encoding Nightmare

fgetcsv() reads bytes. It has no concept of character encoding. A CSV file saved as Windows-1252 (common in Latin America) would produce garbled text when you tried to display it as UTF-8:

// 2015: The encoding dance
$handle = fopen('data.csv', 'r');
while (($row = fgetcsv($handle, 0, ';')) !== false) {
    // Brazilian CSVs often use semicolons and Windows encoding
    $row = array_map(function($field) {
        return mb_convert_encoding($field, 'UTF-8', 'Windows-1252');
    }, $row);
    // Now process $row...
}

And then there was the UTF-8 BOM (Byte Order Mark). Excel loves prepending \xEF\xBB\xBF to UTF-8 files. Your first field in the first row would silently include three invisible bytes, breaking string comparisons:

// 2015: Stripping the BOM manually
$firstLine = fgets($handle);
$firstLine = preg_replace('/^\xEF\xBB\xBF/', '', $firstLine);
rewind($handle);

The Delimiter Guessing Game

fgetcsv() defaults to comma as a delimiter. But CSV isn’t really “comma-separated” — it’s “whatever-the-exporting-program-decided-separated.” Brazilian Excel uses semicolons. Some European systems use tabs. Some use pipes. You had to know in advance or detect it yourself:

// 2015: Sniff the delimiter
$firstLine = fgets($handle);
rewind($handle);
$delimiters = [',', ';', "\t", '|'];
$counts = [];
foreach ($delimiters as $d) {
    $counts[$d] = substr_count($firstLine, $d);
}
$delimiter = array_keys($counts, max($counts))[0];

Fragile? Absolutely. But we all wrote some version of this.

The 2026 Approach: Libraries That Handle Reality

League\Csv

The PHP ecosystem’s answer to CSV handling matured significantly. League\Csv is now the go-to:

// 2026: League\Csv handles the ugly parts
use League\Csv\Reader;
use League\Csv\CharsetConverter;

$reader = Reader::createFromPath('data.csv', 'r');
$reader->setHeaderOffset(0); // First row as headers

// Automatic charset conversion
$encoder = (new CharsetConverter())
    ->inputEncoding('Windows-1252')
    ->outputEncoding('UTF-8');
$reader->addFormatter($encoder);

foreach ($reader->getRecords() as $record) {
    // $record is an associative array keyed by headers
    echo $record['nome'] . ' - ' . $record['email'];
}

What League\Csv gives you:

  • BOM handling — detects and strips it automatically
  • Charset conversion — built-in converter, no manual mb_convert_encoding
  • Header mapping — access fields by name, not index
  • Stream filtering — process without loading the entire file
  • Validation — column count consistency checks

SplFileObject for Simpler Cases

If you don’t want a dependency, PHP’s SPL classes improved the ergonomics:

// 2026: SplFileObject — built into PHP
$file = new SplFileObject('data.csv');
$file->setFlags(SplFileObject::READ_CSV | SplFileObject::SKIP_EMPTY);
$file->setCsvControl(';'); // Set delimiter

foreach ($file as $row) {
    if ($row[0] !== null) {
        echo implode(' | ', $row) . "\n";
    }
}

PHP Generators for Large Files

For files that don’t fit in memory — and in production, they always grow larger than expected:

// 2026: Generator-based reading for memory efficiency
function readCsv(string $path, string $delimiter = ','): Generator {
    $handle = fopen($path, 'r');
    $headers = fgetcsv($handle, 0, $delimiter);

    while (($row = fgetcsv($handle, 0, $delimiter)) !== false) {
        yield array_combine($headers, $row);
    }

    fclose($handle);
}

// Process a 2GB file with constant memory usage
foreach (readCsv('huge-export.csv', ';') as $record) {
    processRecord($record);
}

Generators were available in PHP 5.5 (2013), but in 2015 most codebases hadn’t adopted them yet. By 2026, they’re a standard pattern for file processing.

The Bigger Picture: Is PHP Still the Right Tool?

For CSV processing specifically, the landscape shifted. If I’m doing data analysis or transformation in 2026, I’d honestly reach for:

  • Python pandaspd.read_csv() handles encoding, delimiters, dtypes, and missing values in one call
  • Node.js csv-parse — streaming parser that integrates well with modern JS pipelines
  • DuckDB — query CSV files directly with SQL, no loading step needed
# Python: One line does what took 20 lines in PHP
import pandas as pd
df = pd.read_csv('data.csv', encoding='latin-1', sep=';')

But PHP isn’t going anywhere for web applications that need to accept CSV uploads and process them on the server. League\Csv handles that use case well.

What Changed

Aspect2015 (fgetcsv)2026 (Modern)
EncodingManual mb_convert_encodingBuilt-in charset converter
BOMManual regex strippingAutomatic detection
HeadersAccess by numeric indexAccess by column name
MemoryLoad everything or manage handlesGenerators + streaming
DelimiterGuess or hardcodeAuto-detection in libraries
ValidationDIYColumn count checks, type casting

The Takeaway

fgetcsv() still works. I wouldn’t tell anyone to stop using it for simple scripts. But modern PHP libraries handle the edge cases that raw file functions don’t — encoding detection, BOM stripping, header mapping, memory-efficient streaming. The 2015 answer solved the question asked. The 2026 answer would include a warning: if your CSV comes from the real world, use a library that’s already seen every weird export format Excel can produce.

Related Posts