Merging CSV Files with Powershell

Powershell is really useful for documenting and managing your servers but it’s also a pretty good tool for working with data. I’ve been using it to merge csv files, with an identical structure, into a single file. Now this is pretty easy, if rather tedious, to do using SQL Server Import / Export functionality or with SSIS. Powershell makes this a snap!

In this example I have two csv files in a directory

csv files

These just contain some simple name and age data.

Content of my test csv files

This simple script will produce a third file, merging the contents of file1.csv and file2.csv.

?View Code POWERSHELL
# Author: Rhys Campbell
# 2009-08-22
# Merging identical csv files
 
# Directory containing csv files, include *.*
$directory = "C:\Users\Rhys\Desktop\csv\*.*";
# Get the csv files
$csvFiles = Get-ChildItem -Path $directory -Filter *.csv;
 
# Updated 01/03/2010. Thanks to comment from Chris.
# Resolves error Method invocation failed because [System.Management.Automation.PSObject] doesn't contain a method named 'op_Addition'.
#$content = $null;
$content = @();
 
# Process each file
foreach($csv in $csvFiles)
{
	$content += Import-Csv $csv;
}
 
# Write a datetime stamped csv file
$datetime = Get-Date -Format "yyyyMMddhhmmss";
$content | Export-Csv -Path "C:\Users\Rhys\Desktop\csv\merged_$datetime.csv" -NoTypeInformation;

If all goes as planned a new file will be created.

csv folder with new file

This file will contain data from both csv files.

The final merged csv file

This is obviously a fairly trivial example but it’s a massive timesaver when you have many such files to merge. Word of warning, if you’ve got very big files, you may want to change the script to use Add-Content, to flush each csv file to disk in the foreach loop, to avoid munching up all your RAM.


11 Comments

  1. Chris says:

    Hi, I got the following error running the script:

    Method invocation failed because [System.Management.Automation.PSObject] doesn’t contain a method named ‘op_Addition’.

    I found through another blog that “$content = null” needs to be changed to “content = @().”

    Thanks!

  2. Rhys says:

    Thanks Chris. I get the same error when trying this now. Perhaps a version issue. Anyway, I’ve updated the script.

    Cheers,

    Rhys

  3. John Nash says:

    Thanks for the tip, it worked very well.

  4. James says:

    how about this simple concatenate:

    cd ./dir
    cat *.csv > all.csv

  5. Rhys says:

    Hi James,

    Cool idea but that goes into an infinite loop filling up the csv file until you stop it. Not the behavior I’d expect.

    Cheers,

    Rhys

  6. Rhys says:

    Hi again James,

    Think we’ve found a bug here. Output from a ps session…

    csv file contain 3 lines each…

    PS C:\Users\campbellr\test> ls
    
    
        Directory: C:\Users\campbellr\test
    
    
    Mode                LastWriteTime     Length Name
    ----                -------------     ------ ----
    -a---        24/11/2011     17:16         61 1.csv
    -a---        24/11/2011     17:16         61 2.csv
    -a---        24/11/2011     17:17         61 3.csv
    -a---        24/11/2011     17:43         61 4.csv
    
    
    PS C:\Users\campbellr\test> cat *.csv
    col1,col1,col1,col1
    col1,col1,col1,col1
    col1,col1,col1,col1
    col2,col2,col2,col2
    col2,col2,col2,col2
    col2,col2,col2,col2
    col3,col3,col3,col3
    col3,col3,col3,col3
    col3,col3,col3,col3
    col4,col4,col4,col4
    col4,col4,col4,col4
    col4,col4,col4,col4
    PS C:\Users\campbellr\test> cat *.csv > all.csv
    PS C:\Users\campbellr\test> cat *.csv
    col1,col1,col1,col1
    col1,col1,col1,col1
    col1,col1,col1,col1
    col2,col2,col2,col2
    col2,col2,col2,col2
    col2,col2,col2,col2
    col3,col3,col3,col3
    col3,col3,col3,col3
    col3,col3,col3,col3
    col4,col4,col4,col4
    col4,col4,col4,col4
    col4,col4,col4,col4
    col1,col1,col1,col1
    col1,col1,col1,col1
    col1,col1,col1,col1
    col2,col2,col2,col2
    col2,col2,col2,col2
    col2,col2,col2,col2
    col3,col3,col3,col3
    col3,col3,col3,col3
    col3,col3,col3,col3
    col4,col4,col4,col4
    col4,col4,col4,col4
    col4,col4,col4,col4
    

    The first cat shows the file contents as expected. The line with > carried on filling up all.csv until you ctrl + c it. The next cat command outputs the contents of the files onto the screen ad infinitum.

  7. Steve says:

    Very nice script. It’s a real time saver.

    I did run into one issue, though. My source script puts data in the files, but sometimes there is no data. When the merge script runs it fails if a source file content is null.

  8. Rhys says:

    Hi Steve,

    Probably easy enough to test for this. Does the file contain headers or is it completely empty?

    Cheers,

    Rhys

  9. Chandru N N says:

    easiest way:)

    dir C:UsersRhysDesktopcsv*.csv | foreach { Import-Csv -Path $_ } |
    Export-Csv -Path c:Joined.csv -NoTypeInformation

  10. k3r63r05 says:

    $directorioInicial = “D:\RespaldosAS400\”;
    $archivoFinal = $directorioInicial + “ArchivoCompleto.csv”;

    $cArchivos = 1;
    foreach($csv in $csvFiles)
    {
    if ($cArchivos -eq 1)
    { add-content -path $archivoFinal -value ( get-content $csv ) }
    else
    { add-content -path $archivoFinal -value ( get-content $csv | select -Skip 1 ) }

    $cArchivos += 1;
    }

  11. Rhys says:

    Hi There.

    Good idea! Took a while because I don’t speak Spanish!

    Cheers,

    Rhys

Leave a Reply