SSIS: Loading files into MySQL

Getting data out of MySQL with SSIS is a snap. Putting data into MySQL has been a different matter. I’ve always done this in the past with a hodgepodge of ODBC, Linked Servers, OPENQUERY and the Script Task. All of these work well but they’re just not as convenient as loading files with the OLED Destination.

I recently attempted to use the ADO.NET Destination to load files into MySQL without luck. I tweeted about this and Todd McDermid (blog | twitter) kindly pointed me at a blog post about writing data to MySQL with SSIS. This works well but I am having difficulties with data conversions. The author of the blog post, Matt Mason, posted a follow up article with the various options of SSIS / MySQL interaction. All of these involve pushing data into MySQL but I though why not get it to pull data. Here’s an outline of this approach.

First create the following table in a MySQL database. This is a clone of the HumanResources.Employees table from the AdventureWorks sample database. We will be doing a simple extract-truncate-load of the table into MySQL from SQL Server.

# Create MySQL replica table of the HumanResources.Employee from the AdventureWorks database
CREATE TABLE `HumanResourcesEmployee`
(
	`EmployeeID` INT PRIMARY KEY NOT NULL,
	`NationalIDNumber` VARCHAR(15) NOT NULL,
	`ContactID` INT NOT NULL,
	`LoginID` VARCHAR(256) NOT NULL,
	`ManagerID` INT NULL,
	`Title` VARCHAR(50) NOT NULL,
	`BirthDate` DATETIME NOT NULL,
	`MaritalStatus` CHAR(1) NOT NULL,
	`Gender` CHAR(1) NOT NULL,
	`HireDate` DATETIME NOT NULL,
	`SalariedFlag` VARCHAR(5),
	`VacationHours` SMALLINT,
	`SickLeaveHours` SMALLINT,
	`CurrentFlag` VARCHAR(5),
	`rowguid` VARCHAR(40),
	`ModifiedDate` DATETIME
);

Next create a new Integration Services project in BIDS. Add a Data Flow Task to the project and call it "Extract Employees from SQL Server"

data flow task

This will export the contents of the HumanResources.Employees table and write it to a flat file. Edit the task and add an OLE Source and configure it as illustrated below. The OLEDB connection manager should point at your SQL Server instance hosting the AdventureWorks database.

oledb source employees

Next add a Flat File Destination and name it "Write Employees File" then connect the OLEDB Source to it. Configure the connection manager as illustrated below. The important things to note here are the fact I’m writing the file to E:\Employees.txt and it is pipe delimited. You may need to alter these according to your setup.

flat file connection manager

flat file connection manager delimiter

The dataflow should now look something like below.

final data flow

Go back to the Control Flow pane and add an Execute SQL Task to the canvas. Call this task "TRUNCATE TABLE HumanResourcesEmployee". Edit the task and configure it like below. You need to add a ADO.NET connection manager which references an ODBC connection to your MySQL database. You’ll need the MySQL ODBC Driver for this.

execute sql task employees

My ODBC connection looks like below.

mysql odbc dsn connection

Add another Execute SQL Task and call it "Load new Employees file into MySQL". Connect up the tasks in sequence.

data flow ssis mysql

Edit "Load new Employees file into MySQL", add the ADO.NET connection to MySQL and enter the below SQL. You will need to change the file path and delimiter character if you have changed them.

LOAD DATA LOCAL INFILE 'E:\\Employees.txt' # Column order MUST match the table 
INTO TABLE HumanResourcesEmployee
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\r\n' # Windows line terminator
IGNORE 1 LINES; # Ignore column header line

execute sql task employees mysql

I originally attempted to enclose this in a store procedure as this provides more flexibility but MySQL gave the following error.

LOAD DATA is not allowed in stored procedures

MySQL prevents you from using LOAD DATA INFILE inside stored procedures (my major beef with MySQL is what you can and cannot do inside stored procedures) so we have no choice but to enter it in the SQLStatement pane. Execute the package and if all goes well the Employees.txt file will be loaded into MySQL.

execute_ssis_mysql_load

employees loaded into mysql ssis

I’ve not yet used this in production so use with caution, but it’s simple to setup and fast. One thing to note for this load is that MySQL has replaced backslashes with hyphens in the LoginId column so it would be sensible to check all data conversions.


6 Comments

  1. Carlos says:

    Thank my friend. This solution work fine!

  2. Stefano says:

    I have pretty much the exact same setup except that my source is an oracle table instead of a SQL table. However when I run my package, it always fails on the LOAD DATA INFILE task with the following error.

    “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.”

    What’s even more crazy is that if I stop debugging and just run the LOAD DATA INFILE task itself, it works just fine. I’m not sure why but it’s like I have to wait for it to fail, then run it again (the last step) and I get my data uploaded

  3. Rhys says:

    Hi Stefano,

    I can’t really comment on Oracle but isn’t there a now a component provided by MS for this?

    http://msdn.microsoft.com/en-us/library/ee470675(v=sql.100).aspx

    Rhys

  4. Stefano says:

    Thanks for the reply but it isn’t the Oracle data that’s the issue. I’m trying to bulk insert large amounts of data from Oracle to MySQL. I can pull the data from Oracle pretty fast but importing it into MySQL is what’s slowing me down. After researching for hours and trying various options, I also came to the conclusion that the same setup you did is also probably my best option except I keep getting that error.

    It’s as if when the data is inserted into my TXT file, the file itself is hung up on that step. When the next step tries to LOAD DATA INFILE, it can’t read the file because it’s hung up. When I stop debugging, the file is released and I’m able to run the last step. This is the case each and every time. It’s really weird and I don’t understand it.

  5. Rhys says:

    Hi Stefano,

    OK, I’m with you now.

    Have you been running the package just inside BIDS? Does the problem occur still if you deploy the package to sql or the file system?

    Another alternative would be to execute the LOAD DATA INFILE through a VB.Net script task.

    Rhys

  6. Stefano says:

    I finally figured out what I was doing wrong. Within my package, I’m also transforming the data afterwards in MySQL in several different ways. To do this, I have all my steps within a sequence container with the first step being the following

    sql_mode=’STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ANSI_QUOTES’

    My connection to MySQL is set to retain as well so that I don’t have to send this script in each MySQL step.

    As soon as I took out the LOAD DATA INFILE outside the sequence container and had it run by itself, everything worked fine. I appreciate your assistance on this, it really was mind boggling.

Leave a Reply