I’ve been using a lot of SSIS recently and whilst doing so I figured I should really add a few basics to the blog in case anyone either hasn’t used it before or simply hasn’t used it for a while and wants a few pointers.
Therefore this is the first of a few posts I’ll try to do covering some simple SSIS stuff.
In this first post I’m going to cover the most basic of things… creating a project, setting up a connection, and moving some data from one server to another.
So let’s get started.
Firstly, locate and open SQL Server Business Intelligence Development Studio
(BIDS) in SQL 2005 - 2008 R2, but in SQL Server 2012 has been rebranded as SQL Server Data Tools
This will open a Visual Studio Shell, as indicated in the top left of the screen (I’m using SQL Server Data Tools in Visual Studio 2010 for SQL Server 2012):
So now we’ll start.
Click on “New Project…” in the top left and you’ll see something similar to the following:
Select Integration Services Project
, give your project a name etc and press OK.
This will provide you with a screen similar (depending on version) to the following:
Okay, so now we’re ready to start.
First things first, we need to let SSIS know which 2 servers we’re using. For this we’re going to set up connections to each server.
In the bottom of your screen you should have a panel labelled “Connection Managers”:
As it suggests, right click in here and in our case (connecting to a SQL Server) we’re going to select “New OLE DB Connection”.
This will open up a new window which you can easily use to create and manage connections. Here I’ve created a connection to a database called “testDatabase” on one of my test servers called “Homer” (after Homer Simpson):
Doing the same for AdventureWorks2012 on another of my test servers (Elementary, after Springfield Elementary - noticed a pattern yet?) leaves me with 2 connections in my Connection Managers part of the window:
Now, for this example, all I want to do is move a subset of data from AdventureWorks2012 on Elementary to my testDatabase on Homer.
The subset in question is defined by the following query:
select SalesOrderID, p.Name, UnitPrice
from sales.SalesOrderDetail d
join Production.Product p
on d.ProductID = p.ProductID
where OrderQty = 1
Now, note that this could be done using a Linked Server etc, but this is simply to demonstrate the creation of a very basic SSIS package.
So, first things first, I want to make sure that the table exists on Homer and is empty (effectively freshly created each time). To do this I would normally run the following script:
if object_id('salesOrderSubset') is not null drop table salesOrderSubset
create table salesOrderSubset
salesOrderID int not null,
productID smallint not null,
lineCost money not null
But we don’t want to have to run this manually each time and therefore we can place this into the SSIS package.
To do this, simply drag an “Execute SQL Task” object from the Toolbox into the main package:
Double click the Task and fill in accordingly. I’ve changed the name to something meaningful, entered the SQL I wish to run (in SQLStatement), and specify the Connection which, in this case, is Homer. Also, note that we are not returning anything with this code, therefore leave “ResultSet” as None:
Having created this step we are now ready for our data flow. The hint’s in the name… drag a “Data Flow Task” to your window:
Now double click this and you’ll be taken to another tab in which we define the Data Flow.
From the Toolbox, drag an OLE DB Source and OLE DB Destination onto the window:
Double click the OLE DB Source task and fill in accordingly. In this case I’m taking data from Elementary, I’m using a SQL Command, and the command (from above) is placed in the resulting box:
Now, because we want the data going from the Source to the Destination we drag and attach the Blue (or Green - depending on version… basically just not the Red) arrow from the Source to the Destination to show direction:
Now we double click the Destination and fill it in accordingly… in this case our destination is Homer, the salesOrderSubset table, and it’s the whole table we’re loading.
Note there is a section which allows you to limit the rows per batch, which can be useful if you’re loading into a replicated table, over a slow line, or just want to lessen the immediate load on the servers during the execution. The default is empty and does not require a value.
Next, go to the mappings tab and drag input columns to output columns to ensure that the correct column is mapped to the correct column.
This won’t be immediately detected if there are issues surrounding data types of column names:
That’s pretty much it.
Click OK, then go back to your Control Flow tab and drag the success (Green) arrow from your SQL Task to the Data Flow Task to enforce the order in which you wish the objects to execute and you’re done:
Therefore all that’s left is to hit save (just in case), run the package and see it perform.
After saving (we all know how to do that, right?), in Solution Explorer simply right click your package and select “Execute Package”:
The result is more fun in BIDS as it highlights the entire task in Green for Success and Red for failure, but in Data Tools we simply get a little tick or cross. In my case the package succeeded, and we’re done:
Obviously if this were a real package then you would want to deploy it to your server, but this is just a quick demo to get you started building and running a manual package. I’ll cover deployments at a later time.