SSIS error 0×80004005

I recently ran across an issue while working with a package developed by a co-worker. It’s one I don’t fully understand and I hope someone will chime in with a better explanation than I have.

The package isn’t all that complicated. It takes start and end dates as variables and selects data based on the date range. Then it does some lookups and finally inserts into some tables. Both the source and destination databases are on the same SQL 2005 server. The package is being run as a SQL Agent job on a SQL 2008 server. Bother servers are running Windows 2008 R2, and the package was developed using Visual Studio 2008.

For the first few runs the package ran fine, going through small data sets of less than 500,000 rows. After that the SQL job began to return error code 0×80004005. The messages were “Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0×80004005. An OLD DB record is available. Source: “Microsoft SQL Server Native Client 10.0” Hresult: 0×80004005 Description: “Protocol error in TDS stream”.  This would be followed by a few more with different descriptions; “Communication link failure” and “TCP Provider: An existing connection was forcibly closed by the remote host”.

This is where I came in. I exported the package from SQL to a new BIDS project on the same server it runs from. I was able to run it successfully from BIDS, so I thought there was just some issue with running it as a SQL job. Then BIDS started giving me the same errors with a few new ones; “-1073450952, 0x, SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component “xxxxxx” returned error code 0xC0202009”. There’s a bit more but just pointing out there were errors prior to that one.

I did notice that the error always occurred at the same place, in the same data flow on the same OLE DB Source component. Just as a quick test I dropped the existing component and replaced it with another OLE DB Source, using the same data source and query to fetch the data. And voila! No more errors. I redeployed it to SQL and now it runs as a SQL job just fine.

At this point though I’m at a loss over just what happened. My only guess right now is that the original package was developed on a server running the full Visual Studio 2008 IDE. He only has SQL 2008 tools installed and I’m betting he never updated them. The server is running just the BIDS shell that comes with SQL 2008, but it is patched at SP 3. So possibly there was a patch on the server that affected BIDS. I haven’t had a chance to go over what was in each 2008 SP, I’ll do that tomorrow. And as I said, I’m just guessing.

So if anyone who reads this has any insight, please add a comment. I’d love to find out just what happened so I can be ready if it happens again.

Error using variable for worksheet name in SSIS Excel destination

p5rn7vb

This is another quick post about something many of you already know about, using the property page in SSIS.

One of the things I monitor as a DBA is the space used by our databases on our production servers. I collect those stats on a weekly basis and store them in a central repository that I can query later. I’ve been saving the results for a while, and I’ve had different methods of getting the data out; queries, reports, even Powershell. But since I’m learning SSIS for Microsoft’s 70-448 exam I thought I’d look for a SSIS solution this time.

Getting the data out of the database and into an Excel spreadsheet is simple enough; a data flow task with an OLE DB source, a data conversion transform, and an Excel destination.

image

And that worked, to a certain point. I wanted to be able to write each package run to a new sheet. So I created 2 package variables; SheetName and SheetTable. SheetName has a value of Default. I used the New Table in the Excel destination to generate code for a new table, named my table Default, and placed that as the value of the SheetTable variable. In a script task on the control flow I set SheetName to a string that includes the date and inside the SheetTable variable replaced “Default” with the SheetName value. Then I use an Execute SQL Task to create the new sheet in my Excel spreadsheet.

My problem came back in the Excel destination, wen I tried to set the table to use the SheetTable variable. Everytime I tried I got a warning that the table didn’t exist. I made sure it did, as a template. I had DelayValidation set to True in the Excel connection manager. I couldn’t get it to save.

That’s when I went to the property page for the Excel destination. There I could set the individual properties with no errors. I set the AccessMode to OpenRowset From Variable and the OpenRowsetVariable to my SheetName variable.

image

And that’s all it took. Saving the package and running it now gives me a sheet for every day I run the package.

image

How I track database size

One of my DBA tasks is to track the space used on our database servers. I want to know how big the database file sizes are, how big the tables are, and how much free space is available on each server. This is a common task for DBAs, and there’s probably a different method for each DBA. This is mine.

When I first started, I created a database on each production server to store all stored procedures, tables, and views that I use as a DBA. The tables are mostly repositories of server state at the time I run a query or procedure. In the case of database sizes I created the procedure GetDBFileStats to get the filesize, total space, free space, and space used for each file in each database: 

CREATE PROCEDURE [dbo].[GetDBFileStats]
AS

EXEC sp_msforeachdb 'USE [?]

CREATE TABLE #tmpspc (Fileid int, FileGroup int, TotalExtents int, UsedExtents int, Name sysname, FileName nchar(520))

INSERT #tmpspc EXEC ('
'DBCC SHOWFILESTATS'')

INSERT INTO PerfDB.dbo.dbStats
SELECT GETDATE() AS RunDate,
'
'?'',
s.name AS [Name],
s.physical_name AS [FileName],
s.size * CONVERT(float,8) AS [Size],
CAST(tspc.UsedExtents * CONVERT(float,64) AS float) AS [UsedSpace],
(s.size * CONVERT(float,8) - CAST(tspc.UsedExtents*convert(float,64) AS float)) AS FreeSpace,
s.file_id AS [ID]
FROM sys.filegroups AS g
INNER JOIN sys.master_files AS s ON (s.type = 0 AND s.database_id = db_id() AND (s.drop_lsn IS NULL)) AND (s.data_space_id=g.data_space_id)
LEFT OUTER JOIN #tmpspc tspc ON tspc.Fileid = s.file_id
ORDER BY [ID] ASC

DROP TABLE #tmpspc

INSERT INTO PerfDB.dbo.dbStats
SELECT GETDATE() AS RunDate,
'
'?'',
s.name AS [Name],
s.physical_name AS [FileName],
s.size * CONVERT(float,8) AS [Size],
CAST(FILEPROPERTY(s.name, '
'SpaceUsed'') AS float) * CONVERT(float,8) AS [UsedSpace],
(s.size * CONVERT(float,8) - CAST(FILEPROPERTY(s.name, '
'SpaceUsed'') AS float)* CONVERT(float,8)) AS FreeSpace,
s.file_id AS [ID]
FROM sys.master_files AS s
WHERE (s.type = 1 AND s.database_id = db_id())
ORDER BY [ID] ASC'


GO
This was as far as I went until a few months ago. At that time I started to think of using SSIS for some of my daily tasks. So I created this simple package that gets the results from my procedure and puts them into a central database repository. I also truncate the tables and change the default date to just the date; 
 image
Finally I created a report that got the data back. I send 2 parameters; a from date and a to date. Now when I run the report I can view what the database size was for the two dates and easily see how much the files have grown. I also created a sub report that, when I click on the database in the main report, opens to show the growth history of that database on a week to week basis.
Here’s the table I use to store the data and the procedure I use to populate the summary report. I also run it as an ad-hoc query;
CREATE TABLE [dbo].[dbStats](
[Servername] [sysname] NOT NULL,
[RunDate] [datetime] NOT NULL,
[DBName] [sysname] NOT NULL,
[Name] [sysname] NOT NULL,
[FileName] [nchar](520) NOT NULL,
[TotalSize] [float] NOT NULL,
[UsedSpace] [float] NOT NULL,
[FreeSpace] [float] NOT NULL,
[FileID] [int] NOT NULL
) ON [PRIMARY]

CREATE PROCEDURE [dbo].[prc_DBSizeCompare] (
@minDate DATETIME,
@maxDate DATETIME)

AS

SET NOCOUNT ON

;WITH DBCTE(ServerName, DBName, Name, MaxRunDate, MinRunDate) AS
(
SELECT ServerName, DBName, Name,
MAX(RunDate) AS MaxRunDate,
MIN(RunDate) AS MinRunDate
FROM dbo.dbStats
WHERE RunDate BETWEEN @minDate AND @maxDate
AND DBName 'tempdb'
GROUP BY ServerName, DBName, Name
)
SELECT db.Servername, db.DBName, db.Name, db.RunDate, db.TotalSize, db.UsedSpace, db.FreeSpace
FROM dbo.dbStats db
INNER JOIN DBCTE CTE ON db.Servername = CTE.ServerName
AND db.Name = CTE.Name
WHERE db.RunDate = CTE.MinRunDate OR db.RunDate = CTE.MaxRunDate
ORDER BY db.Servername, db.DBName, db.FileID, db.RunDate

GO

SSIS Expressions Cheatsheet

Maybe you’re new to SSIS, like I am, or maybe you just need a little refresher on expressions in SSIS. I just ran across the SSIS Expressions Cheatsheet. This is a page on the PragmaticWorks website. It’s also available as a downloadable .pdf file. For me, anyway, it’s a great one-page reference. I’ve got it printed and pinned to my cube wall.

I learned about this resource from Dougbert.com

My first data warehouse – Part 1

I’m going to lay a little ground work for future posts here. My plan to learn BI, and also to create a useful data warehouse from scratch, is to build one based on reporting server performance for my company’s eventual use. I want to be able to drill down to an individual report to view performance data, as well as summarize report performance by locations, by dates, by users,by date ranges, as well as other factors that I don’t envision right now.

Today I’m going to discus the download I found on the CodePlex site, thanks to Tyler Chessman from SQL Magazine. This package and the accompanying reports will be incorporated in my data warehouse project.

To use the downloaded code you’ll need to create a database. The examples use RSExecutionLog as the database so I kept the same name. The download includes a script to create the necessary tables and relations. The tables it creates are used to store data extracted from the ExecutionLogStorage table in your SQL 2008 report server database. Mine has the default name of ReportServer, created when I configured Report Server for the first time. This database stores the report catalog and also useful information pertaining to when reports are run.

There are two Visual Studio solutions in the download; one SSIS solution called RSExecutionLog with a single package called RSExecutionLog_Update.dtsx, the other is a reporting solution with three sample reports to use against the extracted report server data and two to view schema and table information. I’m going to concentrate on the first two. 

I had some original issues with the package and reports. They were written using SQL 2005; I’m using SQL 2008. After converting the reports (basically just opening the report solution in VS 2008) I still had issues with the package and it’s connections. There were places where the connection to the source or destination seemed to be hard-coded inside different components of the package, even after I changed the source and destination connections. I ended up building a new package based on the original.   

The package is pretty sweet. Not only does it populate tables for users and reports, it also creates one for the report parameters. When the parameters is stored in the report server catalog, it’s something like “PARAM1=123&PARAM2;=456”. The package shreds the report server string into individual values, so now I can easily search by parameters. I was thinking of doing the same thing, only storing the values as one record as opposed to one or more.

So I can use this package with very little customization. My next step is to pull data into my dw that is dependent on each report execution. This package should be fairly straight forward.

My next post will show my database design.     

What I learned today – Report Server Execution log package

My first attempt at a data warehouse is going to be collecting data from our report servers. In our environment we have two SQL 2008 instances that host the reports themselves, and they connect to four data centers for the report data. In the data centers we have a table that’s populated with parameters needed for each report; dates, locations, etc. The reports themselves only have two parameters; the datacenter where the data for the report is (we’re using dynamic connection strings) and a guid that identifies the other needed parameters in the data center.

My goal was to build my warehouse from the four data center report parameter tables and the Execution Log tables on the report servers. The report server logs information from each time a report is run; if it was successful, the parameters for the report, the user, and more that would be helpful in debugging a report performance. I wanted to be able to view summaries for each report; for instance how often each report was run for each location during a specified date range, average report durations, or the number of aborted reports.

I was going to build an SSIS package to gather the data for me. Then I read an article in the November 2009 issue of SQL Server Magazine by Tyler Chessman that defined about half of what I want to do. In his article SQL Server Reporting Services Questions Answered, Mr Chessman describes sample reports from Microsoft that you can find on CodePlex. The reports will be interesting enough and I’ll be able to use them, but the best part is Microsoft has already created a package to extract the data from the report server execution log!

This post is meant to be an overview. I’ll post a review of the CodePlex samples soon, and I’ll start laying out my data warehouse design.

What I learned today – auto generate create table DDL in SSIS

I’m pretty new to SSIS, only having played around with it to see how things work. This is probably well known to SSIS developers but I just ran across it accidentally. I’m learning SSIS and I’m currently working through Brian Knight’s book “Knight’s 24 – Hour Trainer Microsoft SQL Server 2008 Integration Service”(I’ll review the book when I’m finished). Lesson 15 is walking me through loading the contents of a flat file into a new table in a SQL database with the OLE DB Destination.

When I’ve done this before, I’ve always written a DDL statement to create the table. This time, I connected the Flat File Source, which has my sample data, to the OLE DB Destination.This time, when I clicked to create a new table, SSIS generated the CREATE TABLE script based on the definition of the flat file! The only thing I had to change was the table name, and even this would probably have been handled if I had renamed my destination first.

image image

I didn’t think the table would be created until the task was run. But after clicking OK, the table was listed in my test database. I hadn’t even finished defining the destination!