My first attempt at a data warehouse is going to be collecting data from our report servers. In our environment we have two SQL 2008 instances that host the reports themselves, and they connect to four data centers for the report data. In the data centers we have a table that’s populated with parameters needed for each report; dates, locations, etc. The reports themselves only have two parameters; the datacenter where the data for the report is (we’re using dynamic connection strings) and a guid that identifies the other needed parameters in the data center.
My goal was to build my warehouse from the four data center report parameter tables and the Execution Log tables on the report servers. The report server logs information from each time a report is run; if it was successful, the parameters for the report, the user, and more that would be helpful in debugging a report performance. I wanted to be able to view summaries for each report; for instance how often each report was run for each location during a specified date range, average report durations, or the number of aborted reports.
I was going to build an SSIS package to gather the data for me. Then I read an article in the November 2009 issue of SQL Server Magazine by Tyler Chessman that defined about half of what I want to do. In his article SQL Server Reporting Services Questions Answered, Mr Chessman describes sample reports from Microsoft that you can find on CodePlex. The reports will be interesting enough and I’ll be able to use them, but the best part is Microsoft has already created a package to extract the data from the report server execution log!
This post is meant to be an overview. I’ll post a review of the CodePlex samples soon, and I’ll start laying out my data warehouse design.
I’m pretty new to SSIS, only having played around with it to see how things work. This is probably well known to SSIS developers but I just ran across it accidentally. I’m learning SSIS and I’m currently working through Brian Knight’s book “Knight’s 24 – Hour Trainer Microsoft SQL Server 2008 Integration Service”(I’ll review the book when I’m finished). Lesson 15 is walking me through loading the contents of a flat file into a new table in a SQL database with the OLE DB Destination.
When I’ve done this before, I’ve always written a DDL statement to create the table. This time, I connected the Flat File Source, which has my sample data, to the OLE DB Destination.This time, when I clicked to create a new table, SSIS generated the CREATE TABLE script based on the definition of the flat file! The only thing I had to change was the table name, and even this would probably have been handled if I had renamed my destination first.
I didn’t think the table would be created until the task was run. But after clicking OK, the table was listed in my test database. I hadn’t even finished defining the destination!
SQL Saturday in Iowa has come and gone. It was a great conference, and it’s a shame that we can’t get a day like this somewhere in the Chicago – Milwaukee – Madison region. Hopefully soon.
But back to the SQL Saturday. Thanks to the East Iowa SQL User Group for putting this all together, bringing in the speakers, finding the venue, lining up the sponsors, and doing all the little things that make an event like this possible. There were only a few minor glitches with equipment (projectors mostly). Registration was probably the easiest I’ve ever seen for any event. Food and beverages were both tasty and plentiful all day long. And the sponsors were only there for the event, not to sell products.
But the best part of the event was the content. I can only comment on the sessions I attended but I’m sure that the rest were equally as good. I have a list of the sessions offered in an earlier post.
I didn’t go to Alex Kuznetsov’s sessions since I’ve seen them. The first session I went to was Janis Griffin’s Tuna Helper. She went over a process that Confio uses to spot issues on servers using wait types and DMVs. Next up were the three BI sessions that Jessica Moss presented. The session on reports was basic, probably because that’s the part of the BI stack I’m most familiar with. And her session on Database Dimensions in SSAS had a glitch with an overheated projector and became more of an overview of SSAS itself, which was fine with me as I know very little about SSAS. The session on SSIS was great. I may finally be getting a handle on data warehousing dimensions and facts thanks to her.
Tim Ford also had an issue with a bulky projector so his session wasn’t as informative as it would have been. But I have the links to his blogs where he’ll have his scripts. They’ll come in handy. And finally Louis Davidson gave his second session of the day, on database design patterns. This was my first chance to see Louis present live. He’s a great speaker.
It was just a coincidence that I posted earlier on cloud computing. I took a long time writing a simple summary of the TechNet and MSDN events I went to on the Azure services. I was (and still am) impressed by the potential of cloud services.
Over the last few days some details are coming out about what happened with T-Mobile’s Sidekick. T-Mobile was using Microsoft’s cloud computing to handle Sidekick data. At some point Microsoft performed a SAN upgrade and outsourced the job to Hitachi. What is completely baffling is that there was no backups taken prior to the upgrade (you can read a better summary at HipTop3.com).
How in the world could Microsoft and/or Hitachi have neglected the backup step? This should have been caught by the most junior member of the team running the upgrade. It’s going to be interesting to hear how they try to spin this; all parties are equally to blame.
And how does Microsoft try to sell their cloud now?
On October 1st I went to the Technet and MSDN events on Azure. I’ve never really read much into what Azure was all about so I wanted to learn more about it; what it was, and most important what it could do for me.
There are a few data centers that Microsoft built for their Azure services. They just finished one here in Chicago, there’s one in Dublin, and I believe the third is in Quincy, Washington. It’s pretty cool how they set it up. Dell loads up semi trucks with the components needed. The trucks are shipped to the data centers where they are plugged into the network. I don’t think they use USB cables. Each data center can hold, I believe, 350,000 nodes.
Azure is actually three parts. The first is Windows Azure. This is exactly what it sounds like, running on a Windows Server platform.
The second part is SQL Azure, again no surprises. There wasn’t much talk about SQL Azure other than to mention that it was available. However the costs seemed to me to be fairly high; 1 GB for $9.99 per month, or 10 GB at $99.99 per month. You would probably be better off finding a different hoster, there are cheaper options available.
The third part of Azure is .NET services. This would be your messaging services. Again, there are probably cheaper options. Microsoft wants $.15 per 100K per message.
On top of all this there are the bandwidth costs; $.10/GB inbound and $.15/GB outbound
West Monroe Partners also gave a demo at the end of the TechNet event. They build the website the City of Chicago used for the Taste of Chicago last summer. It was impressive, especially since this was 3 months ago on a CTP build.
I’m attending the TechNet and MSDN events on using Azure, cloud computing. Some cool stuff on tap. I’ll post about the individual sessions later.