Phantom FileTable Constraints

I was writing a wrap-up post to my series on FileTables when a co-worker found something interesting. SQL will insert a phantom FileTable constraint when you use SQL Server Management Studio to create a script of a non FileTable table that has a clustered index on a field named either path_locator or stream_id. This happens in SQL 2012 SP1 and SP2 as well as SQL 2014. I haven’t tested in SQL 2012 RTM yet.

What my friend was trying to do was to create a bridge table to link documents stored in a FileTable with other records in the database. He’s working in a local SQL 2012 SP1 Developer Edition environment. He already created a FileTable for the documents, and he thought he could store the stream_id in the bridge table, and there he defined a primary key constraint on the stream_id field. After a bit he went to create a script of his table. When he ran it to create a new table it it failed, because he wasn’t creating a FileTable.

I started to play with it, and it appears to happen only when you create either a clustered index or a primary key constraint on a field that is named either stream_id or path_locator  If you’ve worked with FileTables you’ll know that SQL creates those columns as part of the table definition. It also creates a primary key constraint on path_locator and a default constraint on stream_id.

Now before I show some code to recreate the issue I want to stress that SQL doesn’t actually create the FileTable constraint (it can’t because it’s not working in a FileTable). It just inserts it into a script that’s created by SSMS. But if it isn’t spotted and removed it will generate the same error. And maybe the person running the script won’t know the reason it’s failing.

The environment where I’m running this is a Windows Server 2012 R2 virtual machine that has a SQL 2014 Developer edition instance installed. I’ve tested with filestreaming on the instance both enabled and disabled, and with and without a filestream filegroup for the database.

CREATE DATABASE DemoDB
ON PRIMARY
(
NAME = 'DemoDB_data',
FILENAME = 'F:\SQL2014\DATA\DemoDB_data.mdf',
SIZE = 512000KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 102400KB
)
LOG ON
(
NAME = 'DemoDB_log',
FILENAME = 'F:\SQL2014\DATA\DemoDB_log.ldf',
SIZE = 102400KB,
MAXSIZE = 2048GB,
FILEGROWTH = 25600KB
);

USE DemoDB;

-- Script to create first table with path_locator as PK
CREATE TABLE dbo.DemoTable1(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable1 PRIMARY KEY CLUSTERED
(path_locator)
);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable1](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable1] PRIMARY KEY CLUSTERED
(
[path_locator] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[PK_DemoTable1]
)

-- Script to create second table with no PK
CREATE TABLE dbo.DemoTable2(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL
);

-- Script returned by SSMS, no added constraint
CREATE TABLE [dbo].[DemoTable2](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL
) ON [PRIMARY]

--  Script to create 3rd table and add clustered index
CREATE TABLE dbo.DemoTable3(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL
);

CREATE CLUSTERED INDEX clus_DemoTable3 ON dbo.DemoTable3
(path_locator);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable3](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[clus_DemoTable3]
)

-- Script to addd 4th table, change datatype to int, add as PK
CREATE TABLE dbo.DemoTable4(
path_locator int NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable4 PRIMARY KEY CLUSTERED
(path_locator)
);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable4](
[path_locator] [int] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable4] PRIMARY KEY CLUSTERED
(
[path_locator] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[PK_DemoTable4]
)

-- Script to create fifth table with stream_id as PK
CREATE TABLE dbo.DemoTable5(
stream_id uniqueidentifier NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable5 PRIMARY KEY CLUSTERED
(stream_id)
);
-- Script returned from SSMS, adds FILETABLE_STREAMID_UNIQUE_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable5](
[stream_id] [uniqueidentifier] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable5] PRIMARY KEY CLUSTERED
(
[stream_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_STREAMID_UNIQUE_CONSTRAINT_NAME=[PK_DemoTable5]
)
p5rn7vb

Working with FileTable Constraints

I’ve been playing a little more with FileTables lately, and I thought I’d share some of what I’ve seen when applying my own constraints. I’m testing with a SQL 2014 Developer Edition instance, but I can recreate everything here in 2012 editions too.

When you make a FileTable SQL handles the creation of all objects, including constraints.You can’t drop or modify these, but you can add your own. In this script I create a constraint on a FileTable to limit the type of files I can insert to just .txt files.

ALTER TABLE dbo.LimitType  WITH CHECK ADD  CONSTRAINT CK_LimitType_FileName CHECK (file_type='txt');
GO

Now that I have my constraint let’s test it out. Inserting a blank .txt file works, as expected…

INSERT INTO dbo.LimitType(name, file_stream)
SELECT 'InsertedTextFile.txt', 0x

…but when we try to add a blank image file we’re blocked by our constraint…

INSERT INTO dbo.LimitType(name, file_stream)
SELECT 'InsertedImageFile.jpeg', 0x

Msg 547, Level 16, State 0, Line 5
The INSERT statement conflicted with the CHECK constraint “CK_LimitType_FileName”. The conflict occurred in database “DemoDB”, table “dbo.LimitType”, column ‘file_type’.
The statement has been terminated.

…and if you try to copy it into the shared folder you’ll get a different message.

image

That’s not a problem, it’s what we wanted.  However, there is a problem if you try to limit the size of the files you store in your FileTable. Creating a constraint on the cached_file_size doesn’t work. You can still insert files greater than the limit you set in your constraint.

To test I’m going to add a check constraint to another FileTable that is supposed to limit the file size to 1 MB.

ALTER TABLE dbo.LimitSize WITH CHECK ADD CONSTRAINT CK_LimitSize_cached_file_size CHECK (cached_file_size <= 1000);
GO

I’ve copied the .iso file that contains the install bits for SQL 2014 Developer Edition, along with a few small txt files. This is what the shared folder looks like.

image

Below, on the left, is what the .iso file properties should look like. On the right is what the properties look like from the FileTable copy.

image  image

If I queried the tables I’d see the cached_file_size as 0 there as well.

There is already a Connect item for this, created when SQL 2012 came out. But it was closed and marked as won’t be fixed. I can verify that this is still happening in the RTM version of SQL 2014, which is where I’ve done my tests.

Copying duplicate files with SQL FIleTables

I was going to write about security on FileTables in SQL but I thought I’d write a quick post about creating duplicate files first. This topic came up as part of a general discussion I was having recently on general FileTable usage.

Sometimes you want to make a copy of a  file.in the same directory that you store the original. Usually you’re editing a large file and you want a copy to revert back to in case you need to start over. In Windows it’s pretty easy; right click the file in question, choose copy, right click in the directory and choose paste. You’ll get a second copy, usually with “Copy” appended somewhere inside the file name.

But what if you want to do the same thing with a file stored inside a FileTable in a SQL database? If you have access to the shared directory of the table, the process is the same. Depending on your method you’ll get a screen similar to this one…

image

…then Windows will create the second file with a similar name, like this.

imageIf you query the database you’ll see both records.

However, what if you try to insert a file with the same name and directory as one that already exists? You can’t do it. There is a unique clustered constraint on the name and parent_path_locator fields. The parent_path_locator is a hierarchy id data type that corresponds to the directory where the file is located. You’ll need to either rename the file you’re inserting or update the existing record.

Of course it’s easy enough to do. Here’s one way of changing the name of the file you’re inserting..

DECLARE @fileName    VARCHAR(25)

SELECT @fileName = 'SF Sailboat.jpg'

IF EXISTS(SELECT 1 FROM dbo.Documents WHERE name = @fileName AND parent_path_locator IS NULL)
INSERT INTO dbo.Documents (name, file_stream)
SELECT 'Copy of ' + @fileName AS name, CAST(bulkcolumn AS VARBINARY(MAX))
FROM OPENROWSET(BULK 'C:\FileTable Demo\Photos\SF Sailboat.jpg', SINGLE_BLOB) AS file_stream

… and here’s a similar way you can rename the existing file using the MERGE command.

DECLARE @image1        VARBINARY(MAX)
DECLARE @fileName    VARCHAR(25)

SELECT @image1 = CAST(bulkcolumn AS VARBINARY(MAX))
FROM OPENROWSET(BULK 'C:\FileTable Demo\Photos\SF Sailboat.jpg', SINGLE_BLOB) AS x

SELECT @fileName = 'SF Sailboat.jpg'

MERGE dbo.Documents AS t
USING (SELECT * FROM dbo.Documents WHERE name = @fileName AND parent_path_locator IS NULL) AS s
ON s.stream_id = t.stream_id
WHEN MATCHED THEN
UPDATE SET t.name = 'Copy_' + t.name
WHEN NOT MATCHED THEN
INSERT (name, file_stream) VALUES(@fileName, @image1);

Now when we navigate to the file share we’ll see all 4 copies – the original, the one we copied using Windows, and the 2 we inserted with T-SQL.

image

That’s easy enough. Just remember, if you insert using T-SQL you’ll need to handle the duplicates yourself.

Where’s my FileTable space?

Last year I wrote a number of posts about using a new feature of SQL 2012, FileTables. Very briefly, FileTables are just tables that store files by using FILESTREAMING. If you’re interested you can read my other posts to learn more.

Recently a friend approached me with a question. He had started playing with FileTables, using it to store pictures he had taken over the years. He had inserted a large number of photos, probably 200 – 300 MB worth. When he queried sys.sysfiles to check on the space he wasn’t seeing any increase; to him the size of the database was the same. Where was the space?

He was just looking in the wrong place. The view sys.sysfiles doesn’t show the size used by the filegroup files.  

To help explain I set up another little example. I’m using a virtual server that’s running SQL 2012 Developer Edition and I’ve created a DemoDB that has both the database and FILESTREAM filegroup on the E drive. Then I wrote another query that joined sys.sysfiles with sys.database_files to see the full picture. The space when the database had no files inserted looks like this;

image

image

Now to insert a large file. I choose to insert the iso file that contained the media for SQL 2014 CTP2, about 2.5 GB. After the insert I reran my script, and now the space looks like this;

image

image

And to further prove the point, I can see the file in the filestream storage;

filestream

So, you just need to know where to look. This might help you if you ever need to provision a FILESTREAM database or set up FileTables. You’ll need to remember that there’s enough space in your FILESTREAM directory.

And one word of warning. Never try to manipulate the files inside the filestream storage. You can cause some big problems in your database if you do. Resist the urge and access the files through client applications or FileTable shares. 

Last thing. I’ll be speaking about FileTables at #SQLSatMadison on March 29th. There’s still time to register!

I’m speaking at #SQLSatMadison

Have you registered for SQL Saturday Madison yet? If not, do it soon, you’re almost out of time. The date is March 29th and it’s coming up fast, less than 2 weeks to go!

There’s a lot to see at this event. Start with your choice of 3 full day pre-conference sessions on Friday…

…then choose from 40 awesome sessions from some of the top SQL speakers on Saturday. Don’t take my word for it, check them out here. Or download the Guidebook app to your iPhone or Android device. There’s lots to choose from, whether you’re looking at beginning level sessions or advanced. And I’ll be making my SQLSaturday speaking debut, An Introduction to FileTables. If you do come to my session be sure to come on up and introduce yourself.

And I haven’t even mentioned Cows of a Spot yet, where you sit and have lunch with other attendees that share similar SQL interests. Want to learn more about report design? There’s a table for you.

Also, if you are coming and you’ve been to one of the past SQL Saturdays in Madison, be aware that the location has changed. It’s now at American Family Insurance Training Center, 6000 American Parkway, Building A, Madison WI, 53783.

See you there!

Studying for SQL 70-462 Exam

At this point in my career I’m not sure I need to take anymore Microsoft SQL certification exams. I’ve already taken them for SQL 7 and SQL 2005, plus the 70-448 Business Intelligence exam on Business Intelligence. there’s not much more I can accomplish by re-taking them for SQL 2012, especially with the 2014 product launch coming soon.

However that doesn’t mean that I want to stop learning. On the contrary. Each new release has loads of new features that need to be looked at, tested, played with and understood. I’ve always found that studying for the exams was a good way to learn the material.

So to that end I’m going to start working towards taking the 70-462 exam, Administering Microsoft SQL Server 2012 Databases. Who knows, when I’m done maybe I’ll end up taking the test anyway.

As I did when I was studying for the 70-448 exam, I’ll put my notes into a series of blog posts. I’ll also place links to any resource I run across that helps. Please feel free to comment or discuss any points I bring up. We can all learn from each other.

One last thing. Don’t expect to see any test dumps on this site, or any links to them. The best way to learn is to know the material. Believe me, you’re not fooling anyone when you claim to be certified but you can’t answer simple questions when interviewing for a new position.

Resources I’ll be using

The website for the 70-462 exam. Here you’ll see what the exam objectives are, making it easy to concentrate on one area of the exam at a time.
http://www.microsoft.com/learning/en-us/exam-70-462.aspx 

The Microsoft Press Training Kit will be very useful. Not only does it cover the exam objectives it has a testing engine you can install to test your progress and see which areas of the exam you need to concentrate on. Personally I like to buy my technical books from O’Reilly. If you buy the printed copy you can buy the ebook for another $5, and if you buy the ebook you get the book in epub, mobi, and pdf formats AND you get lifetime updates.
http://shop.oreilly.com/product/0790145345134.do

Microsoft Virtual Academy has live Jump Start events for different Microsoft products. There was one for Administering SQL back in October of 2013. Fortunately these are taped so you can still register and watch.
http://www.microsoftvirtualacademy.com/training-courses/administering-microsoft-sql-server-2012-jump-start#?fbid=K9-EWOQ1LfT

Pluralsight is a fantastic site for SQL training, especially now that SQLSkills is adding content. All that content is great and I highly recommend watching it. But a few months ago Pluralsight added Train Signal courses, and they have a 3 part series that focuses on the 70-462 exam. You can try Pluralsight for free for a few weeks to sample the content. If you like it and find it valuable the cost is very reasonable, starting at $29 per month.
http://www.pluralsight.com/training/

I may be adding to this list later. If you have any resource you think would help add it to the comments section.

Good luck!

What would you do?

I’m going to lay out a scenario for you. It involves dealing with a SQL database that has been marked as suspect, and it happened on a server I’m responsible for a few days ago. After I describe the conditions I’ll go over the steps that I took to resolve the issue. But I’m always looking to improve my processes, so I’d like to learn from this. I’d like to hear from you: what I did right, what I did wrong, what I should have done, and what you’ve done in similar situations.

The Scenario

Early one morning you get a series of emails from co-workers letting you know that one of the SQL servers you are responsible for is not allowing any applications to connect. This is a SQL 2005 instance with Service Pack 4 running on Windows 2008 R2 Enterprise Edition running SP1. You try to connect using Management Studio and fail. You remote into the server to see what the problem is. Checking the services you see all SQL services are running. You open Management Studio on the server and you see one of the databases marked as suspect.

Next you check the SQL logs. You see the problem began about 5 1/2 hours earlier. At that time the SQL Agent job that performs a full backup of this database was running. About 15 minutes into the backup there are a number of warnings that SQL has encountered occurrences of I/O requests taking a long time. However there was one such warning for each database on the server. This is very early in the morning, the only activity at the time should have been the full backup. Immediately after the warning was an entry that the database log file was corrupt. Checking the Windows event logs shows that, for a few seconds at least, the SAN drive where that log file resides came offline.

You have full and log backups from the previous day so you can restore the database if needed. The final log backup took place 3 1/2 hours before the corruption occurred. You know of some database activity that took place after the last backup and you know it can be reproduced if needed. But there still is a possibility of at least some data loss.

What I did

My first thought was to try and grab a final log backup so I could restore to just before the log corruption occurred. However I had forgotten that the full backup was running when disaster struck. Indeed, according to current server activity, that backup job was still running almost 6 hours later. This job should have completed in 45 minutes. The backup file was about 1/3 of what it should have been.

Next I restarted the SQL service. I think I was hoping that whatever marked the log as corrupted would somehow be rolled back. Of course I was wrong; the database was still marked as suspect and the log was still marked as corrupted.

Now I needed to see if the database was also corrupted. I ran DBCC CHECKDB but it gave me an error. I didn’t realize I couldn’t run it against a database marked as suspect. So I put the database in emergency and single user mode and ran DBCC again. This time it ran, but unfortunately it also showed there were allocation and consistency errors. The good news was that the only table listed was a table that I knew only held staging data. The bad news was that there appeared to be other objects involved I couldn’t see.

Now here’s where I gambled. I ran DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS. I knew that option is only recommended as a last resort. But I felt I was still protected somewhat. I wanted to keep data loss at a minimum, and I wasn’t worried about losing just the unneeded staging data. I still had the backup files that I could recover from if I was wrong, and at that point all I would have lost would have been time.

And I won. After running CHECKDB with REPAIR_ALLOW_DATA_LOSS I was able to bring the database back online. The corrupted log was rebuilt and the CHECKDB output only showed the single table as being affected. Restoring the previous backups to a different server and comparing record counts showed no missing records. Spot checks didn’t turn up any bad data. 

Lessons Learned

Before anything else I should mention that I informed my managers beforehand of each step I took, and the possible consequences of each of my actions. They were OK with my decisions, even though I could have had the database back up much quicker if I had just restored the backups. Also, this is not my first database disaster. On two occasions I was able to recover SQL instances from SAN failures by restoring from backups, including system databases. I’ve only had to deal with one other suspect database, way back in SQL 7 days. And that was also fixed by restoring from a backup, but then a full day of data was lost. I’ve had plenty of other moments of panic when servers rebooted without warning.

  • I’ve found a hole in the timing of my backup jobs. There was too large a gap between the final log backup and the full backup. Indeed, there was also a big hole after the full backup and the first log backup. Database usage has changed since I created the schedule and I hadn’t been paying attention. That has changed and my first act after all of this was to plug those gaps. I’m also going to set reminders to check the schedules every 6 months.
  • I’m not comfortable with REPAIR_ALLOW_DATA_LOSS and I don’t know if I’d use it again. True, nothing was lost and I did have the backup files to fall back on, but even so I feel I got lucky.
  • I need to find a better option to repair the log. A better solution may have been to just drop the damaged table and repair the log. I’m going to try to recreate a damaged table and log to experiment with.
  • I know Paul Randal is THE expert on DBCC. I need to go back to his blog on the various options available.

Now It’s Your Turn

I want to hear from you. Email me, leave comments, contact me via twitter, however. I want to know what I steps I’ve missed. I’d love to learn from your experiences.

Most of all, if this happens again, I want to be prepared.

Using PowerShell to check on SQL services

This is just a quick little post to give you another reason to learn PowerShell.

Recently we ran different updates on our different servers across our networks. Some were Windows patches, some were for anti-virus updates, others were for different system related software. But some of them had one thing in common; they required the servers to be rebooted. Normally that’s not a problem. But in our case most SQL services are configured to start up manually when the server does. This is to limit any possible issues with SQL when the server crashes and restarts unexpectedly.

So let’s say that 10 of your servers were restarted and you need to check what the status of each SQL service on those 10 servers is, and if they are stopped then you need to start them. It’s time consuming to connect to each, one by one. Wouldn’t it be nice if there was an easy way?

As you’ve probably guessed by now, there is. PowerShell. If ever there was a reason for a SQL DBA to learn PowerShell it’s for cases like this, administering multiple instances.

Back to my issue of SQL services. Open up a PowerShell console and run a single line to check the status of all SQL services

Get-Service -computername SERVER1 | Where-Object{$_.DisplayName -like "*SQL*"} | format-table -property Status, Name, DisplayName

That’s good, but not all that we want. Remember, we’re looking at 10 instances, so we don’t want to run this 10 times. What we do instead is to enter a comma separated list of servers for the computer name. And to see which service is on what server we can add the MachineName property to our table.

Get-Service -computername SERVER1, SERVER2, SERVER3 | Where-Object{$_.DisplayName -like "*SQL*"} | format-table -property MachineName, Status, Name, DisplayName

Now that you can see the status it’s just a matter of stopping, starting, or restarting the right service. In the examples below I’m working with the SSRS service on SERVER1.

Start-Service -inputobject $(Get-Service -computername SERVER1 -name ReportServer)

Stop-Service -inputobject $(Get-Service -computername SERVER1-name ReportServer)

Restart-Service -inputobject $(Get-Service -computername SERVER1 -name ReportServer)

See how simple that is? Go out and pick up a little PowerShell. It will make your life as a DBA much easier.

Using Microsoft’s New Database Backup Tool – Azure Storage

In my last post I wrote about using Microsoft’s new SQL Backup to Azure tool to compress and encrypt database backups to local storage. In this post I’m going to talk about using it to backup to an Azure storage container.

Before I get to the Azure tests let me go over some of the basics of using the tool from my last post. 

Before Microsoft released the SQL Server Backup to Windows Azure Tool (download link is http://www.microsoft.com/en-us/download/details.aspx?id=40740) you needed a third party tool compress/or encrypt your database backups. The upcoming SQL 2014 release will have both features, plus allowing you to mirror a backup to an Azure storage container. But if you are still running older instances of SQL you can use the new tool to get the same functionality, going back to SQL 2005.

Microsoft SQL Server Backup to Windows Azure Tool runs as a Windows Service under a new local account named .SQLBackup2Azure.To use the tool you just need to configure three options. The first asks where your backups are stored. The second asks if you want to use local or Azure storage. If you choose Azure you’ll need to enter your Azure account name, key, and container name.

image 

The third step allows you to select to compress and/or encrypt your backups, as well as enter your encryption password. When you finish you’ll get a screen where you can manage your backup rules.

Then just backup and restore your databases as you do normally, either with SSMS or with T-SQL commands.

The files you get depend on the options in your backup rule. The tool will create a .bak file that stores the metedata about your backup. This file is not encrypted,even if you choose that option. If you select to encrypt your backups you’ll see a .enc file with the same name as the .bak file. If you just choose compression then a .gz file is created. And if you didn’t choose compression or encryption for your rule then a .data file is created. Backing up with no rule will just create the .bak file.

I’m using the same two new virtual environments I set up before, running Windows 2008 R2 for my testing and SQL 2005 SP4 Standard Edition.My demo database will be AdventureWorks. The copy I’m using is about 180 MB with an 18 MB log file. I’m using my MSDN Azure subscription, where I created a blob storage container named “backups” for my tests.

For my first test I backed up AdventureWorks using a T-SQL command. Backing up to Azure with both the compression and encryption options took 2 minutes 11 seconds. The tool created a 3 KB .bak file containing just metadata in the local location I specified. In my Azure container it appended a string to my backup name and created the 3 KB .bak and 56 MB .enc files.

image 

Restoring the backup also took just over 2 minutes. I ran RESTORE FROM DISK and pointed to the .bak file on local storage. The restore took the info and restored using the files stored in Azure.

Restoring the backup file on my second DR instance was easy. As I did when testing restores to local storage using the tool, I created the same rule as on my source server and copied the 3 KB .bak file created there. Running the same RESTORE command took 2 1/2 minutes.

I’m not entirely sold on using this tool to backup to Azure, for a few reasons. First and most importantly, it doesn’t always work. At least half of the time backing up with the tool I get one of the following errors. 

Msg 3202, Level 16, State 2, Line 1
Write on “C:\SQL2005\Backups\AdventureWorks_full.bak” failed: 1723(The RPC server is too busy to complete this operation.)
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

Msg 3201, Level 16, State 1, Line 1
Cannot open backup device ‘C:\SQL2005\Backups\AdventureWorks_full.bak’. Operating system error 31(A device attached to the system is not functioning.).
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

What appears to be even worse is that the backups seem to appear in my Azure container. The .enc file shows a size of 1 TB.

image

I haven’t been able to troubleshoot either error yet so I don’t know if the error is on my side, with Azure, or with the backup tool. To rule out my virtual server setup I installed the tool on my laptop running Windows 7 and SQL 2008 with SP3; I get the same errors there.

Another drawback is that I haven’t been able to download the files from Azure to local storage and restore those copies. I’ve edited the metadata to point to the downloaded files and changed my rule to look at local storage but so far I haven’t had a successful restore. Again, this could be my error, I’m probably doing something wrong (the documentation that comes with the tool is pretty thin).

I’m going to keep testing, but until I can backup to Azure without errors consistently and I figure out how to restore from a downloaded file I’m not trusting it with any critical backups. I also need to run some tests of backing up transaction logs.

Using Microsoft’s New Database Backup Tool – Local Backups

At the opening keynote at PASS Summit 2013 it was mentioned that SQL was adding backup encryption along with the ability to backup databases to an Azure storage container. In my last post I wrote about my early testing of those features, albeit in very simple scenarios. I was able to encrypt a database backup both to local and Azure storage and restore to the original and a secondary DR server.

But back to the keynote for a moment. Another new feature relating to database backups was announced. Microsoft was releasing a tool that could also encrypt backups and save them to Azure, but it would work with older SQL instances. In this post I begin to look at using this new tool to backup to local storage.

The SQL Server Backup to Windows Azure Tool (download link is http://www.microsoft.com/en-us/download/details.aspx?id=40740) requires the OS to be at Windows 2008 or higher for servers or at least Windows 7 if you’re running it on your own pc. It’s only a 10 MB download so obtaining and installing the tool took no time at all. By default it installs to C:\Program Files\Microsoft SQL Server Backup to Windows Azure Tool. I haven’t yet identified any SQL objects it created.

The tool documentation claims it works against instances as old as 2005 and makes no mention of any specific edition. Using it you can compress and encrypt your database backups, even those from a SQL 2005 Standard instance. You have the choice of saving your backup files locally or to Azure storage, as long as you have an account there. I don’t see any mention of being able to use other cloud vendors.

As before, I set up two new virtual environments running Windows 2008 R2 for my testing. I wanted to test against the oldest version of SQL that I could so I installed SQL 2005 Standard Edition with SP4. I tracked down an old version of AdventureWorks to use as my demo database. The copy I’m using is about 180 MB with an 18 MB log file.

My first step after setting up my two servers was to create a benchmark. Backing up AdventureWorks without using the new tool took between 5 and 6 seconds in 3 different runs, each creating a 166 MB backup file. I ran restore 3 times as well, each finishing in about 4 1/2 seconds.

Now I’m ready to try using the new and wonderfully descriptively named Microsoft SQL Server Backup to Windows Azure Tool.

After installing you just run the tool’s Configuration. Basically it’s just a configuration wizard allowing you to add backup rules. You’re able to add more than one rule so you can customize your backup plans to suit your needs; maybe 1 rule to keep some backups locally and another rule to move others to Azure. You only need to run the tool when you want to create, modify, or delete a rule.

The first screen asks where to monitor for backups. You can choose the whole computer or a certain location. In my tests I choose a specific location and set it to look for any .bak file.

image

The second screen asks where to save your backups, locally or to Azure. My initial backups are to local storage; later I test to Azure. At that point I needed to enter my Azure information, similar to when I set up a certificate in my last post.

image 

The third and final screen allows you to select encryption and/or compression. If you choose encryption you’ll need to enter a secure key. Notice that it only encrypts using the AES-256 algorithm; you don’t have a choice to go weaker or stronger.

image

And that’s it. After clicking finish you get a screen showing all the rules you set up.

image

Now let’s start the tests. First I backup to local storage using both the encryption and compression options. I backup using T-SQL with the only option being to write the backup to disk. The backups took a little longer, about 2 extra seconds on average.

The tool uses the .bak file to store metadata about the backup, where the backup is, whether the backup is compressed or encrypted, etc. Depending on the options you are using in your rule the backup will create an additional file that contains the data for the backup. Choosing encryption will create a file with a .enc extension. Choosing just compression you’ll get a .gz file. If you backup without either option you’ll see a .data file. And if you backup without any rule then a regular backup .bak file is created.

Testing backing up locally  with compression and encryption created a 2 KB .bak file and a .enc file about 56 MB.

If you open the .bak file in a text editor you can see the values. This file is not encrypted even if you set that option. One of my backups produced the following (I’ve changed the strings, you will see much longer random characters in your backups).

{
  “Version”: 1,
  “Azure”: null,
  “Local”: {
    “DataFileName”: “C:\\SQL2005\\Backups\\AdventureWorks_full.bak.enc”
  },
  “FileSize”: 169957888,
  “CompressedSize”: 57870000,
  “Trailer”: “Extremely long string here”,
  “Compressed”: true,
  “DataEncryptionDetails”: {
    “Salt”: “Salt string here”,
    “Iv”: “IV string here”,
    “Mac”: “Mac string here”
  },
  “TrailerEncryptionDetails”: {
    “Salt”: “salt string 2″,
    “Iv”: “IV string 2″,
    “Mac”: “Mac string 2″
  }
}

Next I wanted to test restoring the database. Again I just used a RESTORE T-SQL script with overwriting the existing database being the only option. Restores usually took about 30 seconds, much longer than backups without compression or encryption. You will need to keep the rule in use when you run your backup. If you delete it you won’t be able to restore it.

Restoring the backup to another server was similar. I copied the backup filed from my first instance and tried to restore, getting the following error:

Msg 3241, Level 16, State 0, Line 1
The media family on device ‘C:\SQL2005\Backups\AdventureWorks_full.bak’ is incorrectly formed. SQL Server cannot process this media family.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.

After I installed the new tool on my secondary instance and configured a rule that was identical to the rule on the source server the restores completed successfully. One thing I noticed about backing up locally. You can’t write the backup to the same file using WITH APPEND. If the file exists you’ll get an error. You’ll need to either move your local backups or create backups with unique names.

In my next post I’ll cover backing up to Azure with the new tool.