Do It Yourself PASS SQL Summit

It’s that time of year again, when people from all over the globe converge on Seattle for the 5 day PASS Summit 2014. I’ve been fortunate in that I was able to attend two past Summits, in Seattle 2011 and last year in Charlotte. Both were amazing experiences. I got to meet some of the same SQL experts whose blogs and tweets I read on a regular basis. PASS SQL Summit 2011 in Seattle WAI made new friends and met new DBAs from all over, learning from each and getting a new view on how others perform the same tasks I do. I had a chance to reconnect with friends I don’t get a chance to see very often. Every night there were dinners and other fun events, a great way to unwind after the sessions.

But don’t think the Summit is just a week long party, there’s some serious learning opportunities too.PASS SQL Summit 2014 in Charlotte NC The first two days are filled with pre-cons, not part of the regular Summit sessions but optional half day or all day deep dives into various topics. I haven’t been to one yet but I plan to at my next Summit.  Then there are the keynotes; 90 minute talks to start the day. Here you get a chance to learn Microsoft’s plans for the future of SQL and data. Finally there are the sessions themselves. Each day from Wednesday through Friday there are tons of talks, aligned by a common theme such as development or administration, and there’s usually 5 daily sessions in each track you can mix and match from, according on what your interests are.

Curses!

Unfortunately I’m not able to go to Seattle this year. I used to mope around during Summit week, cursing my bad luck and vowing to make it next year. But over the last few years my attitude has changed. I’ll still always vow to make the next Summit. Now, instead of feeling sorry for myself, I make my own SQL Summit. I use this time to concentrate on learning more about SQL. I concentrate on areas where my knowledge is lax (hello, extended events, I’m coming at you this week!). There’s a ton of free resources out there to use. In this post I’m going to share some of the ones I’ve found.

There’s quite a few just for the Summit:

  • PASStv will be showing content from the Summit, including the keynote sessions. They’ll also show some live sessions and interviews through out the conference. Last year they repeated the content in the evening and you can also find it on YouTube and I’m guessing they’ll do the same this time.  As of November 1st I don’t see a schedule yet, but watch the site for links. From the Summit home page go to Sessions >> MoreLearning and watch the PASStv link.
  • If you can’t watch the keynotes you can read about them later. A number of attendees live blog during the sessions. Brent Ozar (Blog | Twitter) will be one.I don’t know who else but I’m sure a quick web search will turn them up.
  • You can also follow live via twitter. Watch the #summit14 and #sqlpass hashtags, and there’s probably others as well.
  • There’s more links on the main PASS page. Go to Learning Center >> Session Recordings. Here you can see sessions from past Summits. There’s complete lists from 2012 back, also Best Of from last year. Go to Events >> 24 Hours of Pass to watch recordings of past events, some of which is the same sessions you’d see at the Summit. Go to Events >> PASS SQLRally for even more sessions.
  • Check out the list of SQL Saturday events. While there’s none scheduled until November 22nd, browse the list for one near you in the future. The same speakers that present at the Summit also attend these events and  often show the same content that they did at the Summit.
  • Finally, you can purchase a USB device that contains all of this year’s sessions. PASS will make it available to non-attendees after the Summit. I’ve bought these for the two Summits I attended and I plan to go through and watch them again this week.

If you want just SQL training there’s plenty of that available as well:

  • Microsoft Virtual Academy  is a great resource for training of Microsoft products. It’s a mix of live and recorded events and it’s not just about SQL. You can learn Azure, .NET, Office 365, and much, much more.
  • SQL Skills is a training and consulting company headed up by Kimberly Tripp and Paul Randal. A few years back they produced and released 40 hours worth of videos for the now retired SQL Master certification. While some of the content is dated it’s still worth watching. You can find the content here.
  • And while you’re still on the SQL Skills sight, be sure to sign up for the SQLSkills Insider Newsletter. You’ll get an email with Paul discussing something SQL related. There will also be links to short videos that come from the training sessions SQL Skills does for Pluralsight.
  • Speaking of Pluralsight, it’s worth checking out. This is the only pay sight on my list, but I’m lucky enough to have an annual subscription. But you can sign up for a free 10 day trial to view the content, see if it fits your needs. If you like it, you can sign up on a month by month basis.
  • Pragmatic Works has a series of free webcasts every Tuesday and Thursday. You can sign up to watch the webcasts live (they last about 60 minutes) or you can choose to watch past events. There’s a large list of sessions to choose from.

There’s more I could add but I think you get the picture. Simple web searches will turn up many more. Feel free to add yours in the comments, or contact me and I’ll add them.

And if you are lucky enough to go to Summit 2014, enjoy!

m4s0n501

Working with FileTable in SQL – Wrapup

I’m in the process of reviewing my notes and demos for my upcoming session, Introduction to FileTables, at SQL Saturday Orlando. There’s a few things I’ve been meaning to write about, so this post will be on some of the topics I’ve skipped. Some have come from questions I’ve be asked, others from just trying different thing.

Multiple Filegroups and FileTables

You can have more than one FileTable inside a database. In my demo I’ve created a few different tables in the same database. If I explore by mapping to the shared path I can easily see each table.image

However if you browse to where the filestream data is stored you can’t see which is which. Not a big deal since you shouldn’t be accessing the files from here anyway.

.image

You can have multiple filestream filegroups in a database, but they can’t share space. Each one needs it’s own directory. For example you can’t put 2 filestream filegroups at C:\MyFiles\Data, but you can create 1 at C:\MyFiles\Data1 and another at C:\MyFiles\Data2.

Databases usually only have one default filegroup.  If you’ve set up filestreaming for your database  then you’ll have 2 defaults, one for your tables and indexes and one for filestream data. Any FileTable you create will be created in the default filestream filegroup, unless you specify which filegroup to use. To do that you add the FILESTREAM_ON option…

CREATE MyTable AS FILETABLE FILESTREAM_ON Secondary

Security

In Windows you can specify who has access to what directories, including sub-directories. For example let’s say you have a directory called Documents, with Billing, Human Resources, and Orders sub directories. You can easily set permissions on the HR directory so only the HR department can access the files, and the same for Billing and Orders. But even though you create directories in FileTables, you can’t set permissions the same way. The directory is a record in the table, and if you have access to the table you have access to the directory.

That’s not to say that you can’t still deny access. There are a few different methods of securing your FileTable files.

The easiest way would be to create a separate FileTable for each department, or however you want to split your files. Then grant access on each table the same way you’d grant access to any other table in the database.    .  .

Another method would be to update the is_hidden column to 1 in the tale for each directory and/or file you want to block access to. When someone accesses the FileTable shared directory they won’t see those objects. However, if they have set the option to see hidden objects in Windows Explorer, they’d be able to see and access them. You can update the is_offline column to 1, and they can see the objects but not open them. Or update the is_readonly column to 1, and they can open the files but not modify them.

If you only want to store the files in a FileTable, you may want to look at the filestreaming options for your database. You can set FILESTREAM Non-Transacted Access to Off and the only access would be through the database. Or you can set it to ReadOnly to allow access but no modifications.

image

I may have another post on FileTable security in the future. In the meantime, I’d love to hear from you. Let me know if you’re using FileTables, what you like/dislike about them, any tips or tricks you’d like to share.

See you in Orlando in a few weeks!

Phantom FileTable Constraints

I was writing a wrap-up post to my series on FileTables when a co-worker found something interesting. SQL will insert a phantom FileTable constraint when you use SQL Server Management Studio to create a script of a non FileTable table that has a clustered index on a field named either path_locator or stream_id. This happens in SQL 2012 SP1 and SP2 as well as SQL 2014. I haven’t tested in SQL 2012 RTM yet.

What my friend was trying to do was to create a bridge table to link documents stored in a FileTable with other records in the database. He’s working in a local SQL 2012 SP1 Developer Edition environment. He already created a FileTable for the documents, and he thought he could store the stream_id in the bridge table, and there he defined a primary key constraint on the stream_id field. After a bit he went to create a script of his table. When he ran it to create a new table it it failed, because he wasn’t creating a FileTable.

I started to play with it, and it appears to happen only when you create either a clustered index or a primary key constraint on a field that is named either stream_id or path_locator  If you’ve worked with FileTables you’ll know that SQL creates those columns as part of the table definition. It also creates a primary key constraint on path_locator and a default constraint on stream_id.

Now before I show some code to recreate the issue I want to stress that SQL doesn’t actually create the FileTable constraint (it can’t because it’s not working in a FileTable). It just inserts it into a script that’s created by SSMS. But if it isn’t spotted and removed it will generate the same error. And maybe the person running the script won’t know the reason it’s failing.

The environment where I’m running this is a Windows Server 2012 R2 virtual machine that has a SQL 2014 Developer edition instance installed. I’ve tested with filestreaming on the instance both enabled and disabled, and with and without a filestream filegroup for the database.

CREATE DATABASE DemoDB
ON PRIMARY
(
NAME = 'DemoDB_data',
FILENAME = 'F:\SQL2014\DATA\DemoDB_data.mdf',
SIZE = 512000KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 102400KB
)
LOG ON
(
NAME = 'DemoDB_log',
FILENAME = 'F:\SQL2014\DATA\DemoDB_log.ldf',
SIZE = 102400KB,
MAXSIZE = 2048GB,
FILEGROWTH = 25600KB
);

USE DemoDB;

-- Script to create first table with path_locator as PK
CREATE TABLE dbo.DemoTable1(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable1 PRIMARY KEY CLUSTERED
(path_locator)
);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable1](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable1] PRIMARY KEY CLUSTERED
(
[path_locator] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[PK_DemoTable1]
)

-- Script to create second table with no PK
CREATE TABLE dbo.DemoTable2(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL
);

-- Script returned by SSMS, no added constraint
CREATE TABLE [dbo].[DemoTable2](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL
) ON [PRIMARY]

--  Script to create 3rd table and add clustered index
CREATE TABLE dbo.DemoTable3(
path_locator hierarchyid NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL
);

CREATE CLUSTERED INDEX clus_DemoTable3 ON dbo.DemoTable3
(path_locator);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable3](
[path_locator] [hierarchyid] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[clus_DemoTable3]
)

-- Script to addd 4th table, change datatype to int, add as PK
CREATE TABLE dbo.DemoTable4(
path_locator int NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable4 PRIMARY KEY CLUSTERED
(path_locator)
);

-- Script returned from SSMS, adds FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable4](
[path_locator] [int] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable4] PRIMARY KEY CLUSTERED
(
[path_locator] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_PRIMARY_KEY_CONSTRAINT_NAME=[PK_DemoTable4]
)

-- Script to create fifth table with stream_id as PK
CREATE TABLE dbo.DemoTable5(
stream_id uniqueidentifier NOT NULL,
document_name varchar (50) NOT NULL,
record_link char(10) NOT NULL,
CONSTRAINT PK_DemoTable5 PRIMARY KEY CLUSTERED
(stream_id)
);
-- Script returned from SSMS, adds FILETABLE_STREAMID_UNIQUE_CONSTRAINT_NAME
CREATE TABLE [dbo].[DemoTable5](
[stream_id] [uniqueidentifier] NOT NULL,
[document_name] [varchar](50) NOT NULL,
[record_link] [char](10) NOT NULL,
CONSTRAINT [PK_DemoTable5] PRIMARY KEY CLUSTERED
(
[stream_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
WITH
(
FILETABLE_STREAMID_UNIQUE_CONSTRAINT_NAME=[PK_DemoTable5]
)

Working with FileTable Constraints

I’ve been playing a little more with FileTables lately, and I thought I’d share some of what I’ve seen when applying my own constraints. I’m testing with a SQL 2014 Developer Edition instance, but I can recreate everything here in 2012 editions too.

When you make a FileTable SQL handles the creation of all objects, including constraints.You can’t drop or modify these, but you can add your own. In this script I create a constraint on a FileTable to limit the type of files I can insert to just .txt files.

ALTER TABLE dbo.LimitType  WITH CHECK ADD  CONSTRAINT CK_LimitType_FileName CHECK (file_type='txt');
GO

Now that I have my constraint let’s test it out. Inserting a blank .txt file works, as expected…

INSERT INTO dbo.LimitType(name, file_stream)
SELECT 'InsertedTextFile.txt', 0x

…but when we try to add a blank image file we’re blocked by our constraint…

INSERT INTO dbo.LimitType(name, file_stream)
SELECT 'InsertedImageFile.jpeg', 0x

Msg 547, Level 16, State 0, Line 5
The INSERT statement conflicted with the CHECK constraint “CK_LimitType_FileName”. The conflict occurred in database “DemoDB”, table “dbo.LimitType”, column ‘file_type’.
The statement has been terminated.

…and if you try to copy it into the shared folder you’ll get a different message.

image

That’s not a problem, it’s what we wanted.  However, there is a problem if you try to limit the size of the files you store in your FileTable. Creating a constraint on the cached_file_size doesn’t work. You can still insert files greater than the limit you set in your constraint.

To test I’m going to add a check constraint to another FileTable that is supposed to limit the file size to 1 MB.

ALTER TABLE dbo.LimitSize WITH CHECK ADD CONSTRAINT CK_LimitSize_cached_file_size CHECK (cached_file_size <= 1000);
GO

I’ve copied the .iso file that contains the install bits for SQL 2014 Developer Edition, along with a few small txt files. This is what the shared folder looks like.

image

Below, on the left, is what the .iso file properties should look like. On the right is what the properties look like from the FileTable copy.

image  image

If I queried the tables I’d see the cached_file_size as 0 there as well.

There is already a Connect item for this, created when SQL 2012 came out. But it was closed and marked as won’t be fixed. I can verify that this is still happening in the RTM version of SQL 2014, which is where I’ve done my tests.

Copying duplicate files with SQL FIleTables

I was going to write about security on FileTables in SQL but I thought I’d write a quick post about creating duplicate files first. This topic came up as part of a general discussion I was having recently on general FileTable usage.

Sometimes you want to make a copy of a  file.in the same directory that you store the original. Usually you’re editing a large file and you want a copy to revert back to in case you need to start over. In Windows it’s pretty easy; right click the file in question, choose copy, right click in the directory and choose paste. You’ll get a second copy, usually with “Copy” appended somewhere inside the file name.

But what if you want to do the same thing with a file stored inside a FileTable in a SQL database? If you have access to the shared directory of the table, the process is the same. Depending on your method you’ll get a screen similar to this one…

image

…then Windows will create the second file with a similar name, like this.

imageIf you query the database you’ll see both records.

However, what if you try to insert a file with the same name and directory as one that already exists? You can’t do it. There is a unique clustered constraint on the name and parent_path_locator fields. The parent_path_locator is a hierarchy id data type that corresponds to the directory where the file is located. You’ll need to either rename the file you’re inserting or update the existing record.

Of course it’s easy enough to do. Here’s one way of changing the name of the file you’re inserting..

DECLARE @fileName    VARCHAR(25)

SELECT @fileName = 'SF Sailboat.jpg'

IF EXISTS(SELECT 1 FROM dbo.Documents WHERE name = @fileName AND parent_path_locator IS NULL)
INSERT INTO dbo.Documents (name, file_stream)
SELECT 'Copy of ' + @fileName AS name, CAST(bulkcolumn AS VARBINARY(MAX))
FROM OPENROWSET(BULK 'C:\FileTable Demo\Photos\SF Sailboat.jpg', SINGLE_BLOB) AS file_stream

… and here’s a similar way you can rename the existing file using the MERGE command.

DECLARE @image1        VARBINARY(MAX)
DECLARE @fileName    VARCHAR(25)

SELECT @image1 = CAST(bulkcolumn AS VARBINARY(MAX))
FROM OPENROWSET(BULK 'C:\FileTable Demo\Photos\SF Sailboat.jpg', SINGLE_BLOB) AS x

SELECT @fileName = 'SF Sailboat.jpg'

MERGE dbo.Documents AS t
USING (SELECT * FROM dbo.Documents WHERE name = @fileName AND parent_path_locator IS NULL) AS s
ON s.stream_id = t.stream_id
WHEN MATCHED THEN
UPDATE SET t.name = 'Copy_' + t.name
WHEN NOT MATCHED THEN
INSERT (name, file_stream) VALUES(@fileName, @image1);

Now when we navigate to the file share we’ll see all 4 copies – the original, the one we copied using Windows, and the 2 we inserted with T-SQL.

image

That’s easy enough. Just remember, if you insert using T-SQL you’ll need to handle the duplicates yourself.

Where’s my FileTable space?

Last year I wrote a number of posts about using a new feature of SQL 2012, FileTables. Very briefly, FileTables are just tables that store files by using FILESTREAMING. If you’re interested you can read my other posts to learn more.

Recently a friend approached me with a question. He had started playing with FileTables, using it to store pictures he had taken over the years. He had inserted a large number of photos, probably 200 – 300 MB worth. When he queried sys.sysfiles to check on the space he wasn’t seeing any increase; to him the size of the database was the same. Where was the space?

He was just looking in the wrong place. The view sys.sysfiles doesn’t show the size used by the filegroup files.  

To help explain I set up another little example. I’m using a virtual server that’s running SQL 2012 Developer Edition and I’ve created a DemoDB that has both the database and FILESTREAM filegroup on the E drive. Then I wrote another query that joined sys.sysfiles with sys.database_files to see the full picture. The space when the database had no files inserted looks like this;

image

image

Now to insert a large file. I choose to insert the iso file that contained the media for SQL 2014 CTP2, about 2.5 GB. After the insert I reran my script, and now the space looks like this;

image

image

And to further prove the point, I can see the file in the filestream storage;

filestream

So, you just need to know where to look. This might help you if you ever need to provision a FILESTREAM database or set up FileTables. You’ll need to remember that there’s enough space in your FILESTREAM directory.

And one word of warning. Never try to manipulate the files inside the filestream storage. You can cause some big problems in your database if you do. Resist the urge and access the files through client applications or FileTable shares. 

Last thing. I’ll be speaking about FileTables at #SQLSatMadison on March 29th. There’s still time to register!

I’m speaking at #SQLSatMadison

Have you registered for SQL Saturday Madison yet? If not, do it soon, you’re almost out of time. The date is March 29th and it’s coming up fast, less than 2 weeks to go!

There’s a lot to see at this event. Start with your choice of 3 full day pre-conference sessions on Friday…

…then choose from 40 awesome sessions from some of the top SQL speakers on Saturday. Don’t take my word for it, check them out here. Or download the Guidebook app to your iPhone or Android device. There’s lots to choose from, whether you’re looking at beginning level sessions or advanced. And I’ll be making my SQLSaturday speaking debut, An Introduction to FileTables. If you do come to my session be sure to come on up and introduce yourself.

And I haven’t even mentioned Cows of a Spot yet, where you sit and have lunch with other attendees that share similar SQL interests. Want to learn more about report design? There’s a table for you.

Also, if you are coming and you’ve been to one of the past SQL Saturdays in Madison, be aware that the location has changed. It’s now at American Family Insurance Training Center, 6000 American Parkway, Building A, Madison WI, 53783.

See you there!

Studying for SQL 70-462 Exam

At this point in my career I’m not sure I need to take anymore Microsoft SQL certification exams. I’ve already taken them for SQL 7 and SQL 2005, plus the 70-448 Business Intelligence exam on Business Intelligence. there’s not much more I can accomplish by re-taking them for SQL 2012, especially with the 2014 product launch coming soon.

However that doesn’t mean that I want to stop learning. On the contrary. Each new release has loads of new features that need to be looked at, tested, played with and understood. I’ve always found that studying for the exams was a good way to learn the material.

So to that end I’m going to start working towards taking the 70-462 exam, Administering Microsoft SQL Server 2012 Databases. Who knows, when I’m done maybe I’ll end up taking the test anyway.

As I did when I was studying for the 70-448 exam, I’ll put my notes into a series of blog posts. I’ll also place links to any resource I run across that helps. Please feel free to comment or discuss any points I bring up. We can all learn from each other.

One last thing. Don’t expect to see any test dumps on this site, or any links to them. The best way to learn is to know the material. Believe me, you’re not fooling anyone when you claim to be certified but you can’t answer simple questions when interviewing for a new position.

Resources I’ll be using

The website for the 70-462 exam. Here you’ll see what the exam objectives are, making it easy to concentrate on one area of the exam at a time.
http://www.microsoft.com/learning/en-us/exam-70-462.aspx 

The Microsoft Press Training Kit will be very useful. Not only does it cover the exam objectives it has a testing engine you can install to test your progress and see which areas of the exam you need to concentrate on. Personally I like to buy my technical books from O’Reilly. If you buy the printed copy you can buy the ebook for another $5, and if you buy the ebook you get the book in epub, mobi, and pdf formats AND you get lifetime updates.
http://shop.oreilly.com/product/0790145345134.do

Microsoft Virtual Academy has live Jump Start events for different Microsoft products. There was one for Administering SQL back in October of 2013. Fortunately these are taped so you can still register and watch.
http://www.microsoftvirtualacademy.com/training-courses/administering-microsoft-sql-server-2012-jump-start#?fbid=K9-EWOQ1LfT

Pluralsight is a fantastic site for SQL training, especially now that SQLSkills is adding content. All that content is great and I highly recommend watching it. But a few months ago Pluralsight added Train Signal courses, and they have a 3 part series that focuses on the 70-462 exam. You can try Pluralsight for free for a few weeks to sample the content. If you like it and find it valuable the cost is very reasonable, starting at $29 per month.
http://www.pluralsight.com/training/

I may be adding to this list later. If you have any resource you think would help add it to the comments section.

Good luck!

What would you do?

I’m going to lay out a scenario for you. It involves dealing with a SQL database that has been marked as suspect, and it happened on a server I’m responsible for a few days ago. After I describe the conditions I’ll go over the steps that I took to resolve the issue. But I’m always looking to improve my processes, so I’d like to learn from this. I’d like to hear from you: what I did right, what I did wrong, what I should have done, and what you’ve done in similar situations.

The Scenario

Early one morning you get a series of emails from co-workers letting you know that one of the SQL servers you are responsible for is not allowing any applications to connect. This is a SQL 2005 instance with Service Pack 4 running on Windows 2008 R2 Enterprise Edition running SP1. You try to connect using Management Studio and fail. You remote into the server to see what the problem is. Checking the services you see all SQL services are running. You open Management Studio on the server and you see one of the databases marked as suspect.

Next you check the SQL logs. You see the problem began about 5 1/2 hours earlier. At that time the SQL Agent job that performs a full backup of this database was running. About 15 minutes into the backup there are a number of warnings that SQL has encountered occurrences of I/O requests taking a long time. However there was one such warning for each database on the server. This is very early in the morning, the only activity at the time should have been the full backup. Immediately after the warning was an entry that the database log file was corrupt. Checking the Windows event logs shows that, for a few seconds at least, the SAN drive where that log file resides came offline.

You have full and log backups from the previous day so you can restore the database if needed. The final log backup took place 3 1/2 hours before the corruption occurred. You know of some database activity that took place after the last backup and you know it can be reproduced if needed. But there still is a possibility of at least some data loss.

What I did

My first thought was to try and grab a final log backup so I could restore to just before the log corruption occurred. However I had forgotten that the full backup was running when disaster struck. Indeed, according to current server activity, that backup job was still running almost 6 hours later. This job should have completed in 45 minutes. The backup file was about 1/3 of what it should have been.

Next I restarted the SQL service. I think I was hoping that whatever marked the log as corrupted would somehow be rolled back. Of course I was wrong; the database was still marked as suspect and the log was still marked as corrupted.

Now I needed to see if the database was also corrupted. I ran DBCC CHECKDB but it gave me an error. I didn’t realize I couldn’t run it against a database marked as suspect. So I put the database in emergency and single user mode and ran DBCC again. This time it ran, but unfortunately it also showed there were allocation and consistency errors. The good news was that the only table listed was a table that I knew only held staging data. The bad news was that there appeared to be other objects involved I couldn’t see.

Now here’s where I gambled. I ran DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS. I knew that option is only recommended as a last resort. But I felt I was still protected somewhat. I wanted to keep data loss at a minimum, and I wasn’t worried about losing just the unneeded staging data. I still had the backup files that I could recover from if I was wrong, and at that point all I would have lost would have been time.

And I won. After running CHECKDB with REPAIR_ALLOW_DATA_LOSS I was able to bring the database back online. The corrupted log was rebuilt and the CHECKDB output only showed the single table as being affected. Restoring the previous backups to a different server and comparing record counts showed no missing records. Spot checks didn’t turn up any bad data. 

Lessons Learned

Before anything else I should mention that I informed my managers beforehand of each step I took, and the possible consequences of each of my actions. They were OK with my decisions, even though I could have had the database back up much quicker if I had just restored the backups. Also, this is not my first database disaster. On two occasions I was able to recover SQL instances from SAN failures by restoring from backups, including system databases. I’ve only had to deal with one other suspect database, way back in SQL 7 days. And that was also fixed by restoring from a backup, but then a full day of data was lost. I’ve had plenty of other moments of panic when servers rebooted without warning.

  • I’ve found a hole in the timing of my backup jobs. There was too large a gap between the final log backup and the full backup. Indeed, there was also a big hole after the full backup and the first log backup. Database usage has changed since I created the schedule and I hadn’t been paying attention. That has changed and my first act after all of this was to plug those gaps. I’m also going to set reminders to check the schedules every 6 months.
  • I’m not comfortable with REPAIR_ALLOW_DATA_LOSS and I don’t know if I’d use it again. True, nothing was lost and I did have the backup files to fall back on, but even so I feel I got lucky.
  • I need to find a better option to repair the log. A better solution may have been to just drop the damaged table and repair the log. I’m going to try to recreate a damaged table and log to experiment with.
  • I know Paul Randal is THE expert on DBCC. I need to go back to his blog on the various options available.

Now It’s Your Turn

I want to hear from you. Email me, leave comments, contact me via twitter, however. I want to know what I steps I’ve missed. I’d love to learn from your experiences.

Most of all, if this happens again, I want to be prepared.

Using PowerShell to check on SQL services

This is just a quick little post to give you another reason to learn PowerShell.

Recently we ran different updates on our different servers across our networks. Some were Windows patches, some were for anti-virus updates, others were for different system related software. But some of them had one thing in common; they required the servers to be rebooted. Normally that’s not a problem. But in our case most SQL services are configured to start up manually when the server does. This is to limit any possible issues with SQL when the server crashes and restarts unexpectedly.

So let’s say that 10 of your servers were restarted and you need to check what the status of each SQL service on those 10 servers is, and if they are stopped then you need to start them. It’s time consuming to connect to each, one by one. Wouldn’t it be nice if there was an easy way?

As you’ve probably guessed by now, there is. PowerShell. If ever there was a reason for a SQL DBA to learn PowerShell it’s for cases like this, administering multiple instances.

Back to my issue of SQL services. Open up a PowerShell console and run a single line to check the status of all SQL services

Get-Service -computername SERVER1 | Where-Object{$_.DisplayName -like "*SQL*"} | format-table -property Status, Name, DisplayName

That’s good, but not all that we want. Remember, we’re looking at 10 instances, so we don’t want to run this 10 times. What we do instead is to enter a comma separated list of servers for the computer name. And to see which service is on what server we can add the MachineName property to our table.

Get-Service -computername SERVER1, SERVER2, SERVER3 | Where-Object{$_.DisplayName -like "*SQL*"} | format-table -property MachineName, Status, Name, DisplayName

Now that you can see the status it’s just a matter of stopping, starting, or restarting the right service. In the examples below I’m working with the SSRS service on SERVER1.

Start-Service -inputobject $(Get-Service -computername SERVER1 -name ReportServer)

Stop-Service -inputobject $(Get-Service -computername SERVER1-name ReportServer)

Restart-Service -inputobject $(Get-Service -computername SERVER1 -name ReportServer)

See how simple that is? Go out and pick up a little PowerShell. It will make your life as a DBA much easier.