Saturday, August 13, 2011

Book Review–Microsoft SQL Server 2008 R2 Administration Cookbook

 

Brief Review:

Microsoft SQL Server 2008 R2 Administration Cookbook is a collection of recipes that help database administrators with day-to-day tasks.  The recipes from this book have a broad spectrum of topics, including  Piecemeal Restores, Optimistic Concurrency, Data-Tier Applications, Master Data Services, Replication, Multi-Server Management, Utility Control Point etc..  This book is a great reference on how to accomplish daily administrative tasks and challenges.

Detailed Review:

Microsoft%20SQL%20Server%202008%20R2%20Administration%20CookbookThere are few books available that can keep your technical curiosity engaged, while presenting the information in a easy to understand manner.  Microsoft SQL Server 2008 R2 is one such book.  This book delivers quality content in a simplistic manner that will benefit all database administrators.  My favorite thing about this book is that it delivers real world solutions to real world problems.  In today’s IT environment, the database administrator can be a seasoned IT professional with 10 to 20 years of database experience, or what some people call an accidental DBA.  An accidental DBA is a database administrator that assumes the role of database administrator, either through necessity or circumstance, and usually has little to no real world experience as a database administrator.  This book caters to both the seasoned and the accidental database administrator. 

As the name implies, Microsoft SQL Server 2008 R2 Administration Cookbook focusing on the administrative side of SQL Server.  This book contains over 70 quality recipes that database administrators can use in their day-to-day duties.  The great thing about this book is the diversity of the recipes.  The author does a great job of covering SQL Server administration from a lot of different angles using different immerging technologies.  The recipes include Resource Governor, Multi-Server Administration, Business Intelligence, High Availability and more.  There is an enormous amount of information spanning a broad spectrum of topics. Without sounding cliché, there is something for everybody, in this book.

The author’s writing style is clear and concise.  You do not have to be a Microsoft Certified Master to understand the fundamental concepts presented in each recipe.  Another thing I like about this book is the code samples.  I cannot tell you how many times I have cracked open a book and have to analyze and reread the code sample to understand what it is doing.  This book does not have that problem. The code samples are very easy to understand and follow.  By keeping the samples simple, the author delivers a better experience to the reader.

My favorite chapters in this book are Chapter 7 – Managing The Core Database Engine, Chapter 5 – Managing Core SQL Server 2008 R2 Technologies because these chapters contain a lot of useful information that a lot of database administrators do not know about, including Utility Control Point, SQL Azure, StreamInsight, and Master Data Services. 

The Verdict: 5/5

I recommend Microsoft SQL Server 2008 R2 Administration Cookbook to any IT Professional currently working as a database administrator or wants to get into database administration, with SQL Server 2008 R2.  This book is a great administrative reference that you will want to keep close to your desk, regardless of experience level.

Tuesday, March 15, 2011

TSQL Challenge Quiz: Win an IPAD And Bragging Rights!

First and foremost, sorry for the long delay between posts, I have recently switched jobs and my laptop crashed.  I am just now getting back into a steady routine and will be posting more regularly in the coming weeks, so stayed tuned.  So now that I have the formalities out of the way….. How does a FREE IPAD sound?

TSQLChallenges.com is  currently running TSQL Quiz 2011.  TSQL Quiz 2011 will be running a TSQL SQL Server question each day in March 2011.  Each question is orchestrated by SQL Server experts and community leaders to address SQL Server problem areas and/or best practices.  TSQL Quiz is a great opportunity to interact with fellow database professionals, strengthen your SQL Server knowledge, and most importantly win an IPAD, compliments of Red Gate Software ! Even if you feel like you are not online enough to compete, do not fret because the questions remain open for 30 days. You really do not have anything to lose.

If you are interested in participating in TSQL Quiz 2011, you can start by clicking http://beyondrelational.com/quiz/SQLServer/TSQL/2011/default.aspx.  All the information you need is provided on the site.

If you want to help TSQL Challenges by becoming a Quiz Master, you can click here, http://beyondrelational.com/quiz/nominations/0/new.aspx

BeyondRelational_TSQL_Quiz

Good luck and happy coding.

Thursday, August 19, 2010

SQL Server Parameter Sniffing

Today on the MSDN TSQL forums I was asked about a performance problem and to me the problem seemed to be directly related to parameter sniffing.  The poster then stated that he is not using stored procedures, so it cannot be a parameter sniffing .  Truth be told there are a lot of misconceptions surrounding parameter sniffing.  The best way to understand parameter sniffing is to understand why it happens. 

Parameter sniffing occurs when a parameterized query uses cached cardinality estimates to make query plan decisions.  The problem occurs when the first execution has atypical parameter values.  For each subsequent execution the optimizer is going to assume the estimates are good even though the estimates may be way off.  For example, say you have a stored procedure that returns all id values between 1 and 1000.  If the stored procedure is executed with this large range  of parameter values, the optimizer is going to cache these atypical values, which indirectly causes the optimizer to under estimate cardinality.  The problem is a typical execution may only return a few rows.  This “sniffing” can cause queries to scan a table oppose to seek because the optimizer is assuming inaccurate cardinality estimates.  The easiest way to tell if this problem is occurring in your environment, is to look at the query plan XML. Inside the query plan XML, you will see something similar to the code snippet below:

<ColumnReference Column="@1" ParameterCompiledValue="(1000)" ParameterRuntimeValue="(10)" />
<ColumnReference Column="@0" ParameterCompiledValue="(1)" ParameterRuntimeValue="(1)" />

In the snippet above, the query plan is assuming that column@1 has a value of 1000 and column @0 has a value of 1, while the actual runtime values are 10 and 1 respectively. 

There are three different methods to incorporate parameterization in SQL Server, auto/simple parameterization, stored procedures, and dynamic TSQL (executed with sp_executesql).  One of the most common misconceptions I have seen surrounding parameter sniffing is thinking that it is limited to stored procedures.   Now that we know more about parameter sniffing, lets have a look at an example.  I will be using the AdventureWorks database for my example.  In this example, I will select a few rows from the Sales.SalesOrderHeader table and then issue the same query, but return a lot more rows.

Code:

SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between 1 and 10
SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between 1 and 500

Query Plan:

image

As you can see, the query plan changes based on the number of rows returned.  The reason being is in this case is the optimizer hit a tipping point where the cost of the key lookup is greater than an index scan.  Let’s see what happens when a parameter sniffing problem occurs.

DBCC freeproccache
GO
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End;',N'@Start INT,@End INT',@Start=1,@End=500
GO
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End;',N'@Start INT,@End INT',@Start=1,@End=10
GO

Query Plans:

image

The execution plans are identical for both queries even though the number of rows greatly decreased.  This is a parameter sniffing problem. This problem occurs because we executed and cached the atypical execution that is returning customerid values between 1 and 500.  We can look into the execution plan and see the compiled parameter values and we can look at the execution plan estimated rows to validate.

<ColumnReference Column="@End" ParameterCompiledValue="(500)" ParameterRuntimeValue="(10)" />
<ColumnReference Column="@Start" ParameterCompiledValue="(1)" ParameterRuntimeValue="(1)" />

What can you do to solve the parameter sniffing problem?  You have a few options that you can use to solve the parameter sniffing problem.  You can use a local variables, this makes the optimizer use the density of the table to estimate cardinality, option recompile, or use the optimize for hint. 

--Declare local variables
EXEC sp_executesql N'declare @dynStart INT,@dynEnd INT; SET @dynStart=@Start; SET @dynEnd=@End;SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @dynStart and @dynEnd;',N'@Start INT,@End INT',@Start=1,@End=500
EXEC sp_executesql N'declare @dynStart INT,@dynEnd INT; SET @dynStart=@Start; SET @dynEnd=@End;SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @dynStart and @dynEnd;',N'@Start INT,@End INT',@Start=1,@End=10

--Solution Using option(recompile)
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(RECOMPILE);',N'@Start INT,@End INT',@Start=1,@End=500
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(RECOMPILE);',N'@Start INT,@End INT',@Start=1,@End=10

--Solution Using OPTIMIZE FOR HINT
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start=1,@End=10));',N'@Start INT,@End INT',@Start=1,@End=500
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start=1,@End=10));',N'@Start INT,@End INT',@Start=1,@End=10

--Solution Using OPTIMIZE FOR UNKNOWN (SQL 2008 only)
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start UNKNOWN,@End UNKNOWN));',N'@Start INT,@End INT',@Start=1,@End=500
EXEC sp_executesql N'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start UNKNOWN,@End UNKNOWN));',N'@Start INT,@End INT',@Start=1,@End=10

Now each of the above methods will help alleviate some of the problems associated with parameter sniffing, but that does not mean it will give you an optimal query plan.  You should test each of the methods to see which makes the most sense for your environment.  If none of these options perform well, another option is to use control flow logic to execution different variations of the TSQL or stored procedure, allowing for more control over which execution plan gets used.  The thing to remember here, is you have to cater to your customers and their usage patterns to ultimately decide which solution is best for your environment.

Until next time happy coding.

Saturday, August 7, 2010

TSQL Challenge 35 Available

TSQLChallenges.com recently released challenge 35, “Find the total number of 'Full Attendees' in each 24 HOP Session”.  For those not familiar with TSQL Challenges,  TSQL Challenges is a website that creates and evaluates SQL Server puzzles each and every week.  The goal of TSQL Challenges is to increase TSQL best practice awareness and to showcase solutions to common and sometimes uncommon TSQL problems, using set based programming logic. Not only do you compete in challenges, but more importantly TSQL Challenges gives you the opportunity to interact with your peers.  Essentially it is a mechanism to give back to and learn from the SQL Server community.  If you haven’t had a chance to stop by and checkout TSQL Challenges, I highly recommend you do so, TSQLChallenges.com.

So…. What is the challenge?  The challenge should you choose to accept it is to count the number of attendees that fully watched each session at 24 hours of PASS.  Note:  this data is artificial and does not reflect real 24 hours of PASS metrics.  If you love puzzles, TSQL, and PASS this challenge is for you. 

Good luck and happy coding.

Saturday, July 17, 2010

Order By Does Not Always Guarantee Sort Order

A week or so ago, I saw an interesting question on the MSDN SQL Server forums and I thought it would make a great blog post.  The forum question asked about an Order By clause that does not guarantee sort.  The query was really two queries merged together via a UNION.  The OP noticed that when UNION ALL was used the sort order was different than the same query using UNION, even though an ORDER BY clause was used.  If you are familiar with UNION and UNION ALL, you know that UNION has to perform a distinct sort and remove duplicates, while UNION ALL does not.  The query plan between the two queries is identical, other than a sort vs. a distinct sort.

Here is a small scale repro of the problem.

SET NOCOUNT ON;
GO

IF OBJECT_ID('tempdb.dbo.#t') IS NOT NULL
DROP TABLE #t;
GO

CREATE TABLE #t(id INT, col CHAR(1),col2 BIT);

INSERT INTO #t VALUES (1,'a',1)
INSERT INTO #t VALUES (1,'a',0)
GO

SELECT id, col, col2
FROM #t 

UNION ALL

SELECT id,col,col2
FROM #t 
ORDER BY id, col
GO

SELECT id,col,col2
FROM #t 

UNION

SELECT id,col,col2
FROM #t 
ORDER BY id, col
GO

/*
id          col  col2
----------- ---- -----
1           a    1
1           a    0
1           a    1
1           a    0


id          col  col2
----------- ---- -----
1           a    0
1           a    1
*/

As you can see that the order of col2 is not the same between the two queries. The root of this problem is I am using columns that contain duplicates in the ORDER BY clause and col2 is not included in the ORDER BY clause.  I can never guarantee the order of the “duplicate” rows because I cannot guarantee how the optimizer will build and execute the query plan.  In this example, the UNION query sorts by all columns in the select list, which includes col2, while the UNION query does not. You can guarantee that the order will be  id, col, but the col2 value order may vary between executions.  You will need to add col2 to the ORDER BY clause to guarantee the sort.

SELECT id, col, col2
FROM #t 

UNION ALL

SELECT id,col,col2
FROM #t 
ORDER BY id, col, col2
GO

SELECT id,col,col2
FROM #t 

UNION

SELECT id,col,col2
FROM #t 
ORDER BY id, col, col2
GO

/*
id          col  col2
----------- ---- -----
1           a    0
1           a    0
1           a    1
1           a    1


id          col  col2
----------- ---- -----
1           a    0
1           a    1
*/

I thought this was a good reminder to all that even with an ORDER BY clause specified, the order of the rows may not be what you expect.  You have to use an ORDER BY clause and make sure all the columns you want to sort by are listed in the ORDER BY.

Until next time happy coding.

Thursday, July 8, 2010

Breaking the Print character limit

I got some grief regarding my SQL Meme post about PRINT. I specifically stressed that I believe PRINT needs a make over because its inability to handle max data types, http://jahaines.blogspot.com/2010/05/sql-meme-tagged-5-things-sql-server.html.  I know I am not the only person out there that feels this functionality is a bit antiquated. In this post, I will provide a great alternative to PRINT.  I have been using this method for the past year or so to print really long dynamic SQL.  The concept is very simple.  Instead of printing the dynamic SQL to the messages tab, I will be converting the dynamic SQL to XML.  XML is a great alternative because it keeps the formatting and can hold up to 2 GB of data.  The key component here is naming the column [processing-instruction(x)].  This column name [processing-instruction(x)]sends special XML instruction allowing the text to be converted, along with any special characters.   It should be noted that whatever value you put in parenthesis will be incorporated in the XML tags, in my case “x”.

Let’s have a look at how this works.

DECLARE @sql VARCHAR(MAX)
SET @sql =
    CAST(REPLICATE('a',5000) + CHAR(13) AS VARCHAR(MAX)) +
    CAST(REPLICATE('b',5000) + CHAR(13) AS VARCHAR(MAX)) +
    CAST(REPLICATE('c',5000) + CHAR(13) AS VARCHAR(MAX)) + 
    'd'
    
SELECT [processing-instruction(x)]=@sql FOR XML PATH(''),TYPE

image

Pretty simple right!!! There really is not much to this technique.  It is very simplistic and gets the job done.  If you find yourself getting aggravated with makeshift PRINT solutions, come on over to the dark side and get your XML on.

Until next time, happy coding.

Friday, June 11, 2010

Why Are Transactions Blocked All Of A Sudden?

Have you ever had a query that runs perfectly fine one day and the then all of a sudden starts getting bombarded with blocking transactions?  Believe it or not this is not than uncommon an occurrence and more interestingly can occur when no changes occur in the schema at all!  Unbeknownst to most, you are susceptible to an imaginary data distribution tipping point that can go south at any point in time, if your application creates a specific type of workload.   Let’s dig deeper to find out what causes this problem.

I will start off by creating some sample data.

USE [tempdb]
GO
IF EXISTS(SELECT 1 FROM sys.tables WHERE NAME = 'TestData')
BEGIN
    DROP TABLE dbo.[TestData];
END
GO

CREATE TABLE dbo.TestData(
RowNum INT PRIMARY KEY CLUSTERED,
SomeBit INT,
SomeCode CHAR(2)
);
GO

INSERT INTO dbo.TestData
SELECT TOP 5000 
    ROW_NUMBER() OVER (ORDER BY t1.NAME) AS RowNumber,
    CASE WHEN ROW_NUMBER() OVER (ORDER BY t1.NAME) %5 = 0 THEN 1 ELSE 0 END AS SomeBit, 
    'A' AS SomeCode
FROM 
    Master.dbo.SysColumns t1,
    Master.dbo.SysColumns t2
GO

Nothing new here, just a table with some data.  Now I will begin a transaction and run a simple UPDATE statement.

BEGIN TRANSACTION

UPDATE dbo.TestData
SET SomeCode = 'B'
WHERE somebit = 0

Now run a simple select statement against the table in a new query window.

SELECT RowNum FROM dbo.TestData WHERE RowNum = 1000

/*
RowNum
-----------
1000
*/

The query returned a resultset, just as we thought it would.  What I wanted to show here is that we currently do not have a blocking problem and users can still access the rows that do not have a SomeBit value of 0.   SQL Server will try to take the lowest or most granular lock possible when satisfying a query, such that other queries can still do what they need to.  Obviously there are limitations to this and SQL Server reacts differently based on system responsiveness and pressures.    You can verify that you cannot access a row with a SomeBit value of 0 by changing the predicate to a number that is not divisible by 5.

Okay…. big deal…. you probably already know this, but lets suppose that your manager tells you to migrate data from an existing system into this table.    The flat file has a measly 3000 rows is it, so its overall impact should really have no implications on our system right???? Let’s find out.  Please note that this problem can manifest itself by hitting a tipping point of data also… Meaning it does not take a huge influx of data to cause this problem, and this is why this problem can appear seemingly out of nowhere.

I will load the data with the same insert statement to mimic our data migration.

INSERT INTO dbo.TestData
SELECT TOP 3000 
    CASE WHEN ROW_NUMBER() OVER (ORDER BY t1.NAME) %5 = 0 THEN 1 ELSE 0 END AS SomeBit, 
    'A' AS SomeCode
FROM 
    Master.dbo.SysColumns t1,
    Master.dbo.SysColumns t2
GO

Now let’s run our same update transaction again.

UPDATE dbo.TestData
SET SomeCode = 'B'
WHERE somebit = 0

Now run the query below in a new window.

SELECT RowNum FROM dbo.TestData WHERE RowNum = 1000

What you should see is the query is now blocked.  Keep in mind that nothing changed on this server, except new data was inserted into the table.  Any ideas why this problem is now occurring?  If you haven’t figured it out yet, this problem is caused by lock escalation, http://msdn.microsoft.com/en-us/library/ms184286.aspx.  When SQL Server meets certain thresholds or memory pressure exists, SQL Server will escalate locks.  Lock escalation unfortunately goes from very granular locks to not so granular locks.  Lock escalation will go straight to a table lock from a rid/key or page lock.  What does this mean?  It means that SQL Server can save memory by acquiring a less granular lock, oppose to a lot of granular locks.  You can look at the transaction locks for each of the UPDATE statements to verify lock escalation is occurring.

Note: I removed intent locks as those locks from the resultset.

SELECT * 
FROM sys.[dm_tran_locks] 
WHERE [request_session_id] = xx
AND [request_mode] = 'X'
AND [request_mode] NOT LIKE 'I%'

 

Initial Update Query

image 

Lock Escalated Update Query

image

If you run a lot of big DML transactions in your environment and still require concurrency, you may want to pay careful attention to lock escalation; otherwise, you may experience an abnormally large number of blocks.  While lock escalation  is great in most cases, in others it is less than ideal.

Here are the thresholds as described by BOL, http://msdn.microsoft.com/en-us/library/ms184286.aspx

Lock Escalation Thresholds


Lock escalation is triggered when lock escalation is not disabled on the table by using the ALTER TABLE SET LOCK_ESCALATION option, and when either of the following conditions exists:

  • A single Transact-SQL statement acquires at least 5,000 locks on a single nonpartitioned table or index.

  • A single Transact-SQL statement acquires at least 5,000 locks on a single partition of a partitioned table and the ALTER TABLE SET LOCK_ESCALATION option is set to AUTO.

  • The number of locks in an instance of the Database Engine exceeds memory or configuration thresholds.

If locks cannot be escalated because of lock conflicts, the Database Engine periodically triggers lock escalation at every 1,250 new locks acquired.

Now that we have identified the problem, what can we do to fix it?  There are a number of options that can be used  to solve this problem.  One solution is to use the nolock hint in your query or the read uncommited isolation level.  This particular solution is not recommend for all OLTP environments and should only be implemented with careful consideration.  The nolock hint and the read uncommitted isolation level can return inconsistent data.  If and only if this is okay, should you consider this as a solution.  Another solution is to use the read committed snapshot isolation level or the snapshot isolation level.  Both of these solutions require tempdb overhead, but do return transactional consistent data.  You can read more about these isolation levels here, http://msdn.microsoft.com/en-us/library/ms189122.aspx.  The other approach is to remove lock escalation.  You can remove lock escalation at the instance level (trace flags 1211 and 1224) or at the table level in SQL Server 2008, using the ALTER TABLE statement.  Obviously, removing lock escalation should be carefully thought out and tested.  For more information on these trace flags, please visit the storage engine’s blog http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/05/17/lock-escalation.aspx.

There you have it. I have shown you how simply adding data to your table can literally make a fast application a blocking nightmare overnight.  Before you go off and start adding nolock hints or changing your isolation level, please understand that you should only take these steps if you are experiencing this problem.  The reality is an OLTP system should not be holding onto more than 5000 (a lock escalation tipping point) locks because transactions should be short and efficient.  If you are experiencing this problem, your database is probably OLAP, you are missing indexes, or you queries are not typical OLTP transactions.  For example, you probably have large DML transactions and users trying to query the table concurrently.  

Until next time happy coding.