CPU Pressure Part 2

In my post ‘CPU Pressure Part 1’ (https://justinsetliffe.com/2015/05/28/cpu-pressure-part-1/) we looked at excessive query compilation and optimization.  In this post we will further dig into CPU issues and more specifically look into unnecessary recompilation, all of which can be found in Microsoft’s Troubleshooting Performance Problems in SQL Server 2008 (https://msdn.microsoft.com/en-us/library/dd672789(v=sql.100).aspx).

When batches or RPCs (remote procedure calls) are submitted to the SQL engine SQL checks for a valid and correct query plan to use.  If the check fails the batch may have to get recompiled to produce a different query plan.  These recompilations are needed for correctness and to ensure an optimal query plan is used. Recompiling batches is CPU intensive and can result in problems on the system.

Since SQL 2005 only statements that need to be recompiled in stored procedures will get recompiled.  In SQL 2000 the entire procedure would get recompiled, which cause more CPU pressure and locking.  Recompiles happen in the given situations:

  • Schema changes
  • Statistics changes
  • Deferred compilation
  • SET option changes
  • Temporary table changes
  • Stored procedure creation with the RECOMPILE query hint or the OPTION (RECOMPILE) query hint

How to detect these recompiles?  One way is to use Performance Monitor.

The SQL Statistics object provides counters to monitor compilation and the type of requests that are sent to an instance of SQL Server. You have to monitor the number of query compilations and recompilations in conjunction with the number of batches received to find out whether the compilations are contributing to high CPU use. Ideally, the ratio of SQL Recompilations/sec to Batch Requests/sec should be very low, unless users are submitting ad hoc queries.

These are the key data counters:

  • SQL Server: SQL Statistics: Batch Requests/sec
  • SQL Server: SQL Statistics: SQL Compilations/sec
  • SQL Server: SQL Statistics: SQL Recompilations/sec

For more information, see SQL Statistics Object (http://msdn.microsoft.com/en-us/library/ms190911.aspx) in SQL Server 2008 Books Online.

Another way is to use SQL Server Profiler Trace.

SP:Recompile / SQL:StmtRecompile

The SP:Recompile and the SQL:StmtRecompile event classes indicate which stored procedures and statements have been recompiled. When you compile a stored procedure, one event is generated for the stored procedure and one for each statement that is compiled. However, when a stored procedure recompiles, only the statement that caused the recompilation is recompiled. Some of the more important data columns for the SP:Recompile event class are listed here. The EventSubClass data column in particular is important for determining the reason for the recompilation. SP:Recompile is triggered once for the procedure or trigger that is recompiled and is not fired for an ad hoc batch that could likely be recompiled. In SQL Server 2008 and SQL Server 2005, it is more useful to monitor SQL:StmtRecompile, because this event class is fired when any type of batch, ad hoc, stored procedure, or trigger is recompiled.

The key data columns to look at in these events are as follows.

  • EventClass
  • EventSubClass
  • ObjectID (represents stored procedure that contains this statement)
  • SPID
  • StartTime
  • SqlHandle
  • TextData

For more information, see SQL:StmtRecompile Event Class (http://technet.microsoft.com/en-us/library/ms179294.aspx) in SQL Server 2008 Books Online.

If you have a trace file saved, you can use the following query to see all the recompilation events that were captured in the trace.

SELECT

    spid,

    StartTime,

    Textdata,

    EventSubclass,

    ObjectID,

    DatabaseID,

    SQLHandle

FROM

    fn_trace_gettable ( ‘e:\recompiletrace.trc’ , 1)

WHERE

    EventClass in(37,75,166)

EventClass  37 = Sp:Recompile, 75 = CursorRecompile, 166 = SQL:StmtRecompile

You can also group the results from this query by the SqlHandle and ObjectID columns, or by various other columns, to see whether most of the recompilations are attributed by one stored procedure or are due to some other reason (such as a SET option that has changed).

Showplan XML For Query Compile

The Showplan XML For Query Compile event class occurs when SQL Server compiles or recompiles a Transact-SQL statement. This event has information about the statement that is being compiled or recompiled and includes the query plan and the object ID of the procedure in question. Capturing this event has significant performance overhead, because it is captured for each compilation or recompilation. If you see a high value for the SQL Compilations/sec counter in Performance Monitor, you should monitor this event. With this information, you can see which statements are frequently recompiled. You can use this information to change the parameters of those statements. This should reduce the number of recompilations.

DMVs

When you use the sys.dm_exec_query_optimizer_info DMV, you can get a good idea of the time SQL Server spends optimizing. If you take two snapshots of this DMV, you can get a good feel for the time that is spent optimizing in the given time period.

SELECT * FROM sys.dm_exec_query_optimizer_info;

Counter                     Occurrence Value

optimizations         121                1

elapsed time           121                 0.026198347107438

The elapsed time value is the time elapsed due to optimizations, which will be close to CPU time.

Another DMV that is useful for capturing this information is sys.dm_exec_query_stats.

The data columns to look at are as follows:

  • Sql_handle
  • Total worker time
  • Plan generation number
  • Statement Start Offset

For more information, see sys.dm_exec_query_stats (http://msdn.microsoft.com/en-us/library/ms189741.aspx) in SQL Server 2008 Books Online.

In particular, plan_generation_num indicates the number of times the query has recompiled. The following sample query gives you the top 25 stored procedures that have been recompiled.

SELECT * FROM sys.dm_exec_query_optimizer_info;

SELECT top 25

    sql_text.text,

    sql_handle,

    plan_generation_num,

    execution_count,

    dbid,

    objectid

FROM

    sys.dm_exec_query_stats a

    CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sql_text

WHERE

    plan_generation_num >1

ORDER BY plan_generation_num DESC;

If you detect excessive compilation and recompilation, consider the following options:

  • If a changed SET option caused recompilation use SQL Profiler to find the changed SET option.  Try not to change the SET option in the future, or change it at the connection level.
  •  Temp table recompile thresholds are lower than for normal tables.  If statistics are causing recompilations use table variables as opposed to temp tables.
  • Use KEEP PLAN query hint.  This sets the recompile level on temp tables to be the same as for regular tables.  The EventSubclass column displays “Statistics Changed” for an operation on a temporary table.
  • To avoid recompilations that are due to changes in statistics specify the KEEPFIXED PLAN query hint. This option makes sure recompilations can only happen to ensure correctness (for example, when the underlying table structure has changed and the plan no longer applies) and not to respond to changes in statistics. For example, a recompilation can occur if the schema of a table that is referenced by a statement changes, or if a table is marked with the sp_recompile stored procedure.
  • Turning off the automatic updates of statistics for indexes and statistics that are defined on a table or indexed view will prevent recompilations because of changes to statistics on objects.  Turning off the auto-stats feature by using this method is not usually a good idea. This is because the query optimizer is no longer sensitive to data changes in those objects and suboptimal query plans might result. This should only be used as a last resort.
  • Batches should use fully qualified object names (dbo.Table4) to avoid recompiles and ambiguity.
  • To avoid recompilations that are due to deferred compilations, do not interleave DML and DDL or create the DDL from conditional constructs such as IF statements.
  • Run index tuning tools (I happen to love Brent Ozar Unlimited’s free suite of tools) to see if new/different indexes can help execution times.
  • Check if stored procedure with created with either WITH RECOMPILE or RECOMPILE query hints.
  • If a procedure was created with the WITH RECOMPILE option, in SQL Server 2008 or SQL Server 2005, you may be able to take advantage of a statement-level RECOMPILE hint if a particular statement within that procedure needs to be recompiled. Using this hint at the statement level avoids the necessity of recompiling the whole procedure each time it executes, while at the same time allowing the individual statement to be compiled. For more query hint information check out  SQL 2014 books online (https://msdn.microsoft.com/en-us/library/ms181714.aspx).

 

Find Database Causing Deadlock

I’m a big fan of using SQL guru’s queries and products.  Why am I going to dig through SQL DMVs if someone a lot smarter than I am has already done it for me?  My favorites are scripts from Brent Ozar’s people and Steve Stedman’s Database Health Monitor.  While running Brent Ozar’s sp_blitz I saw an entry for deadlocks, quite a large number of them.  My first thought was ‘Ok, now how am I going to track this down?’.  I just want to find the problem database for now, once I have that I can dig into that and see what issues the database has.

First I need to look into Extended Events, these are event-handling systems for servers and were introduced in SQL 2008.

Here is my query:

select XEventData.XEvent.value(‘(data/value)[1]’, ‘varchar(max)’) as DeadlockGraph

FROM

(select CAST(target_data as xml) as TargetData

from sys.dm_xe_session_targets st

join sys.dm_xe_sessions s on s.address = st.event_session_address

where name = ‘system_health’) AS Data

CROSS APPLY TargetData.nodes (‘//RingBufferTarget/event’) AS XEventData (XEvent)

where XEventData.XEvent.value(‘@name’, ‘varchar(4000)’) = ‘xml_deadlock_report’

That gives me the following:

UPDATE [dbo].[DeadLockTest2] set [col1] = @1
UPDATE dbo.DeadLockTest2 SET col1 = 1
UPDATE dbo.DeadLockTest2 SET col1 = 1
UPDATE [dbo].[DeadLockTest] set [col1] = @1
UPDATE dbo.DeadLockTest SET col1 = 1
BEGIN TRAN
UPDATE dbo.DeadLockTest2 SET col1 = 1
UPDATE dbo.DeadLockTest SET col1 = 1

For this example I didn’t use my production database, but instead loaded some deadlocks into AdventureWorks2012 for some test data.

If, for example, I had hundreds of databases this wouldn’t tell me which database has this table in it so I need one more query to fully track down where the deadlock came from.

We will use this:

create table #t ( DBName sysname not null ) go exec sp_msforeachdb ‘use [?]; if OBJECT_ID(”dbo.mytable”) is not null insert into #t (DBName) select ”?”’ go select * from #t go drop table #t

Obviously we have to change ”dbo.mytable” to ”DeadLockTest2”.

Here are the results:

DBName
AdventureWorks2012

Now we can further look under the hood into that database and see what may be causing problems.

CPU Pressure Part 1

When looking into performance issues one area to look into is CPU pressure.  CPU pressure basically means the hardware cannot keep up with the load.  Once the loads are identified there are ways to see if those queries can be better tuned to give better performance, before rushing out to buy new hardware.

For this blog, we will be looking at excessive query compilation and optimization, all of which can be found in Microsoft’s Troubleshooting Performance Problems in SQL Server 2008 (https://msdn.microsoft.com/en-us/library/dd672789(v=sql.100).aspx).

Optimizing and compiling queries is a CPU-intensive operation, and the more complex a query the higher the cost to optimize. To help keep this cost as low as possible, SQL Server will cache and reuse query plans. For each new query SQL Server will search the plan cache, or procedural cache, to look for a previously compiled plan it can use. If there is no current plan SQL will have to create one before the query is run.

During compilation, SQL Server 2008 computes a “signature” of the query, which gets put in the query_hash column for both sys.dm_exec_requests and sys.dm_exec_query_stats, and the QueryHash attribute in Showplan/Statistics XML. Entries with the same query_hash value have a good probability of being the same query text, if it had been written in a query_hash parameterized form (SQL Server like parameters over literal values for this very reason). Queries that vary in their literal values should have the same value, for example the first two queries share the same query hash, while the third query has a different query hash, because it is performing a different operation.

SELECT * FROM sys.objects WHERE object_id = 100

SELECT * FROM sys.objects WHERE object_id = 101

SELECT * FROM sys.objects WHERE [name] = ‘sysobjects

When the query hash is computed white spaces are ignored, as are differences in the use of explicit column lists compared to using an asterisk (*) in the SELECT list. Fully qualified names as opposed to just the table name also does not matter. All of the following should produce the same query_hash value.

USEse AdventureWorks

GO

SET showplan_xml on

GO

— Assume this is run by a user whose default schema is Sales

SELECT * FROM SalesOrderHeader h

SELECT * FROM Sales.SalesOrderHeader h

SELECT SalesOrderID,

RevisionNumber,

OrderDate,

DueDate,

ShipDate,

Status,

OnlineOrderFlag,

SalesOrderNumber,

PurchaseOrderNumber,

AccountNumber,

CustomerID,

ContactID,

SalesPersonID,

TerritoryID,

BillToAddressID,

ShipToAddressID,

ShipMethodID,

CreditCardID,

CreditCardApprovalCode,

CurrencyRateID,

SubTotal,

TaxAmt,

Freight,

TotalDue,

Comment,

rowguid,

ModifiedDate

FROM Sales.SalesOrderHeader h

GO

SET showplan_xml OFF

GO

Note that the database portion of the fully qualified name is ignored when the query_hash value is generated. This allows resource usage to be aggregated across all queries in systems that replicate the same schema and queries against many databases on the same instance. An easy way to detect applications that submit lots of ad hoc queries is by grouping on the sys.dm_exec_query_stats.query_hash column as follows.

SELECT

q.query_hash,

q.number_of_entries,

t.text as sample_query,

p.query_plan as sample_plan

FROM
(

SELECT TOP 20 query_hash,

count(*) as number_of_entries,

min(sql_handle) as sample_sql_handle,

min(plan_handle) as sample_plan_handle

FROM sys.dm_exec_query_stats

GROUP BY query_hash

HAVING count(*) > 1

ORDER BY count(*) desc) as q

cross apply sys.dm_exec_sql_text(q.sample_sql_handle) as t

cross apply sys.dm_exec_query_plan(q.sample_plan_handle) as p

GO

Queries that have a number_of_entries value in the hundreds or thousands are perfect candidates for parameterization. If you look at the CompileTime and CompileCPU attributes under the tag of the sample XML query plan and multiply those values times the number_of_entries value for that query, you can get an estimate of how much compile time and CPU you can eliminate by parameterizing the query (which means that the query is compiled once, and then it is cached and reused for subsequent executions). Fixing these queries will have a cascading benefit as there will be a reduction of CPU usage, and more memory to cache other plans, thus more memory for the buffer cache.

SQL produces a query_plan_hash  value that represents the “signature” of the query plan’s access path (that is, what join algorithm is used, the join order, index selection, and so forth).   If an application relies on different query plans based on different parameter’s being evaluated you do not want to parameterize the query.

The values query_hash and query_plan_hash can be combined to determine if a set of ad-hoc queries with the same query_hash value resulted in query plans with the same or different query_plan_hash values, or access path. A small modification is done to our earlier query:

SELECT

q.query_hash,

q.number_of_entries,

q.distinct_plans,

t.text as sample_query,

p.query_plan as sample_plan

FROM
(

select top 20 query_hash,

count(*) as number_of_entries,

count(distinct query_plan_hash) as distinct_plans,

min(sql_handle) as sample_sql_handle,

min(plan_handle) as sample_plan_handle

FROM sys.dm_exec_query_stats

GROUP BY query_hash

HAVING count(*) > 1

ORDER BY count(*) desc) as q

cross apply sys.dm_exec_sql_text(q.sample_sql_handle) as t

cross apply sys.dm_exec_query_plan(q.sample_plan_handle

) as p

GO

Note that this new query returns a count of the number of distinct query plans (query_plan_hash values) for a given query_hash value. Rows that return a large number for number_of_entries and a distinct_plans count of 1 are good candidates for parameterization. Even if the number of distinct plans is more than one, you can use sys.dm_exec_query_plan to retrieve the different query plans and examine them to see whether the difference is important and necessary for achieving optimal performance.

Once queries that can be parameterized have been identified the best place to parameterize them is at the client application, which would vary from application to application of course.

Next post, we will look into unnecessary recompilation

Available Memory Quick Check

You just got a call from an application owner telling you his application isn’t running as fast as normal, and can you check the SQL instance his database is sitting on to see what’s going on?  You first want to check what’s going on in terms of memory being used, at least I do, but you just want something simple that returns a quick message that says if the instance is under any memory pressure.  The question is which of the numerous memory based DMVs do you use?  For me, if I want a quick check of the memory being used I want to look at sys.dm_os_sys_memory.  This DMV was introduced in SQL 2008 and returns memory information from the operating system.

Here is the syntax:

select

total_physical_memory_kb / 1024 as phys_mem_mb,

available_physical_memory_kb / 1024 as avail_phys_mem_mb,

system_cache_kb /1024 as sys_cache_mb,

(kernel_paged_pool_kb+kernel_nonpaged_pool_kb) / 1024

as kernel_pool_mb,

total_page_file_kb / 1024 as total_page_file_mb,

available_page_file_kb / 1024 as available_page_file_mb,

system_memory_state_desc

from sys.dm_os_sys_memory

Here is the result:

Screen Shot 2015-04-30 at 11.53.15 AM

What is this telling me?

phys_mem_mb – Total size of physical memory available to the operating system (in MB)

available_physical_memory_kb – Size of physical memory available (in MB)

system_cache_kb – Total amount of system cache memory (in MB)

kernel_paged_pool_kb – Total amount of the paged kernel pool

kernel_nonpaged_pool_kb – Total amount of the nonpaged kernel pool (Combined with the above result in MB)

total_page_file_kb – Size of the commit limit reported by the operating system (in MB)

available_page_file_kb – Total amount of page file that is not being used (in MB)

system_memory_state_desc – Description of the memory state.  This entry comes from a API fIunction described here – https://msdn.microsoft.com/en-us/library/aa366541(VS.85).aspx

The entire list of available values for the sys.dm_os_sys_memory DMV can be found here – https://msdn.microsoft.com/en-us/library/bb510493.aspx

Now, thanks to that last column, I know my memory is ok, I can look into something else (which we will go into on a later post).

One of the things I’ve done at my company is create a SSRS report based on this, and have the report emailed to me every morning.  This way each morning I can check on my instances memory right from my email.  Also, if I do get any calls regarding performance issues I can simply click on a bookmark on my browser and bring up the report right away.  This saves me having to log into a jump server, and then into my SQL instance to run this query.

IMPLICIT CONVERSIONS – NUMBER THREE

For the last two weeks we have looked at why SQL Implicit Conversions (ICs) are bad and how to identify a query containing them.  This week we will look at a few ways to correct a query that contains an IC.

Here is the query we will be working with:

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = 879342154

GO

The first thing to notice is the field NationalIDNumber is not surrounded with single quotes, which means SQL will think that the value being passed is a numeric value.  If we look at the table HumanResources we see that the column Employee is a nvarchar data type.  Using Microsofts chart on data type conversions (https://msdn.microsoft.com/en-us/library/ms191530.aspx) we will see that this will cause an implicit conversion.

Screen Shot 2015-04-20 at 1.01.14 PM

The first way we can change this is to surround the value in the WHERE statement with single quotes to tell SQL that this is a string value and not numeric.  Make sure to turn on Include Actual Execution Plan.

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = ‘879342154’

GO

When we run this query a number of things become apparent.  First, the yellow warning sign has gone way.

Screen Shot 2015-04-20 at 1.08.53 PM

Next, in the Execution Plan window one part of the query has gone from an Index Scan to now an Index Seek.  This means that SQL is now using an index and not doing a full table scan to find the data (less IO).

Screen Shot 2015-04-20 at 1.15.32 PM

We can also look at the IO differences between a full table scan and an Index Seek.  It shows that indeed an Index Seek is much, much more efficient in terms of IO.

Screen Shot 2015-04-20 at 1.18.36 PMScreen Shot 2015-04-24 at 9.15.47 AM

There is also another way to remove the IC from the execution plan.  When SQL runs a query it gives that query a plan signature and puts that signature in to memory.  When queries run SQL will check these signatures against what is being run and if they match SQL can reuse the existing plan signature and not have to generate a new one.  So why is this important?  If we use the above query as our example and pretend that the column NationalIDNumber is not just numbers but also string values.  When a query come through that uses a string value such as WHERE NationalIDNumber = ‘TextExample’ SQL will consider this different from our original query (WHERE NationalIDNumber =879342154) and will generate a new plan signature.  How can we avoid this?  Use parameters.  Parameterized queries better match, and reuse existing cached plans.  Let’s change our query to use a parameter now by first creating the parameter and then setting it to a specified value.

DECLARE @ID nvarchar(15)

SET @ID = ‘879342154’

–Now we run the query

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

   FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = @ID

So now we have identified what implicit conversions are, why they are bad, how to identify them and finally a few ways to improve your query to remove them.  I hope this helps.

Implicit Conversions – Part Deux

Last week I wrote about the dangers of Implicit Conversions (ICs) in SQL query code (https://justinsetliffe.com/2015/04/10/implicit-conversions-2/).  Just to summarize, there are numerous reasons why they are bad.

This week let’s find out how to identify these ICs, it’s pretty easy to accomplish.

The first way is done when you have a query you want to run in SSMS.  For this example we will run a query against Microsoft’s AdventureWorks database.  This is a play/test database that can be downloaded from Microsoft (http://msftdbprodsamples.codeplex.com/).  Here is the query we will run for this:

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

      ,[JobTitle]

      ,[BirthDate]

      ,[MaritalStatus]

      ,[Gender]

      ,[HireDate]

      ,[SalariedFlag]

      ,[VacationHours]

      ,[SickLeaveHours]

      ,[CurrentFlag]

      ,[rowguid]

      ,[ModifiedDate]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = 879342154

GO

Make sure to select ‘Include Actual Execution Plan’ from the Query tab in SSMS, and execute the query.  Once you query runs you’ll notice a new tab called ‘Execution plan.

IC_Example2

If you click on that tab you’ll notice a big ‘!’ inside of an yellow triangle.  That is telling you something is wrong with your query.  Inside of the execution plan window for this query you’ll notice an index scan, which means the index for that table is not being used (if there ever was an index).

Screen Shot 2015-04-01 at 4.35.35 PM

From my first article we saw that ICs can cause a table scan so that is a great place to see what is going on.  Mousing over that box shows that indeed there is an implicit conversion happening.

Screen Shot 2015-04-20 at 1.18.36 PM

Now, is there an easier way to identify ICs that are happening on your database?  Yes, there is a query that Jonathan Kehayias (of SQLSkills fame) wrote.  Just one FYI for this query, if you run this against a large database it can take a while.  Here is a link to his query – http://www.sqlskills.com/blogs/jonathan/finding-implicit-column-conversions-in-the-plan-cache/.

So far we have identified why ICs are bad in a query and how to find them, so what can we do next week?  I think finding ways to fix them is now called for.  After all you want your queries to be as efficient (I love that word by the way) as possible, it means more free time for you and less calls from your clients with complaints.

Implicit Conversions

So the gauntlet has been thrown down by Sir Brent Ozar and his merry band of Supra-Genius Ultimate’s. Make a blog he says. Pick a topic he says. Well as a famous philosopher once said “Go do it right now. Seriously, you have nothing better to do. You’re reading my blog, for crying out loud.” Ok, it was that Brent guy again.

He’s right so here we go.  What to write about for the first blog…..??? I know, something near and dear to my pain. Implicit Conversions and SQL. Insert screaming at random intervals. So, what exactly are implcit conversions and why are they bad?

1) Per Microsoft – Implicit conversions are not visible to the user. SQL Server automatically converts the data from one data type to another. For example, when a smallint is compared to an int, the smallint is implicitly converted to int before the comparison proceeds.

2) Can result in increased CPU usage for query (bad).

3) Can cause the query to do a table scan instead of a seek (bad SQL, very bad).

4) Can lead to the DBA getting an increase in the number of “My app is running slow” calls from their clients.

Microsoft even has a URL with a nice little picture of all the data types and which ones can create implicit conversions when used together –https://msdn.microsoft.com/en-us/library/ms191530.aspx.

Next week, how to identify said implicit conversions……or maybe something else. Only the Shadow knows….I may know also.

UPDATE – Jonathan Kehayias of SQL Skills has an updated mapping of data types and what data type matchings cause either scans or seeks.  Please use this going forward, thanks you.

https://www.sqlskills.com/blogs/jonathan/implicit-conversions-that-cause-index-scans/

SQL Help