Category Archives: SQL Server

CPU Pressure Part 1

When looking into performance issues one area to look into is CPU pressure.  CPU pressure basically means the hardware cannot keep up with the load.  Once the loads are identified there are ways to see if those queries can be better tuned to give better performance, before rushing out to buy new hardware.

For this blog, we will be looking at excessive query compilation and optimization, all of which can be found in Microsoft’s Troubleshooting Performance Problems in SQL Server 2008 (https://msdn.microsoft.com/en-us/library/dd672789(v=sql.100).aspx).

Optimizing and compiling queries is a CPU-intensive operation, and the more complex a query the higher the cost to optimize. To help keep this cost as low as possible, SQL Server will cache and reuse query plans. For each new query SQL Server will search the plan cache, or procedural cache, to look for a previously compiled plan it can use. If there is no current plan SQL will have to create one before the query is run.

During compilation, SQL Server 2008 computes a “signature” of the query, which gets put in the query_hash column for both sys.dm_exec_requests and sys.dm_exec_query_stats, and the QueryHash attribute in Showplan/Statistics XML. Entries with the same query_hash value have a good probability of being the same query text, if it had been written in a query_hash parameterized form (SQL Server like parameters over literal values for this very reason). Queries that vary in their literal values should have the same value, for example the first two queries share the same query hash, while the third query has a different query hash, because it is performing a different operation.

SELECT * FROM sys.objects WHERE object_id = 100

SELECT * FROM sys.objects WHERE object_id = 101

SELECT * FROM sys.objects WHERE [name] = ‘sysobjects

When the query hash is computed white spaces are ignored, as are differences in the use of explicit column lists compared to using an asterisk (*) in the SELECT list. Fully qualified names as opposed to just the table name also does not matter. All of the following should produce the same query_hash value.

USEse AdventureWorks

GO

SET showplan_xml on

GO

— Assume this is run by a user whose default schema is Sales

SELECT * FROM SalesOrderHeader h

SELECT * FROM Sales.SalesOrderHeader h

SELECT SalesOrderID,

RevisionNumber,

OrderDate,

DueDate,

ShipDate,

Status,

OnlineOrderFlag,

SalesOrderNumber,

PurchaseOrderNumber,

AccountNumber,

CustomerID,

ContactID,

SalesPersonID,

TerritoryID,

BillToAddressID,

ShipToAddressID,

ShipMethodID,

CreditCardID,

CreditCardApprovalCode,

CurrencyRateID,

SubTotal,

TaxAmt,

Freight,

TotalDue,

Comment,

rowguid,

ModifiedDate

FROM Sales.SalesOrderHeader h

GO

SET showplan_xml OFF

GO

Note that the database portion of the fully qualified name is ignored when the query_hash value is generated. This allows resource usage to be aggregated across all queries in systems that replicate the same schema and queries against many databases on the same instance. An easy way to detect applications that submit lots of ad hoc queries is by grouping on the sys.dm_exec_query_stats.query_hash column as follows.

SELECT

q.query_hash,

q.number_of_entries,

t.text as sample_query,

p.query_plan as sample_plan

FROM
(

SELECT TOP 20 query_hash,

count(*) as number_of_entries,

min(sql_handle) as sample_sql_handle,

min(plan_handle) as sample_plan_handle

FROM sys.dm_exec_query_stats

GROUP BY query_hash

HAVING count(*) > 1

ORDER BY count(*) desc) as q

cross apply sys.dm_exec_sql_text(q.sample_sql_handle) as t

cross apply sys.dm_exec_query_plan(q.sample_plan_handle) as p

GO

Queries that have a number_of_entries value in the hundreds or thousands are perfect candidates for parameterization. If you look at the CompileTime and CompileCPU attributes under the tag of the sample XML query plan and multiply those values times the number_of_entries value for that query, you can get an estimate of how much compile time and CPU you can eliminate by parameterizing the query (which means that the query is compiled once, and then it is cached and reused for subsequent executions). Fixing these queries will have a cascading benefit as there will be a reduction of CPU usage, and more memory to cache other plans, thus more memory for the buffer cache.

SQL produces a query_plan_hash  value that represents the “signature” of the query plan’s access path (that is, what join algorithm is used, the join order, index selection, and so forth).   If an application relies on different query plans based on different parameter’s being evaluated you do not want to parameterize the query.

The values query_hash and query_plan_hash can be combined to determine if a set of ad-hoc queries with the same query_hash value resulted in query plans with the same or different query_plan_hash values, or access path. A small modification is done to our earlier query:

SELECT

q.query_hash,

q.number_of_entries,

q.distinct_plans,

t.text as sample_query,

p.query_plan as sample_plan

FROM
(

select top 20 query_hash,

count(*) as number_of_entries,

count(distinct query_plan_hash) as distinct_plans,

min(sql_handle) as sample_sql_handle,

min(plan_handle) as sample_plan_handle

FROM sys.dm_exec_query_stats

GROUP BY query_hash

HAVING count(*) > 1

ORDER BY count(*) desc) as q

cross apply sys.dm_exec_sql_text(q.sample_sql_handle) as t

cross apply sys.dm_exec_query_plan(q.sample_plan_handle

) as p

GO

Note that this new query returns a count of the number of distinct query plans (query_plan_hash values) for a given query_hash value. Rows that return a large number for number_of_entries and a distinct_plans count of 1 are good candidates for parameterization. Even if the number of distinct plans is more than one, you can use sys.dm_exec_query_plan to retrieve the different query plans and examine them to see whether the difference is important and necessary for achieving optimal performance.

Once queries that can be parameterized have been identified the best place to parameterize them is at the client application, which would vary from application to application of course.

Next post, we will look into unnecessary recompilation

Available Memory Quick Check

You just got a call from an application owner telling you his application isn’t running as fast as normal, and can you check the SQL instance his database is sitting on to see what’s going on?  You first want to check what’s going on in terms of memory being used, at least I do, but you just want something simple that returns a quick message that says if the instance is under any memory pressure.  The question is which of the numerous memory based DMVs do you use?  For me, if I want a quick check of the memory being used I want to look at sys.dm_os_sys_memory.  This DMV was introduced in SQL 2008 and returns memory information from the operating system.

Here is the syntax:

select

total_physical_memory_kb / 1024 as phys_mem_mb,

available_physical_memory_kb / 1024 as avail_phys_mem_mb,

system_cache_kb /1024 as sys_cache_mb,

(kernel_paged_pool_kb+kernel_nonpaged_pool_kb) / 1024

as kernel_pool_mb,

total_page_file_kb / 1024 as total_page_file_mb,

available_page_file_kb / 1024 as available_page_file_mb,

system_memory_state_desc

from sys.dm_os_sys_memory

Here is the result:

Screen Shot 2015-04-30 at 11.53.15 AM

What is this telling me?

phys_mem_mb – Total size of physical memory available to the operating system (in MB)

available_physical_memory_kb – Size of physical memory available (in MB)

system_cache_kb – Total amount of system cache memory (in MB)

kernel_paged_pool_kb – Total amount of the paged kernel pool

kernel_nonpaged_pool_kb – Total amount of the nonpaged kernel pool (Combined with the above result in MB)

total_page_file_kb – Size of the commit limit reported by the operating system (in MB)

available_page_file_kb – Total amount of page file that is not being used (in MB)

system_memory_state_desc – Description of the memory state.  This entry comes from a API fIunction described here – https://msdn.microsoft.com/en-us/library/aa366541(VS.85).aspx

The entire list of available values for the sys.dm_os_sys_memory DMV can be found here – https://msdn.microsoft.com/en-us/library/bb510493.aspx

Now, thanks to that last column, I know my memory is ok, I can look into something else (which we will go into on a later post).

One of the things I’ve done at my company is create a SSRS report based on this, and have the report emailed to me every morning.  This way each morning I can check on my instances memory right from my email.  Also, if I do get any calls regarding performance issues I can simply click on a bookmark on my browser and bring up the report right away.  This saves me having to log into a jump server, and then into my SQL instance to run this query.

IMPLICIT CONVERSIONS – NUMBER THREE

For the last two weeks we have looked at why SQL Implicit Conversions (ICs) are bad and how to identify a query containing them.  This week we will look at a few ways to correct a query that contains an IC.

Here is the query we will be working with:

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = 879342154

GO

The first thing to notice is the field NationalIDNumber is not surrounded with single quotes, which means SQL will think that the value being passed is a numeric value.  If we look at the table HumanResources we see that the column Employee is a nvarchar data type.  Using Microsofts chart on data type conversions (https://msdn.microsoft.com/en-us/library/ms191530.aspx) we will see that this will cause an implicit conversion.

Screen Shot 2015-04-20 at 1.01.14 PM

The first way we can change this is to surround the value in the WHERE statement with single quotes to tell SQL that this is a string value and not numeric.  Make sure to turn on Include Actual Execution Plan.

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = ‘879342154’

GO

When we run this query a number of things become apparent.  First, the yellow warning sign has gone way.

Screen Shot 2015-04-20 at 1.08.53 PM

Next, in the Execution Plan window one part of the query has gone from an Index Scan to now an Index Seek.  This means that SQL is now using an index and not doing a full table scan to find the data (less IO).

Screen Shot 2015-04-20 at 1.15.32 PM

We can also look at the IO differences between a full table scan and an Index Seek.  It shows that indeed an Index Seek is much, much more efficient in terms of IO.

Screen Shot 2015-04-20 at 1.18.36 PMScreen Shot 2015-04-24 at 9.15.47 AM

There is also another way to remove the IC from the execution plan.  When SQL runs a query it gives that query a plan signature and puts that signature in to memory.  When queries run SQL will check these signatures against what is being run and if they match SQL can reuse the existing plan signature and not have to generate a new one.  So why is this important?  If we use the above query as our example and pretend that the column NationalIDNumber is not just numbers but also string values.  When a query come through that uses a string value such as WHERE NationalIDNumber = ‘TextExample’ SQL will consider this different from our original query (WHERE NationalIDNumber =879342154) and will generate a new plan signature.  How can we avoid this?  Use parameters.  Parameterized queries better match, and reuse existing cached plans.  Let’s change our query to use a parameter now by first creating the parameter and then setting it to a specified value.

DECLARE @ID nvarchar(15)

SET @ID = ‘879342154’

–Now we run the query

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

   FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = @ID

So now we have identified what implicit conversions are, why they are bad, how to identify them and finally a few ways to improve your query to remove them.  I hope this helps.

Implicit Conversions – Part Deux

Last week I wrote about the dangers of Implicit Conversions (ICs) in SQL query code (https://justinsetliffe.com/2015/04/10/implicit-conversions-2/).  Just to summarize, there are numerous reasons why they are bad.

This week let’s find out how to identify these ICs, it’s pretty easy to accomplish.

The first way is done when you have a query you want to run in SSMS.  For this example we will run a query against Microsoft’s AdventureWorks database.  This is a play/test database that can be downloaded from Microsoft (http://msftdbprodsamples.codeplex.com/).  Here is the query we will run for this:

SELECT [BusinessEntityID]

      ,[NationalIDNumber]

      ,[LoginID]

      ,[OrganizationNode]

      ,[OrganizationLevel]

      ,[JobTitle]

      ,[BirthDate]

      ,[MaritalStatus]

      ,[Gender]

      ,[HireDate]

      ,[SalariedFlag]

      ,[VacationHours]

      ,[SickLeaveHours]

      ,[CurrentFlag]

      ,[rowguid]

      ,[ModifiedDate]

  FROM [HumanResources].[Employee]

  WHERE NationalIDNumber = 879342154

GO

Make sure to select ‘Include Actual Execution Plan’ from the Query tab in SSMS, and execute the query.  Once you query runs you’ll notice a new tab called ‘Execution plan.

IC_Example2

If you click on that tab you’ll notice a big ‘!’ inside of an yellow triangle.  That is telling you something is wrong with your query.  Inside of the execution plan window for this query you’ll notice an index scan, which means the index for that table is not being used (if there ever was an index).

Screen Shot 2015-04-01 at 4.35.35 PM

From my first article we saw that ICs can cause a table scan so that is a great place to see what is going on.  Mousing over that box shows that indeed there is an implicit conversion happening.

Screen Shot 2015-04-20 at 1.18.36 PM

Now, is there an easier way to identify ICs that are happening on your database?  Yes, there is a query that Jonathan Kehayias (of SQLSkills fame) wrote.  Just one FYI for this query, if you run this against a large database it can take a while.  Here is a link to his query – http://www.sqlskills.com/blogs/jonathan/finding-implicit-column-conversions-in-the-plan-cache/.

So far we have identified why ICs are bad in a query and how to find them, so what can we do next week?  I think finding ways to fix them is now called for.  After all you want your queries to be as efficient (I love that word by the way) as possible, it means more free time for you and less calls from your clients with complaints.

Implicit Conversions

So the gauntlet has been thrown down by Sir Brent Ozar and his merry band of Supra-Genius Ultimate’s. Make a blog he says. Pick a topic he says. Well as a famous philosopher once said “Go do it right now. Seriously, you have nothing better to do. You’re reading my blog, for crying out loud.” Ok, it was that Brent guy again.

He’s right so here we go.  What to write about for the first blog…..??? I know, something near and dear to my pain. Implicit Conversions and SQL. Insert screaming at random intervals. So, what exactly are implcit conversions and why are they bad?

1) Per Microsoft – Implicit conversions are not visible to the user. SQL Server automatically converts the data from one data type to another. For example, when a smallint is compared to an int, the smallint is implicitly converted to int before the comparison proceeds.

2) Can result in increased CPU usage for query (bad).

3) Can cause the query to do a table scan instead of a seek (bad SQL, very bad).

4) Can lead to the DBA getting an increase in the number of “My app is running slow” calls from their clients.

Microsoft even has a URL with a nice little picture of all the data types and which ones can create implicit conversions when used together –https://msdn.microsoft.com/en-us/library/ms191530.aspx.

Next week, how to identify said implicit conversions……or maybe something else. Only the Shadow knows….I may know also.

UPDATE – Jonathan Kehayias of SQL Skills has an updated mapping of data types and what data type matchings cause either scans or seeks.  Please use this going forward, thanks you.

https://www.sqlskills.com/blogs/jonathan/implicit-conversions-that-cause-index-scans/