SQL and SQL only: July 2009

Monday, July 27, 2009

Check Currently Running Statements

When the performance dips, the most frequently asked question is which SP/ which query is consuming the most of the CPU/memory/ IO. Till SQL 2k we either need to start a Profiler or need to check sysprocesses and run DBCC inputbuffer. More often than not it would not catch the actual running queries. So, it would be really useful if we have a query which lists down the currently executing queries and other useful details from sysprocesses.

SELECT   spid,
         TEXT                       AS [query],
         Db_name(sysprocesses.dbid) AS dbname,
         cpu,
         memusage,
         status,
         loginame,
         hostname,
         lastwaittype,
         last_batch,
         cmd,
         program_name
FROM     sys.sysprocesses
         OUTER APPLY Fn_get_sql(sql_handle)
WHERE    spid > 50
ORDER BY cpu DESC

DMV dm_exec_sql_text provides an option to list blocking queries.But it lists the currently active queries and not the ones in sleeping state. sysprocesses when used with fn_get_sql(sql_handle) provides details about all the current connections.

SELECT sqltext.TEXT,
       req.session_id,
       req.status,
       req.command,
       req.cpu_time,
       req.total_elapsed_time,
       req.blocking_session_id,
       Db_name(req.database_id) AS database_name,
       req.wait_type,
       req.wait_time,
       req.last_wait_type,
       req.reads,
       req.writes,
       req.logical_reads,
       sqltext_blocking.TEXT    AS blocking_stmt
FROM   sys.dm_exec_requests req
       LEFT OUTER JOIN sys.sysprocesses blocking
         ON blocking.spid = req.blocking_session_id
       OUTER APPLY sys.Dm_exec_sql_text(req.sql_handle) AS sqltext
       OUTER APPLY sys.Dm_exec_sql_text(blocking.sql_handle) AS sqltext_blocking
WHERE  req.session_id > 50

DMVs are cool arent they..Though fn_get_sql was introduced in SQL 2k SP3 it can be used effectively only with OUTER APPLY operator. fn_get_sql is a table valued function which means it cant take a column from a table in a query as its parameter. This was fixed in 2k5 with the introduction of OUTER APPLY operator.

Saturday, July 25, 2009

Check Database Fragmentation

Just thought it would be useful if we have a query to check the overall fragmentation percentage on our database. Just like DBCC showcontig gives fragmentation info at table level,it would be of some help to

check the same at database level. So here we go..

SELECT Sum(Round ((avg_page_space_used_in_percent / 100)

* page_count,0)) AS [Total Pages Used],

Sum(page_count) AS [Available Pages],

(Sum(Round((avg_page_space_used_in_percent / 100) * page_count, 0)) /

Sum(page_count)) * 100 AS [Percentage Used ],

Max(CASE WHEN index_type_desc IN ('HEAP')

AND record_count > 1000 THEN avg_fragmentation_in_percent ELSE 0 END)

AS [Maximum Extent Fragmentation],

Sum(CASE WHEN index_type_desc IN ('HEAP')

AND record_count > 1000 AND avg_fragmentation_in_percent > 10

THEN 1 ELSE 0 END) AS [No_of_Extent_Fragmented_Tables],

Sum(CASE WHEN index_type_desc NOT IN ('HEAP')

THEN (avg_fragmentation_in_percent * page_count) ELSE 0 END) /

Sum(CASE WHEN index_type_desc NOT IN ('HEAP') THEN page_count

ELSE 0 END) AS [Logical Fragmentation in %]

FROM sys.Dm_db_index_physical_stats(Db_id(),NULL,NULL,NULL,'DETAILED')

The script gives the following details:

Page Usage: It is the sum of used sections of each page in each table/index in entire the database. It is expressed as pages.

Available Pages: It is the sum of total number of pages available in the database.

Percentage Used: Ratio of Used Pages and Available Pages. Percentage Used is exactly the same number as Percentage Density seen in the result of DBCC Showcontig statement. But, this one is at the Database level but ShowContig is at table level. Percentage Used should be some where around 90 - 100% which means most of the pages are full and they are getting fully used.But the percentage also heavily depends on the fill factor selected when the indexes are designed.So the expected output for your database should be either close to Fill factors selected for huge tables or in excess of 90%. ( Default setting is 0% when fillfactors are not selected. 0% actually means 100% which implies that all your data pages can be 100% full and index pages can have one index entry space free ).Sudden variations in Percentage_Usage can cause a performance dip and hence its important for a DBA to monitor on a daily basis.

Maximum Extent Fragmentation: Maximum Extent Fragmentation indicates the highest extent fragmentation percentage obtained among all the tables with more than 1000 rows atleast. Extent Fragmentation is calculated by finding the ratio of out of order Extents and total number of extents in the table. Extent Fragmentation expressed here is not a cumlative figure for the entire database but highest experienced among the bigger tables of the database. If the highest is indicated as a value more than 10% then one can execute the script provided below to find the table with the high extent fragmentation and act accordingly. Ideal number expected is 0%. Less than 10% is acceptable.

No_of_Extent_Fragmented_Tables: No_of_Extent_Fragmented_Tables indicates Total number of tables that experience high extent fragmentation. Again tables with atleast 1000 rows and a fragmentation % greater than 10% are taken into consideration. If there more number of tables ( say even around 3-5 ) one needs to immediately find out the tables and fix the same.

Logical Fragmentation in %: Logical Fragmentation is the ratio of out of order pages and total number of pages. Best is 0%. Less than 10% is acceptable. These figures can have variations because updates/inserts/deletes. When such variations are detected its better to check the same figures at table level. The script provides a percentage of out of order pages at the database level. Table Reorg or Rebuilding indexes are couple of options for fixing fragmentation. Below is the script to check fragmentation at each object level.

SELECT Object_name(object_id) AS objectname,

Sum(Round((avg_page_space_used_in_percent / 100 ) * page_count,0))

AS used_pages,

Sum(page_count) AS available_pages ,

(Sum(Round((avg_page_space_used_in_percent / 100 ) * page_count ,0)) /

Sum(page_count) ) * 100 AS percentage_used,

Sum( CASE WHEN index_type_desc IN ('HEAP')

THEN (avg_fragmentation_in_percent * fragment_count) ELSE 0 END)/

CASE WHEN Sum ( CASE WHEN index_type_desc IN ('HEAP')

THEN fragment_count ELSE 0 END ) = 0 THEN 1

ELSE Sum (CASE WHEN index_type_desc IN ('HEAP')

THEN fragment_count ELSE 0 END ) END AS extent_fragmentation ,

Sum( CASE WHEN index_type_desc NOT IN ('HEAP')

THEN ( avg_fragmentation_in_percent * page_count) ELSE 0 END) /

CASE WHEN Sum ( CASE WHEN index_type_desc NOT IN ('HEAP')

THEN page_count ELSE 0 END ) = 0 THEN 1 ELSE

Sum ( CASE WHEN index_type_desc NOT IN ('HEAP')

THEN page_count ELSE 0 END ) END AS logical_fragmentation

FROM

sys.Dm_db_index_physical_stats(Db_id(), NULL, NULL, NULL ,

'DETAILED') WHERE avg_page_space_used_in_percent !=0

GROUP BY Object_name(object_id)

ORDER BY Object_name(object_id)

To avoid divide by zero error empty tables are not taken into account. PS: Noticed that db_id() at times throws errors. In that case find your database id from sysdatabases and hardcode on your query.

Forgive me for the formatting of the query. Really tried my best but couldnt help it.Any tips on the same will be appreciated.

Friday, July 24, 2009

Impact of Indexes on locking

Indexes are mostly used as a technique to make the queries run faster.But they do have a major impact on the way locks are handled, which I will be covering in this post.

To illustrate the same,as a first step, I am trying to Update a table using a column which doesnt have any index.

Consider the following query:

BEGIN TRAN

UPDATE jobs
SET job_desc = 'index_update'
WHERE job_id = 10

Note that I have removed the default Clustered Index present on jobs table(job_id column) in pubs( default sample database provided by Microsoft ) database.The spid for this connection is 53.A look at query plan indicates that query took a table scan as there was no index on it.

Querying Lockinfo table to check locking schemes for this connection(spid 53)shows that the query has taken a Row-Level Exclusive Lock ( Write lock )on the Row ID 1:115:9.The query has acquired Shared Exclusive locks at Page and table levels.

The transaction is not committed and I am opening another connection to update the same table, but a different row. So I attempt to exeute the following query.

BEGIN TRAN

UPDATE jobs
SET job_desc = 'index_update_2'
WHERE job_id = 8

ConnectionID (SPID) is 54. You will notice that the query doesn't complete at all and has to wait till the query in the other window ( spID 53) commits or rollsback.

The Lock Status for the connection 54 indicates that 54 has successfully acquired locks at page and table levels and for one row( 1:115:7 - which has to update). But spid 54 has to wait to obtain a Row Level Update lock on Row 1:115:9 (which is held by sp53).This is in spite of the fact that it has to update a completely row different from the one getting updated at other connection.

The main reason for this case is the 'table scan' taken by first and second query.As the optimizer opts for a scan( as there is no index on job_id), it has read every row in the table to find the row that meets the required criteria. As the Second query does the scan, it has cross the row exclusively held by other connection(spid 53), though it doesn't satisfy the condition and hence has to wait.

Let us create an index and see how it works.I create the following index.

CREATE UNIQUE CLUSTERED INDEX [PK__jobs__117F9D94] ON [dbo].[jobs] (
[job_id])
ON [PRIMARY]

I execute the same query.

BEGIN TRAN

UPDATE jobs
SET job_desc = 'index_update'
WHERE job_id = 10

The query plan this time indicates an index/update as it uses an index.The lockinfo table details indicate a exclusive lock on Key which is the index key and not on Row ID.
Without Committing the row, the other query is also executed using a different connection(spid 54). But this time, the query completes right away without waiting for the first query(spid 53) to commit or rollback.

The lock info also indicates that spid 54 has successfully obtained locks at page,table and an exclusive lock at the relevant key. spid 54 doesn't apply any lock for row/key that didnt meet its criterion.

The reason why this query works is because of the clustered index on job_id.The optimizer for scanning thro uses the index instead of the table, and hence obtains a lock only on the index key. So, when the other query is executed it doesn't have to wait for other rowlocks to release as the query optimizer directly accesses the row it intends to update using the clustered index. This makes a major difference in improving query performance and reducing deadlock occurrence.

One of the intentions in writing this post was to clarify a myth which said 'Never create indexes on the table which is updated as it slows down updates'. Yes Indexes do slowdown the Updates, but only if it happens to update indexed column. But if you plan to write a Update query, make sure your 'where' clause is properly indexed or else this would potentially run your application into high lock waits and deadlocks.