SQL and SQL only

Monday, November 29, 2010

Lock escalation : SQL Server 2008

Lock escalation is a event which occurs when SQL Server decides to upgrade a lock at a lower level hierarchy to a lock to a table level lock., In other words, when a particular query obtains a large number of row level locks/ page level locks, SQL Server decides that instead of creating and granting number of row level/page level locks, it is effective to grant a single table level lock. Or to be precise,
SQL Server upgrades the row/page level locks to table level locks. The above process is termed as lock escalation.

Lock escalation is good, as it reduces the overhead in maintaining a large number of smaller level locks. A lock structure is about occupies about 100 bytes of memory and too many locks can cause a memory pressure.Similarly applying,granting , and releasing locks for each row or page is a resource consuming processes which can be reduced by lock escalation. However, Lock escalation also reduces concurrency. ie, If a query causes lock escalation, then the query obtains a full table level lock, and another query attempting to access the table will have to wait till the first query releases the lock.

How SQL Server decides when to escalate lock ?

* When a query consumes more than 5000 locks per index / heap.
* When the lock monitor consumes more than 40% of the static memory or non AWE alloted memory.

So Let us quickly see lock escalation in action. Consider the following query

SET ROWCOUNT 4990

GO

BEGIN TRAN

UPDATE orders
SET    order_description = order_description + ' '
--Rollback

The orders table has a clustered index. Row level locks will be taken on the index keys. SET ROWCOUNT ensures that only 4990 rows are updated by the query. I am leaving the transaction open ( without committing or rolling back ) , so that we can see the number of locks held by the query.

Fire the following query to check the locks held by the above script. The query lists the count of locks for each lock type and object. Note that the session id for the above script on my machine was 53. So filtering by the same.

SELECT spid,
       COUNT(*),
       request_mode,
       [resource_associated_entity_id],
       sys.dm_tran_locks.resource_type AS object_type,
       Db_name(sysprocesses.dbid)      AS dbname
FROM   sys.dm_tran_locks,
       sys.sysprocesses
       OUTER APPLY Fn_get_sql(sql_handle)
WHERE  spid = 53
       AND sys.dm_tran_locks.request_session_id = 53
       AND sys.dm_tran_locks.resource_type IN ( 'page', 'key', 'object' )
       AND Db_name(sysprocesses.dbid) = 'dbadb'
GROUP  BY spid,
          [resource_associated_entity_id],
          request_mode,
          sys.dm_tran_locks.resource_type,
          Db_name(sysprocesses.dbid)

As one may notice, we can find 4990 key locks / row level locks. Let us rollback transaction and modify the script to use 5000 or more locks.

SET ROWCOUNT 5000

GO
BEGIN TRAN

UPDATE orders
SET    order_description = order_description + ' '
--Rollback

Now let us fire the same query on sys.dm_tran_locks. we obtain a single exclusive lock on the table/object. There are no key or row level locks as SQL Server as per its rule has escalated the row level locks to a table level lock.

SQL Server 2005 had a server wide setting to disable lock escalations. On SQL Server 2005, when the trace flag 1211/1224 are set, no query is allowed to escalate locks on the entire server. Ideally, we would like to have it as a object/table level setting which was provided by SQL Server 2008. SQL Server 2008 allows one to disable lock escalations at table/ partition levels.

Consider the following command in SQL 2k8

ALTER TABLE orders SET CONSTRAINT (LOCK_ESCALATION = DISABLE )
GO

The ALTER TABLE command's LOCK_ESCALATION property accepts three values.

* Disable -> Disables lock escalation ( Except a few exceptions . Refer Books online for details )
* Table -> Allows SQL Server to escalate to table level. That is the default setting.
* Auto -> Escalation will be partition level if the table is partitioned. Else escalation is always up to table level.

Let us rollback the open transaction created earlier and run the ALTER TABLE command posted above to disable lock escalations. Now let us run the same script to update 5000 records again and see if lock escalation has actually occurred.

As you may now notice, for the same 5000 rows, there is no lock escalation occuring this time as we have disabled it using the ALTER TABLE command. The picture shows 5000 key/row locks which is not possible at the default setting of lock escalation.

The intention behind this post was to introduce lock escalation, show how it works and also explain the new option provided to change lock escalation setting SQL Server 2008. Upcoming posts, we will dive deeper into the topic and understand when and under what circumstances can we play with lock escalation setting.

Tuesday, November 16, 2010

Backup log Truncate_Only in SQL Server 2008

BACKUP LOG <db_name> WITH truncate_only command, used for clearing the log file, is deprecated in SQL Server 2008. So this post will explain option available in SQL Server 2008 for truncating the log.

Step 1: Change the recovery model to Simple

USE [master]
GO
ALTER DATABASE [dbadb]
SET recovery simple WITH no_wait
GO

Step 2: Issue a checkpoint
One can issue a checkpoint using the following command.

CHECKPOINT
GO

Checkpoint process writes all the dirty pages in the memory to disk. On a simple recovery mode, the checkpoint process clears the inactive portion of the transaction log.

Step 3: Shrink the log file

USE dbadb
GO
DBCC shrinkfile(2, 2, truncateonly)

Shrinking the log file with a truncateonly option clears the unused space at the end of the log file. First parameter of the Shrinkfile takes the filed id within the database. Mostly the fileid of the log file is 2. You may verify the same by firing a query on sysfiles.

Step 4: Change the recovery model back to full/bulk logged
Change the recovery model to the recovery model originally ( full/bulk logged ) used by the database.

USE [master]
GO
ALTER DATABASE [dbadb]
SET recovery FULL WITH no_wait
GO

After these steps the log file size should have reduced.

The intention behind this post is not to encourage truncating the log files. Use the method explained, only when you are running short of disk space because of a log file growth. Note that, Just like truncating log files, Changing the recovery model also disturbs the log chain. After clearing the log using the above method, you need to either run a full backup/Differential backup to keep your log chain intact for any recovery.

Just a quick demo to show that the log chain breaks if you change the recovery model.

The database whose log file we will be clearing is dbadb. Log file size 643 MB as shown below.

After executing the scripts mentioned above, the log file size is 2 MB as shown below.

The log chain breaks after changing the recovery model. When log chain breaks, subsequent transaction log backups start failing as shown below.

Transaction log backups will be successful only after the execution of full or differential backup.

PS: Pardon me for a SQL 2k8 post, when the whole world is going crazy about
SQL Denali :)

Tuesday, November 9, 2010

Finding CPU Pressure - dm_os_schedulers DMV

The last post dealt with checking CPU pressure using wait_stats DMV. But, to get a complete picture of CPU Pressure, the script provided in the previous post alone wouldn't suffice. We need the help of additional DMV
sys.dm_os_schedulers.

Why we need the help of sys.dm_os_schedulers?
As already mentioned, Waitstats DMV captures waiting time for a group of wait types. WaitStats DMV works in the following way. WaitStats checks whether any processes waits at any of the defined wait types. If yes, then WaitStats tracks resource wait time experienced by the process. After the resource wait is over, waitstats tracks the signal wait/CPU wait. Once, both the waits are completed, the waitstats reflect in the DMV.

There have been scenarios where a process doesn't experience any resource wait and just has a CPU wait/Signal wait. In such a case, the waiting time misses the wait stats, and wait stats doesn't reflect the CPU pressure experienced at all. In other words, if a process doesn't wait for any resource and directly gets
into the CPU queue and spends long time only in runnable state, then
sys.dm_os_wait_stats doesn't reflect it. The Same scenario is explained in detail by MVP Linchi Shea over here.

In such a scenario, one can use sys.dm_os_schedulers to detect CPU pressure.
Consider the following query.

SELECT scheduler_id,
       cpu_id,
       current_tasks_count,
       runnable_tasks_count,
       current_workers_count,
       active_workers_count,
       work_queue_count
FROM   sys.dm_os_schedulers
WHERE  scheduler_id < 255;

Simple and straight forward query on sys.dm_os_schedulers DMV. Returns one row per CPU. Scheduler_id < 255 filters out schedulers used by system processes. Detailed explanation on the DMV can be obtained from here. Most important column is runnable_tasks_count which indicates the number of tasks that are waiting
on the runnable queue. Once a process moves out of the runnable state into running state, the value of the column reduces by one. So, the DMV always indicates the current state of the server and the runnable queue.dm_os_schedulers doesnt hold historical data since last restart unlike waitstats dmv. So, this needs to
polled lot more frequently, perhaps every 5 minutes if necessary. Any non zero value noted on runnable_tasks_count is a indication of CPU pressure and requires close monitoring.

To sum it up, CPU monitoring for SQL Server would include three parts.

* Directly checking CPU using sys.os_ring_buffers dmv as explained here.
* Checking waitstats dmv as explained here
* checking sys.os_schedulers as explained in this post.

Sunday, October 31, 2010

Finding CPU Pressure using wait_stats DMV

CPU Pressure forms vital part in performance monitoring and sys.dm_os_wait_stats is a good place to check. Some background info before we look at the DMV.

Waiting time for a SQL Process is divided into two. They are Resource waits and signal waits( CPU waits ). When a process/query is submitted to SQL Server, SQL Server checks whether all resources required for the process are available.Resources would include things like IO, Network, Memory, Locks, Latch waits etc. For Ex: If a process is waiting for another process to release its lock, then it becomes a resource wait because of locks. If a process wants to write something to a disk but its has to wait for some other IO activity to complete then its a resource wait because of IO. So all these waiting time gets categorized as resource waits..

After a process grabs all the resources it requires, the process gets scheduled to run/executed by the Processor/ CPU. So, it enters the CPU queue. If there are already many processes executing, then it has to wait to get the CPU. The time spent waiting for the CPU is termed as signal waits.

sys.dm_os_wait_stats DMV provides the cumulative sum of the waiting time experienced by all processes at each wait type. dm_os_wait_stats DMV is a SQL Server 2005/2008 equivalent of DBCC SQLPERF(waitstats). A lot more accurate explanation about the DMV is provided here . dm_os_wait_stats maintains these stats since last restart of the server or since the last execution DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);

sys.dm_os_wait_stats provides about 485 wait types/wait events and waiting time spent at each wait type. Two interesting columns of this DMV are

* wait_time_ms
* signal_wait_time_ms

Wait_time_ms gives the total time spent at each wait type ( Resource + cpu wait ). signal_wait_time_ms specifies the time spent on waiting for CPU. Resource waittime can be found by wait_time_ms - signal_wait_time_ms.

SELECT SUM(signal_wait_time_ms) * 100 / SUM(wait_time_ms) AS
       cpu_pressure_percentage,
       SUM(signal_wait_time_ms)                           AS cpu_wait,
       SUM(wait_time_ms - signal_wait_time_ms)            AS resource_wait,
       SUM(wait_time_ms)                                  AS total_wait_time
FROM   sys.dm_os_wait_stats

Consider the query above. cpu_pressure_percentage provides percentage of time spent waiting for CPU, by finding the ratio of signal_time and total wait time. Microsoft recommends this value to be less than 25% for a normal system. Value any higher would indicate a system suffering likely from a CPU stress.
Most common causes for CPU stress/high signal waits can be

* Poor and ineffective query plans / missing indexes / table scans
* Too many Plan Recompliations
* Slow CPU and requires a upgrade

As a production DBA, its a good idea to collect the resultset dm_os_wait_stats DMV once few hours or twice or thrice a day and perhaps analyze the results to notice any surge in wait time. This can be one proactive approach to detect potential major problems.

Saturday, October 23, 2010

Backup path

Sometimes after taking a backup, one can forget the path where the backup was taken.
The simple script below provides the list of backups taken on the server, backup type ( log - l, Full -D..),backup start date and time, lsn numbers and the path of the backup. Simple and handy.

SELECT database_name,
       TYPE,
       backup_start_date,
       first_lsn,
       last_lsn,
       backupmediafamily.physical_device_name
FROM   msdb.dbo.backupset,
       msdb.dbo.backupmediafamily
WHERE  backupset.media_set_id = backupmediafamily.media_set_id
ORDER  BY backup_start_date DESC

Tuesday, October 19, 2010

Currently executing query - wait status

Recently, I had a developer who came to my desk with a request.

' I started a execution of a script from a program. the program has been running for a long time. I don't know whether its executing or hanging. Can you check from database whether the script is running or hanging and if hanging at which statement it is hanging?. If possible please tell me why its hanging..'

Earlier in my blog, I posted the following script to check the currently running query.

SELECT spid,
       TEXT                       AS [query],
       Db_name(sysprocesses.dbid) AS dbname,
       cpu,
       memusage,
       status,
       loginame,
       hostname,
       lastwaittype,
       last_batch,
       cmd,
       program_name
FROM   sys.sysprocesses
       OUTER APPLY Fn_get_sql(sql_handle)
WHERE  spid > 50
ORDER  BY cpu DESC

The above query would give the stored procedure's name or starting statements of the script/batch and not the exact statement within the stored procedure or batch that is currently executing. To find that one should use sys.dm_exec_requests

SELECT session_id,
       request_id,
       Db_name(database_id),
       start_time,
       status,
       command,
       Substring(txt.TEXT, ( statement_start_offset / 2 ) + 1,
       ( ( CASE statement_end_offset
       WHEN -1 THEN Datalength(txt.TEXT)
       ELSE statement_end_offset
                                                                   END
           - statement_start_offset ) / 2 ) + 1) AS statement_text,
       wait_type,
       wait_time,
       blocking_session_id,
       percent_complete,
       cpu_time,
       reads,
       writes,
       logical_reads,
       row_count
FROM   sys.dm_exec_requests
       CROSS APPLY sys.Dm_exec_sql_text([sql_handle]) AS txt
WHERE  session_id <> @@SPID
       AND session_id > 50

The query above gives the currently executing statement within a batch. sys.dm_exec_requests provides the columns statement_start_offset ,statement_end_offset which specify the starting and ending positions of the currently executing query within a batch, in bytes. As 'text' column returned by dm_exec_sql_text is of nvarchar datatype, statement_start_offset/2 is required to get to the starting postion of the query. statement_end_offset returns -1 to indicate the end of the batch.So, statement_text column of the query answers the first part of the developer's request.

Now, for the next part , ie finding the status of the query ie.. running/hanging/sleeping etc

A user request can take the following major states.

> Sleeping : Query completed execution and has nothing to do. Connection
is still not closed.
> Running : Currently running/executing.
> Runnable: Can run but waiting for a processor. Implies waiting in
processor queue.
> Suspended: Waiting for some resource or event. Wait may be because of
locks, IO completion, Latch or Memory wait etc.

Sleeping requests are not shown in sys.dm_exec_requests. Runnable/Suspended state requests are the ones we need to watch out for. Runnable/Suspended states are not abnormal, as long as they dont stay at the same state for a long time.

Now for the third part..' Why it is hanging..'

The wait_type column on the resultset will return the reason why a request is waiting. To know the meaning of each wait type refer here

Popular few wait types are given below

ASYNC_IO_COMPLETION - Waiting for IO
CXPACKET - Wait beacuse of Parallelism.
LOGBUFFER - Waiting for the logbuffer to release some space in the memory.
LCK_M% - Waiting for lock to be released. Refer to Blocking_Session_id for finding the blocking process.

Other interesting columns returned by dm_exec_requests are wait_time returning time spent waiting in ms, percent_complete indicating progress of the query,
cpu_time,reads,writes,logical_reads and row_count. percent_complete mostly indicates the progress for select queries,backup process but not for DML operations.

So, a quick execution of the script and checking the wait state details can clearly
show the current progress/ status of the request.

Monday, September 27, 2010

SQL Server never stores the same column twice

Consider the following table.

CREATE TABLE dbo.tbl
  (
     col1_pky      INT PRIMARY KEY,
     col2_nc_index INT
  )

CREATE INDEX nc_col2_col1
  ON dbo.tbl(col2_nc_index, col1_pky)

I have created a table dbo.tbl with the column col1_pky as primary key.
A composite Non Clustered index is created on col2_nc_index,col1_pky columns.As by definition every non clustered index contains the clustered index key.
So going by definition, the non-clustered index should contain the following

* col2_nc_index,col1_pky - Index definition
* col1_pky - Clustered Index key

col1_pky is supposed to repeat twice as its a part of the nonclustered index and also clustered index key. But, SQL Server avoids the same by not storing the same
column twice.

/* Data Generation Script */
DECLARE @col1 INT,
        @col2 INT

SET @col1 = 1
SET @col2 = 10000

WHILE @col1 < 10000
  BEGIN
      INSERT INTO tbl
      SELECT @col1,
             @col2

      SET @col1 = @col1 + 1
      SET @col2 = @col2 - 1
  END

As usual let us use DBCC IND / PAGE to check the same.
To check the root page

DBCC ind ( 'dbadb', 'tbl', -1)

Root page is found as 18684

Let us take a look at the contents of root page using DBCC PAGE

DBCC traceon(3604)
GO
DBCC page(dbadb, 1, 18684, 3)
GO

Note that col1_pky appears only once and doesnt appear twice. To confirm the same let us check the contents of a leaf level page.

DBCC page(dbadb, 1, 19044, 3)
GO

Again col1_pky is present only once. So, SQL Server always stores a column only once in a table.