Monday, August 31, 2015

At SQL Server Geeks 2015 - Pre Conference Session - Benjamin Nevarez

Attended SQL Server Geeks Annual Summit 2015 on last Wednesday (26/8) at Bangalore, India. Due to few urgent responsibilities at work, could attend only Pre Conference event on Wednesday, skipping the events on Thursday, Friday and Saturday.

The Preconference event I attended was "SQL Server Optimization and Tuning" by Benjamin Nevarez :) No prizes for guessing why this session.. Of course for Mr.Optimizer Benjamin Nevarez!!! I have been a reader of SQL Server Internals Series of books since SQL Server 2005 and I am currently reading his SQL Server 2014 query tuning book and was obviously too thrilled to meet him and learn more from him. And yes, it did happen!!!




So, now getting to the actual session. The session was scheduled to start at 9 AM and Benjamin sir and most of the attendees were in at 8:45 itself. Session was off to a flying start, with Benjamin sir explaining importance of optimizer right from the starting days of relational database. After the gentle start, Benjamin sir wasted no time in getting into depth details of the optimizer. The scripts and examples showing things like transformation rules, experiments playing with transformation rules ( turning off few of them and saying "what  happens now" ) and usage of all the undocumented trace flags were mind-blowing. It gave the attendees a deep insight and a little tour inside the query optimizer itself. Genuine level 500 stuff which only Benjamin sir and probably  just a few handful of others in SQL Community are capable of doing.

On the post lunch session, we were scheduled to look at a array of topics like Indexes, Statistics, Plan reuse and parameter sniffing, Cardinality Estimator, Hekaton, & column store indexes. That's quite a lot for half a day ( 4 hours ) and hence perhaps Benjamin sir didn't go so much in detail in the sessions post lunch. I would have wished for few more 500 level stuff but due to time constraints and also to reach out to entire spectrum of audience, I guess it was designed to be a 300 level session. Nevertheless, it helped me clarify, one of those small things or little doubts which one has, even on the topics which you know the most.

Overall, the pre conference session was a great learning experience not just on a technical front, even on performing technical presentations. The way Benjamin sir was prepared for the sessions, his time management, the scripts, the demos , the organization of presentation slides was simply perfect. And needless to say, meeting a legend of the SQL Community like Mr.Benjamin Nevarez is absolutely priceless. So, overall a event which I would remember for long. Special thanks to Amit Bansal and his team for making this happen.

Other arrangements like food and beverages, organizers to look for any help, registration and directions to the hall were neatly handled. Overall a well managed event and kudos to the entire SQL Server Geeks team for their efforts!!! Few pictures which were taken by photographers of the event shared below.






 

Tuesday, August 25, 2015

Query cost and Query Execution time - Difference

This post is my 100th post :) Was inactive in blogging for few years in between. Now back in full flow. Hope the 200 doesn't take this long !!!

When analysing query plans, one would have noticed the Query cost percentage ( which appear on top of each query in estimated and actual query plans). Query Cost percentages indicate the resource consumed by each query relative to the batch of queries that were executed.



The point to note is highest query cost doesn't mean that the query took longest time to run, it just means that the query highest amount of resources ( CPU, I/O or Memory ). Query with the highest cost may not always be the longest running query. Please refer to this example. Run the following script with the option "Include Actual Query Plan"

SET NOCOUNT ON

Declare @dt Datetime

SET @dt = getdate()

WAITFOR DELAY '00:00:10'

select * from sys.databases where database_id = 5

Print 'time spent in First Query - '

Print Datediff(ms,@dt,getdate())

Print char(10)

/*************** End of First Batch ******************/

SET @dt = getdate()

select syscolumns.name,sysobjects.name,syscolumns.*

from syscolumns, sysobjects where syscolumns.id = sysobjects.id

and sysobjects.xtype = 'u'

order by sysobjects.crdate

Print 'time spent in Second Query - '

Print Datediff(ms,@dt,getdate())


We would use query plans to compare the query cost of first and second queries. The Variable @dt used in the script would be used to track the time taken to run each query.

The first query set of queries, have a delay of 10 seconds and finishes little over 10 seconds. While the second query finishes in 140 milliseconds. Refer to the text result below

time spent in First Query -

10110

time spent in Second Query -

140


Comparing the query plans, the first query consumes just 12% of the total cost of the batch of queries while second query being little complex consumes 88% of total query batch cost, in spite of the first query taking longer to complete


 

Friday, August 14, 2015

Hekaton Part 4: Simple Performance comparison - In Memory vs Disk table

Continuing on the Hekaton Series, this post will attempt a very basic performance comparison of In Memory and disk based table.

Table Creation and Data Loading

CREATE TABLE dbo.Disk_tbl
(
[ID] Int identity(1,1) Not null PRIMARY KEY CLUSTERED,
[Data] char(32) COLLATE Latin1_General_100_BIN2 null,
[dt] datetime not null
);
GO
 
Insert into Disk_tbl(data,dt) Select 'xyz',getdate()
GO 10000
 
Insert into Disk_tbl(data,dt)
Select 'xyz',getdate() from Disk_tbl
GO 10

A Similar script for Hekaton table is provided below. Note that clustered primary key is replaced with Hash Index as cluster indexes are not supported in memory tables.


CREATE TABLE dbo.HK_tbl
(
[ID] Int identity(1,1) Not null PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
[Data] char(32) COLLATE Latin1_General_100_BIN2 null,
[dt] datetime not null,
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO
 
Insert into HK_tbl(data,dt) Select 'xyz',getdate()
GO 10000
 
Insert into HK_tbl(data,dt)
Select 'xyz',getdate() from HK_tbl
GO 10
 

Performance comparison: 

   Please note that memory needs to be cleaned to have a
   genuine comparison between disk based table and In 
   memory table. "DBCC DROPCLEANBUFFERS" helps us on
   the same.  The query just picks one row out of 10 Million
   rows by filtering on the primary key column
 
   DBCC FREEPROCCACHE;
   DBCC DROPCLEANBUFFERS;
   SET STATISTICS IO ON
   Declare @dt Datetime
   SET @dt = getdate()
   Select * from HK_tbl where id = 100234
   Print datediff(ms,@dt,getdate())
   SET @dt = getdate()
   Select * from disk_tbl where id = 100234
   Print datediff(ms,@dt,getdate())


Performance comparison by time:

Result of the script provided below

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1 row(s) affected)

0

(1 row(s) affected)

Table 'Disk_tbl'. Scan count 0, logical reads 3, physical reads 3, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

16



In Memory tables" take 0 ms compared to 16 ms on disk based tables.

Performance comparison by IO:

Statistics I/O on doesn't work on In Memory tables as they seldom involve physical I/O. Disk based tables indicate 3 Physical I/Os.

Performance comparison by Cost:





Query plan indicates that query Disk based table had 99% of query cost while "In Memory" table took only 1% of total resource utilized.



On all counts, "In Memory" tables do outperform disk based tables. However, the scenario considered is a simple scenario and as this series progresses we will analyse with more in depth details and scenarios.

 
     


 
 
  
 
 
 

Thursday, August 13, 2015

Hekaton Part 3: Table Creation and Parameters

Part of a Series on "Hekaton". Refer below for previous posts


Having created a "Hekaton" database, Let us create a "In Memory" table
 
CREATE TABLE dbo.HK_tbl
(
 
[ID] Int Not null PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
[Data] char(32) COLLATE Latin1_General_100_BIN2 null,
[dt] datetime not null,
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
 
Looks like a typical table creation script. Points to note are
 
 1) Definition of primary key - As No ALTER is allowed in SQL 2014, constraint needs to be defined inline. ALTER statement is supported in SQL 2016 and  restriction applies only on SQL Server 2014.
 
2)  Keywords "NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000)" - specifies hash index to be used. More on that in upcoming posts.
 
3)  "MEMORY_OPTIMIZED = ON" clause - indicates that the table will be a in memory table 
 
4)  "DURABILITY = SCHEMA_AND_DATA" - Indicates that in Memory table will be written to the disk like traditional disk based table. "DURABILTY= SCHEMA_ONLY" would indicate that table would be only in Memory and data would be lost after restart.
 
 5) Collation : -  Column "[Data]" has been defined under collation "Latin1_General_100_BIN2". Char and Varchar columns are to be defined only on "Latin1_General_100_BIN2".     However, the restriction is removed in SQL Server 2016.
 
Next post will perform a quick performance comparison of disk based and "In Memory" table.