Monday, July 25, 2016

Speaking at SQL Server Geeks Annual Summit 2016

I am very happy and proud to share the news that I will be speaking in SQL Server Geeks Annual Summit 2016 to be held in Bangalore from Aug 11 to 13th. SQL Server Geeks Annual Summit is the biggest Technical conference in Asia with 60+ speakers from across the globe speaking on SQL Server and Microsoft Data platform technologies.

Last year, I travelled more than 6000 KM with in a day, just to attend one day of the conference ( Read here for my last year's experience). And yes, it was worth the effort. From a attendee trying hard to attend the event last year to be a speaker, personally it is a big leap for me. It is indeed a great honor to be speaking at a event at which other speakers are the biggest stalwarts of SQL world like  Grant Fritchely, Sunil Agarwal, Bob Ward, Andreas Wolter and many more. I can't thank enough Amit Bansal ji, good friend Manohar Punna for the opportunity.

I will be giving 4 sessions in total - 1 Regular session ( 75 minutes ) and 3 open / chalk talk sessions ( 30 minutes each). 

  • Regular Session on "The New Cardinality Estimator"
  • Chalk talk and Open talks are on "CPU Performance troubleshooting", "In Memory OLTP Versioning", "Performance Monitoring n Troubleshooting"

Refer here for the session details.

As a attendee, I am super excited about the following sessions.

1) Bob ward's session on Query Store, R technologies
2) Ajay Jagannathan on Extended events, Tempdb
3) Joe Yong on Azure
4) Sunil Agarwal's sessions on In Memory and Column store indexes
5) Amit Bansal's session on Semaphore
6) Satya Jayanty's session on BI for DBAs

These are many more wonderful sessions ( in fact 127 in total !!! ) . So, I am pretty sure it is going to be 3 days deep dive technical content of top notch which you cant find anywhere else. So if you are anywhere in India and someway connected to SQL Server, August 10 - 13 is the place to be. I can assure you wont be disappointed for sure!! Don't miss it!!! 

Sunday, June 12, 2016

SQL Community n Speaking Assignments

Dear All,

It has been a while since I became less active on this blog. Reason for the same being that, I have moved to other medium of SQL Server - Speaking / Organizing SQL Server Events.

It has been a year since I got associated with SQL Server User Group, Singapore. Since then, I have been managing the user group along with 3 others ( Dharmendra, Khilit n Sarbjit Singh ). In case you are wondering what SQL Server User Group is, it is a community for SQL Server enthusiasts, were we arrange knowledge sharing sessions on a monthly basis. Every month, usually on 3rd thursday we have atleast 2 excellent presentations done by MVP ( Microsoft
Valuable Professional ) or expert from Microsoft Singapore team or any other industrial expert. So, if you are someway related to Data Platform and keen on learning and networking, that is the place to be. Join our facebook page here for latest updates on the user group.

I have done close to 10 public speaking assignments in less than 1 year. Yes, I know for some of the SQL Community folks it is just a month's target :) But I didn't expect myself to be speaking at this rate. But no complaints as I enjoy it to the core. Looking back at the list of things I did

  • July 2015 at SG SQL User Group on  "Hekaton - In memory tables - Overview"  - details provided here

  •  Nov 2015 at SG SQL User Group on "Hekaton - In Memory tables - Transaction Logging and Checkpoint processes" - details provided here

  •  Jan 2016 at SG SQL User Group on "SQL Server 2016 on Always Encrypted"  - details provided here

  •  March 2016 at SQL Saturday Nepal 2016 on "Cardinality Estimator" -  details provided here

  •  Apr 2016 at SG SQL User Group on "Optimizer's choice of Indexes"  - details provided here

  •  May 2016 at SG SQL User Group on "SQL Server 2016 - Real Time Operational Analytics - "Real" Game changer" - details provided here

  •  May 2016 at Data Immersion Event in Microsoft Singapore on Columnstore Indexes and In Memory OLTP.
  •  May 2016 at SQL Saturday Brisbane on "SQL Server 2016 - Real Time Operational Analytics - "Real" Game changer" - details provided  here

  •  May 2016 - "Moderated" Niko Neubauger's session in 24 Hours of PASS - details provided here

Besides these, couple of ones at work too. Speaking on Technology is fun. Picking a topic on which one has good knowledge, designing the scope, working on the subject for at least 2 months and then preparing the presentation n talk is a challenging but fun filled exercise. Grooms one's technical , presentation and communication skills for sure. What goes behind the scene in a technical presentation is a interesting post all together. Will do one on that soon.
I also have a few huge upcoming presentations as well. Will be blogging about it as well.  Hope to balance blogging n speaking in the rest of the year :)



Tuesday, April 12, 2016

Cardinality Estimator - Part 2- Out of Date Statistics

Continuing from our series on Cardinality Estimator 2014, Let us look at the first difference between 2 CE models. The way estimates are made when the statistics are out of date or stale is the first difference we will be exploring. Couple of prerequisites for this demo

  1. Download SQL Server 2012 Adventureworks database from here
  2. Restore the downloaded database into a 2014 instance and switch the compatibility level to 120 ( SQL 2014 )  

Let us check the row count of the table [Production].[TransactionHistory]

EXEC sp_spaceused '[Production].[TransactionHistory]'

Contains 113,443 rows. Lets insert 10,000 ( little less than 10%) rows so that it doesn't cause the statistic to be automatically updated and statistic remains out of date. Note that 20% of table needs to change for statistics to be updated automatically.

The plan is to insert rows, run queries against both the Cardinality Estimator models and compare the estimates.

Step 1: Insert 10,000 rows.

Insert into [Production].[TransactionHistory]([ProductID]
SELECT Top 10000
  FROM [Production].[TransactionHistory]
  WHERE [ModifiedDate] BETWEEN '20070101' and '20080101'

Step 2: Run query against both the models

  FROM [Production].[TransactionHistory]  WHERE ProductID = 714  OPTION (QUERYTRACEON 9481)

  FROM [Production].[TransactionHistory]
  WHERE ProductID = 714

Note that the hint "OPTION (QUERYTRACEON 9481)" will force the query to use old CE 70 in SQL Server 2014.Observe the estimates for both.

CE 70:

CE 120:

They are the same. Let us observer the histogram below.

Notice that the histogram contains value "714" as a range hi key and it estimates 645 rows. Estimate on the query plan is slightly higher. As SQL Server knows that approximately 10% of the rows have changed, it makes an adjustment ( 645 + ~10% of 645 ) on the both the Cardinality Estimator models.

Step 3 :

Now let us observe for another value

  FROM [Production].[TransactionHistory]   WHERE ProductID = 746  OPTION (QUERYTRACEON 9481)

  FROM [Production].[TransactionHistory]
  WHERE ProductID = 746

Let us compare the estimates now

CE 70:

CE 120:

Now we see a difference in the estimates. Let us check the histogram for the value.

The Value doesn't exist as a Range Hi Key and hence "Avg Range Rows" is used for the estimate. CE 70 uses exactly the value estimated by the histogram but CE 120 (new CE) makes the adjustment to the estimate from Histogram, even if the estimate is a approximate guess via Average Range rows. So, CE 70 estimates the value to be 206 as shown above, while CE 120 estimates it to be slightly higher, at 224.15

So, the takeaway from the post is Cardinality Estimator 120 will adjust the estimate obtained from histogram when statistics are out of date and even when the value has to be guessed via "Avg Range rows".

Sunday, April 3, 2016

SQL Server 2014 - Cardinality Estimator - Part 1

It has been a while since I blogged. Just moved to few other mediums like speaking at user group and SQL Saturday events in the last few months and it has kept me away from blogging for a while. Now back to blogging again. Hope to do a better balancing act :)

Starting from where I left, which was on Statistics and Histograms, I now move on to Cardinality Estimator, one of the major changes of SQL Server 2014

What is Cardinality Estimator?

Cardinality Estimator is responsible for providing the estimates on number of rows returned for each operator. If you have observed a query plan before, the estimated row count is provided by the cardinality estimator. What happens behind the scene is

  1. Query optimizer instructs cardinality estimator (CE) to find the estimated rows for each operation ( read a table, filter, join etc )
  2.  Cardinality estimator checks statistics and histogram to find the number of rows returned
  3.  Query optimizer based on the inputs given by CE, picks the right operator and subsequently the right plan

Obviously, CE plays a key role in getting right plan. Incorrect estimates would lead to poor plan and spiral out into poor performance.

CE has remained unchanged since SQL Server 7.0 and SQL Server 2014 brought about changes on the way estimates are made and on the algorithm used in making estimates. 

Run the following query with actual plan ( Query ->Include Actual Execution Plan ) on "Adventureworks2012". 

WHERE ProductID < 100

Right click on the select operator and click on properties as shown below

Observe the property "CardinalityEstimatonModelVersion". Value equal to 70 indicates it is using Old CE.

Query uses old CE as the compatibility level of "Adventureworks2012" database is 110 (SQL 2012). Switch the compatibility level to 2014 or higher,
rerun the query and you will notice the "CardinalityEstimatorModelVersion" to be 120, which indicates it uses the new cardinality estimator.

In a series of posts ( expect around 10 ), I would be covering the various changes in new cardinality estimator.

Wednesday, January 13, 2016

SQL Server Statistics - Histograms

Most of us are aware of Statistics in SQL Server and its purpose. But still, just to get started, statistics store information on the distribution of data in a column(s), which helps the query optimizer pick the right query plans. Effectively, it helps to answer questions like "how many rows with value 25 on column "Age" or how many more than 15 etc..

This post will get little in depth on how statistics are actually stored in the database. Statistics are actually stored as "histograms" ( One may take a look at histogram used in statistics by expanding a commonly queried table in your database, expand statistics -> properties -> Click on "Details" tab.

Interesting part of the image above is the histogram steps. One would notice that histogram steps contain entries for selected values in the column. Basically what histogram steps intend to convey is the number of rows that exist for a particular or range of values.
For example, for a column containing
marks, it attempts to say
0 - 10   there are 15 rows
10 - 30 - there are 45 rows
50 - 80 - there are 60 rows
80 - 100 - there are 20 rows and so on..
Observe RANGE_HI_KEY column. It indicates the highest value in the range. Each row represents the range starting from the ending of previous row ( Previous row RANGE HI KEY value )  to the value in RANGE_HIGH_KEY. For example the second row "" indicates distribution between x to y
  1. RANGE_ROWS Column indicate the total number of rows in the range ( Previous hi key to current hi key )
  2. DISTINCT_RANGE_ROWS indicate how many among them are distinct
  3. AVG_RANGE_ROWS indicate the average row returned per distinct value
  4. EQ_ROWS indicate total the number of rows for the range hi key value alone.
The most important part is number of steps in the histogram are fixed to 200. Meaning the histogram will have maximum of 200 entries / rows irrespective of number of rows the table. Even if there are 1 million rows, 200 entries will describe the column's data distribution.
So, why this post on statistics? :) Armed with this knowledge, let us dive into deeper concepts like cardinality estimator and how it has changed in SQL 2014 in upcoming posts.

Wednesday, November 18, 2015

In-Memory OLTP Table Checkpoint Processes Performance Comparison

Checkpoint processes and other internal processes have undergone major changes for In Memory OLTP tables and transactions. The following article of mine compares the performance of checkpoint processes in disk based ( traditional ) tables against In memory tables. I understand that, I have not written about checkpoint processes of In Memory tables here. Will be doing soon. Anyways, sharing the article below. Read n comment if interested

Friday, November 6, 2015

Hekaton Part 11 :- Finding row count in memory table

Finding a row count in a "in memory" Hekaton table can be a tricky task as the regular commands and DMVs don't give the desired result.

  • sp_spaceused - doesn't work and returns a blank result
  • sys.dm_db_partition_stats - doesn't track in memory tables.

So the  option we are left with is
     [Schema Name],

      OBJECT_NAME(Ind.OBJECT_ID) As [Table Name],

      ISNULL([rows],0) as [Total Records]
      FROM sys.dm_db_xtp_index_stats AS Ind
      CROSS APPLY sys.dm_db_stats_properties
      WHERE Ind.index_id =2
However, please note that the above query will give correct row counts if and only if statistics are updated for the table. For in memory tables, statistics are not automatically updated manually and hence will have to be updated  before one attempts to find the row count. So, please refer to the query below, to update the statistics for in memory table alone.
DECLARE @id int,@rowcnt int
DECLARE @in_mem_tbl_name varchar(500)
SET @id = 1
CREATE TABLE #in_memory_tbl(id int identity(1,1),in_mem_tbl_name varchar(500))
INSERT INTO #in_memory_tbl(in_mem_tbl_name)
SELECT object_name(object_id) FROM sys.dm_db_xtp_table_memory_stats
WHERE object_id > 0
SET @rowcnt = @@ROWCOUNT
WHILE @id <= @rowcnt
SELECT @in_mem_tbl_name = in_mem_tbl_name
FROM #in_memory_tbl
WHERE id = @id

SET @id = @id + 1
DROP TABLE #in_memory_tbl

Please ensure to run the statistics section above, before attempting to find the row counts on "in memory tables"