Thursday, May 11, 2017

Find the number of rows for a value, without querying the table

Dear all,

Being a DBA, developers approach me with questions like can you run this query for me in production?

Select *

WHERE UnitPrice = 32.6

Lets, say the table contains few billion rows and UnitPrice is not an indexed column, then query would take ages to provide the result. Not just that, it causes a massive I/O on a busy production database.  

For the above situation, I would usually ask the developer, "Would you need a accurate value or is it ok if it is approximate?". If approximate numbers are fairly sufficient then, one can answer these type of questions without actually querying the table. How?

Simple - STATISTICS is the word :)

To find the statistic that will be useful for the query, please use the following script and provide the table name and column name. The script provides the statistic name we should be checking.

Declare @table_name varchar(100)
Declare @Column_name varchar(100)

SET @table_name = 'FactInternetSales'
SET @Column_name = 'UnitPrice'

SELECT OBJECT_NAME(s.object_id) AS object_name, 
    COL_NAME(sc.object_id, sc.column_id) AS column_name, AS statistics_name 
FROM sys.stats AS
JOIN sys.stats_columns AS sc 
    ON s.stats_id = sc.stats_id 
AND s.object_id = sc.object_id 
OBJECT_NAME(s.object_id) like @table_name
COL_NAME(sc.object_id, sc.column_id) like @Column_name

After finding the statistic name, just go to the table, expand "Statistics" and double click on the stat that you are interested in. The click on the details section, find the value interested in on "Range_HI_Key" Column and observe the "EQ_Rows" section to get the approximate number of rows. 

Most of you would know that statistics can be used to find query estimates but few (I guess) would use it operationally for these kind of requests. One can also use them to estimate for queries of this type too

Select count(*), 
from [FactInternetSales]
Group by ProductKey

Select count(*) from
WHERE UnitPrice > 40  -- ( Possible but can get little tricky at times, more of it on upcoming posts :) )

Few quick pointers:

The above method relies on the following prerequisites

* Either Auto create Stats should be turned on or the column should be the leading column of any index for the statistic to be present. Auto Create Stats are on by default

* For the stat to be reasonably accurate, one needs to have Auto Update Statistics on ( which is also "on" by default )

* Automatically created statistics are named like '_WA%'. For a column to have a auto created stat, the column should have been queried at least once since database creation. Indexed column would not need auto created statistics

* To figure out when was the statistic last updated, use the following query 

sp.stats_id, name, filter_definition, last_updated, rows,
rows_sampled, steps, unfiltered_rows, modification_counter  
FROM sys.stats AS stat  
CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp 
WHERE stat.object_id = object_id('people_Data')
and name like '_WA_Sys_0000000D_1273C1CD';

If the value you are looking for doesn't appear as a "Range_HI_Key" on the histogram then refer to the next biggest value on the "Range_Hi_Key" and observe "Avg_Range_Rows" for the estimates. Please refer to this post for interpreting histograms.

There is always more to statistics. Will cover them in upcoming posts.

Sunday, May 7, 2017

SQL Azure Databases - Active Secondary - How to configure? - Part 1

Dear all,

I have taken the azure dive at last :)
In case if you are new to azure, azure is the Microsoft cloud platform service, offering lots of services for handling data platform tasks. There are tons and tons of material on it and perhaps you can get a good introduction on the following links - here and here

This post is about SQL Azure Database - a service (a PAAS service ) which allows one to host just a database in cloud and a feature of it called "active secondary". "Geo Replication" - Active Secondary is a feature in SQL Azure in premium service tier ( similar to our enterprise edition but certainly not as costly :) ), which synchronizes data asynchronously to another server (either in the same geographical location or different ) and allows one to run read only queries to it.

In my opinion, the below specifications are perhaps the most common requirement of most modern day mid size applications.

  •  A database with good performance to store and retrieve data
  • A simple disaster recovery / high availability solution
  • Readable secondary if possible

All these with a really affordable cost ( will discuss that part soon ) , just in few clicks is what makes SQL Azure tick. This post will run thro the details on how to configure it.

Pre requisite:

1) Grab a Microsoft Azure subscription :)

 Demo :

Now for the step by step  "how to do" screenshots :)

1) Login to and Click on SQL Databases -> Click on Create SQL Database

2) Provide the server name. Though one pays only for the database, one just needs to create a server to host the database. Server will be maintained by azure and no charges are applicable for the server while using  “sql azure database” service.


Provide unique names for

·         Database name
·         Resource group ( more about it later )
·         Server name
 3) Click on "Configure required settings" and provide Server Details to be provided as shown below
4) Set the service level to "Premium"
One needs to pick Premium to have readable secondary
5) Click on the create database to complete primary setup. Creation takes few minutes

6) Configure GEO replication - secondary

Once the notification bar indicates that the primary has been created, click on sql databases to find the primary database created.
Pick “Geo Replication” as shown below

7) Configure GEO Replication

The screen below allows one to pick the geographical location of the secondary server. If one needs lesser latencies, closer locations to primary are preferred


8) Once location is picked, one is prompted to fill in the rest of the details of secondary server
 9) Create a secondary server similar to primary, by providing details like servername and admin login name
10) Create the secondary server

Upon clicking ok, Primary and secondary starts to synchronize as shown below

11) Synchronization completion
The completed tick marks will indicate synchronization completion as shown below
Synchronization can take few minutes or longer depending upon data size. Once complete, click on SQL Databases on Azure Portal to verify as shown below
With that the setup of active secondary on sql azure database is complete. Pretty simple as it takes a few clicks and needs almost zero pre requisites. A similar setup on "on Premises" would take days to months in most places.
 That's a lengthy first post on SQL Azure - Active Secondary.  On subsequent posts, I will cover how to connect to primary and secondary, failover , monitoring and how applications can benefit via readable secondary and of course the cost involved as well.


Thursday, May 4, 2017

Data Channel Interview - 07 - Steve Knutson on Managing Sharepoint Databases

7th Interview of #DataChannel is out :)

Very happy to have Mr.Steve Knutson, office servers and SharePoint MVP, share his knowledge on "Managing SharePoint Databases" on #DataChannel. Steve, demystifies quite a few items for SQL Server DBAs managing SharePoint databases like "what are the do's and don’ts for SharePoint databases", "Why are SharePoint DBs huge", "How to Size SharePoint Databases", "What are RBS / File Streams / Blobs on SharePoint Databases" and many more. 

Steve is to the point on his answers and in my opinion, the interview is very useful for database administrators managing SharePoint Databases. Happy Watching!!!


Sunday, April 30, 2017

Data Channel Interview - 06 - Manohar Punna on SQL Azure Migrations

Here we go for the 6th interview on #DataChannel. Manohar Punna, MVP and Data Platform Geeks President discusses "Migrations to SQL Azure" on #DataChannel. Manohar, in this interview provides valuable inputs on migrating to SQL Azure, challenges involved and methods to monitor and size SQL Azure databases. Happy watching !!!

Thursday, April 13, 2017

SQL Saturday Christchurch 2017

Dear All,

 Last week, I had travelled to the beautiful city of Christchurch, for speaking in SQL Saturday South Island. This is by far the farthest I have travelled for a SQL Saturday, nearly 8500KMs from Singapore. I am so glad, I made it as it turned out to be a excellent event organized by my dear friend / brother Hamish Watson, Martin Catherall and rest of the SQL Saturday South Island team.

My talk was on "Cardinality estimator - A ride from SQL 7.0 to SQL Server 2016", going thru the various changes on CE 120 and CE 130. Personally, I enjoyed delivering this one as the audience was fairly well informed and it gave me the license to get lot more detailed on the subject. No one left the room since I started the talk and it was a very interactive session, which gave me a feel that the talk was well received too.

After my session, I managed to interview couple of wonderful MVPs, Mr. Manohar Punna and Mr. Steve Knutson for #DataChannel. Both the interviews were on my editing table and will be out in few weeks’ time :)

The Oz/NZ SQL family is so warm, welcoming and take so much of care, that it is always a pleasure being part of their events. I always have a ball of a time in company of Hamish, Martin Catherall, Manu, Warwick, Martin Cairney, Rob Douglas, Patrick Flynn, Reza and Leila (though they had to leave a little early this time around). Thanks for making me feel part of the family as always.
One of the main reasons for me deciding to take part in this event was beacuse of Mr.Hamish Watson, who invited me way back in Aug 2016 itself. While he calls me "SQL Brother", he took care of me like his real brother, taking me around the beautiful city of Christchurch, taking me to his home, showing me his farm as well. I got a chance to meet his wonderful family, his wife and 4 lovely kids and I really wish them all the happiness in this world. Hamish also cooked delightful vegetarian dinner for the entire family, making the overall stay at his home memorable. I really don't think I could play such a good host as him and I had lots to nice things to learn from him and his family.

Finally, I also did the tourist thing of sightseeing and shopping in Christchurch. After travelling around a few countries and a number of cities, I can certainly say, Christchurch is one of the most beautiful cities I have ever seen. The mountains, the ocean, the beach and the country side are so scenic. Sadly, multiple earthquakes in last few years have caused loss of several lives and properties but nothing stops the New Zealanders from being happy and keeping the city neat, clean and beautiful.


 I managed to visit a place called Gondula which is a location on Mount cavendish, at about thousand feet above sea level, from where one gets an aerial view to the city & harbor at one side and calm endless pacific ocean at the other side. Needless to say, the views were simply stunning and I was awestruck to see how nature has blessed this place with so much of beauty. Gondula is one of the two stunning locations I have ever been too (the other being the great oceans road / 12 apostles in Australia).


 Overall, a memorable trip both professionally and personally, thanks again to Hamish Watson for giving me an opportunity to be part of this excellent event!!! 

if interested, please check out more photos here



Wednesday, April 12, 2017

Data Channel Interview - 05 - Edwin Sarimento on Availablity Groups & High Availability

Dear all,

On the 5th interview of #DataChannel, Mr.Edwin Sarimento, MVP, MCM and a popular high availability expert discusses SQL Server Always on Availability Groups and High Availability options in SQL Server. In this interview, Edwin gives a totally new dimension to common perceptions on high availability.

Sharing a few top lines from Edwin in the interview!!!

  1. Keeping it simple is the most difficult thing to do!!!
  2. Don't do something just because its the latest. Always ensure it solves the business problem
  3. Deprecated doesn't mean not supported
Watch the full interview below and enjoy!!!

Monday, March 27, 2017

Data Channel - Interview 04 - Martin Catherall on In Memory OLTP

Welcome to the 4th Interview of Data Channel.
We have Mr.Martin Catherall, 3 time Data Platform MVP from Australia / New Zealand discussing "In Memory OLTP" tables. Interview was shot at his wonderful home in Melbourne. I would like to thank Martin for sharing his knowledge for the SQL Community. Check out the interview below.


Sunday, January 22, 2017

MVP 2016!!!

Dear All,

Very happy to share the news that I have been awarded Microsoft Most Valuable Professional award for my contributions to SQL Server Community in 2016!!!. This is an honor I have dreamt of for long needless to say that the award motivates me to contribute more for the SQL Server Community.

In case if you don't know what MVP award is all about, you may read about it here and here

No big achievement is possible without a team. This one is no different. My family, Singapore Microsoft Data Platform Team, MVPs who nominated me, Microsoft DX Team, my friends at Singapore SQL Server User Group, colleagues at work and several experts from whom I have learnt the tricks of the is a long list of people who have helped me get here!!! I have just collected the award on behalf of everyone :)

I received the award few days ago shipped from Microsoft. Sharing the pictures of the same below. Thanks to Microsoft again for giving me something I would cherish forever!!!!


Wednesday, January 18, 2017

Data Channel - Interview 03 - Hamish Watson on DevOps and Azure

Dear All,

After a short break, we are back with the 3rd interview on #DataChannel. Mr.Hamish Watson, Community Lead from Christchurch, New Zealand discusses "DevOps Practices and Azure" with me.  Had a great time as always talking technology with the ever awesome Hamish. Interview recording provided below

Interview transcript to be updated soon.