SQL and SQL only: May 2017

Saturday, May 27, 2017

SQL Azure Databases - Active Secondary - How to connect? - Part 2

As a continuation of "SQL Azure Databases - Active Secondary" series started earlier, this post will cover how to connect to a azure SQL Database primary and secondary after configuring "Geo Replication".

To some, it may sound like "what is the big deal in connecting to, after configuring it". If you are new to azure, it can take a while for you figure out how to connect and at sometimes it can turn out to be one of those annoying simple things, which refuse come off in easily :)

As described in the earlier post, we have configured the following

* Database called "rajprimary" on "rajprimarysvr" server
* Geo replicated database copy of "rajprimary" on "rajsecondary" server

To connect to the primary or the secondary server, follow the steps provided below

Configure the firewall:

* Login to your portal.azure.com

* Click on SQL Databases on the left

* Click on the primary / secondary databases ( as shown under replication role), depending upon the one you would like to connect to

* Click on set firewall as shown below

* Firewall settings are turned on and IP Addresses that are listed alone are allowed to access. Click on "add client IP" for allowing the machine that you are using, to connect to the azure database.

* You may add other client IP Addresses to, if they are to connect to the database. Click on save after adding.

Connect to the database

* Start sql server management studio on the machine you are logged on, and connect to ".database.windows.net". In this case it would be rajprimarysvr.database.windows.net

if you are not sure of your database servername, you may look at by clicking on "sql databases" -> "databasename" -> "Properties"

* Provide the user account and password used while setting up the server

Upon connecting, one would notice that only database related components are visible. Server related components like SqlAgent, Server Error logs are not seen as we are using "Sql database" service on azure.

Same steps can be repeated for secondary server too. More on Azure - Geo replication in upcoming posts.

Thursday, May 11, 2017

Find the number of rows for a value, without querying the table

Dear all,

Being a DBA, developers approach me with questions like can you run this query for me in production?

Select *

FROM

[dbo].[FactInternetSales]

WHERE UnitPrice = 32.6

Lets, say the table contains few billion rows and UnitPrice is not an indexed column, then query would take ages to provide the result. Not just that, it causes a massive I/O on a busy production database.

For the above situation, I would usually ask the developer, "Would you need a accurate value or is it ok if it is approximate?". If approximate numbers are fairly sufficient then, one can answer these type of questions without actually querying the table. How?

Simple - STATISTICS is the word :)

To find the statistic that will be useful for the query, please use the following script and provide the table name and column name. The script provides the statistic name we should be checking.

Declare @table_name varchar(100)

Declare @Column_name varchar(100)

SET @table_name = 'FactInternetSales'

SET @Column_name = 'UnitPrice'

SELECT OBJECT_NAME(s.object_id) AS object_name,

COL_NAME(sc.object_id, sc.column_id) AS column_name,

s.name AS statistics_name

FROM sys.stats AS s

JOIN sys.stats_columns AS sc

ON s.stats_id = sc.stats_id

AND s.object_id = sc.object_id

WHERE

OBJECT_NAME(s.object_id) like @table_name

and

COL_NAME(sc.object_id, sc.column_id) like @Column_name

ORDER BY s.name;

After finding the statistic name, just go to the table, expand "Statistics" and double click on the stat that you are interested in. The click on the details section, find the value interested in on "Range_HI_Key" Column and observe the "EQ_Rows" section to get the approximate number of rows.

Most of you would know that statistics can be used to find query estimates but few (I guess) would use it operationally for these kind of requests. One can also use them to estimate for queries of this type too

Select count(*),

ProductKey

from [FactInternetSales]

Group by ProductKey

Select count(*) from

[FactInternetSales]

WHERE UnitPrice > 40 -- ( Possible but can get little tricky at times, more of it on upcoming posts :) )

Few quick pointers:

The above method relies on the following prerequisites

* Either Auto create Stats should be turned on or the column should be the leading column of any index for the statistic to be present. Auto Create Stats are on by default

* For the stat to be reasonably accurate, one needs to have Auto Update Statistics on ( which is also "on" by default )

* Automatically created statistics are named like '_WA%'. For a column to have a auto created stat, the column should have been queried at least once since database creation. Indexed column would not need auto created statistics

* To figure out when was the statistic last updated, use the following query

SELECT

sp.stats_id, name, filter_definition, last_updated, rows,

rows_sampled, steps, unfiltered_rows, modification_counter

FROM sys.stats AS stat

CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp

WHERE stat.object_id = object_id('people_Data')

and name like '_WA_Sys_0000000D_1273C1CD';

* If the value you are looking for doesn't appear as a "Range_HI_Key" on the histogram then refer to the next biggest value on the "Range_Hi_Key" and observe "Avg_Range_Rows" for the estimates. Please refer to this post http://strictlysql.blogspot.sg/2016/01/sql-server-statistics-histograms.html for interpreting histograms.

There is always more to statistics. Will cover them in upcoming posts.

Sunday, May 7, 2017

SQL Azure Databases - Active Secondary - How to configure? - Part 1

Dear all,

I have taken the azure dive at last :)

In case if you are new to azure, azure is the Microsoft cloud platform service, offering lots of services for handling data platform tasks. There are tons and tons of material on it and perhaps you can get a good introduction on the following links - here and here

This post is about SQL Azure Database - a service (a PAAS service ) which allows one to host just a database in cloud and a feature of it called "active secondary". "Geo Replication" - Active Secondary is a feature in SQL Azure in premium service tier ( similar to our enterprise edition but certainly not as costly :) ), which synchronizes data asynchronously to another server (either in the same geographical location or different ) and allows one to run read only queries to it.

In my opinion, the below specifications are perhaps the most common requirement of most modern day mid size applications.

A database with good performance to store and retrieve data
A simple disaster recovery / high availability solution
Readable secondary if possible

All these with a really affordable cost ( will discuss that part soon ) , just in few clicks is what makes SQL Azure tick. This post will run thro the details on how to configure it.

Pre requisite:

1) Grab a Microsoft Azure subscription :)

Demo :

Now for the step by step "how to do" screenshots :)

1) Login to portal.azure.com and Click on SQL Databases -> Click on Create SQL Database

2) Provide the server name. Though one pays only for the database, one just needs to create a server to host the database. Server will be maintained by azure and no charges are applicable for the server while using “sql azure database” service.

Provide unique names for

· Database name

· Resource group ( more about it later )

· Server name

3) Click on "Configure required settings" and provide Server Details to be provided as shown below

4) Set the service level to "Premium"

One needs to pick Premium to have readable secondary

5) Click on the create database to complete primary setup. Creation takes few minutes

6) Configure GEO replication - secondary

Once the notification bar indicates that the primary has been created, click on sql databases to find the primary database created.

Pick “Geo Replication” as shown below

7) Configure GEO Replication

The screen below allows one to pick the geographical location of the secondary server. If one needs lesser latencies, closer locations to primary are preferred

8) Once location is picked, one is prompted to fill in the rest of the details of secondary server

9) Create a secondary server similar to primary, by providing details like servername and admin login name

10) Create the secondary server

Upon clicking ok, Primary and secondary starts to synchronize as shown below

11) Synchronization completion

The completed tick marks will indicate synchronization completion as shown below

Synchronization can take few minutes or longer depending upon data size. Once complete, click on SQL Databases on Azure Portal to verify as shown below

With that the setup of active secondary on sql azure database is complete. Pretty simple as it takes a few clicks and needs almost zero pre requisites. A similar setup on "on Premises" would take days to months in most places.

That's a lengthy first post on SQL Azure - Active Secondary. On subsequent posts, I will cover how to connect to primary and secondary, failover , monitoring and how applications can benefit via readable secondary and of course the cost involved as well.

Thursday, May 4, 2017

Data Channel Interview - 07 - Steve Knutson on Managing Sharepoint Databases

7th Interview of #DataChannel is out

:)

Very happy to have Mr.Steve Knutson, office servers and SharePoint MVP, share his knowledge on "Managing SharePoint Databases" on #DataChannel. Steve, demystifies quite a few items for SQL Server DBAs managing SharePoint databases like "what are the do's and don’ts for SharePoint databases", "Why are SharePoint DBs huge", "How to Size SharePoint Databases", "What are RBS / File Streams / Blobs on SharePoint Databases" and many more.

Steve is to the point on his answers and in my opinion, the interview is very useful for database administrators managing SharePoint Databases. Happy Watching!!!