SQL and SQL only: 2016

Saturday, December 31, 2016

Data Channel - Interview 02 - Hamid Fard on SQL Server Memory Capacity Planning

Dear All,

Welcome to the second interview of Data Channel. Very happy and proud to have Mr Hamid Fard, MVP and MCM from Malaysia for Data Channel. Hamid discusses SQL Server Memory Capacity Planning on Data Channel. Discussion covers topics like SQL Server Memory sizing, AWE, Plan Cache, In Memory OLTP Capacity Planning and Buffer Pool Extension.

Interview Video provided below:

Excerpt from the interview provided below.

What is Capacity Planning? What does one needs to know?

While managing SQL Server, DBA's need to estimate and come up with the sufficient memory required for database instance. There have been many cases where the database server was setup and released to production and upon analysis it was found that the memory available for the server was insufficient and this resulted in poor performance. So, DBA's should plan and carefully estimate the right memory required for SQL Server.

There is a common complaint that SQL Server uses all the memory in the server. How does SQL Server use the memory?

SQL Server is a memory hungry and one shouldn't be scared if it does so!!! SQL Server loads all the data into the memory for all its queries and then the results are returned to client. If we have lot of memory, more data can be placed on the memory and that improves the performance of queries

Does SQL Server ever release the memory to Windows?

Yes. Lazy writer process scans the buffer pool when the SQL Server or Windows operating system is under memory pressure, and releases the unused pages in buffer pool. The least recently used memory pages are released by Lazy writer. If the pages are dirty ( unwritten to disk ), then they are written to disk before being removed from memory.

Is there any guideline for setting the maximum memory?

Ones needs to come up with a formula based on the Windows Server & Edition and SQL Server Version & Edition. A formula which I will be explaining, will help in coming up with maximum memory setting.

Maximum memory setting allocates only for the buffer pool. One needs to estimate the memory required for SQL MGR, memory used by worker and threads, columnstore cache, plan cache and reduce the same from available physical memory to derive the maximum memory setting for SQL Server.

Can you please share the formula used? Has that been documented anywhere by you? Would be useful for many.

Yes. I have blogged the same. Please refer to my blogpost below where I have written about it in detail http://fard-solutions.com/sql-server-memory-capacity-planning/

Is there a recommended method to set the maximum memory?

Please refer to the blog written by me provided below http://fard-solutions.com/sql-server-memory-capacity-planning/

Blog post clearly explains how to size the memory for the following items

1) Reserve memory for windows

2) SQL Server DLLs

3) MTL - SQL Server Memory To Leave area

4) Plan Cache

5) Buffer Pool

Is there a need to size differently for OLTP and OLAP systems? Why ?

OLAP systems involves more of "Select" queries and hence one needs to have bigger buffer pool to cache more data pages. Otherwise, OLAP systems will have lots of disk IO.
OLTP Systems normally comprises of insert statements, random I/Os, smaller transactions and don't have huge select statements. As the types of workloads are different, a slightly different formula for different systems. Some OLTP systems also have huge reports causing massive select statements, but the formula can serve as a general guideline.

Buffer Pool Extension ( BPE ) was introduced in SQL Server 2014. How do you think it helps in "Memory Capacity Planning"?

Buffer Pool Extension is good for companies using old hardware but running on SQL Server 2014, where one is unable to upgrade the memory. Buffer Pool Extension is normally used with SSD disks, but SSD disks response rates are in micro seconds and not nano seconds as it is in usual memory. So, BPE doesn't give a drastic performance gain.

My suggestion is to use it as last resort and ideally one should look to upgrade the RAM. Even if one decides to use BPE, suggestion is to use it with faster disks like PCI RAM disks

Another major technology to release in SQL Server 2014, is "In Memory" OLTP. How does one size if the database uses "In Memory OLTP" tables?

"In Memory OLTP" is one of the good features to have released in 2014 and improved further in SQL Server 2016. However, one needs to carefully size them to achieve the performance gain.

Are there been scenarios where "In Memory OLTP" actually resulted in negative performance?

Yes. If one doesn't size the memory carefully, it can even result in negative performance. Refer to the formula provided below.

One needs to calculate 4 different areas

1) Table & Row Data - Table data size is the sum of timestamp size, In memory Index Pointers size and actual data size ( row size )

Begin & End Timestamp - 24 bytes

In memory index pointers - 8 + 8 + 8 + 8 ( because 8 bytes per index and there are 4 indexes )

Data Size - 200 bytes in this example

So the per row table data for this example is 256 bytes.

2) Hash Index - Size of hash index depends upon bucket count. It is as follows

( Bucket Count rounded up to nearest power of 2) *

( Index Pointer ) * ( number of hash indexes )

In this case - 5,000,000 nearest power of 2 is 2^23. So, 2^23 * 8 * 3

3) Memory for Non Clustered Index - Derived using indexed column ( key column ) size and number of unique values on the column

Index Pointer + Sum( Key Column Data Size ) * Unique Rows

4) Memory for Row Versioning - In memory OLTP uses row versioning which implies for every operation, the older rows are copied and changes are applied on the latest version. So, the formula is

Longest Transaction Duration per Second * Peak rows updated / deleted * Row Size

So, for the above example, for a table that has a maximum 30% of its rows changing at a time, would require 3.6 GB of memory even though the table size is 1.2 GB.

Ideally, In memory OLTP tables are best suited for tables with insert only and fewer changes ( Deletes / Updates ) on it.

SQL Server 2016 SP1 announcement saw quite a few changes to SQL Server features availability across edition. How do you see the changes that are related to memory?

I was shocked(pleasantly) to see the changes. The big change was making columnstore and "In Memory" feature available on Standard and lower editions. The memory used by Columnstore is

capped to 32 GB and is outside of buffer pool.Hence one needs to calculate the memory for column store and then substract the same from physical memory to come up with the maximum memory

required for buffer pool.

"In Memory" OLTP tables have a maximum of 32 GB of memory per database which is also outside buffer pool. One needs to take that into account as well while sizing

the memory.

What are the warning signs if one fails to configure memory correctly or downsizes it?

Internal memory pressure is experienced if insufficient memory is sized. It can be felt via

1) Stolen Pages
2) Plan cache issues
3) Not enough buffer pool memory to keep all the data

Another thing one may want to check is unnecessary or unused "Non Clustered Indexes". As the indexes are updated, they are loaded to memory as well. So, unused indexes not just add up to the disk usage but also waste buffer pool space.

Edit from Nagaraj: Some of the content is debatable as indicated in the comments. Viewers discretion is advised

Monday, December 26, 2016

Query Store- Part 5: Checking forced plans and Query Store Failures

Most of you would know that using query store one can force query plans. Is there a need to check what plans are forced? Why?

Fundamentally, two important reasons for checking forced plans.

1) To check if the query is yielding the desired performance?. The performance needs to be obviously better than the traditional query performance in most of the cases.

2) To look for query store failures while forcing plans.

This post will deal with point 2, query store failures while forcing plans.

What are query store failures?

Query store failures are events where query store is unable to force the plan it is supposed to.

For example, - Let's say you have set the query store to force the index "NCIX_dt_present" for a particular query, and assume you have dropped the index, then query store does the following

1) Ignores the plan to be forced and picks up the next best query plan of optimizer's choice

2) Marks the failure to force plan on query store on the table "sys.query_store_plan" in the column "force_failure_count"

For example,

Consider the following query

Select session_id,cpu_time,dt,logical_reads

from [dbo].[Process_monitor]

WHERE dt between '20150715' and '20150715 00:15'

Refer to the screenshot from query store

Plan 99 uses "NCIX_dt_present"( a non clustered Index ) for the query

To ensure that the query always uses the Index "NCIX_dt_present", lets force the plan 99 on query 92 as shown below.

Run the select query few times and verify if plan is getting forced. Now lets drop the index

DROP INDEX Process_monitor.NCIX_dt_present

Re run the select query again few times

Select session_id,cpu_time,dt,logical_reads

from [dbo].[Process_monitor]

WHERE dt between '20150715' and '20150715 00:15'

List of query store failures can be found from the query below

SELECT

qt.query_sql_text,q.query_id,

CAST(query_plan AS XML) AS 'Execution Plan',

qp.force_failure_count,qp.last_force_failure_reason_desc,

rs.first_execution_time, rs.last_execution_time,

rs.count_executions,

rs.avg_duration,

rs.avg_rowcount,

rs.last_logical_io_reads,

rs.avg_cpu_time,

rs.avg_query_max_used_memory

FROM sys.query_store_plan qp

INNER JOIN sys.query_store_query q

ON qp.query_id = q.query_id

INNER JOIN sys.query_store_query_text qt

ON q.query_text_id = qt.query_text_id

INNER JOIN sys.query_store_runtime_stats rs

ON qp.plan_id = rs.plan_id

WHERE qp.is_forced_plan = 1

AND qp.force_failure_count > 0

Order by rs.last_execution_time

qp.force_failure_count indicates the number of failures of query store to force plan. qp.last_force_failure_reason_desc" gives the reason why plan couldn't be forced.

It is strongly recommended to track query store failures and forced plans to avoid unexpected performance results.

Thursday, December 22, 2016

R and SQL Server 2016

Yesterday, I delivered another session for Singapore SQL PASS Chapter. The session titled "R and SQL Server 2016" was giving an overview on "R" integration with SQL Server 2016 and also demonstrated how database administrators can apply "R" in their day to day operations. Video provided below. Thanks to Engineers.SG team for recording the same.

Sunday, December 11, 2016

My Interview at SQL Server Geeks 2016

One of the biggest highlights of this year for me was speaking at SQL Server Geeks Annual Summit 2016. In the summit, I gave an interview to the SQL Server Geeks team on my summit experience. Video provided below

Thursday, December 8, 2016

Query Store - Part 4 - Find the plans and queries forced by Query Store

As most of you know, one of the most fundamental uses of query store is to force query plans. Unlike query hints, Query store doesn't demand code a change nor it is complex to implement like plan guides. One can force a plan without make a query / code change and it is pretty simple to achieve the same via user friendly GUI. To figure out how to force a plan, please refer here

But, one of the important things to do, after forcing plans using query store is to track the plans that are being forced. One can do that using the following query

SELECT
qt.query_sql_text,q.query_id,
CAST(query_plan AS XML) AS 'Execution Plan',
rs.first_execution_time, rs.last_execution_time,
rs.count_executions,
rs.avg_duration,
rs.avg_rowcount,
rs.last_logical_io_reads,
rs.avg_cpu_time,
rs.avg_query_max_used_memory, qp.force_failure_count
FROM sys.query_store_plan qp
INNER JOIN sys.query_store_query q
ON qp.query_id = q.query_id
INNER JOIN sys.query_store_query_text qt
ON q.query_text_id = qt.query_text_id
INNER JOIN sys.query_store_runtime_stats rs
ON qp.plan_id = rs.plan_id
WHERE qp.is_forced_plan = 1
Order by rs.last_execution_time

The reasons why one needs to track forced plans are interesting. That will be covered in the next post :)

Saturday, December 3, 2016

Data Channel - Interview 01 - Andreas Wolter on Columnstore

Dear all,

Welcome to the first ever interview of DataChannel. I am honored to have Mr.Andreas Wolter, a German MVP, MCM, MCSM for the first interview. Detailed profile of Andreas provided below.

Andreas Wolter has over 16 years of experience with SQL Server and earned the Master-Certification both on SQL Server 2008 (MCM) and SQL Server 2012 (MCSM)– as one of only 2 experts worldwide. Besides that he has also been awarded with the MVP for SQL Server since 2014.

He is the founder of Sarpedon Quality Lab, originally from Germany, specializing in Development and Optimization of SQL Server Database- and Datawarehouse-architectures with focus on performance, scalability and security.

Website: http://andreas-wolter.com/
Facebook: https://www.facebook.com/andreas.wolter.01
LinkedIn:https://www.linkedin.com/in/andreaswolter

Topic of discussion is Columnstore Index.

Interview video recording provided below:

Excerpts from the Interview provided below:

What is Columnstore Index, how is it different from traditional Index?

ColumnStore Index is supposed to be optimized for access of large volume of data in the range of millions of rows. Columnstore index is optimized for single column access queries which are common in datawarehouse style applications. Columnstore Indexes design combined with underlying compression helps in fetching large volume of data at a much faster pace from OLAP style queries. Columnstore index has improved so much more in SQL Server 2016 to make it much more flexible.

What are the standard gains for applications using columnstore index?

First, application workload needs to be suitable to columnstore index. The major gains would be

Compression: If the application is a typical datawarehouse application with "star schema" design, with 20 million or more rows, one is likely to receive 50% data compression by using columnstore index. Effectiveness of compression also depends on the data type of columns

Speed: On the speed front, it can be double or triple times or sometimes even 100 times faster, depending upon how the query has been designed and how the query was performing previously.

How does SQL Server make such a improvement possible? What is the architecture of columnstore that makes it possible?

It is a fundamental change in the data storage making it possible. Columnstore is hugely compressed resulting in massive reduction of I/O. Even today, in the era of SSDs, disk IO is the slowest part of the system. Compression in columnstore does provide half of the improvement in columnstore. And of course, there is more to it too.

How does one pick the tables that would benefit from columnstore? How does one design the application to leverage upon the benefits of columnstore?

Ideally datawarehouse style applications would benefit. Tables running into several million rows and wide tables with many columns (for ex 50 columns) would benefit from columnstore index
The width and length table need to be huge to benefit via columnstore index. The type of queries commonly executed by the application is also important. Columnstore would help queries fetching larger number of rows and wouldn't make a difference on queries reading 1 or 2 rows. Especially queries which involve aggregation functions with group by clause reading lots of rows would benefit the most.

How are the columnstore indexes stored on the disk? Are they differently stored compared to traditional Indexes?

Totally different. Only thing that is common is 8KB Pages for storing data. Unlike traditional Index pages which contain data of multiple fields and multiple data types, columnstore stores only one type of column data on a single page (One data type). These pages are organized as segments and each segment contains data from just one single column. This makes large compression possible.
Second part is, when fetching the data, if the query is selecting only 5 columns, column store reads the only the segments that contain the 5 columns. The traditional index would be forced to read all the columns as each page contains all the columns of the table or index.

Columnstore Index is one of those technologies which has improved a lot since 2012. Can you drive us thro the lifecycle of improvements it has had till 2016?

Columnstore was actually first introduced in SQL Server 2008 in the Parallel Data warehouse edition of SQL Server.

In SQL Server 2012, once the nonclustered columnstore was put on the table, the table was write protected and supported only read only queries. Due to this limitation it very few people adapted or started using it.

In SQL Server 2014, clustered columnstore was introduced. Clustered columnstore by design contains all columns of the table. Clustered columnstore was updatable. However, Non clustered columnstore was still read only.

In SQL Server 2016, both clustered and non clustered column store indexes are updatable. However, one can't mix columns on clustered and non clustered indexes. Meaning, one can either have a clustered columnstore index or a non clustered columnstore index on the same table.

Andreas has prepared a Index Cheat Sheet to showcase the various combinations with Indexes. Please check it out over here

We understand that clustered columnstore index would contain all the columns of the table. Do I still need traditional indexes?

Yes. Good question. You will still need them if you have queries which pick smaller set of data. Traditional non clustered indexes are required for single row or fewer row lookups. Fetching lesser number of rows or specific rows from compressed segments would be harder and traditional indexes would help those requests.

What is Real Time Operational Analytics?

Let’s split this into 2 parts - Analytics is what we have discussed using all along using columnstore. Real time operational data refers to the data that is constantly changed via inserts / updates / Deletes.

Real time operational analytics refers to the same table being used for two kinds of workload - both regular operations like insert / update /deletes and also datawarehouse style analytic queries. This has been made possible in SQL Server 2016 via the "Delta" store in columnstore indexes. "Delta Store" is an area of columnstore, which contains the hot data or recently changed data in an uncompressed format. This combination of static data on compressed segments and hot data on "Delta" Store makes real time operational analytics possible in SQL Server 2016.

The other buzzword is In Memory OLTP. Can it be combined with columnstore? How would applications benefit out of these two technologies?

The other much improved technology of SQL Server 2016 is in "In Memory". Yes, In SQL Server 2016, one can combine the columnstore technology and In memory table, which gives faster access to real time data. "Real Time Operational Analytics" with "in memory OLTP", makes the access to operational data lock free, latch free and also let analytic workload benefit via columnstore technology

What are the warning signs in using columnstore? What are the "Dont's" while using columnstore?

The queries/ workload used should suit the design of columnstore index.

1) Application with most of the queries using "select * " or many columns may not benefit via columnstore index

2) Database schema design also matters. Shouldn't be a too complex schema design like snowflake / galaxy schema design. Datawarehouse applications using star schema design are most suitable

3) Functions or queries used also matters - For ex: - functions like min / max are suitable while commands like "distinct" are unsuitable for columnstore

Tuesday, November 29, 2016

SQL Server 2016 SP1 – Rocking RELEASE!!!

Last week, I did a presentation titled "SQL Server 2016 SP1 – Rocking RELEASE!!!" to Singapore SQL User Group. The Presentation was about the things I covered over here and few more cool features released with SQL Server 2016 SP1. The best part of it was it was recorded :) This is my first presentation that has been recorded and uploaded in public platform. Obviously excited sharing it to all. Special thanks to Paul Lorret and engineers.sg team for recording it.
On the presentation, it was certainly not my best as I had just 2 days to prepare. But still worth sharing and had fun doing it :) Link provided below.

Slides and scripts can be downloaded from here

Saturday, November 26, 2016

Data Channel - Launch

Dear All,

I am starting a YouTube channel for SQL Server and Microsoft Data Platform Technologies called "Data Channel". What this channel is going to contain is Interviews with SQL Server and Data Platform experts like MVPs, MCM , Community Leads and Microsoft Professionals too. Interviews are generally 15 to 20 minute long,giving us a quick understanding of the subject discussed. The intention behind starting this channel is over a period of time it would develop into a huge volume of useful technical information for our community. I have shot a quick promo for the same..Take a look at it below.

If you like this idea, please subscribe to the "Data Channel" over here. Please share your comments and follow this space for more updates on the channel.

Regards,
Nagaraj

Sunday, November 20, 2016

SQL Server 2016 - SQL Service Doesn't Start after GDR Security update

Recently, after an security fix GDR (https://support.microsoft.com/en-us/kb/3194716) on SQL Server 2016 RTM, SQL Service refused to start. Though, the patch completed successfully as per
"C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log", the Event viewer under "System" gave the following error

"The SQL Server (I2016) service terminated with the following service-specific error: %%945"
Error didn't make much sense

The eventviewer under "Application" section gave a lot more meaningful error

FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file 'D:\MSSQL2016Root\MSSQL13.I20161708\MSSQL\Binn
\mssqlsystemresource.ldf'. Diagnose and correct the operating system error, and retry the operation.

From the error it was fairly clear that "mssqlsystemresource.mdf", "mssqlsystemresource.ldf" were missing. A windows search of those two files showed that the files were mysteriously moved to "C:\Program Files\Microsoft SQL Server\130\LocalDB\Binn"

Even though, there is no local DB installed, the security fix had likely moved the "mssqlsystemresource.mdf", "mssqlsystemresource.ldf" to a different folder causing the SQL Server to fail while starting.

The fix was simple. Copy "mssqlsystemresource.mdf", "mssqlsystemresource.ldf" files from "C:\Program Files\Microsoft SQL Server\130\LocalDB\Binn" to its original location which was "D:\MSSQL2016Root\MSSQL13.I20161708\MSSQL\Binn\". After the file copy SQL Service started as usual.

The above issue occurred on Windows 10 laptop with SQL Server 2016 Developer edition. Please note that, GDR was applied automatically by windows patch update. Yes, we would not schedule auto update of SQL Security Patches on production machines but suspect the issue would have occurred even had the GDR been manually installed.

Hope this post would help someone, trying to troubleshoot SQL Service failure issue after applying security fix.

Thursday, November 17, 2016

SQL Server 2016 SP1 Release - Thoughts and Comments!!!

SQL Server 2016 SP1 has released last night and most of the features that were previously part of enterprise edition have been made available in standard edition now!!!! By far, the biggest change any service pack brought about in any version of SQL Server!!!

List of items provided below which moved from enterprise to standard provided below. Read about the announcement over here

Change Data Capture
Database Snapshot
Columnstore Index
In-Memory OLTP
Row Level Security
Dynamic Data Masking
Always Encrypted
Fine Grained Auditing
Data Compression
Multiple FILESTREAM Containers
Partitioning
Polybase

That's quite a change and I am super excited looking at the list of items. Quick comments few of those items.

Column store Index:

A game changer technology especially in data warehouse applications now moves to standard. But, the concern is, it is a memory driven concept known to consume significant amount of memory. The memory limit of 128 GB still holds good on standard edition. To add to that, SQL Server 2016 SP1 sets a hard limit of 25% of maximum memory set for SQL Server for the memory used by column store index. For example if 48 GB is allocated to SQL Server, column store can use maximum of 12 GB. This implies, theoretically column store index can maximum take only 32 GB of RAM. However, the memory used by Columnstore is not counted within buffer pool's quota of memory. So it implies, if 48 GB of memory is set for SQL Server, then SQL Server's total memory usage can go up to 60 GB ( 12 GB for column store + 48 GB of buffer pool )

Though the extended memory usage provided for Columnstore does sound useful, it may not be so effective as column store index is at its best only when it operates on large volume of data with sizable amount of memory.

In-Memory OLTP:

In my opinion, one of the best moves of SP1. Reason is even though "In memory" is ideally suitable for systems with few hundred GBs of RAM, it can prove handy for some of the mid size ones too. 25% of max memory hard limit also applies for "In memory" tables (similar to column store). For ex, lets say we have a 64 GB of RAM for SQL Server, and then 16 GB would be available for "In memory" tables.

1) Though the memory available may sound to be small, it can still be useful to place the most important and most accessed tables take advantage of "In memory OLTP". Please note that most important tables of the application may not be the largest table by size.

2) "In memory OLTP" can help applications get rid of locking and latch contentions almost completely.

3) Remember, "In memory" is not just about placing tables in the memory. SQL Server reads and writes to them differently enhancing the overall performance of the application

So, this is an enhancement which I will be pushing my application teams to take advantage of for sure.

Security n Encryption Related Features: - Row Level Security / Dynamic Data Masking / Always Encrypted / Fine Grained Auditing

It is a excellent move to place security related features in standard and other editions. My personal take has always been that security related features should be available at all editions of SQL Server. The reason why security related features needs to be available for all is application developed may not have high performance requirements demanding enterprise edition but would need key security features of SQL Server. For example, a small shopping cart application which just needs SQLExpress, deals with sensitive data ( credit card numbers, account balance etc..) requires features like "Always Encrypted" or "Row Level security" for developing secure applications.

Partitioning:

Another welcome change as Standard edition does deal with terabytes of data and partitioning does make lot of their lives easier

Change Data Capture / Database Snapshot :

Commonly used features in development or staging environments and making them available across all editions makes it better for environments which use different editions for Development and Production.

What remains in Enterprise:

High Availability feature like "Always ON Availability Groups" ( barring basic AG which was made available in standard since SQL Server 2016 RTM ) and "Online Index Rebuild" remain in enterprise. Capacity limitations of lesser editions ( max 128 GB, 24 core processor limit on standard etc.. ) still remain. This makes absolute sense, as these features are primarily required by applications with extreme performance requirements and they should ideally be on Enterprise editions ( without capacity limits ).

Also notice that the features on enterprise only are more of "Infrastructure" related instead of "Development" related. This also is a move in right direction as one needs to have the same editions at development environment and production environment to make application development effective.

Overall, indeed, one of the best announcements in the entire history of SQL Server. No doubt, it is going to make several applications move to SQL Server 2016 soonest and makes my job of convincing application teams easier :)