Friday, January 13, 2012

File Group Backups - Intro



What is File Group backup?

Backing up a portion of a database, say a File Group is termed as Filegroup backup.
If you are wondering what are filegroups, then in short a Database's data files can be made of multiple files or groups of files. For more info on File Groups read here

When File Group backups are useful ?
Assume you have very large database with few hundred GBs or a few Terabytes. The database is divided into multiple filegroups, with recently loaded data in one file group and older data in other filegroups. For example, you have a database which maintains a shop's order/transaction details. Assume that the database is designed to have each year's transaction at one filegroup. Then, instead of backing up the entire database, it would save lot of disk space, if one backup's up the current year's file group alone.

How to take file group backups ?

The Screenshot shows how to take file group backup. Fairly straight forward.


What are the advantages of file group backups ?
1) Saves lot of space as you backup only a portion of the backup.
2) Can bring the database online partially and at a faster pace. You can restore only your highest priority filegroup first, bring it online while filegroups havent been restored.
3) If a table or particular filegroup is corrupted then One can restore the filegroup seperately.

Requirements
Any backup strategy is said to work only when one can successfully recover the database. With filegroup backups, there is one basic principle. Each filegroups that is online should be consistent with the rest of the filegroups in the database. Also, the primary filegroup should be restored first for the database to be partially online.

To explain bit more, assume you have a database 'DB' with filegroups FG1,FG2,FG3. All are read write file groups.FG1 is the primary file group.You can bring the database online partially with either of these

* File group backups of FG1 alone
* File group backups of FG1 + FG2
* File group backups of FG1 + FG3
* File group backup of FG1 + FG2 + FG3 ( this becomes completely online )

However one should note that FG2/FG3 backup set should have the same Restoration point as FG1. Restoration point is the time upto which backups where taken for a file/filegroup.

Assume one has taken full backup of a database at 1 PM and transaction log backups at 2PM and 3PM. After that there were no backups taken for the database. Then the restoration point is termed to be 3PM. In other words, the time upto which you are restoring a backup is termed as restoration point.

So in our case, one CANT bring the database partially ( excluding FG1 alone ) online with

* FG1 file group backup taken on 10th Jan 9 PM
* FG2 file group backup taken on 9th Jan 9 PM
* FG3 file group backup taken on 8th Jan 9 PM

Attempts to restore with these 3 backups alone will fail as FG3 contains transactions upto 8th Jan night, FG2 upto 9th night and FG1 upto 10th night.

What CAN work is

* FG1 file group backup taken on 10th Jan 9 PM
* FG2 file group backup taken on 9th Jan 9 PM
* FG3 file group backup taken on 8th Jan 9 PM
* Additional T-Log backups from 8th Jan 9 PM to 10th Jan 9 PM.

T-Log backups from 8th Jan 9 PM to 10th Jan 9 PM contain all the transactions till
10th Jan 9 PM and upon restoration we can bring FG1,FG2,FG3 to the same restoration point.

In short the two most important principles for filegroup backups are

1) Primary Filegroup should be restored first.
2) All the Filegroups should have the same restoration point.

The table below shows the recovery models and Modes at which filegroup backups are useful.



Recover modelRead onlyStrategy
FullNoFull FG backups + Differential + T-log backups
SimpleNoDoesn't work
FullYesFull FG backups for Read write + T-Log backups
SimpleYesFull FG backups

On the upcoming posts, I will be explaining various backup strategies and restoration scenarios in detail.