Monday, June 28, 2010

Rebuild vs ReOrg

Rebuild or Regorg, difference between Rebuild or Reorg is a fairly common topic on blogs and forums. Here is my take on it.As I am not good on writing tr and td html tags lemme list down the comparison as pointers instead of a tabular column.

Before starting the comparsions, just a small intro on fragmentation. I have written in detail on it over here. But still, a quick intro.
Database Fragmentation is of 2 types. Internal and external fragmentation.

>Internal Fragmentation refers to the fragmentation that happens with in a page.
>External fragmentation is divided to 2 types. Logical and extent fragmentation.
>Logical fragmentation refers to out of order pages within a extent.
>Extent fragmentation refers to out of order extents.

What Reorg does? :

Takes the leaf pages of each index, fixes the internal fragmentation by ensuring that each leaf page is filled as per the fillfactor specified initially while creating the index. Reorg also swaps and rearranges the out of order pages to fix logical fragmentation.

picture of an Sample Extent before Reorg:



picture of Sample Extent after Reorg:



What Rebuild does? :
Rebuild index recreates the index from scratch which means both the leaf and non leaf pages's internal fragmentation is removed.As it rebuilds the index completely it fixes logical fragmentation. Rebuild fixes extent fragmentation most of the
times, but there are a few exceptions.

Let us see the important differences/similarities.

New Page allocation difference - Reorg and Rebuild:
Reorg while fixing internal and logical fragmentation, uses only the existing pages that are allocated to the table.It never allocates new pages and moves around the existing pages to get the internal and logical fragmentation correct.Perhaps thats the reason its called 'Reorg'.

On the other hand Rebuild can allocate new pages, to fix logical and extent fragmentation. Allocating new pages means that Rebuild can sometimes increase the space usage in the data file.

Leaf level pages
As mentioned earlier, Reorg defragments only leaf level pages of a index and Rebuild defrags both leaf and non leaf levels of an index.

Extent fragmentation
Extents are a group of 8 pages, when not present physically in their logical order, is said to be extent fragmented.For ex: page 1-8 of table 1 are present in extent-1.if Pages-9 to 16 of the table, present in extent-2 is not physically stored after extent-1 then the table is said to extent fragmented.

Picture of Sample Extent Fragmented table:


Picture of Extents storage order after Rebuild



Locking - online/offline differences
Reorg is a online process. ie The table remains online during Reorg and updates and Inserts can be done to the table. Reorg takes Indent Exclusive lock at the table level. At page level, it takes exclusive lock only at the page at which is currently working on (swapping/ fixing internal frag ) and leaves the rest of the pages of the table. Once it finishes work on a single page, then it releases the exclusive lock and pulls a exclusive lock the next page it has to work on. If the system is too busy, Reorg sometimes skips the pages.

Rebuild is a offline process and consumes a exclusive lock on the entire table and stops any access to it until the operation finishes.However, Rebuild can be done online using the option ONLINE = ON,if and only if you have a Enterprise edition.

Stats differences
Rebuild of clustered index updates all the statistics assocaited with the table ( both auto and user created ) whereas Reorg doesn't do the same. The STATS_DATE function can be used to check the date at which statistics were last updated.

Log file growths
Both cause the log files to grow. Rebuild can be a minimally logged operation where as REOrg is a fully logged operation.

Stoping / Pausing REOrg
REOrg can be stopped during execution and can continue its execution from it where left.But Rebuild acts like a single transaction and always starts from scratch.

Lob differences
Rebuild of clustered index on a table with LOB column ( varchar(max),varbinary(max),image,xml,text ) fails when performed ONLINE.REOrg of Clustered index compacts LOB objects on the table. REOrg of a non clustered index compacts LOB columns that are part of the included non key columns of the NC Index.

Replacements
Reorg replaces DBCC Indexfrag and Rebuild replaces DBCC DBReindex. The DBCC commands still work on SQL Server 2005, but its recommended to switch to REorg/Rebuild.

Just a note about fragmentation. Fragmentation can hurt your servers performance only when your queries scan the tables instead of Seeking.If your indexes are well selected and accurate, then the queries should use seek operator to fetch the rows and as Seek directly jumps to the row it wants, it doesnt matter how much the table is fragemented or disordered.So, a occasional Reorg or rebuilt should be enough to keep your databases and tables healthy. In general, one perfoms Re0rg when fragmention is around 10 -30 % and rebuild when its greater than 30 %.

3 comments:

Anonymous said...

Easy to understand.Thank u

Nagaraj Venkatesan said...

Welcome!!!

Anonymous said...

Hi Nagaraj,

Thanks for your post. I have a question on Statistics.
1. What is Statistics ??
2. Do we have two different statistics - index statistics and Table statistics (or both are same).
3. What is the syntax used to find the latest updated date ??
4. What is the syntax used to update both index statistics and table statistics (wherein both are different).

Please help my understanding.