Saturday, June 25, 2011

ALTER TABLE - Adding Column - Column Order Impact

Most of us would have altered a table structre by adding a column to the table. When an ALTER table script is used to add the column to the table, the column is placed on the last of the table. For inserting a column, in the middle of the table one needs to use the SQL Server Management Studio ( SSMS ).ie., Right click on the table and pick design table and then proceed to add a column. This post will deal with the significance of column order while adding a column to the table.

Consider the table 'sample'. The table contains about 800 rows with a size of 6 MB. Relatively small by a normal database standards.

Let us add column in the middle using SSMS as shown below. Once we click on save button to save the column addition, the operation completes immediateley.

Let me add a few more rows into the table. Now the table contains over 200K rows and the size of the table is 1.6 GB.

Now let me add a column to the middle of the table using SSMS as done proviously. Now the operation takes much much longer. Just in case if you face timeout error refer here.

Now the operation takes few hours to complete and the new column will be inserted between two columns as shown below.

Let us add one more column at the last of the table ( not in between the columns as doen previously). Refer to picture below.

ALTER TABLE sample ADD col3_int CHAR(5) 

Now the operation takes in 0 seconds to complete. The points to note are provided below.

* When we add a column to the middle of the table using SSMS, when the number of rows are higher, it consumes a longer execution time and resource.

However, when the number of rows are lesser it doesnt consume much of time and CPU/IO resource. This aspect is to be handled carefully where a DBA can fall for the trap if overlooked.

Assume, DBA is planning to perform a small column addition on the production server. DBA has already tested in staging and it was over in few seconds. DBA assumes that the operation is going to take a few seconds in production and plans accordingly. If the number of rows are higher in production, the DBA can be taken for ride and it can be different ball game all together.

So lesson to be learnt is Dont underestimate any table modification prepations and make sure to check size/# of rows on production before deployment.

* When the column was added to the end of the table ( without caring about position of the column ) using T-SQL script ( ALTER TABLE coomand ),it completed immediately without consuming much of time and resource, though the number of rows were very high.

Lesson learnt is insertion of a column to the middle ( or rather between two other columns ) of the table should not be done, unless there is a strong reason to do so. If there is no strong reason, then always add the column to the end of the table using ALTER TABLE script as they take much much lesser time to execute.

* Use scripts instead of SSMS GUI especially while performing table strucutre modifications or DDL operations.

We will take a much closer look in the next post exploring why such a behavior is observed.


LAKSHMI said...

Raj, NagaRaj
Howa are you?

warm welcome.....

Nagaraj Venkatesan said...

Thank u

Anonymous said...

How does column order even matter in an RDBMS?
Always use a script and leave column ordering to the front end - there is no meaning in having a column in the middle or the end or the beginning.

Nagaraj Venkatesan said...

Internllay ordering doesnt matter. But when Programmers use code like 'Insert into table select * from table' and u cant change the program then it becomes ineveitable.