How Online Indexing Works
The default behavior of either method of rebuilding an index is that SQL Server takes an exclusive lock on the index, so it is completely unavailable while the index is being rebuilt. If the index is clustered, the entire table is unavailable; if the index is non-clustered, there is a shared lock on the table meaning no modifications can be made but other processes can SELECT from the table (But obviously they cannot take advantage of the index being rebuilt). Now this is pretty miserable in large databases since queries wont be able to take advantage of indexes resulting in our arch nemisis: table scans.
The online build works by maintaining two copies of the index simultaneously, the original (source) and the new one (target). The target is used only for writing any changes made while the rebuild is going on. All reading is done from source as well. SQL Server row-level versioning is used so anyone retrieving information from the index will be able to read consistent data.
Here are the steps involved in rebuilding a non-clustered index
- A shared lock is taken on the index, which prevents any data modification queries and an Intent-Shared lock is taken on the table
- The index is created with the same structures as the original and marked as write-only
- The shared lock is released on the index, leaving only the Intent-Shared lock on the table.
- A versioned scan is started on the original index, which means modifications made during the scan will be ignored. The scanned data is copied to the target
- All subsequent modifications will write to both the source and the target. Reads will use only the source
- The scan of the source and copy to the target continues while normal operations are performed.
- The scan completes
- A Schema-Modification-Lock (most strict lock) is taken to make the source completely unavailable
- The source is dropped, metadata is updated, and the target is made to be read-write
- The Schema-Modification-Lock is released.
For a build of a new clustered index or a rebuild of a clustered index with a schema change there are a few more differences. First, an intermediate mapping index is used to translate between the source and target physical structures. Additionally, all existing non-clustered indexes are rebuilt one at a time after a new base table ahs been built. Creating a clustered index on a heap with two non-clustered indexes involves the following steps:
- Create a new write-only clustered Index
- Create a new non-clustered index based on the new clustered index
- Create another new non-clustered index based on the new clustered index
- Drop the heap and the two original non-clustered indexes