Welcome!

Microsoft Cloud Authors: David H Deans, Liz McMillan, Pat Romanski, Janakiram MSV, Jnan Dash

Related Topics: Microsoft Cloud

Microsoft Cloud: Article

SQL Server Index Fragmentation In-Depth.

Fragmentation is a common term that describes numerous effects that can occur because of data modifications

There is no way to avoid index fragmentation in any SQL Server environment. It does not depend on your SQL Server version or I/O subsystem you have, or your hardware. In this article, we will drill down into SQL Server index fragmentation issue. We will figure out why index fragmentation is a problem and how it affect on overall performance, discuss how to detect and avoid it.

Index Fragmentation
Fragmentation is a common term that describes numerous effects that can occur because of data modifications. Chances are, you already know that SQL Server stores data on 8KB data pages. Eight contiguous pages form extent. A data page both in clustered or non-clustered indexes contains pointers to the next and previous pages. The following picture demonstrates that there are no fragmentation.

Let's insert a new row into the index and see what happens. SQL Server inserts a new row on the data page in case there is enough free space on that page, otherwise the following happens:

  1. SQL Server allocates a new data page or even a new extent.
  2. A part of data from the existing (old) data page transfers to a newly allocated data page.
  3. In order to keep the logical sorting order in the index, pointers on both pages are updated.

As a consequence, we have two types of index fragmentation:

Logical fragmentation (also called external fragmentation or extent fragmentation) - the logical order of the pages does not correspond their physical order. As a result, SQL Server increases the number of physical (random) reads from the hard drive, making the read-ahead mechanism less efficient. This directly impacts to the query execution time, because random reading from the hard drive is far less efficient comparing to sequential reading.

Internal fragmentation - the data pages in the index contain free space. This lead to an increase in the number of logical reads during the query execution, because the index utilizes more data pages to store data.

Detecting fragmentation
Before you decide which defragmentation approach to use, it is required to analyze the index to find out the degree of fragmentation. You can use the sys.dm_db_index_physical_stats data management function to analyze fragmentation. The following columns in the resultset are most important:

avg_page_space_used_in_percent shows the average percentage of the data storage space used on the page. This value allows you to see the internal index fragmentation.

avg_fragmentation_in_percent provides you with information about external index fragmentation. For tables with clustered indexes, it indicates the percent of out-of-order pages when the next physical page allocated in the index is different from the page referenced by the next-page pointer of the current page. For heap tables, it indicates the percent of out-of-order extents, when extents are not residing continuously in data files.

fragment_count indicates how many continuous data fragments the index has. Every fragment constitutes the group of extents adjacent to each other. Adjacent data increases the chances that SQL Server will use sequential I/O and Read-Ahead while accessing the data.

Avoiding Index Fragmentation
To avoid index fragmentation, try to adhere the following rules:

  1. Choose a cluster key that complements the table's insert pattern
  2. Do not insert records with random key values
  3. Do not update records to make them longer
  4. Do not update index key columns
  5. Be aware of features that can cause page splits
  6. Implement index fill factors

Utilizing Index Fill Factor
Set SQL Server to leave free space on index leaf pages. The main idea is to allow records to expand, records to be inserted without filling out the page and having to cause page split.

Therefore, you need to figure out how much space you want to leave. Amount of space to use is 100% minus fill factor value (e.g., fill factor of 70 means 30% free space).

SQL Server only uses the fill factor when an index is created, rebuild, or reorganized. The index fill factor is not used during regular inserts, updates and deletes. In fact, that does not make any sense, because the whole point is to allow inserts and updated to happen and to add more records without filling up the page.

It is possible to set the instance fill factor using sp_configure, but not recommended. The reason is when you set the fill factor for the entire instance, there are probably some indexes that do not need fill factor. If you find that you've got fragmentation problems on non-leaf level of the index (rare), you can use the PAD_INDEX option. It takes fill factor that has been specified and puts it up into the non-leaf level.

Setting a Fill Factor
Probably the easiest way to set the fill factor is to use the FILLFACTOR option,  when you create or rebuild an index. You can also use the Object Explorer to set the fill factor. Note that you can not set fill factor when you reorganizing an index. Both REBUILD and REORGANIZE use the fill factor stored in index metadata, otherwise they use the instance-wide (default) fill factor. Unless the FILLFACTOR option is specified for a REBUILD.

An obvious question arises: "What fill factor do I use?" Well, actually, there is no any magic number. You just need to pick an initial fill factor and implement it. Put it into production and then monitor how quickly fragmentation occurs. Then choose to do one or both of the following: increase/decrease the fill factor or change the frequency of index maintenance.

Removing Index Fragmentation
Let's figure out what is the difference between ALTER INDEX ... REBUILD and ALTER INDEX ... REORGANIZE. The following table demonstrates the difference:

There is no correct answer in regards: "What to use REBUILD or REORGANIZE?" One of Microsoft Books Online provides the following guidance:

0 to 5-10% - do nothing

5-10% to 30% - do REORGANIZE

30% to 100% - do REBUILD

Eventually, you have several basic options to remove index fragmentation:

  1. Choose to rebuild index
  2. Choose to reorganize index
  3. Forced to reorganize by HA/DR features
  4. Do nothing
  5. Use CREATE INDEX...WITH (DROP_EXISTING=ON) - does the same as ALTER INDEX...REBUILD but you need to specify the entire CREATE INDEX statement.

Resources
Below is a list of resources you may find useful:

More Stories By Jordan Sanders

Jordan Sanders is a Software Marketing Manager at Devart Company. He helps DBAs, software developers (C#, .NET, Delphi) from all around the globe to increase their productivity by using new tools, practices and new approaches to database development and management. He has experience in MySQL, SQL Server, Oracle databases consulting and also in Delphi development. He is always trying to share his knowledge and ideas with the community of his interest.

@ThingsExpo Stories
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, will provide a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services ...
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
The taxi industry never saw Uber coming. Startups are a threat to incumbents like never before, and a major enabler for startups is that they are instantly “cloud ready.” If innovation moves at the pace of IT, then your company is in trouble. Why? Because your data center will not keep up with frenetic pace AWS, Microsoft and Google are rolling out new capabilities In his session at 20th Cloud Expo, Don Browning, VP of Cloud Architecture at Turner, will posit that disruption is inevitable for c...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great dea...
SYS-CON Events announced today that Technologic Systems Inc., an embedded systems solutions company, will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Technologic Systems is an embedded systems company with headquarters in Fountain Hills, Arizona. They have been in business for 32 years, helping more than 8,000 OEM customers and building over a hundred COTS products that have never been discontinued. Technologic Systems’ pr...
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
SYS-CON Events announced today that SD Times | BZ Media has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. BZ Media LLC is a high-tech media company that produces technical conferences and expositions, and publishes a magazine, newsletters and websites in the software development, SharePoint, mobile development and commercial UAV markets.
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
Now that the world has connected “things,” we need to build these devices as truly intelligent in order to create instantaneous and precise results. This means you have to do as much of the processing at the point of entry as you can: at the edge. The killer use cases for IoT are becoming manifest through AI engines on edge devices. An autonomous car has this dual edge/cloud analytics model, producing precise, real-time results. In his session at @ThingsExpo, John Crupi, Vice President and Eng...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
There are 66 million network cameras capturing terabytes of data. How did factories in Japan improve physical security at the facilities and improve employee productivity? Edge Computing reduces possible kilobytes of data collected per second to only a few kilobytes of data transmitted to the public cloud every day. Data is aggregated and analyzed close to sensors so only intelligent results need to be transmitted to the cloud. Non-essential data is recycled to optimize storage.
"I think that everyone recognizes that for IoT to really realize its full potential and value that it is about creating ecosystems and marketplaces and that no single vendor is able to support what is required," explained Esmeralda Swartz, VP, Marketing Enterprise and Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.