Welcome!

Microsoft Cloud Authors: Yeshim Deniz, Janakiram MSV, John Katrick, David H Deans, Andreas Grabner

Related Topics: Microsoft Cloud

Microsoft Cloud: Article

SQL Server Index Fragmentation In-Depth.

Fragmentation is a common term that describes numerous effects that can occur because of data modifications

There is no way to avoid index fragmentation in any SQL Server environment. It does not depend on your SQL Server version or I/O subsystem you have, or your hardware. In this article, we will drill down into SQL Server index fragmentation issue. We will figure out why index fragmentation is a problem and how it affect on overall performance, discuss how to detect and avoid it.

Index Fragmentation
Fragmentation is a common term that describes numerous effects that can occur because of data modifications. Chances are, you already know that SQL Server stores data on 8KB data pages. Eight contiguous pages form extent. A data page both in clustered or non-clustered indexes contains pointers to the next and previous pages. The following picture demonstrates that there are no fragmentation.

Let's insert a new row into the index and see what happens. SQL Server inserts a new row on the data page in case there is enough free space on that page, otherwise the following happens:

  1. SQL Server allocates a new data page or even a new extent.
  2. A part of data from the existing (old) data page transfers to a newly allocated data page.
  3. In order to keep the logical sorting order in the index, pointers on both pages are updated.

As a consequence, we have two types of index fragmentation:

Logical fragmentation (also called external fragmentation or extent fragmentation) - the logical order of the pages does not correspond their physical order. As a result, SQL Server increases the number of physical (random) reads from the hard drive, making the read-ahead mechanism less efficient. This directly impacts to the query execution time, because random reading from the hard drive is far less efficient comparing to sequential reading.

Internal fragmentation - the data pages in the index contain free space. This lead to an increase in the number of logical reads during the query execution, because the index utilizes more data pages to store data.

Detecting fragmentation
Before you decide which defragmentation approach to use, it is required to analyze the index to find out the degree of fragmentation. You can use the sys.dm_db_index_physical_stats data management function to analyze fragmentation. The following columns in the resultset are most important:

avg_page_space_used_in_percent shows the average percentage of the data storage space used on the page. This value allows you to see the internal index fragmentation.

avg_fragmentation_in_percent provides you with information about external index fragmentation. For tables with clustered indexes, it indicates the percent of out-of-order pages when the next physical page allocated in the index is different from the page referenced by the next-page pointer of the current page. For heap tables, it indicates the percent of out-of-order extents, when extents are not residing continuously in data files.

fragment_count indicates how many continuous data fragments the index has. Every fragment constitutes the group of extents adjacent to each other. Adjacent data increases the chances that SQL Server will use sequential I/O and Read-Ahead while accessing the data.

Avoiding Index Fragmentation
To avoid index fragmentation, try to adhere the following rules:

  1. Choose a cluster key that complements the table's insert pattern
  2. Do not insert records with random key values
  3. Do not update records to make them longer
  4. Do not update index key columns
  5. Be aware of features that can cause page splits
  6. Implement index fill factors

Utilizing Index Fill Factor
Set SQL Server to leave free space on index leaf pages. The main idea is to allow records to expand, records to be inserted without filling out the page and having to cause page split.

Therefore, you need to figure out how much space you want to leave. Amount of space to use is 100% minus fill factor value (e.g., fill factor of 70 means 30% free space).

SQL Server only uses the fill factor when an index is created, rebuild, or reorganized. The index fill factor is not used during regular inserts, updates and deletes. In fact, that does not make any sense, because the whole point is to allow inserts and updated to happen and to add more records without filling up the page.

It is possible to set the instance fill factor using sp_configure, but not recommended. The reason is when you set the fill factor for the entire instance, there are probably some indexes that do not need fill factor. If you find that you've got fragmentation problems on non-leaf level of the index (rare), you can use the PAD_INDEX option. It takes fill factor that has been specified and puts it up into the non-leaf level.

Setting a Fill Factor
Probably the easiest way to set the fill factor is to use the FILLFACTOR option,  when you create or rebuild an index. You can also use the Object Explorer to set the fill factor. Note that you can not set fill factor when you reorganizing an index. Both REBUILD and REORGANIZE use the fill factor stored in index metadata, otherwise they use the instance-wide (default) fill factor. Unless the FILLFACTOR option is specified for a REBUILD.

An obvious question arises: "What fill factor do I use?" Well, actually, there is no any magic number. You just need to pick an initial fill factor and implement it. Put it into production and then monitor how quickly fragmentation occurs. Then choose to do one or both of the following: increase/decrease the fill factor or change the frequency of index maintenance.

Removing Index Fragmentation
Let's figure out what is the difference between ALTER INDEX ... REBUILD and ALTER INDEX ... REORGANIZE. The following table demonstrates the difference:

There is no correct answer in regards: "What to use REBUILD or REORGANIZE?" One of Microsoft Books Online provides the following guidance:

0 to 5-10% - do nothing

5-10% to 30% - do REORGANIZE

30% to 100% - do REBUILD

Eventually, you have several basic options to remove index fragmentation:

  1. Choose to rebuild index
  2. Choose to reorganize index
  3. Forced to reorganize by HA/DR features
  4. Do nothing
  5. Use CREATE INDEX...WITH (DROP_EXISTING=ON) - does the same as ALTER INDEX...REBUILD but you need to specify the entire CREATE INDEX statement.

Resources
Below is a list of resources you may find useful:

More Stories By Jordan Sanders

Jordan Sanders is a Software Marketing Manager at Devart Company. He helps DBAs, software developers (C#, .NET, Delphi) from all around the globe to increase their productivity by using new tools, practices and new approaches to database development and management. He has experience in MySQL, SQL Server, Oracle databases consulting and also in Delphi development. He is always trying to share his knowledge and ideas with the community of his interest.

@ThingsExpo Stories
DX World EXPO, LLC, a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of the 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to gre...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develop...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.