|
YOUR FEEDBACK
|
TOP MICROSOFT .NET LINKS Tools Effective Database Change Management
Versioning, developing concurrently, deploying and upgrading
Nov. 8, 2006 12:00 PM
Have you ever been on a project where software development worked beautifully but developing and maintaining the database always caused unexpected problems and bugs? Do your changes constantly get overwritten by other developers, or is only one person at a time allowed to make changes? Do you find, after two or three major releases, that it's impossible to create upgrade scripts for existing production databases? After experiencing these frustrations and more, I decided to address them.
Source Control for Databases Some common but suboptimal approaches for database version management, often applied because of a lack of better alternatives, include:
When we compare the above approaches to the way we version regular source code (e.g., for a .NET assembly), we notice a big difference. The source code for a .NET assembly is broken down into many small files, each containing only one class at a time (by using partial classes, even one class could be spread across multiple files). Therefore, when developers create new features or fix bugs, only a small number of files will actually change. Subsequently, source control will pick up only these few changed files and check them in. A revision log combined with a diff tool will allow anybody to pinpoint changes in code within seconds. We can accomplish the same for database development by increasing the granularity in source control. Rather than checking in one big chunk of data (e.g., a one-file build script), we need to break down the repository representation of the database into many small pieces. Clearly we will need tool support for this. Ideally, such a tool supports round-trip engineering, which means we can build a database right from a local checkout folder of the repository (working copy), make changes to the database using tools such as SQL Management Studio, and script out the database to the local checkout folder again in order to check in only a few modified pieces. Figure 1 shows the file-based representation of a database as it was created by the DbGhost tool from Innovartis. Each entity in a database (such as a table, stored procedure, or login, etc.) is scripted to an individual file. Furthermore, in this screenshot you can see the overlay icons from TortoiseSVN (the UI for source control tool Subversion). At a glance, you can detect which parts of a database were changed (the red exclamation mark) or which parts are new (the blue plus). Such a fine-grained, file-based approach provides a number of great benefits:
As a side note I'd like to mention that many tools provide their own proprietary storage type and format, which is not necessarily text-based like the one shown above. Rather, they use a proprietary format, such as snapshot files or a SQL database as source control repository. While this approach might be better, e.g., with regards to performance, I prefer a purely file-based approach because it tightly integrates with the same source control system used for the related software project.
Concurrent Development What would concurrent database development look like when using a fine-grained database representation? First of all, there are two major approaches, each with its own pros and cons:
The interesting part is to see what happens in the case of merge conflicts. First of all, the chances that two developers will change the same database object are already much lower when using a fine-grained representation. Second, a good source control tool will be able to merge most of the changes automatically (e.g., both developers modified a different column). But just as when working with regular source code, an automatic merge can (and will) fail periodically, producing a merge conflict that needs to be resolved manually. Figure 2 shows how the process to resolve this merge conflict works. For the most part, such an approach will take the guesswork out of merging conflicting changes in database development and enable concurrent development by multiple developers, effectively eliminating the typical bottleneck encountered in linear development.
UPGRADING PRODUCTION SYSTEMS
Some changes to the database schema are simply additions, such as new columns or tables. However, other changes are the result of complex refactorings, data normalizations or denormalizations. The key to any database upgrade strategy is to transform your customers' live production data in such a way that no data gets lost and that the transformed data adheres to all constraints and is coherent with what your application expects. In general, there are two strategies to perform such an upgrade:
Again, the key to success is tool support, this time using a database synchronization tool. Such a tool takes a source and a target database, compares all database objects between the two (schema, data, programmability, and security), and then upgrades the target database by creating, modifying and dropping objects in the target database. The result of this synchronization process is two exact database copies. During the synchronization, the DB sync tool will record a SQL script with all executed commands, which will allow replaying the upgrade process later. MICROSOFT .NET LATEST STORIES
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING NEWS FROM THE WIRES
|
|||||||||||||||||||||||||||||||||||