| By Timothy Stall | Article Rating: |
|
| December 21, 2006 10:00 AM EST | Reads: |
7,991 |
After years of nearly every tradeshow, magazine, and technologist saying how wonderful unit tests are, developers know that they're supposed to write unit tests. But it's not always that easy. For example, while most developers can write tests for a trivial class library, it's much harder to write tests for the database.
There is much value in testing the database - it can contain complicated stored procedures, transactions, and functions that could easily cause errors. Preventing those errors can be worth the time it takes to write the test.
This article will explain some of the theories and difficulties behind testing the data layer, and then show how a new open source tool, the MassDataHandler, helps solve those problems, resulting in value-added unit tests. You can download it for free from Microsoft's new "CodePlex" Web site: www.codeplex.com/Wiki/View.aspx?ProjectName=MassDataHandler
Problems
The standard pattern for a deterministic test is to:
- Set the initial state.
- Run the system under test (SUT).
- Check the final state.
Setting the initial state requires first creating the schema, and then populating the database with test data. Each test - in order to run independent of the others - must clear the appropriate tables (to avoid interference) and then create its own data, mostly likely with a group of insert statements. However, generating these insert statements is messy. Just look at a simple example to insert a few columns in just three rows:
insert into Customer (CustomerName, Notes, LastUpdate)
values ('Bob',null, cast('03/04/2006' as varchar(30)) )
insert into Customer (CustomerName, Notes, LastUpdate)
values ('Sarah','smart employee', cast('06/03/2005' as varchar(30)) )
insert into Customer (CustomerName, Notes, LastUpdate)
values ('Simspon','show up for work on time', cast('09/17/1997' as varchar(30))
THE PROBLEMS INCLUDE:
- A huge amount of error-prone, textual overhead, like recopying the insert line for each row and calling extra functions just to create the right data type.
- Brittle insert statements that add a non-null column to the table and break everything in the process.
- Irrelevant non-null fields. (Say a table has 20 required fields that you don't need; you still need to insert data into all of them to create a single row.)
- Hard coded identity values that change with each test run.
- Visual separations in the value and the column it inserts into, making it harder to ensure the right value goes to the right column.
How the MassDataHandler Does It
The MassDataHandler framework takes a user-friendly XML snippet that contains the relevant data to insert. Because these are just XML snippets, you can store them either as literal strings in code or as physical files. Using knowledge of the database schema to do all the grunt work, it translates that snippet into SQL inserts and populates the database. Figure 1 shows a sample XML fragment that inserts data into several tables. There are three types of root elements: a variables section, an import element that dynamically includes other scripts, and a table element to insert data into the named table. The table elements have child row elements, where each node inserts a row into the database. The database column is the XML attribute; the XML value is what gets inserted into the database.
While the online tutorials go into more detail, the MassDataHandler scripts have several benefits:
- Variables: You can use variables in attribute values to refactor out common text strings. A variable is referenced with an MSBuild-like syntax of $(…).
- Identity Inserts: The MassDataHandler saves the identity value of every row it inserts, and lets you reference them later with the '@' operator. So say you've inserted into the customer and product tables, and then need both their primary key (identity) values to insert into the child table order. When you specify CustomerId="Customer.@N", where N is a row (1, 2, 3, etc....), the framework is smart enough to check the customer table, lookup the nth row's value, and then automatically insert that.
- Non-Null Columns: You only need to specify the columns that you care about. The MassDataHandler uses knowledge of the database schema to determine any required columns that weren't explicitly mentioned, and then inserts minimal default data into those columns so that you can still insert a row without worrying about them. For example, if the LastUpdate column is required, yet you omit it from your XML script because it's irrelevant to your test, the framework will still populate it with a default date. Likewise, if someone adds a new required column, it won't break your tests because the framework will just automatically populate it.
- Multiple Tables: You can insert data into multiple tables with the <Table> element (by default it clears all the tables listed, helping your tests to have a fresh baseline each time).
- Default Rows: You can specify a default row using the <Default> element that sets the values for certain columns. You can then override the default with a <Row> element that specifies specific values for specific columns. The next section explains this more.
The framework uses several steps to determine what column values to insert into the database table. For each <Row> element, the framework first starts with the minimal inserts for that table (i.e. the non-null, required columns). This ensures that all required columns always get populated. It then overrides that with any <Default> element, overrides that again with the specific <Row> element, and lastly evaluates the variables and identity lookups. Figure 2 illustrates this with a 5-column table. For simplicity, these columns are all varchar (string). For example, Column 2 is required, and therefore has a default value of "A." However, we override that with the <Default> element for a new value of "B." Then the given <Row> has the final say. For Column 2 it overrides everything and explicitly sets the value to "C," whereas for Column 3 it doesn't specify anything and therefore keeps the original value of "B" from the <Default>.
Round-Tripping
One of the ways that the MassDataHandler helps simplify database unit tests is by allowing you to round-trip data between SQL, XML snippets, and ADO.NET DataTables (see Figure 3). The intent is that the MassDataHandler primarily works with XML Snippets, supplying methods to convert the snippet to live SQL data. However, it also supplies methods to create an XML Snippet from a SQL select (thus round-tripping between XML and SQL Data). Furthermore, the framework lets you create a snippet from a DataTable, as well as create a DataTable from a snippet. Therefore, using the snippet as an intermediate step, you can quickly convert data between any of the three formats. This is especially useful when you need to create snippets from existing data or check a large result set to verify that your test worked correctly.
How to Install
Installing the MassDataHandler is quick. You can download it for free from Microsoft's open source Web site, CodePlex. The root directory has a batch file appropriately named "ClickHereToStart.bat". This file triggers a WinForm to collect environmental inputs (database connection info and your MSTest and MSBuild directories) (see Figure 4). With this info, the batch then kicks off an MSBuild script to compile the source code, install a test database, run the tutorial (which is really just an MSTest suite showing how to use the framework), and lastly opening up the solution. The MSBuild script takes less than 30 seconds to run; if it passes, then the framework successfully ran on your machine.
The Readme file lists the key environmental assumptions, namely that the MassDataHandler framework was created for .Net 2.0 and SQL Server 2005, that it requires MSTest, and that you've somehow created a test database that has the sufficient schema (excluding foreign keys).
The downloadable package includes an MSTest suite that shows the MassDataHandler in action. Because it can be easiest to learn from actually seeing working code, this is intended as a tutorial. At the time of this writing, it contained about a dozen tests, starting with the basic concepts, and gradually working up to the advanced ones. Users can probably find a test that is similar to what they need, and just adapt the code appropriately.
Using GUI
The MassDataHandler is only as powerful as the XML snippets that you can write. Therefore it includes a visual editing tool, both to test the scripts (see Figure 5) and to create snippets from existing data (see Figure 6), which is extremely useful as your data gets more complicated. The GUI is straightforward to begin with, and it gets compiled (from the MSBuild script running) to the "MassDataHandler.Gui\bin\Debug" directory.
Frequently Asked Questions
Q: Is it cheating to have the database created as a separate step?
A: No. Although ideally in unit tests, each test runs perfectly in isolation, without any prerequisites, the very nature of database tests is not ideal. It could take several minutes to completely recreate a database schema. It's worth having your local update script (or test run script) first upgrade a test database with the latest schema, if it means your unit tests perform more quickly. However, it's also possible to require each test to individually check for the latest database schema and only trigger an upgrade if necessary - it just may not be worth your time to implement.
Q: Why not just use mocking?
A: By its very definition, mocking "mocks out the database," preventing it from being tested, which is the exact opposite of what we want to do. Mocking is great for testing the business layer by decoupling it from the database. But our goal is to test the database.
Q: But won't your tests be brittle?
A: A little, yes. But, we're trying to solve a complicated problem, and the extra brittleness goes with the territory. However, by using the MassDataHandler's flexible XML scripts, we bring the brittleness of our database tests down to a manageable level.
Q: You can't really call this a unit test.
A: It's semantics. Technically a unit test runs only a single unit of code, and the very nature of database tests requires integration (interconnected tables, stored procs, views, functions). However, by making all the other pieces so robust, you can effectively run these database tests as if they were unit tests. They're run from within MSTest (alongside all your other unit tests), they're mostly atomic, they perform quickly, and they safeguard your advanced SQL logic. You can call them any testing buzzword you choose, but the point is they add value to your project.
Conclusion
Testing the database is difficult, yet beneficial. The MassDataHandler is a free tool that makes it easier to create and maintain database tests. With the importance of the data layer in today's enterprise applications, it's a good investment worth checking out.
Published December 21, 2006 Reads 7,991
Copyright © 2006 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Timothy Stall
Tim Stall is a software developer at Paylocity, an independent provider of payroll and human resource solutions. He can be contacted at tims@paylocity.com.
- Kindle 2 vs Nook
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Confessions of a Ulitzer Addict
- Cloud Computing Best Practices
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Infrastructure-as-a-Service Will Mature in 2010: Microsoft's David Chou
- Windows 7 – Microsoft’s First Step to the Cloud
- Cloud Computing & Federal IT - What Does the Future Hold?
- Jill Tummler Singer, Deputy CIO of CIA, Keynotes at GovIT Expo
- Cloud Expo and the End of Tech Recession
- Kindle 2 vs Nook
- The Difference Between Web Hosting and Cloud Computing
- Ajax in RichFaces 3.3, JSF 2 and RichFaces 4
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Confessions of a Ulitzer Addict
- Cloud Computing Best Practices
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Eval JavaScript in a Global Context
- Infrastructure-as-a-Service Will Mature in 2010: Microsoft's David Chou
- Windows 7 – Microsoft’s First Step to the Cloud
- Google Maps and ASP.NET
- Crystal Reports XI & How It Has Changed
- Converting VB6 to VB.NET, Part I
- Creating Controls for.NET Compact Framework in Visual Studio 2005
- Where Are RIA Technologies Headed in 2008?
- How to Write High-Performance C# Code
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Implementing Tab Navigation with ASP.NET 2.0
- i-Technology Photo Exclusive: Bill Gates & Steve Jobs In "Nerds"
- .NET Archives: Getting Reacquainted with the Father of C#
- i-Technology Viewpoint: "SOA Sucks"
- Programmatically Posting Data to ASP .NET Web Applications



































