Welcome!

.NET Authors: Liz McMillan, Yakov Werde, Matthew Pollicove , Kevin Benedict

Related Topics: .NET

.NET: Article

Indexed LINQ

Optimizing the performance of LINQ queries using in-memory indexes

For constant nodes - which could be an int, string, or some other data structure, we go ahead and simply grab the value, get its hash, and return that. For other nodes, we convert the expression to a LambdaExpression, compile it, invoke the resulting method, and get our hash from the result. Either way, so long as the result isn't null, we'll get an int lookup value we can use for our index lookup.

Do the Query
Once we have a property we know we have an index and a value to look up in the index, we can go ahead and process the query using the index:

hashRight = GetHashRight(rightSide);
Dictionary<int, List<T>> myIndex =
       sourceCollection.GetIndexByProperty(property);
if (myIndex.ContainsKey(hashRight.Value))
{
    IEnumerable<T> sourceEnum = myIndex[hashRight.Value].AsEnumerable<T>();
    IEnumerable<T> result = sourceEnum.Where<T>(expr.Compile());
    foreach (T item in result)
       yield return item;
}
noIndex = false; //we found an index

The process of doing an indexed lookup is a matter of calling Where again, this time, on a much smaller set - the items within a given page of the index - rather than the entire enumeration. The results of the where are then iterated through, using the yield return statement on each item as you would with any other iteration routine.
In case no index is found, you simply do the following:

if (noIndex) //no index? just do it the normal slow way then...
{
    IEnumerable<T> sourceEnum = sourceCollection.AsEnumerable<T>();
    IEnumerable<T> result = sourceEnum.Where<T>(expr.Compile());
    foreach (T resultItem in result)
       yield return resultItem;
}

This does nothing more than return results just like the default where implementation does - looking at each element of the collection and checking to see if it passed the test in the expression passed to the Where method.

Results
As anyone who has ever done work with databases knows, indexes can provide dramatic performance improvements for large sets of data. Over a large set of a million books, our lookup time would have been around 1,200ms for the unfortunate book at the end of the collection. Using an index, however, brings that time down to less than 10ms - a 100x performance improvement.

But it's not without a cost. The code to maintain an index in cases where the values of the objects change more frequently is somewhat more complex, requiring you to implement a change notification in the class you'll be indexing. And just like database indexes, adding indexes slows down your add, remove, and update operations - making the developer who has domain knowledge of how the objects will be used the only person who can possibly know the best indexing strategy.

The source code that ties all of this together is available at www.codeplex.com/i4o, an open source project designed to make indexed collections available to all developers on the .NET platform.

More Stories By Aaron Erickson

Aaron Erickson is a principal consultant for Magenic. He is a ruthless advocate for concentrating on creating the most business value in the least amount of time when his clients entrust him to deliver a technical solution. Aaron has been delivering solutions on the Microsoft platform for over 14 years, and currently leads open source development efforts related to LINQ, including Indexes of Objects (i4o) and LINQ to Expressions (MetaLinq).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.