Welcome!

.NET Authors: Liz McMillan, Mark O'Neill, Peter Silva, Yakov Werde, Matthew Pollicove

Related Topics: .NET

.NET: Article

Indexed LINQ

Optimizing the performance of LINQ queries using in-memory indexes

A Smarter Where Implementation
At this point, we have an indexed collection that could, in theory, be used by a query processor to enable faster searches based on lookups rather than searching sequentially (a k a associative search). Which brings us to our next problem - how do we write our own implementation of Where? The answer is a new C# 3.0 feature called Extension Methods. While explaining how extension methods work is a topic worthy of its own article, for purposes of this exercise, we can say that extension methods let us extend classes we don't own to do things we want them to do. Extending our own class to have its own custom where implementation, as it turns out, isn't terribly difficult. To do so, you create a static class, with a static Where method that contains a special signature:

   public static class IndexableCollectionExtention
   {
     //extend the where when we are working with indexable collections!
     public static IEnumerable<T> Where<T>
     (
       this IndexableCollection<T> sourceCollection,
       Expression<Func<T, bool>> expr
       )

Extension methods extend the class you specify in the first parameter, the one shown here with the modifier on the parameter. The second parameter is the actual parameter that a Where method is always passed - namely, the expression we're going to evaluate to determine whether we do an index lookup or simply pass the call through to the default implementation of where.

Analyzing the Expression Tree
Expressions are a representation of code as data - something very familiar to those from the LISP world. The Where method takes an Expression as its parameter, allowing the implementer of where to determine how a search is going to take place. In LINQ-to-SQL this is where the expression is converted to SQL code that is, in turn, passed to the database. However, in our case, rather than do a conversion, we're simply going to examine the tree to determine if we can use an index, and if so, figure out what to look up.

Two conditions must be true for us to be able to use an index.
• The where clause must be testing for equality
• The property on the left-hand side of the comparison must be a field we happen to have an index for.

Is Where Testing for Equality?
Our test to determine whether the programmer is doing an equality operation is not as complex as it might seem. A Where always gets a certain type of expression passed to it called a LambdaExpression. LambdaExpressions are simply delegates assigned to a data structure. In our case the LambdaExpression tends to be a specialized version that always returns a Boolean value by virtue of being part of a where clause.

Lambda expressions have two main properties - a set of Parameters that reflect the parameters that go into the lambda, and a Body that represents what the lambda actually does (see Figure 1). Of particular interest to us is the Body expression. Specifically, we want to make sure that:
a.  The expression is, in fact, a BinaryExpression
b.  And that the BinaryExpression represents an "equals" operation

The following code does both for us:

if (expr.Body.NodeType == ExpressionType.Equal)
{
   //Equality is a binary expression
     BinaryExpression binExp = (BinaryExpression)expr.Body;

Examine the Left Side
Once we know we're dealing with a binary "equals" expression in the body of the lambda, we can determine whether we're dealing with a field that is indexable. We pass the left side of our binary expression to the following method:

    private static bool HasIndexablePropertyOnLeft<T>(
Expression leftSide,
IndexableCollection<T> sourceCollection)
    {
       if (leftSide.NodeType == ExpressionType.MemberAccess)
          return sourceCollection.PropertyHasIndex(
((MemberExpression)leftSide).Member.Name
);
    else
       return false;
}

This routine performs two important tests. The first is to see if it's the kind of expression we can use on the left side - something called a MemberExpression. MemberExpressions are expressions that relate to any property lookup that might occur in code. We need the expression to be a member expression because that's the only type that will let us determine which index, if any, to use.

The second test, which only applies if it is, in fact, a MemberExpression, is to make sure that the property that the MemberExpression tries to access is, in fact, indexed by the collection we're working with. To perform this test, we use the PropertyHasIndex method we wrote earlier, passing the name of the property the user passed into the query. If this works, we know we have a field for which an indexed lookup is possible.

Evaluate the Right Side
Now that we know what index to use, we need to figure out the value that we'll be looking up in the index. So we need to figure out what value the right side of the BinaryExpression (a k a the right side of the equals) evaluates to. Unlike the left side, we're less picky about what might be on the right side of the expression. The right side could be a constant number, a string, or any other valid C# statement that evaluates to something. The following routine does the evaluation:

private static int? GetHashRight(Expression rightSide)
{
   //rightside is where we get our hash...
   switch (rightSide.NodeType)
   {
     //shortcut constants, dont eval
     case ExpressionType.Constant:
       ConstantExpression constExp
         = (ConstantExpression)rightSide;
       return (constExp.Value.GetHashCode());
     default:
       LambdaExpression evalRight = Ex.Lambda(rightSide, null);
       //Compile it, invoke it, and get the resulting hash
       return evalRight.Compile().
DynamicInvoke(null).
GetHashCode();
   }
}


More Stories By Aaron Erickson

Aaron Erickson is a principal consultant for Magenic. He is a ruthless advocate for concentrating on creating the most business value in the least amount of time when his clients entrust him to deliver a technical solution. Aaron has been delivering solutions on the Microsoft platform for over 14 years, and currently leads open source development efforts related to LINQ, including Indexes of Objects (i4o) and LINQ to Expressions (MetaLinq).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.