| By Aaron Erickson | Article Rating: |
|
| August 1, 2007 06:30 PM EDT | Reads: |
13,550 |
Language Integrated Query (LINQ) is a very powerful new technology coming to us with Visual Studio 2008. There is a great deal of innovation going on in the LINQ space, including innovative projects like LINQ-to-Flickr and LINQ-to-Amazon (among many others), in addition to the great things Microsoft is providing in LINQ-to-SQL and LINQ-to-XML.
This article dives deeper into the more mundane details about how to optimally query data you already have in memory via LINQ-to-Objects. LINQ lets you do queries on any object (typically collections or arrays) that support IEnumerable<T>. For example, if you have an online bookstore, you might decide to cache your products in memory on the Web server to improve performance. In such a case, you might have a List<Books> bookCache that you query this way when someone goes to look up a book:
var searchResult = from b in bookCache
where b.ISBN="978-1590596326"
select b;
Now, imagine your cache of books for your Web site contains, say, a million books, and you need to make this scale in a serious way. In such a case, you might be interested in optimizing how LINQ works.
How LINQ Queries in Memory Objects
When you use LINQ syntax to do a query, what really happens is that the syntax is converted to a series of method calls, such as where, join, and others related to the query "operators" that LINQ supports. Of interest to us is, for purposes of a query, how Microsoft implements "Where" in LINQ to Objects. Given that Where has to work for any IEnumerable<T>, there's very little that the default implementation can do to optimize the query. As a result, the default search over IEnumerable<T> in LINQ is a sequential search. In other words, if you are searching for the book at the bottom of a million-book pile, your search may not be terribly fast.
Indexing to the Rescue
Nearly any searching algorithm is better than sequential search. Of course, to use any of those algorithms, we have to know a little bit more about what we are searching. If your items are in order, you might use a binary search, which would help quite a bit. However, for the best possible performance, you want your search to use an index - just like every relational database worth mentioning does when it searches through a large collection of rows.
Implementation of a Generic Index
As it turns out, we need three elements to be able to do this in a generic manner:
• An ability to easily specify which fields in a class we are going to have an indexed collection of that we are going to index.
• A mechanism that allows for building and maintaining these indexes automatically.
• Modification of the implementation of Where so that it takes advantage of the index, rather than doing a sequential search.
The [Indexable()] Attribute
If we're going to have an easy-to-use indexing mechanism, we need a simple way to answer the "what" question - as in what is it we are going to build indexes on. Building indexes on every property in a class is almost certainly going to be too much - causing performance lags when you build the collection, not to mention wasting memory if we build indexes on fields that aren't searched on. To be more specific about what we are going to index, we need to add some meta-information to our class. That's exactly what Attributes are for.
Thankfully, for a simple indexing implementation, we need nothing more than to mark certain properties as indexable. The implementation of the attribute is very simple:
public class IndexableAttribute : Attribute { }
Once in place, we can do the following with any class we intend to use in an indexed collection:
class Book
{
private string _isbn;
private string _authorFirstName;
private string _authorLastName;
private int _id;
[Indexable()]
public string ISBN { get { return _isbn; } set { _isbn = value; } }
[Indexable()]
public int Id { get { return _id; } set { _id = value; } }
public string AuthorFirstName {
get { return _authorFirstName; }
set { _authorFirstName = value; }
}
[Indexable()]
public string AuthorLastName {
get { return _authorLastName; }
set { _authorLastName = value; }
}
}
Implementation of IndexableCollection<T>
The next step is to have a collection that, while based on the collection functionality that .NET gives us, lets us intercept certain operations on it that are of interest to maintaining an index. Namely, we want, at the very least, to know when something is added or removed from the collection.
To start, we'll derive from Collection<T>, from System.Collections.ObjectModel, as follows:
public class IndexableCollection<T> : Collection<T>
{
}
Published August 1, 2007 Reads 13,550
Copyright © 2007 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Aaron Erickson
Aaron Erickson is a principal consultant for Magenic. He is a ruthless advocate for concentrating on creating the most business value in the least amount of time when his clients entrust him to deliver a technical solution. Aaron has been delivering solutions on the Microsoft platform for over 14 years, and currently leads open source development efforts related to LINQ, including Indexes of Objects (i4o) and LINQ to Expressions (MetaLinq).
- Kindle 2 vs Nook
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Confessions of a Ulitzer Addict
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Best Practices
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Infrastructure-as-a-Service Will Mature in 2010: Microsoft's David Chou
- Windows 7 – Microsoft’s First Step to the Cloud
- Cloud Computing & Federal IT - What Does the Future Hold?
- Jill Tummler Singer, Deputy CIO of CIA, Keynotes at GovIT Expo
- Cloud Expo and the End of Tech Recession
- Kindle 2 vs Nook
- The Difference Between Web Hosting and Cloud Computing
- Ajax in RichFaces 3.3, JSF 2 and RichFaces 4
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Confessions of a Ulitzer Addict
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Best Practices
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Eval JavaScript in a Global Context
- Infrastructure-as-a-Service Will Mature in 2010: Microsoft's David Chou
- Windows 7 – Microsoft’s First Step to the Cloud
- Google Maps and ASP.NET
- Crystal Reports XI & How It Has Changed
- Converting VB6 to VB.NET, Part I
- Creating Controls for.NET Compact Framework in Visual Studio 2005
- Where Are RIA Technologies Headed in 2008?
- How to Write High-Performance C# Code
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Implementing Tab Navigation with ASP.NET 2.0
- i-Technology Photo Exclusive: Bill Gates & Steve Jobs In "Nerds"
- .NET Archives: Getting Reacquainted with the Father of C#
- i-Technology Viewpoint: "SOA Sucks"
- Programmatically Posting Data to ASP .NET Web Applications


































