YOUR FEEDBACK
Ubuntu Here We Come! - Java Finally To Become 100% Open Source
Reader wrote: Since November 206, wow! that is a long process.

SYS-CON.TV
TOP MICROSOFT .NET LINKS


Optimizing Management Queries

Digg This!

The story of Microsoft's management APIs is a fractured tale at best. Baseline Win32 APIs have been wrapped, reorganized, and wrapped again to form a suite of oddly packaged technologies, to say the least. This leads to a situation in which there are no ways to perform some essential management tasks from code, and five ways to perform others.

In situations where more than one approach is possible in order to perform the same function, developers often have a hard time determining which is the optimal approach. In this article, I will examine the alternatives from a performance-centric point of view and will show you how to get the very best response times from your management queries and related operations.

WMI
The Windows Management Interface is arguably the most important API that Microsoft provides for programmatic management of their tools and platform. It is, in fact, Microsoft's implementation of the Web-Based Enterprise Management standard. This standard attempts to organize common pieces of operating system components < such as files, processes, and pieces of hardware < into a common data structure that is intended to resemble a database engine.

Within this hypothetical database engine, the components corresponding to various systems are arranged into different namespaces. The namespace that you will most often be working with when managing Microsoft platforms and systems is the root\cimv2 namespace. This contains important tables such as:

  • win32_service: For managing Windows Services
  • win32_process: For managing Windows processes
  • win32_operatingsystem: For managing the Windows OS
  • win32_logicaldisk: For managing physical disk drives
  • win32_printer: For managing physical printers

    These tables can be queried using a SQL-like language known as the Windows Management Instrumentation Query Language (WQL). Unfortunately, unlike SQL, WQL does not support many features like sorting by columns or joins.

    As with SQL, however, it is important for the performance of your queries that you limit your WHERE clauses to indexed columns. For example, consider the following common WQL query:

    select * from win32_localaccount

    Optimizing WQL Queries
    If the computer where this query is executed is a stand-alone workstation, this query will return a list of all local accounts. On the other hand, if the computer is a part of a domain, this query will return a list of all user accounts in that domain. In the case of large domains, this might cause the query to take a very long time to return.

    Suppose that we are only interested in pulling up a list of local user accounts. We might first try to restrict the query as follows:

    select * from win32_localhost where localaccount = true

    However, this query will have surprising results in a large domain. Instead of producing results for a long time, it will take a long time to respond at all, and will then produce only the list of local accounts. Of course, the list of local accounts is what is desired, but the long response time is not.

    The reason for the long response time is that the localaccount column is not one of the columns that are indexed in the internal WMI database. Whenever you query WMI, you should be careful to restrict your WHERE clauses to only those columns that are indexed. Otherwise, WMI will still have to retrieve all of the rows in the affected tables in order to filter them. This is known as doing a "table scan," and is guaranteed to adversely affect your performance.

    Another important thing to understand about WMI queries and performance is that every row in a WMI result set will typically constitute a live handle to a different COM object, held via .NET's COM interoperability layer. This means that if you execute a query that produces several thousands of rows in its result, you will effectively have instantiated and held thousands of COM objects.

    This knowledge should impact the way in which you write .NET code in two important ways. First, you should always make sure to release any ManagementObject references that your code may generate as soon as you are done with them. For example, if you intend to restart a given Windows Service, you should set to null the object handle pointing to the ManagementObject for that Windows Service as soon as you are done invoking the method that will cause it to restart.

    Thankfully, the classes in .NET's System.Management namespace do a lot of the legwork for you on this point. Typically, WQL queries issued to WMI using .NET will leverage the ManagementObjectSearcher class, as shown in the following code.

    ManagementObjectSearcher searcher = new
    ManagementObjectSearcher("select * from win32_service");
    foreach (ManagementObject service in searcher.Get()) {
    Console.WriteLine("Service Caption = " + share["Name"]);
    }

    The ManagementObjectSearcher class has a Get method that returns an instance of the ManagementObjectCollection class. The ManagementObjectCollection class implements the IDisposable interface in such a way that iterating through it once with a foreach statement will automatically free all of the objects in it for garbage collection. This prevents the COM objects being referenced from being held for any longer than is absolutely necessary.

    For many WMI cases, this is all the care you must exercise to guarantee decent performance. However, in other cases, you may want to consider switching to an alternate administration technology altogether.

    ADSI
    ADSI stands for Active Directory Service Interfaces, and is the primary way in which .NET applications should interact with Microsoft's Active Directory technology. This is important to understand, because some portions of the Active Directory can also be accessed via WMI. However, doing so will come at a considerable price in terms of performance and reliability.

    For example, a .NET application can connect to the "root\directory\ldap" namespace and execute queries against most of the standard Active Directory objects, such as domains, users, printers, and groups. However, the same restriction mentioned previously would still apply ? every row queried will require the instantiation of its own COM object. Clearly, in cases where an Active Directory contains thousands of entries, this is not the most efficient choice of technologies.

    Fortunately, there are two alternative technologies that you can use for talking to Active Directory. The first of these is .NET's own DirectoryServices namespace. The second is the Active Directory Services Data Services Objects (ADSDSO), which is collection of COM objects that serves as an OLEDB data provider.

    So, which of these technologies should you use when? The short answer is that you should use ADSDSO whenever you will be querying large volumes of Active Directory objects. You should use the managed DirectoryServices namespace only for invoking methods and changing properties on single instances of Active Directory objects.

    This raises an interesting point, of course, which is that one would typically not be advised to access an OLEDB data source from .NET. Instead, the recommendation would usually be to use ADO.NET for all of your data access needs from .NET. Unfortunately, you will not be able to use ADO.NET to connect to the Active Directory, because there is not yet a managed provider for the Active Directory.

    You might think, then, that you could use the OLEDB provider for ADO.NET to wrap ADSDSO objects in a way that would allow you to access them from .NET. This would work for you, up to a point. Unfortunately, the first time you try to retrieve more than 1,000 records, you will find that your retrieval stops on the 1,000th row.

    The only way to overcome this limitation in ADSDSO is to explicitly set the Page Size property on your Recordsets. Amusingly enough, it doesn't matter what you set it to < at least insofar as any value will allow for retrieval of more than 1,000 rows at a time. Unfortunately, this property is not available under ADO.NET, owing to that technology's orientation toward disconnected Recordsets, instead of maintained database connections.

    Instead, you have to use .NET's COM interoperability layer in order to access the actual ADO objects from managed code. In order to do this, follow these steps:
    1.  From the "Project" menu in Visual Studio .NET, choose "Add Reference..."
    2.  In the "Add Reference" dialog that now presents itself, choose the tab labeled "COM," as shown in Figure 1.
    3.  Scroll down to the Microsoft ActiveX Data Objects 2.x Library of your choice, select it, and click OK to close the dialog.
    4.  Add the appropriate lines to the top of your source code to include the ADODB namespace. For example, under C# you would add the line:

    using ADODB;

    Listing 1 illustrates how you could actually go about programming against the ADO object directly from C#. In the case of this example, we would essentially be asking to retrieve every single object in or below the "Executives" organizational unit in the "mydomain.com" domain.

    In the third line of Listing 1 we specify that we want to use the ADSDSO data provider. Immediately below this, we open our connection to the Active Directory ? make sure to substitute your own username/password combination here if you use this code listing.

    Optimizing LDAP Queries
    The query shown in Listing 1 would perform extremely poorly. There are three main reasons for this. To begin with, we have completely neglected to provide an object filter clause. Also, we have not told the query processor which properties we would like to retrieve. Finally, we are using a "subtree" query style, which is usually a poor choice.

    Object filters restrict the kinds of objects that a given LDAP query will return. A good example of an object filter would be:

    (ObjectCategory=person)

    The effect of this filter on the query shown below would be to restrict the objects returned to only those that represent people. Computers, printers, and shares (for example) would not be returned ? potentially removing a great deal of data that might not be of interest to us anyhow.

    If we were to add such an object filter to the query shown in Listing 1, it would look like:

    LDAP:\\OU=Executives,DC=mydomain,DC=com;
    (ObjectCategory=person);;subtree

    The next thing that we could add to our query in order to improve its performance would be a list of the specific properties in which we are interested. In the absence of such a list, the LDAP query engine will return every property for every object. Considering that most of these classes have at least a hundred properties each, with many properties containing several values each, this obviously constitutes an excessive amount of data transfer for most situations.

    In the following code, we have restricted the query to return only the "distinguished name" for every person in the chosen organizational unit.

    LDAP:\\OU=Executives,DC=mydomain,DC=com;
    (ObjectCategory=person);
    distinguishedName;subtree

    The last, and perhaps most important, thing that we can do to improve the performance of this kind of LDAP query is to choose an appropriate type for it. In the query we have been discussing, we have chosen to perform a "subtree" kind of retrieval. What this means is that the LDAP query engine will use a depth-first search through the entire Active Directory below the "Executives" organizational unit, and return all objects in the directory that match our criteria.

    This may or may not be what we really want. If it is what we want, then no further optimization is possible. However, if we know ahead of time that all of the objects in which we are interested just happen to be at the very first level immediately underneath the named organizational unit, then we can greatly improve performance by switching from a "subtree" to a "onelevel" search.

    LDAP:\\OU=Executives,DC=mydomain,DC=com;
    (ObjectCategory=person);
    distinguishedName;onelevel

    Conclusion
    If you have the ability to compare the response times of the query styles shown above using reasonably large amounts of data, you will find that the performance differences are by no means minor. I hope you will find the information presented in this article helpful in developing your own management applications.

  • Sentosa wrote: I can''t get the sample to work in my appliation. It will always return me this error. Please advise Unhandled Exception: Syst em.Runtime.InteropService s.COMException (0x80040E14): O ne or more errors occurred during processing of command. at ADODB.ConnectionCla ss.Execute(String CommandText, Object& RecordsAffected, Int32 Options) at ADOLDAP.Cl ass1.Main(String[] args) in d:\project\test\adolda p\class1.cs:line 32 Press any key to continue
    read & respond »
    MICROSOFT .NET LATEST STORIES
    Icahn Moves To Force Microsoft & Yahoo Together
    Corporate raider Carl Icahn started his proxy fight for control of Yahoo this morning, beginning with the classic Icahn opening, the letter of reproach to the Yahoo board telling them they have acted 'irrationally and lost the faith of shareholders and Microsoft.'
    "RIA" vs "Rich Client Platform": The Term Is Now Up for Debate
    'RIA' is slowly fading in terms of its definition. When I first started the RIA Evangelism role in Microsoft, I had this nagging feeling that the term RIA was just all over the place. Depending on which technology you are backing and which stream of alliance you uphold, the truth is th
    Book Review: ASP.NET 2.0
    ASP.NET developers are bored with traditional books that outline concepts in a lengthy way. These books are good if you like to learn the features in a detailed manner. However, by the time the book is read, a new version will be released. Hence, many learners including myself prefer s
    Peer Networking Series - A Closer Look at PNRP vs. Bonjour/ZeroConf
    It seems as though whenever I bring up PNRP and its benefits, I am immediately inundated with a list of questions or comments indicating that Microsoft is re-inventing the wheel and that PNRP has already been implemented before in the form of ZeroConf and, more specifically, Apple's im
    db4o Open Source Object-Oriented Database Supports LINQ
    db4objects has announced that its db4o object database is now optimized for Microsoft's LINQ. With the new support, developers can choose an object-oriented optimized engine without changing the API or compromising performance. db4object's db4o database offers a persistence solution to
    SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
    SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON FEATURED WHITEPAPERS

    ADS BY GOOGLE
    BREAKING NEWS FROM THE WIRES
    XtremeNotebooks Releases First Xeon Quad Core Laptop to the United States
    XtremeNotebooks, first to introduce the Quad Core laptop to the United States, offers the firs