Welcome!

.NET Authors: Bruce Armstrong, Pat Romanski, Liz McMillan, Yeshim Deniz, Dmitry Sotnikov

Related Topics: .NET

.NET: Article

SqlClient Connection Pooling Exposed - Reflection allows .NET developers to peer into the internals of SqlClient

SqlClient Connection Pooling Exposed - Reflection allows .NET developers to peer into the internals of SqlClient

ADO.NET's SqlClient data provider for SQL Server supports connection pooling, which is enabled by default. It works well: when a connection is closed it is returned to the pool and reused later when another connection using the same connection string and context is created. The result is improved performance. I could delve into the vagaries of how we can test this and what connection string parameters exist to control the connection pooling process, but that isn't what this article is about. This article answers two frequently asked questions about SqlClient connection pooling: How can I set the connection pool timeout, and how can I flush the connection pool?

The short answer to both of these questions is that you can't. Of course, that would make for a very short article and isn't very interesting. The much longer answer to this question is that you can, but we're going to have to go on a journey of discovery to achieve our goal. Like most developers, the one thing that guarantees I'll find an answer to a problem is being told that it can't be done.

In this particular case, the .NET developer's best friend, reflection, is the answer to the problem. Reflection knows all; nothing hides from reflection; and it is through reflection (with a little help from Jay Freeman's Anakrino and Lutz Roeder's Reflector) that we can find out how connection pooling works and how we can set the timeout and flush the pool.

Before we jump in, let's take a moment to consider the Connection Lifetime parameter, which can be set in the connection string. I am sure that anyone who has looked at the documentation of SqlConnection and noticed the Connection Lifetime parameter has thought, "Surely, you can use this to set the connection timeout?"

It's a red herring. Unlike most features of the .NET Framework, this one is badly named. It doesn't set the connection timeout at all. Instead, it determines whether a connection gets pooled or not. It specifies the number of seconds that the connection will be open. When the connection is finally closed, the number of seconds that it was open is compared to this setting, and if it is greater than the Connection Lifetime, then the connection is not pooled. If it is less, then the connection is pooled. The benefit is that if you use SQL Server clustering you can use load balancing more effectively. For our purposes, however, it is of no benefit.

How SqlClient Connection Pooling Works
To effect our solution, we need to know how SqlClient's connection pooling actually works. Because Microsoft has documented only the public interface for SqlClient, there is no documentation on how the internals work. Fortunately, that doesn't stop us as, armed with Anakrino and/or Reflector, we can poke around in System.Data.SqlClient until we find our answers. Figure 1 shows the class hierarchy of the relevant internal SqlClient classes.

 

SqlConnection represents the public face of the connection. However, it is simply a wrapper of convenience. It doesn't actually represent the connection itself. Instead, the actual connection is an SqlInternalConnection. When SqlConnection.Open() is called, a new SqlInternalConnection object is created and assigned to a private field, called _internalConnection of type SqlInternalConnection. When SqlConnection.Close() is called, _internalConnection is assigned null.

SqlInternalConnection has a public property called Pool, which is of type ConnectionPool. Pool is assigned a ConnectionPool object that has a public property called Control, which is of type DefaultPool Control. Control, which is assigned an SqlConnectionPoolControl, is a read-only property that simply reads a private field called _ctrl.

All of DefaultPoolControl's constructors (which are in turn called by SqlConnectionPoolControl's constructor) execute the following line:

DefaultPoolControl._random = new Random(5101977);

_random, a random number generator, is a private static field of type Random. There is the question of why the seed value is 5101977. This value looks suspiciously like a person's birthday - May 10, 1977. Do you know of anyone on the development team with this birthday?

In addition to this, all of DefaultPoolControl's constructors call a private method, called SetupMembers(), in which one of the lines is:

_cleanupWait = (DefaultPoolControl._random.Next(12) + 12) * 10000;

This rather important line initializes a "wait" time and stores it in a private field called _cleanupWait (for which there is a public property called CleanupWait). We will return to this line in a moment. Let's go back to the ConnectionPool that I mentioned. The ConnectionPool constructor executes the following line:

_cleanupTimer = CreateCleanupTimer();

CreateCleanupTimer is a private method that executes the following code:

_cleanupWait = _ctrl.CleanupTimeout;

return new Timer(new TimerCallback(CleanupCallback), null, _cleanupWait, _cleanupWait);

That's all the digging we need to do. Now let's see what it all means. We can see from CreateCleanupTimer that a new Timer object is created. The Timer object calls a method called CleanupCallback. It does this after an initial wait of _cleanupWait milliseconds and again thereafter for each successive interval of _cleanupWait milliseconds. CleanupCallback is the method that ultimately releases the pooled connections (by closing connections until there are "Min Pool Size" connections left in the pool).

All that remains to be covered is to understand why it takes between 4 minutes and 7 minutes, 40 seconds for a connection to time out. Remember the line that initializes _cleanupWait? This line is dependent upon its call to _random.Next(12). This call will return a number between 0 and 11. If it returns 0, then the line is equivalent to:

_cleanupWait = (0 + 12) * 10000;

If it returns 11, then the line is equivalent to:

_cleanupWait = (11 + 12) * 10000;

So _cleanupWait is a value between 120,000 milliseconds and 230,000 milliseconds. The Timer waits for an initial time of _cleanupWait milliseconds. Hence, the range is anywhere from 120,000 to 230,000. 120,000 milliseconds equals 2 minutes, and 230,000 milliseconds equals 3 minutes and 50 seconds.

This would be the end of the story if this is what actually happens - but it isn't.

The CleanupCallback method called by the Timer is rather curious. The ConnectionPool class has not one but two stacks of internal "pooled" connections: it has an "old" stack and a "new" stack. When CleanupCallback is invoked it closes all of the internal connections in the old stack. Then it moves all of the internal connections from the new stack to the old stack. The first time CleanupCallback is invoked, the old stack will be empty, so no connections are closed. All of the internal "pooled" connections are then moved from the new stack to the old stack. The next time CleanupCallback is called, the connections will be in the old stack and will be closed. The consequence of this is that it takes two calls to CleanupCallback to close connections.

As a consequence of this our calculations have to be doubled. Hence the range is anywhere from 240,000 to 460,000 milliseconds (4 minutes to 7 minutes, 40 seconds).

Of course, there is the little matter of the random number generator. If you've looked at random number generators before, you will probably be aware that they're not very random. Neither is the one in .NET. Try this exercise. Create a new Windows Forms application and add a private field:

private Random _random = new Random(5101977);

Now add a button with the following code:

int cleanupWait = (_random.Next(12) + 12) * 10000;
MessageBox.Show(cleanupWait.ToString());

Click on the button a few times and you will get 190000, 220000, 170000, 170000, and 120000. If I can predict what the next value is on your computer, then it's not very random. If we look at the first value, 190000, we can determine that the first connection pool always has a timeout value of 6 minutes, 20 seconds, i.e., (190000 * 2) / 1000.

Getting the Connection Timeout
Now that we know how it works, we can start to get some benefit. Let's start with a quick trick to get the ball rolling. Say we want to know what the connection timeout value is for a given connection. If we have an SqlConnection called connection we might write this:

connection._internalConnection.Pool.Control.CleanupTimeout

As I've said, the first call to CleanupCallback doesn't close the connection, so we would need to double CleanupTimeout to get the true value. Unfortunately, we can't write the above line because the SqlInternalConnection, ConnectionPool, DefaultPoolControl, and SqlConnectionPoolControl classes are all internal, and the SqlConnection._internalCon nection field is private.

Enter reflection. Reflection is the Sauron of the .NET world - it knows all and sees all - and none can hide from it. We could start digging straight into each field and property, but all of our solutions stem from getting at the ConnectionPool object, so we will start by writing a function to get the ConnectionPool. Listing 1 shows the code for GetConnectionPool.

It starts by using reflection to get the private _internalConnection FieldInfo. Although we can use this to get the value of the SqlInternalConnection object, we cannot cast it to an SqlInternalConnection. This is because SqlInternalConnection is an internal class, so we cannot refer to it directly in our code.

Instead, we have to assign it to a variable of type Object. This is unfortunate because it makes the rest of our code less straightforward than it could have been. Having retrieved the _internalConnection we can then get the value of its public Pool property. We have to use reflection again for this because we cannot refer to the SqlInternalConnection class.

Listing 2 shows GetConnection CleanupTimeout, which gets the cleanup timeout value (and which we already know returns 190000 the first time it is called).

Setting the Connection Timeout
Now that we know how to get the connection timeout, you might have thought that setting it would be a case of calling PropertyInfo.SetValue() instead of PropertyInfo.GetValue(). Although this will set the _cleanupWait property, it won't actually change the timeout. Remember that the timeout is invoked by a Timer object. We will have to change the Timer's timeout value to have any effect. What we want to write is this:

connection._internalConnection.Pool._cleanupTimer.Change(cleanupTimeout, cleanupTimeout);

Listing 3 shows the completed method. Remember that when you call SetConnectionCleanupTimeout, the actual timeout value is twice as big as the cleanupTimeout value.

Flushing the Connection Pool
Our remaining goal is to be able to flush the connection pool. I have found this particularly useful in remoting servers. The problem is that I want to perform some activity on the database that requires exclusive access and I have ensured that all users are logged out and cannot log back in. However, I want to keep the remoting server up and running because it is on a separate machine. Unfortunately, it still has active connections to the database because they are still in the connection pool. I could simply wait for them to time out but I'm too impatient to wait up to 7 minutes, 40 seconds - so I invoke a method on the remoting server to flush the connection pool.

All we need to do to flush the connection pool is to call the CleanupCallback method of the ConnectionPool. Of course, to do this we need to get the connection pool. We can get it from an open connection, but that doesn't really help us because an open connection will not be closed when the pool is flushed. Conversely, we cannot get the connection pool from a closed connection (either before it has ever been opened or after it has been closed) because the SqlConnection's _internalConnection is null when the connection is closed. Instead we have to get a reference to the pool itself after a connection has been opened. Our application code to get the connection pool would look something like Listing 4, in which pool is a private field of type Object. To flush the connection pool we would write:

ReleaseObjectPool(pool);

Now that we know how it is supposed to work, we can implement it. What we want to write is:

connection._internalConnection.Pool.CleanupCallback(null);
connection._internalConnection.Pool.CleanupCallback(null);

We have to call CleanupCallback twice because the first call only moves connections from the new stack to the old stack. Listing 5 shows the completed ReleaseObjectPool method.

While this approach works well and is my preferred solution, it isn't version-safe because it goes hacking into the internals of SqlClient. For this reason it is worth pointing out an alternative strategy to flushing the connection pool. The connection pool belongs to the application domain. In most Windows Forms applications there is a single application domain, which is created by default. When your application closes down, the application domain is destroyed and, in turn, the connection pool is destroyed. You could, however, create your own application domain within your application and ensure that all connections (and almost the entire application) exist within your own application domain. To flush the connection pool you would need only to unload the application domain. Although this is technically sound and uses publicly documented features that will not change from version to version, it is a very poor solution. Effectively, it turns your application into a remoting application within a client/server application, and the unloading of the application domain destroys not only your connection pool but all of your global state data. In my opinion this solution is far worse than the problem itself.

OracleClient Connection Pooling
And what of Microsoft's OracleClient provider? With the exception of a few minor details it is the same mechanism and uses the same basic components, yielding the same results. Obviously, there are some class name differences such as prefixing classes with "Oracle" instead of "Sql", and calling connection pools "object pools," resulting in internal class names like "DBObjectPool" and "DBObjectPool Control" instead of "Connection Pool" and "DefaultPoolControl". However, the essential mechanisms that create the Timer, initialize a Random object, and call the CleanupCallback method are exactly the same and use the same values.

Conclusion
Perhaps the most important nugget to grasp from this exercise is something that we already knew - that there is no such thing as "can't" when you have reflection. That said, I hope that Microsoft adds these features to a future version of SqlClient so that such messing around with the internals becomes unnecessary.

More Stories By Guy Smith-Ferrier

Guy Smith-Ferrier is the Chief Courseware Architect for Courseware Online, specialists in developer training course materials for training companies and individuals alike. He is the author of C#/.NET courseware and much of the official Borland courseware including courses on COM and ADO. After years of trying he has still to master the piano better than a deaf one-armed monkey. He lives in hope. He can be contacted at gsmithferrier@coursewareonline.com.

Comments (2) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Guy Smith-Ferrier 12/14/03 03:05:19 PM EST

Angel,

Thanks! That's very kind. It is a subject which I found very interesting and as you can see I'm not keen on being told that I can't do something.

> I would like your permission to point customers with pooling questions to it for reference.

Absolutely. Please go ahead.

> [Unloading the AppDomain] I would be interested in hearing more about why you feel that this is not a good solution

In terms of achieving the result (of being able to flush the pool) it gets 10 out of 10. No problem there. For me, the problem lies in the upheaval to the application code itself. If you have created a set of business objects and have encapsulated access to the database through the business objects then the upheaval will be partially limited to modifying those business objects. Obviously the line instantiating the new objects will change from, say,

CustomersBusinessObject customers = new CustomersBusinessObject();

to:-

CustomersBusinessObject customers = (CustomersBusinessObject) appDomain.CreateInstanceFromAndUnwrap("MainApplication.dll", "MainApplication.CustomersBusinessObject");

but if you have an object factory of some kind then you have already encapsulated the code for object instantiation so this might not be an issue.

The big issue with this approach is that if developers haven't written their applications to expect to communicate with objects across domain boundaries then they will be in for a bit of a learning curve. So if a developer has written a straight client/server Windows Forms application and attempts to use the approach of unloading the application domain then they will find differences in behaviour that they weren't expecting. For example, if the business object has a method called Update to which you pass a DataSet then in a regular one AppDomain application there is only one copy of the DataSet and when the DataAdapter's Update method is called the DataRows in the DataSet will have their RowState updated to reflect the change to the database. If the business object came from a different AppDomain then there would be two copies of the DataSet. The copy of the DataSet in the business object's AppDomain would be updated correctly but these changes wouldn't be seen in the host application's copy of the DataSet. Thus after calling the business object's Update method the DataSet in the host application would still think that there are changes to be made to the database. Furthermore, it wouldn't have picked up any changes made to primary keys, foreign keys and other updates.

If the developer wrote the application anticipating that the business objects would belong to a different AppDomain (as would be the case in a remoting application) then there would be no additional barriers to this solution.

Does this make my reasoning clearer or muddier ?

Guy

Angel Saenz-Badillos[MS] 12/11/03 12:33:48 PM EST

Guy,
This has to be the most interesting managed pooling articles I have seen. I would like your permission to point customers with pooling questions to it for reference.

I like the fact that you not only show the solution but warn people that modifying internal state is not guaranteed to work across releases.

I have been working with Appdomains unloads, these are very very important for applications that host the URT like the new version of Sql Server, and of course IIS has been using this ever since we shipped v1.0.

It is definitelly a drastic solution but it works well as a stopgap measure and it would be a version friendly solution for this problem. I would be interested in hearing more about why you feel that this is not a good solution and would like to hear from people using this feature in their applications.

Thanks,
Angel Saenz-Badillos [MS] Managed Providers. This post is "AS IS" and confers no rights.