Welcome!

.NET Authors: Liz McMillan, Peter Silva, Yakov Werde, Matthew Pollicove , Corey Roth

Related Topics: .NET

.NET: Article

Needle in an App Stack

Optimizing .NET Web application performance

Why? The first issue is certainly the non-deterministic nature of the Web. Not all transactions serviced by the same Web application on the same Web infrastructure under the same load will have the same response time. This could be due to the network segment that a particular transaction traverses through the Web "cloud" from the browser to the data center. The complexity can be compounded by:

  • Technologies like load balancing, virtualization, and SOA, that are intended to make the application infrastructure more flexible and cost-effective but add a heavy dose of complexity.
  • Reliance on third-party vendors like ISVs, Web hosting, caching services, and Software-as-a-Service (SaaS) providers that take key infrastructure or application components out of the reach and direct control of operations personnel.

The second issue is that Web application response times from, say, the click of a mouse button to the complete rendering of the resultant page by the browser is not just a matter of "up" or "down." In the case of the phone service, if you have a dial tone, the service is "up," and if you don't have a dial tone, the service is "down." For Web applications serving today's impatient users, a response time of more than, say, seconds might as well be considered "down" since the end user is likely to move to a competitor's site.

The key takeaway is that even if all the individual infrastructural components are absolutely reliable - for example, there are servers that claim to be fiveNine-reliable - there are still serious unknowns related to the design of the application that impact the probability of the transaction being executed under the limited patience threshold of today's Web users. Web application performance management can only be accomplished by taking a comprehensive system management approach - end-to-end from browser through database - instead of the traditional silo-oriented (e.g., Web, server, network, middleware, etc.) approach that was developed to serve the application management needs of minicomputer or client/server computing environments.

As a result, instead of measuring server load or network bandwidth utilization, we have to measure Web transaction response time specifically from the end-user's perspective. Nor is it sufficient to simply monitor the average response time. A complete understanding of the range of performance requires an understanding of the distribution of the response times for individual transactions. The only viable way to measure is to directly measure response time from the perspective of the real user, and all real users for that matter.

As discussed above, response time is not the only factor that impacts performance as experienced by end users. A high probability of error drives customers to competitive e-commerce merchants, or diminishes user productivity of Web-enabled business applications. Whether a customer encounters certain errors depends on the path the user took navigating through the application or infrastructure. Our shirt buyer might encounter a 404 Page Not Found error because the link to the image of the pink shirt that he or she is searching for is broken, while another buyer searching for the same shirt in blue doesn't encounter any performance problem. Another user might encounter a 500 Internal Server Error since the virtual server he or she is being directed to isn't available at that moment.

As can be concluded from these descriptions, the existence and magnitude of Web application performance issues are unique to each individual user. So the only way to reliably monitor Web application performance is directly from the real user's perspective at the browser. The implication is that the legacy network and server monitoring tools, while useful in resolving individual silo performance issues, are no longer sufficient to proactively discover Web application performance issues or pinpoint the cause of such issues. But once the cause is identified, the domain expert can use the tried-and-true silo tools to resolve the issue.

Houston, We Have a Problem. Now What?
Monitoring, though necessary, is not sufficient. Even when end-user monitoring detects every problem, it still faces the challenge of pinpointing the root cause. Since Web applications and Web Services are complex distributed systems, the source of a problem could lie in any of several tiers: the network, the Web server tier, the .NET application server tier, or the database server tier.

The goal of diagnostics is to take the symptoms detected by monitoring (like slow end-user response time), and identify their probable cause. Unlike monitoring, which simply measures performance and detects problems, diagnostics accelerate the troubleshooting process by highlighting what needs to be fixed.

Historically, IT organizations have relied on two kinds of diagnostics: system level and application level.

  • System-level diagnostics attempt to identify the cause of application slowdowns and errors by searching for hardware or OS-level resource constraints and errors. If the application is slow, one might use Windows perfmon to check the CPU and memory utilization of the underlying servers.
  • Application-level diagnostics work in a similar fashion, but search instead for application-level issues such as excessive queue lengths or too many open connections.

What both of these approaches have in common is that they take a silo view and focus on aggregate statistics. The system-level and application-level metrics they examine are limited to individual servers and machines, and disconnected from individual users or transactions.

Ideally, diagnostics should take an integrated browser-to-database end-to-end view that can trace an individual problem transaction across every tier and easily illuminate the exact source - system-level or application-level - even if that source is buried deep within the .NET infrastructure. A logical approach to diagnosing Web application performance issues is shown in Figure 1.

  1. The first step is to detect the problem from an end-user's perspective. Once a certain response time (e.g., Response Time > four seconds) or error rate (e.g., %Error >20%) threshold has been exceeded, an alert should be generated that notifies IT staff of potential problems via e-mail, pager, or by publishing events to existing tools. These alerts include the nature of the problem, the application affected, and a link to view information on the specific transactions that triggered the alarm.
  2. Next, it's important to assess the impact of the problem. Not all problems are created equal - a slowdown that affects 100% of end users is clearly more serious than one which affects only 1%.

More Stories By Hon Wong

Hon has served as CEO of Symphoniq Corporation since its inception. Prior to joining Symphoniq, Hon co-founded NetIQ, where he served on the board of directors until 2003. Hon has also co-founded and served on the board of several other companies, including Centrify, Ecosystems (acquired by Compuware), Digital Market (acquired by Oracle) and a number of other technology companies. Hon is also a General Partner of Wongfratris Investment Company, a venture investment firm. Hon holds dual BS in electrical engineering and industrial engineering from Northwestern University and a MBA from the Wharton School at the University of Pennsylvania.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.