Welcome!

Microsoft Cloud Authors: Nick Basinger, Kevin Benedict, Pat Romanski, Liz McMillan, Lori MacVittie

Blog Feed Post

Challenges of Monitoring, Tracing and Profiling your Applications runing in “The Cloud”

Cloud Computing presents unique opportunities to companies to reduce costs, outsource non-core functions and scale costs to match demand. However, the Cloud also presents a new level of complexity that makes ensuring application performance in the Cloud a unique challenge, in particular  with the many different usage and deployment scenarios available. Perhaps the most popular present scenario uses the Cloud to perform certain tasks where  additional computational power is unavailable in a local environment, e.g.: running large scale load-tests or processing large amounts of input data into something else. Another scenario which is becoming more attractive these days is to actually run applications in the Cloud.

The big question that circles around this second deployment scenario is whether to use a public or private cloud. The use of public cloud services raises many questions:

  • Is my data really safe with the hosting service provider?
  • How reliable is that service?
  • How can I do trouble shooting in case something happens?

No matter whether you deploy your application in a private or public cloud, Cloud computing requires a platform that can manage the dynamics of the application within this mostly-virtual, opaque environment. One of the biggest challenges presented by the dynamic nature of the Cloud is troubleshooting performance issues. There are currently no good approaches to quickly identify the root cause of application performance issues in the Cloud. Existing tools and solutions are limited in the way they capture information. Solving issues in Cloud Environments today involves inefficient manual effort from the most valuable resources of the application development team: The Architects and Engineers.

Looking at a Cloud Computing Platform

Cloud Computing Platforms - whether privately or publicly hosted - provide the ability to dynamically add additional resources as needed. This for example allows handling peak load on a hosted application to ensure that application end user response times stay within SLAs (Service Level Agreements). Cloud Computing, however, is not only about adding more virtual servers or resources to your virtual infrastructure. Cloud Computing Platforms offer Services to the hosted applications providing the base foundation on which to build scalable applications. These services include data storage, messaging, caching …

Cloud Services: Let’s take Data Storage as an example

Applications hosted in the Cloud can use Service Interfaces to access application-specific data. This data is stored “in the Cloud” and can be accessed by any component and any instance of the hosted application. The Data Services ensure reliable access, concurrency, backup, …
Instead of using interfaces like JDBC, the application uses the data storage interface like In-Memory-Data-Grid to query objects from the data store, add or manipulate data. Accessing the data via this interface enables the application to scale depending on the required bandwidth, concurrent users or amount of concurrent HTTP requests. With increasing load on the application, the Cloud Computing Platform can deploy additional virtual machines in order to handle the additional number of transactions. Additional deployed application instances work seamlessly against the same Data Service interfaces.

Integration with services that run “outside the Cloud”

Most often applications need to get access to resources other than those provided by the Cloud Services. These could be external services available on the internet - like a payment, search or mapping service accessed via Web Service or RESTful interfaces. It could also be accessing data from other applications that you run - most likely applications that you run on-premise, e.g.: your in-house CRM. In order for that to work the Cloud environment must allow outbound connections from any virtual instance.

The BIG PICTURE

Following illustration shows what an application architecture - hosted in a virtual cloud environment - could look like:

Running Applications in the Cloud

Running Applications in a Cloud Environment

On one side you have the end-users that work with the application. Depending on the load and on the response times the requests could be handled by 1, 2 or many more virtual instances that host the application. The application makes use of Cloud Services like Data Storage Services to persist and share data between the virtual instances. External or In-House services might be called via remoting technologies.

The Cloud is a complex environment that can dynamically change. Each request that is executed by an end-user can take different routes through the system and can affect other parts of the overall environment.

What is happening in my Cloud?

A pressing topic in Cloud Computing is monitoring, tracing and profiling. Ensuring SLAs (Service Level Agreements) to the end user can be done rather easily. In case application response times slow down – additional application instances are automatically deployed in order to handle additional load and to better distribute the load across more instances. The Cloud Platform takes care of it.
But is that the correct approach? Adding new virtual instances to handle additional load is fine. But what if your application actually has a performance problem? Adding new virtual instances of course solves the problem in the short run. But basically it is like taking more Advil when having a tooth ache –it actually doesn’t solve the root cause of the problem, which might be a cavity or a broken tooth

Root-Cause Analysis in the Cloud

In order to understand why the current deployment is not able to handle the current load it is necessary to look beyond end user response times and performance counters like CPU, Memory, I/O and Network Utilization.
Monitoring the services running on the cloud gives additional insight into where the time is spent and can also uncover problems in the application itself by identifying “improper” use of service interfaces. As with other architectural guidelines for “non-cloud” applications – it’s essential to be careful with the resources you have. In a traditional application you want to make sure to limit the number of roundtrips over remoting boundaries or to the database. You want to make sure that your SQL statements are well written and only return the data that you need.
The same rules apply for an application that runs in the Cloud and that accesses the Cloud Service Interfaces. The challenge until now was to monitor the activities of the application within the Cloud.

A big limitation is that it is not easily possible to remote debug through your code or to install a profiler on the virtual machines to really understand how the deployed application components communicate with other components or services.

The question that needs to be answered is
How can we get insight into the dynamics of a deployed Cloud Application?

Instead of answering this question I want you to read the following blog article: Proof of Concept: dynaTrace provides Cloud Service Monitoring and Root Cause Analysis for GigaSpaces

This blog explains how the questions raised in this blog could be answered for an application running in a GigaSpaces Cloud Environment with the use of dynaTrace.

Related posts:

  1. Proof of Concept: dynaTrace provides Cloud Service Monitoring and Root Cause Analysis for GigaSpaces In this blog - Challenges of Monitoring, Tracing and Profiling...
  2. Resource Leak Detection in .NET Applications I’ve recently been working on one of my ASP.NET Sample...
  3. Extending Visual Studio Unit Testing with Transactional Tracing In my previous blog entry I wrote about how to...

Read the original blog entry...

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

IoT & Smart Cities Stories
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...