Welcome!

Microsoft Cloud Authors: Nick Basinger, Kevin Benedict, Pat Romanski, Liz McMillan, Lori MacVittie

Blog Feed Post

Challenges of Monitoring, Tracing and Profiling your Applications runing in “The Cloud”

Cloud Computing presents unique opportunities to companies to reduce costs, outsource non-core functions and scale costs to match demand. However, the Cloud also presents a new level of complexity that makes ensuring application performance in the Cloud a unique challenge, in particular  with the many different usage and deployment scenarios available. Perhaps the most popular present scenario uses the Cloud to perform certain tasks where  additional computational power is unavailable in a local environment, e.g.: running large scale load-tests or processing large amounts of input data into something else. Another scenario which is becoming more attractive these days is to actually run applications in the Cloud.

The big question that circles around this second deployment scenario is whether to use a public or private cloud. The use of public cloud services raises many questions:

  • Is my data really safe with the hosting service provider?
  • How reliable is that service?
  • How can I do trouble shooting in case something happens?

No matter whether you deploy your application in a private or public cloud, Cloud computing requires a platform that can manage the dynamics of the application within this mostly-virtual, opaque environment. One of the biggest challenges presented by the dynamic nature of the Cloud is troubleshooting performance issues. There are currently no good approaches to quickly identify the root cause of application performance issues in the Cloud. Existing tools and solutions are limited in the way they capture information. Solving issues in Cloud Environments today involves inefficient manual effort from the most valuable resources of the application development team: The Architects and Engineers.

Looking at a Cloud Computing Platform

Cloud Computing Platforms - whether privately or publicly hosted - provide the ability to dynamically add additional resources as needed. This for example allows handling peak load on a hosted application to ensure that application end user response times stay within SLAs (Service Level Agreements). Cloud Computing, however, is not only about adding more virtual servers or resources to your virtual infrastructure. Cloud Computing Platforms offer Services to the hosted applications providing the base foundation on which to build scalable applications. These services include data storage, messaging, caching …

Cloud Services: Let’s take Data Storage as an example

Applications hosted in the Cloud can use Service Interfaces to access application-specific data. This data is stored “in the Cloud” and can be accessed by any component and any instance of the hosted application. The Data Services ensure reliable access, concurrency, backup, …
Instead of using interfaces like JDBC, the application uses the data storage interface like In-Memory-Data-Grid to query objects from the data store, add or manipulate data. Accessing the data via this interface enables the application to scale depending on the required bandwidth, concurrent users or amount of concurrent HTTP requests. With increasing load on the application, the Cloud Computing Platform can deploy additional virtual machines in order to handle the additional number of transactions. Additional deployed application instances work seamlessly against the same Data Service interfaces.

Integration with services that run “outside the Cloud”

Most often applications need to get access to resources other than those provided by the Cloud Services. These could be external services available on the internet - like a payment, search or mapping service accessed via Web Service or RESTful interfaces. It could also be accessing data from other applications that you run - most likely applications that you run on-premise, e.g.: your in-house CRM. In order for that to work the Cloud environment must allow outbound connections from any virtual instance.

The BIG PICTURE

Following illustration shows what an application architecture - hosted in a virtual cloud environment - could look like:

Running Applications in the Cloud

Running Applications in a Cloud Environment

On one side you have the end-users that work with the application. Depending on the load and on the response times the requests could be handled by 1, 2 or many more virtual instances that host the application. The application makes use of Cloud Services like Data Storage Services to persist and share data between the virtual instances. External or In-House services might be called via remoting technologies.

The Cloud is a complex environment that can dynamically change. Each request that is executed by an end-user can take different routes through the system and can affect other parts of the overall environment.

What is happening in my Cloud?

A pressing topic in Cloud Computing is monitoring, tracing and profiling. Ensuring SLAs (Service Level Agreements) to the end user can be done rather easily. In case application response times slow down – additional application instances are automatically deployed in order to handle additional load and to better distribute the load across more instances. The Cloud Platform takes care of it.
But is that the correct approach? Adding new virtual instances to handle additional load is fine. But what if your application actually has a performance problem? Adding new virtual instances of course solves the problem in the short run. But basically it is like taking more Advil when having a tooth ache –it actually doesn’t solve the root cause of the problem, which might be a cavity or a broken tooth

Root-Cause Analysis in the Cloud

In order to understand why the current deployment is not able to handle the current load it is necessary to look beyond end user response times and performance counters like CPU, Memory, I/O and Network Utilization.
Monitoring the services running on the cloud gives additional insight into where the time is spent and can also uncover problems in the application itself by identifying “improper” use of service interfaces. As with other architectural guidelines for “non-cloud” applications – it’s essential to be careful with the resources you have. In a traditional application you want to make sure to limit the number of roundtrips over remoting boundaries or to the database. You want to make sure that your SQL statements are well written and only return the data that you need.
The same rules apply for an application that runs in the Cloud and that accesses the Cloud Service Interfaces. The challenge until now was to monitor the activities of the application within the Cloud.

A big limitation is that it is not easily possible to remote debug through your code or to install a profiler on the virtual machines to really understand how the deployed application components communicate with other components or services.

The question that needs to be answered is
How can we get insight into the dynamics of a deployed Cloud Application?

Instead of answering this question I want you to read the following blog article: Proof of Concept: dynaTrace provides Cloud Service Monitoring and Root Cause Analysis for GigaSpaces

This blog explains how the questions raised in this blog could be answered for an application running in a GigaSpaces Cloud Environment with the use of dynaTrace.

Related posts:

  1. Proof of Concept: dynaTrace provides Cloud Service Monitoring and Root Cause Analysis for GigaSpaces In this blog - Challenges of Monitoring, Tracing and Profiling...
  2. Resource Leak Detection in .NET Applications I’ve recently been working on one of my ASP.NET Sample...
  3. Extending Visual Studio Unit Testing with Transactional Tracing In my previous blog entry I wrote about how to...

Read the original blog entry...

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

IoT & Smart Cities Stories
IT professionals are also embracing the reality of Serverless architectures, which are critical to developing and operating real-time applications and services. Serverless is particularly important as enterprises of all sizes develop and deploy Internet of Things (IoT) initiatives. Serverless and Kubernetes are great examples of continuous, rapid pace of change in enterprise IT. They also raise a number of critical issues and questions about employee training, development processes, and opera...
AI and machine learning disruption for Enterprises started happening in the areas such as IT operations management (ITOPs) and Cloud management and SaaS apps. In 2019 CIOs will see disruptive solutions for Cloud & Devops, AI/ML driven IT Ops and Cloud Ops. Customers want AI-driven multi-cloud operations for monitoring, detection, prevention of disruptions. Disruptions cause revenue loss, unhappy users, impacts brand reputation etc.
This month @nodexl announced that ServerlessSUMMIT & DevOpsSUMMIT own the world's top three most influential Kubernetes domains which are more influential than LinkedIn, Twitter, YouTube, Medium, Infoworld and Microsoft combined. NodeXL is a template for Microsoft® Excel® (2007, 2010, 2013 and 2016) on Windows (XP, Vista, 7, 8, 10) that lets you enter a network edge list into a workbook, click a button, see a network graph, and get a detailed summary report, all in the familiar environment of...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Atmosera delivers modern cloud services that maximize the advantages of cloud-based infrastructures. Offering private, hybrid, and public cloud solutions, Atmosera works closely with customers to engineer, deploy, and operate cloud architectures with advanced services that deliver strategic business outcomes. Atmosera's expertise simplifies the process of cloud transformation and our 20+ years of experience managing complex IT environments provides our customers with the confidence and trust tha...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
The Japan External Trade Organization (JETRO) is a non-profit organization that provides business support services to companies expanding to Japan. With the support of JETRO's dedicated staff, clients can incorporate their business; receive visa, immigration, and HR support; find dedicated office space; identify local government subsidies; get tailored market studies; and more.
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...