Welcome!

Microsoft Cloud Authors: Pat Romanski, Srinivasan Sundara Rajan, Glenn Rossman, Janakiram MSV, Steven Mandel

Related Topics: SYS-CON ITALIA, Containers Expo Blog

SYS-CON ITALIA: Article

Cloud Computing: Making Analytics in the Cloud a Reality

There will soon be a myriad of announcements of DBMS offerings in the cloud

There will soon be a myriad of announcements of DBMS offerings in the cloud. Many of these will NOT be marriages made in heaven. However, the most innovative new DBMS software combined with new cloud computing services are here today and truly take advantage of the cloud architecture in order to change the economics and the responsiveness of business analytics.

My belief is that cloud computing will change the economics of business intelligence (BI) and enable a variety of new analytic data management projects and business possibilities. It does so by making the hardware, networking, security, and software needed to create data marts and data warehouses available on demand with a pay-as-you-go approach to usage and licensing.

A computing cloud, such as the Amazon Elastic Compute Cloud, is composed of thousands of commodity servers running multiple virtual machine instances (VMs) of the applications hosted in the cloud. As customer demand for those applications changes, new servers are added to the cloud or idled and new VMs are instantiated or terminated.

Cloud computing infrastructure differs dramatically from the infrastructure underlying most in-house data warehouses and data marts. There are no high-end servers with dozens of CPU cores, SANs, replicated systems, or proprietary data warehousing appliances available in the cloud. Therefore, a new DBMS software architecture is required to enable large volumes of data to be analyzed quickly and reliably on the cloud's commodity hardware. Recent DBMS innovations make this a reality today, and the best cloud DBMS architectures will include:

  1. Shared-nothing, massively parallel processing (MPP) architecture. In order to drive down the cost of creating a utility computing environment, the best cloud service providers use huge grids of identical (or similar) computing elements. Each node in the grid is typically a compute engine with its own attached storage. For a cloud database to successfully "scale out" in such an environment, it is essential that the database have a shared-nothing architecture utilizing the resources (CPU, memory, and disk) found in server nodes added to the cluster. Most databases popularly used in BI today have shared-everything or shared-storage architectures, which will limit their ability to scale in the cloud.

  2. Automatic high availability. Within a cloud-based analytic database cluster, node failures, node changes, and connection disruptions can occur. Given the vast number of processing elements within a cloud, these failures can be made transparent to the end user if the database has the proper built-in failover capabilities. The best cloud databases will replicate data automatically across the nodes in the cloud cluster, be able to continue running in the event of 1 or more node failures ("k-safety"), and be capable of restoring data on recovered nodes automatically -- without DBA assistance. Ideally, the replicated data will be made "active" in different sort orders for querying to increase performance.

  3. Ultra-high performance. One of the game-changing advantages of the cloud is the ability to get an analytic application up quickly (without waiting for hardware procurement). However, there can be some performance penalty due to Internet connectivity speeds and the virtualized cloud environment. If the analytic performance is disappointing, the advantage is lost. Fortunately, the latest shared-nothing columnar databases are designed specifically for analytic workloads, and they have demonstrated dramatic performance improvements over traditional, row-oriented databases (as verified by industry experts, such as Gartner and Forrester, and by customer benchmarks). This software performance improvement, coupled with the hardware economies of scale provided by the cloud environment, results in a new economic model and competitive advantage for cloud analytics.

  4. Aggressive compression. Since cloud costs are typically driven by charges for processor and disk storage utilization, aggressive data compression will result in very large cost savings. Row-oriented databases can achieve compression factors of about 30% to 50%; however, the addition of necessary indexes and materialized views often swells databases to 2 to 5 times the size of the source data. But since the data in a column tends to be more similar and repetitive than attributes within rows, column databases often achieve much higher levels of compression. They also don't require indexes. The result is normally a 4x to 20x reduction in the amount of storage needed by columnar databases and a commensurate reduction in storage costs.

  5. Standards-based connectivity. While there are a number of special-purpose file systems that have been developed for the cloud environment that can provide high performance, they lack the standard connectivity needed to support general-purpose business analytics. The broad base of analytic users will use existing commercial ETL and reporting software that depend on SQL, JDBC, ODBC, and other DBMS connectivity standards to load and query cloud databases. Therefore, it's imperative for cloud databases to support these connection standards to enable widespread use of analytic applications.
In summary, cloud databases with the architectural characteristics described above will be able to not just run in the cloud, but thrive there by:

  • "Scaling out," as the cloud itself does
  • Running fast without high-end or custom hardware
  • Providing high availability in a fluid computing environment
  • Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)

More Stories By Jerry Held

Jerry Held is Executive Chairman of Vertica and CEO of the Held Consulting Group, a firm that provides strategic consulting to CEOs and senior executives of technology firms ranging from startups to very large organizations and private equity firms. Prior to his current position, Held was a senior executive at both Oracle Corp. and Tandem Computers.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
SYS-CON Events announced today that ReadyTalk, a leading provider of online conferencing and webinar services, has been named Vendor Presentation Sponsor at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. ReadyTalk delivers audio and web conferencing services that inspire collaboration and enable the Future of Work for today’s increasingly digital and mobile workforce. By combining intuitive, innovative tec...
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
WebRTC adoption has generated a wave of creative uses of communications and collaboration through websites, sales apps, customer care and business applications. As WebRTC has become more mainstream it has evolved to use cases beyond the original peer-to-peer case, which has led to a repeating requirement for interoperability with existing infrastructures. In his session at @ThingsExpo, Graham Holt, Executive Vice President of Daitan Group, will cover implementation examples that have enabled ea...
There is growing need for data-driven applications and the need for digital platforms to build these apps. In his session at 19th Cloud Expo, Muddu Sudhakar, VP and GM of Security & IoT at Splunk, will cover different PaaS solutions and Big Data platforms that are available to build applications. In addition, AI and machine learning are creating new requirements that developers need in the building of next-gen apps. The next-generation digital platforms have some of the past platform needs a...
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, provided tips on how to be successful in large scale machine learning...
In his session at @ThingsExpo, Kausik Sridharabalan, founder and CTO of Pulzze Systems, Inc., will focus on key challenges in building an Internet of Things solution infrastructure. He will shed light on efficient ways of defining interactions within IoT solutions, leading to cost and time reduction. He will also introduce ways to handle data and how one can develop IoT solutions that are lean, flexible and configurable, thus making IoT infrastructure agile and scalable.
SYS-CON Events announced today that Numerex Corp, a leading provider of managed enterprise solutions enabling the Internet of Things (IoT), will exhibit at the 19th International Cloud Expo | @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Numerex Corp. (NASDAQ:NMRX) is a leading provider of managed enterprise solutions enabling the Internet of Things (IoT). The Company's solutions produce new revenue streams or create operating...
In his general session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed cloud as a ‘better data center’ and how it adds new capacity (faster) and improves application availability (redundancy). The cloud is a ‘Dynamic Tool for Dynamic Apps’ and resource allocation is an integral part of your application architecture, so use only the resources you need and allocate /de-allocate resources on the fly.
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
IoT is fundamentally transforming the auto industry, turning the vehicle into a hub for connected services, including safety, infotainment and usage-based insurance. Auto manufacturers – and businesses across all verticals – have built an entire ecosystem around the Connected Car, creating new customer touch points and revenue streams. In his session at @ThingsExpo, Macario Namie, Head of IoT Strategy at Cisco Jasper, will share real-world examples of how IoT transforms the car from a static p...
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, will discuss the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports. The session will include a working demo and a technical d...
Developing software for the Internet of Things (IoT) comes with its own set of challenges. Security, privacy, and unified standards are a few key issues. In addition, each IoT product is comprised of (at least) three separate application components: the software embedded in the device, the back-end service, and the mobile application for the end user’s controls. Each component is developed by a different team, using different technologies and practices, and deployed to a different stack/target –...
Identity is in everything and customers are looking to their providers to ensure the security of their identities, transactions and data. With the increased reliance on cloud-based services, service providers must build security and trust into their offerings, adding value to customers and improving the user experience. Making identity, security and privacy easy for customers provides a unique advantage over the competition.
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management solutions, helping companies worldwide activate their data to drive more value and business insight and to transform moder...
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abil...
Digital innovation is the next big wave of business transformation based on digital technologies of which IoT and Big Data are key components, For example: Business boundary innovation is a challenge to excavate third-party business value using IoT and BigData, like Nest Business structure innovation may propose re-building business structure from scratch, as Uber does in the taxicab industry The social model innovation is also a big challenge to the new social architecture with the design fr...