The Evolution of Big Data and Modern Business Intelligence
“Every industry out there is being disrupted and is recognizing the value of this data and how to analyze it to get ahead of others in the competition,” said Anurag Tandon, Vice President of Product Management at Zoomdata. Tandon recently presented a webinar called All the Difference: Rapid Discovery on Modern Data for DM Radio. Tandon talked about how the Big Data landscape evolved to the present and in an accompanying interview how Zoomdata fits in the evolution of modern Business Intelligence (BI).
Development of the Big Data Landscape
Tandon described the transition from the 1990s, when on-premise, transaction-based relational database management systems (RDBMS) systems handled tens of millions of transactions on structured data, through the 2000s, when hundreds of millions of interactions with structured and unstructured data were processed with NoSQL systems & advanced search. He then moved into the present, when on-premise, Cloud, and hybrid Big Data stores hold hundred of billions of structured, unstructured, and streaming transactions or interactions.
Tandon started his career in BI in the late 90s, working with RDMBS for large enterprise companies. “It was all about transactions,” he said. Companies were tracking human-initiated events: a sale, a service call, a hotel stay, a bill payment, etc. Retail and financial organizations were the first to use data to their advantage because they had the most transactions. “And they were looking at tens of millions of rows and maybe hundreds of millions of rows in the extreme cases.”
OLTP vs. OLAP
Online Transaction Processing (OLTP) is the process of capturing point of sale (POS) transactions and communicating them to the backend systems. Online Analytical Processing (OLAP) aggregates those transactions to form a holistic picture for analysis. “These were still very, very structured sets of data in tabular form, relational schemas in Oracle, SQL Server, and other tools,” he said. The challenge back then was to find a way to separate the workloads of OLTP and OLAP, when in some cases, the same databases were doing both kinds of processing. Analytical processing grew out of transactional processing, and in the 1990s, very few companies understood the importance of Analytics.
The Internet and New Types of Data
Starting in the 2000s, the internet became an accepted avenue for businesses, and websites; web apps and mobile apps were providing a growing pool of data for companies to capture, he said.
“Not just transactional information but also interactions. So as organizations were better understanding the value of their data, they were collecting more and more.”
This proliferation of data forced businesses to find new ways to store and analyze information. Where retail and banks were leading the way with transactional data in the 1990s, organizations like Facebook and Google took the lead in recognizing the value of interaction streams.
IT functions were maturing and became the providers of this information to business users in the form of canned, printed reports or spreadsheets. More advanced companies used web tools for reporting. “There was a handful of users who were using these types of desktop tools, but they were mostly in IT.”
Laptops and PCs Empower Business Users
In the late 2000s, laptops and desktop PCs gained increased processing power and memory, and desktop applications became more prevalent. “As IT was getting bogged down with a lot of things that they were doing for the organization, business users were starting to say, ‘No, we’ve got this,’” he said. This led to what Tandon called a “desktop/web hybrid,” where users could share information from applications like Tableau or ClickView via the web, “because we’re really just talking about millions of rows of data and gigabytes of data, which my laptop can handle.” In the 2010s, there was a shift back to the web with Cloud, with unlimited distributed scale in processing terms, he said.
Timing is Critical
The latency of data became more important as daily updates replaced weekly or monthly batch processing. By 2010, companies started benefiting from real-time data. “Monthly and weekly was good enough in the 90s, and in the 2000s, daily was the norm. Now we’re looking at the freshest information possible,” he said. Of the companies surveyed in The State of Data Analytics and Visualization Adoption, just over half used Embedded Analytics, and that more than 15 percent are working with streaming data that is less than one hour old. In addition, two-thirds are now using non-RDBMS systems, suggesting that the traditional relational database is on its way out, he said. “The relational database systems are there, very much so, but they are more and more used for legacy type of applications.”
A Data Explosion
There is “an explosion” in observational data collection now, where machines are recording, sending, storing, and analyzing data, and communicating with each other, without human involvement, he said. He added that with the advent of Click Stream Analytics and IoT data coming from millions of devices, “You start to see structured data at massive volumes with hundreds of millions, billions, or tens of billions of rows of data.”
In the last twenty years, Business Intelligence (BI) tools have shifted “from desktop to web, back to desktop, and back to web,” Tandon said, with different tools fitting a variety of different uses, because transactional and interaction analysis haven’t really gone away. “We’re just talking about newer and newer use cases emerging with newer sets of data, newer ways of recording them and newer ways of storing them.”
While providing the potential to have more and more insight, the tools that were relied upon in the past are starting to break down under the scale of data that organizations are collecting, especially observational data, he said. “When the data gets big, when the data is streaming or when the data is unstructured, they are failing.” Users now want their Analytics in context, “Because ultimately, the cycle that we need to follow is data to analysis, to insights, to decisions, to action, and results.” Embedded Analytics and analysis are now possible, shortening the time it takes to move through the cycle from analysis to action.
Traditional BI: Not for Modern Data
With the evolution of traditional BI, which focuses on power users in IT rather than business users, he said that:
“There was a long lead time between the time data was available to the time the data was ready for analysis, and then actually analyzed. Tools running on monolithic servers have had a difficult time with user scalability, you couldn’t break them apart, you couldn’t scale them, and you couldn’t distribute them in terms of the workload,” he said. “And because these tools are designed for relational data, there are performance problems with streaming, unstructured, and semi-structured data sources.”
Zoomdata: Developed for Modern BI
Companies wanted tools that would scale as needed, that could handle massive volumes of varying types of data from different sources and locations, and that would provide real-time insight to users across the enterprise in an easy to access format. In short, Tandon said, they wanted tools that could meet “the challenge of doing Analytics on modern data stores.” Zoomdata set out to meet that challenge and to solve the problems that keep companies from moving forward with their Analytics.
According to the company’s website, Zoomdata provides an embeddable, extensible experience where users can search streaming data from a variety of sources anywhere, on-premise or in the Cloud, with no desktop client required. Appealing, easy-to-use dashboards and interactive visualizations provide an asynchronous user experience, with no modeling needed.
Zoomdata delivers a microservice-based flexible deployment architecture querying billions of rows of data “at the speed of thought,” with no data movement.
“The data movement part is really important because that is at the core of what Zoomdata does. We don’t want to replicate the distributed scale that you have in your big data lake or big data store of choice,” Tandon said.
Zooomdata doesn’t store the data. “You store data in your backend system or whatever you happen to be using. But we provide that visual interactive analysis user experience,” he added.
Tandon elaborated on what makes Zoomdata uniquely qualified to meet the challenge of modern data. He said there are five key areas where Zoomdata excels:
- Zoomdata is designed to be fast, “To provide ‘speed of thought’ Analytics on any size data sets, where traditional BI tools can’t analyze, or they take way, way too long.”
- Built from the ground up for modern data sources, Zoomdata takes a highly optimized custom approach to every different type of data store.
- Zoomdata provides real-time streaming data that can be compared to historical data. This includes “a capability we call the ‘Data DVR’ which lets you pause, rewind, and replay that data through a visual UI.”
- Zoomdata excels at availability and extensibility. As a five-year-old company, their user interface was developed in the modern age to meet the needs of a modern company. “You can embed it in existing applications and extend it. So, if there’s a particular visualization you need, you can integrate with any third-party visualization library,”
- “Using Zoomdata Fusion, you can actually blend data on the fly from multiple sources and they can be completely heterogeneous.” Data from Hadoop, from search databases like Elasticsearch, or data from a relational database like MySQL or Oracle can be blended on the fly, “without needing to move the data, and without needing to build a data warehouse or data mart.”
Tandon added that Zoomdata is now offering master classes in Business Analytics. “We recorded a bunch of small, bite sized videos by people who are very accomplished and knowledgeable about the space, and you’re welcome to come and watch them on our website.” To get ahead of the competition in a time where modern BI is essential, the tools of the past are no longer sufficient, said Tandon. “We’ve built this architecture that can excel where others will fail.”
Check out the full webinar and the corresponding podcast entitled, Analytics Drives the Information Economy on DMRadio, hosted by Eric Kavanaugh of The Bloor Group. The other guests for the podcast were Wayne Eckerson of Eckerson Group, Ian Fyfe formerly of Zoomdata, John Morrell of Datameer, Adeel Najmi of One Network, and Ramin Sayar of Sumo Logic.
Photo Credit: kentoh/Shutterstock.com
analytics Big Data Business Intelligence Cloud embedded analytics Hadoop OLAP OLTP RDBMS relational database management system streaming analytics unstructured data