HANA and Hadoop

Takeaways from the Gartner Business Intelligence and Analytics Summit

Last week, MemSQL had the opportunity to participate in the Gartner Business Intelligence and Analytics Summit in Las Vegas. It was a fun chance to talk to hundreds of analytics users about their current challenges and future plans.

As an in-memory database company, we fielded questions on both sides of the analytics spectrum. Some attendees were curious about how we compared with SAP HANA, an in-memory offering at the high-end of the solution spectrum. Others wanted to know how we integrated with Hadoop, the scale-out approach to storing and batch processing large data sets.

And in the span of a few days and many conversations, the gap between these offerings became clear. What also became clear is the market appetite for a solution.

Hardly Accessible Not Affordable

While HANA does offer a set of in-memory analytical capabilities primarily optimized for the emerging SAP S4/HANA Suite, it remains at such upper echelons of the enterprise IT pyramid that it is rarely accessible across an organization. Part of this stems from the length and complexity of HANA implementations and deployments. Its top of the line price and mandated hardware configurations also mean that in-memory capabilities via HANA are simply not affordable for a broader set of needs in a company.

Hanging with Hadoop

On the other side of the spectrum lies Hadoop, a foundational big data engine, but often akin to a large repository of log and event data. Part of Hadoop’s rise has been the Hadoop Distributed File System (HDFS) which allowed for cheap and deep storage on commodity hardware. MapReduce, the processing framework atop HDFS, powered the first wave of big data, but as the world moves towards real-time, batch processing remains helpful but rarely sufficient for a modern enterprise.

In-Memory Speeds and Distributed Scale

Between these ends of the spectrum lies an opportunity to deliver in-memory capabilities with an architecture on distributed, commodity hardware accessible to all.

The computing theme of this century is piles of smaller servers or cloud instances, directed by clever new software, relentlessly overtaking use-cases that were previously the domain of big iron. Hadoop proved that “big data” doesn’t mean “big iron.” The trend now continues with in-memory.

Moving To Converged Transactions and Analytics

At the heart of the in-memory shift is the convergence of both transactions and analytics into a single system, something Gartner refers to as Hybrid transactional/analytical processing (HTAP).

In-memory capabilities make HTAP possible. But data growth means the need to scale. Easily adding servers or cloud instances to a distributed solution lets companies meet capacity increases and store their highest value, most active data in memory.

But an all-memory, all-the-time solution might not be right for everyone. That is where combining all-memory and disk-based stores within a single system fits. A tiered architecture provides infrastructure consolidation and low cost expansion high value, less active data.

Finally, ecosystem integration makes data pipelines simple, whether that includes loading directly from HDFS or Amazon S3, running a high-performance connector to Apache Spark, or just building upon a foundational programming language like SQL.

SQL-based solutions can provide immediate utility across large parts of enterprise organizations. The familiarity and ubiquity of the programming language means access to real-time data via SQL becomes a fast path to real-time dashboards, real-time applications, and an immediate impact.

Related Links:

To learn more about How HTAP Remedies the Four Drawbacks of Traditional Systems here

Want to learn more about in-memory databases and opportunities with HTAP? – Take a look at the recent Gartner report here.

If you’re interested in test driving an in-memory database that offers the full benefits of HTAP, give MemSQL a try for free, or give us a ring at (855) 463-6775.