Modern Database Characteristics

Many legacy database systems are not equipped for modern applications. Near ubiquitous connectivity drives high-velocity, high-volume data workloads – think smartphones, connected devices, sensors – and a unique set of data management requirements. As the number of connected applications grows, businesses turn to in-memory solutions built to ingest and serve data simultaneously.

To support such workloads successfully, database systems must have the following characteristics:

Modern Database Characteristics

Ingest and Process Data in Real Time
Historically, the lag time between ingesting data and understanding that data has been hours to days. Now, companies require data access and exploration in real time to meet consumer expectations.

Subsecond Response Times
As organizations supply access to fresh data, demand for access rises from hundreds to thousands of analysts. Serving this workload requires memory-optimized systems that process transactions and analytics concurrently.

Anomaly Detection as Events Occur
Reaction time to an irregular event often correlates with a business’s financial health. The ability to detect an anomaly as it happens helps companies avoid massive losses and capitalize on opportunities.

Generate Reports Over Changing Datasets
Today, companies expect analytics to run on changing datasets, where results are accurate to the last transaction. This real-time query capability has become a base requirement for modern workloads.

Real-Time Use Cases

Today, companies are using in-memory solutions to meet these requirements. Here are a few examples:

Pinterest: Real-Time Analytics
Pinterest built a real-time data pipeline to ingest data into MemSQL using Spark Streaming. In this workflow, every Repin is filtered and enriched by adding geolocation and Repin category information. Enriched data is persisted to MemSQL and made available for query serving. This helps Pinterest build a better recommendation engine for showing Repins and enables their analysts to use familiar a SQL interface to explore real-time data and derive insights.

Tapjoy: Personalization
Tapjoy optimizes ad performance by taking advantage of the speed and scalability of in-memory computing. With the processing power to run 60,000 queries per second at a response time of less than ten milliseconds, Tapjoy is able to cross-reference user data and serve higher-performing ads to more than 500 million global users.

David Abercrombie, Data Analytics Engineer, details Tapjoy’s database architecture and application in this recorded session from the 2015 In-Memory Computing Summit:

Novus: Portfolio Management
Novus supports more than 100 of the world’s top investment firms, helping users understand investment strengths and risks by providing a “moneyball-like” view of investment data. By taking advantage of a memory-optimized database system, Novus can deliver instant answers to hundreds of analysts querying their dataset.

Noah Zucker, Vice President, shares how Novus built a scalable portfolio investment platform in this video:

Concluding Thoughts

As more data comes online, organizations will rush to build systems that can rapidly ingest data while simultaneously making it accessible for analysis. To help you get there successfully, we teamed up with O’Reilly Media to publish an ebook on Building Real-Time Data Pipelines through In-Memory Architectures. Download it for free here:

OReilly Ebook: Building Real-Time Data Pipelines