Apache Spark Resources

There’s no doubt about it. Apache Spark is well on its way to becoming a ubiquitous technology. Over the past year, we’ve created resources to help our users understand the real-world use cases for Spark as well as showcase how our technologies compliment one another. Now, we’ve organized and consolidated those materials into this very post.


Pinterest Measures Real-Time User Engagement with Spark
Demo of real-time data pipeline processing and analyzing re-pins across the United States.

The State of In-Memory and Apache Spark
Interview with MemSQL CEO and Co-Founder, Eric Frenkiel, describing how a memory-optimized solution made the difference for Pinterest.

Keynote: Close Encounters with the Third Kind of Database
Strata+Hadoop World keynote presentation from MemSQL CEO, Eric Frenkiel, outlining the business case for Spark and MemSQL.

Strata+Hadoop Session: Bringing OLAP Fully Online
How to analyze changing datasets in MemSQL and Spark with a demo from Pinterest.

Blog Posts and Articles

Real-Time Analytics at Pinterest
More about real-time analytics at Pinterest from the official Pinterest engineering blog.

How Pinterest Measures Real-Time User Engagement with Spark
Learn how Pinterest is using Spark Streaming and MemSQL to find patterns in high-value user engagement data in this blog post.

Apache Kafka + Spark + Database = Real-Time Trinity
Read this article from The New Stack to learn how to build real-time data pipelines with Kafka, Spark, and MemSQL.

Harnessing the Enterprise Capabilities of Spark
Learn how to combine MemSQL and Spark for applications like stream processing and advanced analytics to increase business efficiency and revenue in this data-informed article.

Using Apache Spark in the Enterprise
In this Inside big data article, Eric Frenkiel champions the use of Spark to achieve the promise of real-time analytics.

Extending MemSQL Analytics with Spark
Learn how the MemSQL Spark Connector integrates operational data with advanced analytics in this Databricks blog post.

Operationalizing Spark with MemSQL
This blog post highlights the shared characteristics of MemSQL and Spark and provides real-world use cases for using the two technologies in concert.


From Spark to Ignition: Fueling Your Business on Real-Time Analytics
View slides from our Strata+Hadoop World session on the enterprise application of Spark for real-time analytics.


At Spark Summit, we launched a simulation built on the “the real-time trinity” of Kafka,  Spark, and MemSQL. We call it MemCity, and use it to model energy consumption across 1.4 million homes to isolate trends in usage by volume, hour of the day, and location in a futuristic city. Read more about this application of Spark technology on the MemSQL blog. MemCity1

ESRI with MemSQL Geospatial Intelligence
Working with MemSQL and Spark, Esri, the leading provider of geographic information systems, analyzed data from millions of taxi rides in New York to identify trends to further urban planning. Learn more about the project here: NYC taxi data can drive smarter urban planning.



MemSQL Spark Connector
The MemSQL Spark Connector provides everything you need to start using Spark and MemSQL together. Download now on GitHub.


Get started building your own real-time applications with Spark and MemSQL, download MemSQL Community Edition today.

Get The MemSQL Spark Connector Guide

The 79 page guide covers how to design, build, and deploy Spark applications using the MemSQL Spark Connector. Inside, you will find code samples to help you get started and performance recommendations for your production-ready Apache Spark and MemSQL implementations.
Download Here