apache spark

Video: Building the Ideal Stack for Real-Time Analytics

Building a real-time application starts with connecting the pieces of your data pipeline. To make fast and informed decisions, organizations need to rapidly ingest application data, transform it into a digestible format, store it, and make it easily accessible. All at sub-second speed. A typical real-time data pipeline is architected as follows: Application data is ingested through a distributed messaging system to capture and publish feeds. A transformation tier is called to distill...


Manage Case Study

How Manage Accelerated Data Freshness by 10x

Success in the mobile advertising industry is achieved by delivering contextual ads in the moment. The faster and more personalized a display ad, the better. Any delay in ad delivery means lost bids, revenue, and ultimately, customers. Manage, a technology company specializing in programmatic mobile marketing and advertising, helps drive mobile application adoption for companies like Uber, Wish, and Amazon. In a single day, Manage generates more than a terabyte of data and processes more than 30...


PowerStream

Using MemSQL and Spark for Machine Learning

At Spark Summit in San Francisco, we highlighted our PowerStream showcase application, which processes and analyzes data from over 2 million sensors on 200,000 wind turbines installed around the world. We sat down with one of our PowerStream engineers, John Bowler, to discuss his work on our integrated MemSQL and Apache Spark solutions. What is the relationship between MemSQL and Spark? At its core, MemSQL is a database engine, and Spark is a powerful option for writing code to transform data....


MemSQL Guide to Spark Summit 2016

Spark Summit 2016 kicks off this week with more than 90 sessions and five tracks to choose from in the heart of San Francisco. The three day marathon of learning, which includes an entire day dedicated to Spark Training, attracts more than 2,500 engineers, business professionals, scientists, and analytic enthusiasts from across the country. https://spark-summit.org/2016 Hilton San Francisco 333 O’Farrell St, San Francisco, CA 94102 USA Throughout the show, speakers will address the different...


Spark Summit East 2016

Real-Time Solutions Take Center Stage at Spark Summit East 2016

We spent last week in New York at Spark Summit East talking with the visionaries and data architects using Apache Spark. PowerStream Demo At the show we introduced PowerStream, an Internet of Things (IoT) showcase application with visualizations and alerts based on data from 2 million sensors across global wind farms. PowerStream ingests that data and provides actionable insights in real time, giving users a glimpse of how the future of sustainability can be fully realized by adapting data to...


Streamliner Python

Introducing a Performance Boost for Spark SQL, Plus Python Support

This month’s MemSQL Ops release includes performance features for Streamliner, our integrated Apache Spark solution that simplifies creation of real-time data pipelines. Specific features in this release include the ability to run Spark SQL inside of the MemSQL database, in-browser Python programming, and NUMA-aware deployments for MemSQL. We sat down with Carl Sverre, MemSQL architect and technical lead for Ops development, to talk about the latest release. Q: What’s the coolest thing...


Top Spark Summit Questions

Top 5 Questions Answered at Spark Summit

The MemSQL team enjoyed sponsoring and attending Spark Summit last week, where we spoke with hundreds of developers, data scientists, and architects all getting a better handle on modern data processing technologies like Spark and MemSQL. After a couple of days on the expo floor, I noticed several common questions. Below are some of the most frequent questions and answers exchanged in the MemSQL booth. 1. When should I use MemSQL? MemSQL shines in use cases requiring analytics on a changing...


Enterprise Apache Spark

Harnessing the Enterprise Capabilities of Spark

As more developers and data scientists try Apache Spark, they ask questions about persistence, transactions and mutable data, and how to deploy statistical models in production. To address some of these questions, our CEO Eric Frenkiel recently wrote an article for Data Informed explaining key use cases integrating MemSQL and Spark together to drive concrete business value. The article explains how you can combine MemSQL and Spark for applications like stream processing, advanced analytics, and...


In-Memory and Apache Spark

Video: The State of In-Memory and Apache Spark

Strata+Hadoop World was full of activity for MemSQL. Our keynote explained why real-time is the next phase for big data. We showcased a live application with Pinterest where they combine Spark and MemSQL to ingest and analyze real-time data. And we gave away dozens of prizes to Strata+Hadoop attendees who proved their latency crushing skills in our Query Kong game. During the event, Mike Hendrickson of O’Reilly Media sat down with MemSQL CEO Eric Frenkiel to discuss: The state of in-memory...