real-time

Technical Deep Dive into MemSQL Streamliner

MemSQL Streamliner, an open source tool available on GitHub, is an integrated solution for building real-time data pipelines using Apache Spark. With Streamliner, you can stream data from real-time data sources (e.g. Apache Kafka), perform data transformations within Apache Spark, and ultimately load data into MemSQL for persistence and application serving. Streamliner is great tool for developers and data scientists since little to no code is required – users can instantly build their...


Q/A with Dan McCaffrey, VP Analytics, Teespring

One of the best parts of being a startup is getting to work with other startups – companies experiencing rapid growth get to share ideas and technology, and fortify the startup ecosystem. Yesterday we announced a new customer who exemplifies this: Teespring! Teespring is an on-demand ecommerce company that gives budding entrepreneurs the opportunity to build and grow their apparel businesses via its platform. Teespring uses MemSQL to power its sales analytics platform, which analyzes data in...


Geospatial Real-Time Apps

Locate This! The Battle for App-specific Maps

In early August, a consortium of the largest German automakers including Audi, BMW, and Daimler (Mercedes) purchased Nokia’s Here mapping unit, the largest competitor to Google Maps, for $3 billion. It is no longer easy to get lost. Quite the opposite, we expect and rely on maps for our most common Internet tasks from basic directions to on-demand transportation, discovering a new restaurant or finding a new friend. And the battle is on between the biggest public and private companies in the...


Real-Time Database

What We Talk About When We Talk About Real-Time

The phrase “real-time,” like love, means different things to different people. At its most basic, the term implies near simultaneity. However, the amount of time that constitutes the “real-time window” differs across industries, professions, and even organizations. Definitions vary and the term is so often (ab)used by marketers and analysts, that some dismiss “real-time” as a meaningless buzzword. However, there is an important distinction between “real-time” and “what we have...


Driving Relevance with Real-Time and Historical Data

Driving Relevance with Real-Time and Historical Data

As technology weaves into our daily lives, our expectations of it continue to increase. Consider mobile devices and location information. Recently 451 Research released data that 47% of consumers would like to receive personalized information based on immediate location. Source: 451 Research Addressing this requires the ability to track real-time and historical data and to put both in context. Let’s examine that spectrum. Incoming High Value Content With a focus on ‘immediate,’ the highest...


Lambda Architecture

Real-Time Stream Processing Architecture with Hadoop and MemSQL

With Hadoop Summit Europe underway today, we wanted to share some thoughts on how MemSQL fits in to the Hadoop ecosystem. While MemSQL and Hadoop are both data stores, they fill different roles in the data processing and analytics stack. The Hadoop Distributed File System (HDFS) enables businesses to store large volumes of immutable data, but by design, it is used almost exclusively for batch processing. Moreover, newer execution frameworks, that are faster and storage agonistic, are...


high-speed-counters

Turn Up the Volume With High-Speed Counters

Scaling tends to make even simple things, like counting, seem difficult. In the past, businesses used specialized databases for particular tasks, including high-speed, high-throughput event counters. Due to the constraints of legacy systems, some people still assume that relational databases cannot  handle high-throughput tasks at scale. However, due to advances like in-memory storage, high-throughput counting no longer requires a specialized, single-purpose database. Why do we even need...


Operationalize Spark

Operationalizing Spark with MemSQL

In Short: Combining the data processing prowess of Spark with a real-time database for transactions and analytics, where both are memory-optimized and distributed, leads to powerful new business use cases. MemSQL Spark Connector links at end of this post. Data Appetite and Evolution Our generation of, and appetite for, data continues unabated. This drives a critical need for tools to quickly process and transform data. Apache Spark, the new memory-optimized data processing framework, fills this...


MemSQL Spark Connector

Run Real-Time Applications with Spark and the MemSQL Spark Connector

Apache Spark is one of the most powerful distributed computing frameworks available today. Its combination of fast, in-memory computing with an architecture that’s easy to understand has made it popular for users working with huge amounts of data. While Spark shines at operating on large datasets, it still requires a solution for data persistence. HDFS is a common choice, but while it integrates well with Spark, its disk-based nature can impact performance in real-time applications (e.g....