Engineering

Streamliner Python

Introducing a Performance Boost for Spark SQL, Plus Python Support

This month’s MemSQL Ops release includes performance features for Streamliner, our integrated Apache Spark solution that simplifies creation of real-time data pipelines. Specific features in this release include the ability to run Spark SQL inside of the MemSQL database, in-browser Python programming, and NUMA-aware deployments for MemSQL. We sat down with Carl Sverre, MemSQL architect and technical lead for Ops development, to talk about the latest release. Q: What’s the coolest thing...


Oracle and MemSQL Together

Using Oracle and MemSQL Together

Photo: Martin Taylor We often hear “How can I use MemSQL together with my Oracle database?” As a relational database, MemSQL is similar to an Oracle database, and can serve as an alternative to Oracle in certain scenarios. Here is what sets MemSQL apart: MemSQL is a distributed system, designed to run on multiple machines with a massively parallel processing architecture. An Oracle database, on the other hand, resides in a single, large machine, or a smaller fixed cluster size. MemSQL has...


Technical Deep Dive into MemSQL Streamliner

MemSQL Streamliner, an open source tool available on GitHub, is an integrated solution for building real-time data pipelines using Apache Spark. With Streamliner, you can stream data from real-time data sources (e.g. Apache Kafka), perform data transformations within Apache Spark, and ultimately load data into MemSQL for persistence and application serving. Streamliner is great tool for developers and data scientists since little to no code is required – users can instantly build their...


Building an Infinitely Scalable Testing System

Quality needs to be architected like any other feature in enterprise software. At MemSQL, we build test systems so we can ship new releases as often as possible. In the software world, continuous testing allows you to make tiny changes along the way and keep innovating quickly. Such continuous testing is an essential task—and on top of that, we compete with large companies and their armies of manual testers. Instead of hiring hordes of testers, we decided to build infinitely scalable test...


Making Painless Schema Changes

The ability to change a table’s schema without downtime in production is a critical feature of any database system. In spite of this, many traditional relational databases have poor support for it. Quick and easy schema changes was a key advantage of early distributed NoSQL systems, but of course, those systems jettison relational capabilities. Though conventional wisdom may indicate otherwise, easy schema changes are possible with the relational model. At MemSQL we put careful thought...


How to Write Compilers in Modern C++ - Meetup with Drew Paroski

Visit our SoMa headquarters this Wednesday, August 19th for our third official meetup, from 6pm-8pm! This is an exclusive opportunity to learn the art of building compilers from Drew Paroski. Before joining MemSQL, Drew co-created the HipHop Virtual Machine (HHVM) and Hack programming language to support Facebook’s web scale across a growing user base in the billions. Read more about Drew here: http://blog.memsql.com/creator-of-hhvm-joins-memsql/. We will have a delicious Mexican feast...


How to Deploy MemSQL on the Mesosphere DCOS

The Mesosphere Datacenter Operating System (DCOS) is a distributed operating system designed to span all machines in a datacenter. It provides mechanisms for deploying applications across the entire system with a few simple commands. MemSQL is a great fit for deployment on DCOS because of its distributed, memory-optimized design. For example, users can scale computation and storage capacity by simply adding nodes. MemSQL deploys across commodity hardware and cloud, giving users the flexibility...


Big Data Scala by the Bay

The Resurgence of Scala for Big Data

Big Data Scala by the Bay, Aug 16-18, is shaping up to be an engaging event, and will bring together top data engineers, data scientists, developers, and data managers who use the Scala language to build big data pipelines. At the MemSQL booth, we will showcase how enterprises can streamline this process by building their own real-time data pipelines using Apache Kafka, Apache Spark and operational databases. Many of our customers are moving to this real-time data pipeline: a simplified Lambda...


MemSQL Ops

Download the New and Improved MemSQL Ops

The latest release of MemSQL Ops – version 4.0.34 – is now available for download! In this release, we are offering MemSQL users new features to accelerate productivity. Download MemSQL Ops to get up and running on MemSQL Community Edition or MemSQL Enterprise Edition today. MemSQL Ops downloads and upgrades are available for free to all MemSQL Community and Enterprise users. Here are some of the features in the new MemSQL Ops release: Ops Superusers The new MemSQL Ops comes with...


believable benchmark

How to Make a Believable Benchmark

Albrecht Dürer, ‘Man drawing a Lute’, woodcut, 1525 A benchmark asks a specific question, makes a guess about the expected result, and confirms or denies it with experiment. If it compares anything, it compares like to like and discloses enough details so that others can plausibly repeat it. If your benchmark does not do all of these things, it is not a benchmark. Today’s question comes from one of our engineers, who was talking to a customer about new features in MemSQL 4. We...