Engineering

RBAC security

MemSQL 5.1 Enhances Security for Real-Time Enterprises

Enterprises seek real-time data and analytics solutions to stay current in competitive, fast-evolving markets. Companies dealing in private information, such as healthcare organizations, financial institutions, and the public sector have historically been limited in their pursuit of real-time results, given stringent security requirements. Today, we announce the availability of MemSQL 5.1. This release adds Role-Based Access Control (RBAC) to the already powerful MemSQL 5, unlocking the gateway...


real-time monitoring

Monitoring A/B Experiments In Real Time

This post originally appeared on the Pinterest Engineering Blog by Bryant Xiao. As a data driven company, we rely heavily on A/B experiments to make decisions on new products and features. How efficiently we run these experiments strongly affects how fast we can iterate. By providing experimenters with real-time metrics, we increase our chance to successfully run experiments and move faster. We have daily workflows to compute hundreds of metrics for each experiment. While these daily metrics...


Should You Use a Rowstore or a Columnstore?

This is a repost of an article by Ankur Goyal, VP of Engineering, published on Medium ⇒ The terms rowstore and columnstore have become household names for database users. The general consensus is that rowstores are superior for online transaction processing (OLTP) workloads and columnstores are superior for online analytical processing (OLAP) workloads. This is close but not quite right — we’ll dig into why in this article and provide a more fundamental way to reason about when...


dbBench

dbbench: Bringing Active Benchmarking to Databases

In my last blog post, I investigated a Linux performance issue affecting a specific customer workload. In this post, I will introduce the tool I created to drive that investigation. Recently, a customer was running a test where data was loaded into MemSQL via LOAD DATA. The customer’s third-party benchmarking tool found that MemSQL took twice as long to load the same amount of data as a competing database; however, the numbers reported by this tool did not make sense. Local tests had shown...


Investigating Linux Performance

Investigating Linux Performance with Off-CPU Flame Graphs

The Setup As a performance engineer at MemSQL, one of my primary responsibilities is to ensure that customer Proof of Concepts (POCs) run smoothly. I was recently asked to assist with a big POC, where I was surprised to encounter an uncommon Linux performance issue. I was running a synthetic workload of 16 threads (one for each CPU core). Each one simultaneously executed a very simple query (select count(*) from t where i > 5) against a columnstore table. In theory, this ought to be a CPU...


Streamliner Python

Introducing a Performance Boost for Spark SQL, Plus Python Support

This month’s MemSQL Ops release includes performance features for Streamliner, our integrated Apache Spark solution that simplifies creation of real-time data pipelines. Specific features in this release include the ability to run Spark SQL inside of the MemSQL database, in-browser Python programming, and NUMA-aware deployments for MemSQL. We sat down with Carl Sverre, MemSQL architect and technical lead for Ops development, to talk about the latest release. Q: What’s the coolest thing...


Oracle and MemSQL Together

Using Oracle and MemSQL Together

Photo: Martin Taylor We often hear “How can I use MemSQL together with my Oracle database?” As a relational database, MemSQL is similar to an Oracle database, and can serve as an alternative to Oracle in certain scenarios. Here is what sets MemSQL apart: MemSQL is a distributed system, designed to run on multiple machines with a massively parallel processing architecture. An Oracle database, on the other hand, resides in a single, large machine, or a smaller fixed cluster size. MemSQL has...


Technical Deep Dive into MemSQL Streamliner

MemSQL Streamliner, an open source tool available on GitHub, is an integrated solution for building real-time data pipelines using Apache Spark. With Streamliner, you can stream data from real-time data sources (e.g. Apache Kafka), perform data transformations within Apache Spark, and ultimately load data into MemSQL for persistence and application serving. Streamliner is great tool for developers and data scientists since little to no code is required – users can instantly build their...


Building an Infinitely Scalable Testing System

Quality needs to be architected like any other feature in enterprise software. At MemSQL, we build test systems so we can ship new releases as often as possible. In the software world, continuous testing allows you to make tiny changes along the way and keep innovating quickly. Such continuous testing is an essential task—and on top of that, we compete with large companies and their armies of manual testers. Instead of hiring hordes of testers, we decided to build infinitely scalable test...


Making Painless Schema Changes

The ability to change a table’s schema without downtime in production is a critical feature of any database system. In spite of this, many traditional relational databases have poor support for it. Quick and easy schema changes was a key advantage of early distributed NoSQL systems, but of course, those systems jettison relational capabilities. Though conventional wisdom may indicate otherwise, easy schema changes are possible with the relational model. At MemSQL we put careful thought...