Engineering

Psyduck: The MemSQL Journey to Containers

One of the main themes at DockerCon 2017 was the challenge of migrating legacy applications to containers. At MemSQL, we’re early adopters. We are already into our third year of running Docker at scale in production for our distributed software testing regime, where the performance, isolation, and cost benefits of containers are very attractive. The Challenge Before I take you through our journey to containers, let me start by outlining some of the general challenges of testing a distributed...


The Curious Case of Thread Group Identifiers

At MemSQL, we are out to build awesome software and we’re always trying to solve hard problems. A few days ago, I uncovered a cool Linux mystery with some colleagues and fixed it. We thought sharing that experience might benefit others. The scene of the crime While developing an internal tool to get stack traces, we decided to use the SYS_tgkill Linux system call to send signals to specific threads. The tgkill syscall sends a signal to a specific thread based on its “thread group”...


Arrays a Hidden Gem in MemSQL

Arrays - A Hidden Gem in MemSQL

Released this March, MemSQL 6 Beta 1 introduced MemSQL Procedural SQL (MPSQL). MPSQL supports the creation of: User-Defined Functions (UDFs) Stored Procedures (SPs) Table-Valued Functions (TVFs) User-Defined Aggregate Functions (UDAFs) A Hidden Gem: Array Types There’s a hidden gem in MemSQL 6 Beta 1 that we didn’t document at first — array types!  These make programming much more convenient. Since we compile your extensions to machine code, the performance is fantastic. And you...


ArcGIS, Spark & MemSQL Integration

This is a guest post by Mansour Raad of Esri. We were fortunate to catch up with him at Strata+Hadoop World San Jose. This post is replicated from Mansour’s Thunderhead Explorer blog ArcGIS, Spark & MemSQL Integration Just got back from the fantastic Strata + Hadoop 2017 conference where the topics ranged from BigData, Spark to lots of AI/ML and not so much on Hadoop explicitly, at least not in the sessions that I attended. I think that is why the conference is renamed Strata + Data from...


Everything We’ve Known About Data Movement Has Been Wrong

Data movement remains a perennial obstacle in systems design. Many talented architects and engineers spend significant amounts of time working on data movement, often in the form of batch Extract, Transform, and Load (ETL). In general, batch ETL is the process everyone loves to hate, or put another way, I’ve never met an engineer happy with their batch ETL setup. In this post, we’ll look at the shift from batch to real time, the new topologies required to keep up with data flows, and the...


MemSQL Opens New Office in Second Tech Hub: Seattle, WA

Behind the scenes of the world’s leading companies in finance, retail, media, and energy, sits MemSQL – the operational data warehouse powering real-time data ingest and analytics. At MemSQL, hiring exceptional talent drives innovation in real-time technology and enables us to advance the state of the art in databases. We hire top engineers from prestigious universities such as MIT, Stanford, and Carnegie Mellon University, as well as companies like Facebook, Microsoft, Oracle and...


MemSQL Pipelines

MemSQL Pipelines: Real-Time Data Ingestion with Exactly-Once Semantics

Today we launched MemSQL 5.5 featuring MemSQL Pipelines, a new way to achieve maximum performance for real-time data ingestion at scale. This implementation enables exactly-once semantics when streaming from message brokers such as Apache Kafka. An end-to-end real-time analytics data platform requires real-time analytical queries and real-time ingestion. However, it is rare to find a data platform that satisfies both of these requirements. With the launch of MemSQL Pipelines as a native feature...


BPU Linux Performance

What is BPF and why is it taking over Linux Performance Analysis?

Performance analysis often gets bottlenecked by lack of visibility. At MemSQL, we architected our database to easily observe its inner workings. Observability allows our engineers to easily identify components that need to be faster. Faster components mean our database’s performance skyrockets. These tools also enable support engineers to react quickly and precisely to customer needs. In the spirit of using the best available tools to which we have access, the performance team is currently...


MemSQL Performance Benchmark

New Performance Benchmark for Live Dashboards and Fast Updates

Newest Upsert Benchmark showcases critical use case for internet billing with telcos, ISPs, and CDNs MemSQL achieves 7.9 million upserts per second, 6x faster than Cassandra Benchmark details and scripts now available on GitHub The business need for fast updates and live dashboards Businesses want insights from their data and they want it sooner rather than later. For fast-changing data, companies must rapidly glean insights in order to make the right decisions. Industry applications like IoT...


data ingest and concurrent analytics

Massive Data Ingest and Concurrent Analytics with MemSQL

The amount of data created in the past two years surpasses all of the data previously produced in human history. Even more shocking is that for all of that data produced, only 0.5% is being analyzed and used. In order to capitalize on data that exists today, businesses need the right tools to ingest and analyze data. At MemSQL, our mission is to do exactly that. We help enterprises operate in today’s real-time world by unlocking value from data instantaneously. The first step in achieving this...