Engineering

How Careful Engineering Led to Processing Over a Trillion Rows Per Second

On March 13, we published a blog demonstrating the performance of MemSQL in the context of ad hoc analytical queries. Specifically, we showed that the query SELECT stock_symbol, count(*) as c FROM trade GROUP BY stock_symbol ORDER BY c desc LIMIT 10; 12345 SELECT stock_symbol, count(*) as cFROM tradeGROUP BY stock_symbolORDER BY c descLIMIT 10; can process 1,280,625,752,550 rows per seconds on a MemSQL cluster...


blog opener

Architecting Data in the AWS Ecosystem

Amazon Web Services (AWS) is a juggernaut of a platform, and many of today’s database solutions can run on AWS, including MemSQL. Recently, we held a meetup at our office in San Francisco during AWS Summit 2018 to discuss how customers can benefit from using MemSQL within the AWS ecosystem. In the talk, “Architecting Data in the AWS Ecosystem,” Seth Luersen from MemSQL took a look at the overall data landscape related to the purpose-built databases from Amazon Web Services. He illustrated...


Operationalizing MemSQL

In a recent webcast, we shared some tips and tricks on how you can operationalize MemSQL for configuring MemSQL Ops and MemSQL root user passwords; memory settings and health metrics; and how you can take backups and add nodes securely to your cluster. Here are the topics we covered: Permissioning a new cluster by adding a super-user to MemSQL Ops and via GRANT statements on the cluster itself   Best practices for configuring memory limits Best practices for basic cluster monitoring, and how...


Shattering the Trillion-Rows-Per-Second Barrier With MemSQL

Last week at the Strata Data Conference in San Jose, I had the privilege of demonstrating MemSQL processing over a trillion rows per second on the latest Intel Skylake servers. It’s well known that having an interactive response time of under a quarter of a second gives people incredible satisfaction. When you deliver response time that drops down to about a quarter of a second, results seem to be instantaneous to users. But with large data sets and concurrency needs, giving all customers...


blog header

Full-Text Search in MemSQL

Today, we are sharing that MemSQL now has Full-Text Search, a highly requested feature, built into the product. Thanks to customer feedback, we are delighted to make it available for all companies building real-time applications. What is Full-Text Search? You might be thinking, “MemSQL is pretty fast at searching things and they already support large strings, so why do they need to add anything?” So let’s start with a description of Full-Text Search (FTS). Full-Text Search is different...


blog header

Recapping An Evening with MemSQL Engineering

Recently, we hosted a special meetup at our headquarters in San Francisco for the community, and shared some great talks. The slides for each talk and the video presentations have been made available below. Drew Paroski, MemSQL VP of Engineering and Adam Prout, MemSQL Chief Architect delivered a fun talk about taking a methodical approach for making a decision, dug into interesting tradeoffs, and gave tips about what to look for under the hood and how to evaluate the tech behind the database,...


An Engineering Approach to Database Evaluations

Whether you’re the CTO of the Rebel Alliance or the Galactic Empire, it could be very difficult to decide on your next database technology with the distraction of both sides constantly at war. In 2018, you’ll need to make a database choice for an existing or new application. Here are the things you need to keep in mind as you shop for your next database. 8 Criteria To Keep In Mind While Looking For Your Next Database 1) Pick the right language: SQL The history of SQL, or Structured...


Scaling Distributed Joins

Most users of SQL databases have a good understanding of the join algorithms single-box databases employ. They understand the trade-offs and uses for nested loop joins, merge joins, and hash joins. Distributed join algorithms, on the other hand, tend not to be as well understood. Distributed databases need to make a different set of tradeoffs to account for table data that is spread around a cluster of machines instead of stored on a single machine, like in a traditional database. Because these...


JSON Streaming And The Future Of Data Ingest

As businesses continue to become technology focused, data is more prevalent than ever. In response, companies have adopted a handful of common formats to help manage this explosive growth in data. Data Formats Today For a long time, XML has been the giant in terms of data interchange formats. Recently, JSON has become popular, catching a wave of interest due to its lightweight streaming support, and general ease of use. JSON is a common format for web applications, logging, and geographical...


Running Stored Procedures on Distributed Systems with MemSQL 6

Today we’re announcing the general availability of MemSQL 6. This is a big milestone for the product, which comes with new features to help customers get even more value out of MemSQL. The latest release includes breakthrough query performance, enhanced online operations, and extensibility. In this blog, we’ll take a deeper look at the new Extensibility features. Why did you add Extensibility to MemSQL 6? The Extensibility feature was built based on market demand, and enables people to move...