In my last blog post, I investigated a Linux performance issue affecting a specific customer workload. In this post, I will introduce the tool I created to drive that investigation.

active database benchmarkingRecently, a customer was running a test where data was loaded into MemSQL via LOAD DATA. The customer’s third-party benchmarking tool found that MemSQL took twice as long to load the same amount of data as a competing database; however, the numbers reported by this tool did not make sense. Local tests had shown MemSQL could process ~200 MB/s of data on a single machine1, but the third-party tool reported MemSQL was only processing ~60 MB/s of data. Looking at raw TCP socket data on both sides of the connection, I realized that the benchmarking tool was inadvertently throttling itself when running against MemSQL. When I used the standard mysql client to run the LOAD DATA command directly, performance doubled as MemSQL saturated the network card and matched competitor performance at 128MB/s2.

The problem here was that the customer was passively running their tool and only looking at the final result. I follow a methodology known as active benchmarking, a technique for validating the results of a benchmark. The original description summarizes the active benchmarking process quite well:

While the benchmark is running, analyze the performance of all components involved using other tools, to identify the true limiter of the benchmark.

Every active benchmarking session is a fully-fledged performance investigation that takes more time and effort. But it turns up the actual bottleneck3 and reveals the maximum possible performance.


When I first joined MemSQL, I found myself frequently writing one-off scripts to simulate complicated workloads for these active benchmarking investigations. Having previously used fio to test filesystems, I wanted a similar tool for defining a custom workload and running it efficiently. I found the existing tools to be fairly restrictive:

  • mysqlslap is only really designed to execute one type of query at a time, and sysbench only supports a very small set of workloads as first class citizens. I wanted to test more complicated workloads.
  • Both mysqlslap and sysbench cap the number of concurrent connections to the configured number of threads if the database cannot keep up with the workload. I want to test and observe queueing in the database (not in my benchmarking tool) as queries pile up.
  • Both mysqlslap and sysbench are designed first and foremost be stand-alone tests: they want to create their own managed tables with their own data.
  • HammerDB is a graphical tool that does not appear to have easy ways to configure otherwise. I wanted to use simple text files for defining workloads (à la fio) so I could share them via email or check them into a git repository.

In addition, I had concerns about the performance overhead of running these thread-based workload drivers that generate queries or data4. I commonly run the workload generator on the database server, where it is unacceptable for the generator to compete with the database for resources. I needed a tool to be as lightweight as possible.

Ultimately, these concerns motivated me to write a new tool, dbbench. dbbench is a fast, lightweight database workload generator that executes a workload defined a flexible configuration file. dbbench can behave like the other stand-alone testing tools, but it really shines as a tool for active benchmarking complicated workloads. The following configuration file describes a common analytics workload: analytics queries on a table with streaming writes that are periodically deleted from a table:

Given this configuration file, dbbench will run forever and log some simple statistics about each job. While it is running, I can very easily observe interesting performance behaviors such as how the performance of read queries changes dramatically after the table is cleaned up. In the snippet from the logs below, the average latency of the “read nearest values” job drops from 225ms to 56ms after the table is cleaned:

I recently used a very similar configuration to test a customer workload and noticed that write query throughput dropped precipitously after the table cleanup was executed. It turned out that the background log merger thread was misconfigured to be aggressive on the MemSQL instance. When the table cleanup executed, the background thread triggered and used up so many disk IOPs that the transaction log was unable to keep up with incoming writes:

dbbench was instrumental in this investigation because it provided a simple way to describe the workload and a reliable way to execute it. By monitoring and questioning the data throughout the execution of the benchmark, I was able to observe and fix a complicated performance anomaly.

dbbench version 0.1 is publicly available under an Apache license. I’m actively improving and stabilizing dbbench. MemSQL is already using it for many customer Proof of Concepts and performance testing. Try out or contribute to dbbench on Github today.

  1. During the investigation, I found an easy way to improve this to 1GB/s
  2. We were bottlenecked on the 1 Gigabit Ethernet card
  3. Including in the benchmarking tool itself
  4. For example, the new sysbench 0.5 Lua interface has a noticable CPU overhead