At Strata+Hadoop World, James Burkhart, technical lead on real-time data infrastructure at Uber, shared how Uber supports millions of analytical queries daily across real-time data with Apollo, Uber’s internal analytics querying language.
James covers architectural decisions and lessons learned from building an exactly-once ingest pipeline that captures raw events across in-memory row storage and on-disk columnar storage. He also details how Uber uses a custom metalanguage and query layer by...
by Mason Hooten
The tracking and targeting of our online lives is no secret. Once we browse to a pair of shoes on a website, we are reminded about them in a retargeting campaign. Lesser known efforts happen behind the scenes to accumulate data and scan through it in realtime, delivering the perfect personalized campaign. Specificity and speed are converging to deliver nano-marketing.
If you are a business leader, you’ll want to stay versed in these latest approaches. If not, as a consumer, you’ll likely...
by Gary Orenstein
Data is changing. You knew that. But the dialog over the past 10 years around big data and Hadoop is rapidly moving to data and real-time.
We have tackled how to capture big data at scale. We can thank the Hadoop Distributed File System for that, as well as cloud object stores like AWS S3.
But we have not yet tackled the instant results part of big data. For that we need more. But first, some history.
Turning Point for the Traditional Data Warehouse
Internet scale workloads that emerged in the...
How can big data and machine learning be used for good?
In our keynote at Strata+Hadoop World, MemSQL CEO Eric Frenkiel shared how we are working with Thorn to provide a new approach to machine learning and real-time image recognition to combat child exploitation.
Thorn partners across the technology companies and government organizations to combat predatory behavior, rescue victims, and protect vulnerable children.
Thorn has to sift through a massive amount of images daily. Images...
by Kevin White
This post originally appeared on the Myntra Engineering Blog.
Learn how Myntra gained real-time insights on rapidly growing data using their new processing and reporting framework.
I got an opportunity to work extensively with big data and analytics in Myntra. Data Driven Intelligence being one of the core values at Myntra, so crunching and processing data and reporting meaningful insights for the company is of utmost importance.
Everyday millions of users visit Myntra on our App...
by Lesia Myroshnichenko
3x Spend Increase
“Between 2016 and 2019, spending on real-time analytics will grow three times faster than spending on non-real-time analytics.”
Every organization uses some form of analytics to monitor and improve their business. The growth of data has increased the impact of analytics and is a critical ingredient for delivering a successful digital business strategy.
Companies are using more real-time analytics, because of the pressure to increase the speed and accuracy of...
by Mike Boyarski
Adoption of in-memory technology solutions is happening faster than ever. This stems from a three pronged demand – first, a greater number of users, analysts, and businesses need access to data. Second, the number of transactions is increasing globally, so companies need faster ingest and analytics engines. Finally, performance inconsistencies are the nail in the coffin for companies competing in the on-demand economy – these enterprises need the responsiveness in-memory technology...
by Emily Friedman
In some industries, a hesitance remains in recognizing the commodification forces of real-time solutions. These industries often rely on orthodox tenets as barriers to marketplace entry, such as regulatory compliance, traditional value propositions, brand recognition, and market penetration. The term “ripe for disruption” often characterizes these industries and their respective leaders.
Arguably, an illustrative industry in the midst of responding to commodification, adapting to real-time...
by Seth Luersen
Data movement remains a perennial obstacle in systems design. Many talented architects and engineers spend significant amounts of time working on data movement, often in the form of batch Extract, Transform, and Load (ETL). In general, batch ETL is the process everyone loves to hate, or put another way, I’ve never met an engineer happy with their batch ETL setup.
In this post, we’ll look at the shift from batch to real time, the new topologies required to keep up with data flows, and the...
Smart gas and electric meters produce huge volumes of data. A small MemSQL cluster of 5 nodes easily handles massive quantities of data like the workloads from leading gas and electric utility enterprises.
In one particular use case, over 200,000 meter readings per second load into the MemSQL database while users simultaneously process queries against that data. Millions of meters sending between 10 and 30 sensor readings every hour leading to billions of rows of data. Just an initial part of a...
by Dale Deloy