At Strata+Hadoop World, James Burkhart, technical lead on real-time data infrastructure at Uber, shared how Uber supports millions of analytical queries daily across real-time data with Apollo, Uber’s internal analytics querying language.
James covers architectural decisions and lessons learned from building an exactly-once ingest pipeline that captures raw events across in-memory row storage and on-disk columnar storage. He also details how Uber uses a custom metalanguage and query layer by...
by Mason Hooten
Building a real-time application starts with connecting the pieces of your data pipeline.
To make fast and informed decisions, organizations need to rapidly ingest application data, transform it into a digestible format, store it, and make it easily accessible. All at sub-second speed.
A typical real-time data pipeline is architected as follows:
Application data is ingested through a distributed messaging system to capture and publish feeds.
A transformation tier is called to distill...