The phrase “real-time,” like love, means different things to different people.
At its most basic, the term implies near simultaneity. However, the amount of time that constitutes the “real-time window” differs across industries, professions, and even organizations. Definitions vary and the term is so often (ab)used by marketers and analysts, that some dismiss “real-time” as a meaningless buzzword.
However, there is an important distinction between “real-time” and “what we have...
by Conor Doherty
As technology weaves into our daily lives, our expectations of it continue to increase. Consider mobile devices and location information. Recently 451 Research released data that 47% of consumers would like to receive personalized information based on immediate location.
Source: 451 Research
Addressing this requires the ability to track real-time and historical data and to put both in context. Let’s examine that spectrum.
Incoming High Value Content
With a focus on ‘immediate,’ the highest...
by Gary Orenstein
With Hadoop Summit Europe underway today, we wanted to share some thoughts on how MemSQL fits in to the Hadoop ecosystem.
While MemSQL and Hadoop are both data stores, they fill different roles in the data processing and analytics stack. The Hadoop Distributed File System (HDFS) enables businesses to store large volumes of immutable data, but by design, it is used almost exclusively for batch processing. Moreover, newer execution frameworks, that are faster and storage agonistic, are...
by Lesia Myroshnichenko
Scaling tends to make even simple things, like counting, seem difficult. In the past, businesses used specialized databases for particular tasks, including high-speed, high-throughput event counters. Due to the constraints of legacy systems, some people still assume that relational databases cannot handle high-throughput tasks at scale. However, due to advances like in-memory storage, high-throughput counting no longer requires a specialized, single-purpose database.
Why do we even need...
by Nikita Shamgunov
Combining the data processing prowess of Spark with a real-time database for transactions and analytics, where both are memory-optimized and distributed, leads to powerful new business use cases. MemSQL Spark Connector links at end of this post.
Data Appetite and Evolution
Our generation of, and appetite for, data continues unabated. This drives a critical need for tools to quickly process and transform data. Apache Spark, the new memory-optimized data processing framework, fills this...
Apache Spark is one of the most powerful distributed computing frameworks available today. Its combination of fast, in-memory computing with an architecture that’s easy to understand has made it popular for users working with huge amounts of data.
While Spark shines at operating on large datasets, it still requires a solution for data persistence. HDFS is a common choice, but while it integrates well with Spark, its disk-based nature can impact performance in real-time applications (e.g....
by Wayne Song
Sometimes we become so accustomed to certain things, we stop asking questions and accept the status quo. That can last a long time.
Such is the case with batch processing in the enterprise. And it was made abundantly clear in a survey titled “Enterprise Big Data, Business Intelligence, and Analytics Trends” by the Enterprise Strategy Group, that there remains an ongoing complacency with the time it takes to complete batch processing. This is in stark contrast to the recognition that...
In our connected world, some businesses adapted to the Internet, and some businesses blossomed with the Internet. Shutterstock embodies defining characteristics of the latter.
For more than ten years, Shutterstock has grown into one of the largest and most vibrant two-sided marketplace for creative professionals to license and contribute content, including images, videos and music.
As an Internet-based company, IT infrastructure is a critical component to Shutterstock’s success, with...
While Hadoop is great for storing large volumes of data, it’s too slow for building real-time applications. However, our recent collaboration with Cisco provides a solution for Hadoop users who want a better way of processing real-time data. Using Cisco’s Application Centric Infrastructure including APIC and Nexus switch technology, we’ve been able to demonstrate exceptional throughput on concurrent MemSQL and Hadoop 2.0 workloads.
Here’s How It Works
Cisco’s new networking technology...
“The trickiest part of speeding up a program is not doing it, but deciding whether it’s worth doing at all,” says Carlos Bueno, a former Facebook engineer and award-winning author of the Mature Optimization Handbook. He’s now a senior engineer at MemSQL, helping to create in-memory database solutions that create significant value by leveraging Big Data analytics. Register Now.
Carl Wright, a former CSO/CTO of the United States Marine Corps, will join Carlos as they discuss how...
by Mark Horton