With Hadoop and HDFS and related big data technologies, we’ve pretty much licked the scale problem of handling petabyte upon petabyte of data. Next up: solving the speed problem.
Right now running interactive queries across data sets spread among a thousand nodes is no mean feat. As a rule of thumb, you can run fast queries on old data (in a data warehouse) but running fast, interactive queries on massive distributed data sets is still the problem, according to speakers at a Structure:Data 2013 session today that honed in on what problems real-time data analytics — when possible — could attack.
“Interactive analytics is a complex problem. You have on one end a business users asking ‘what if we did things a little different?’” said Silvius Rus, director of big data platforms for Quantcast. They may have 10 ideas on how to change something and 9 are bad, but one is good. They need to be to iterate queries and get the answer back in minutes not a day, he added.
On day one of the show, Paul Maritz, the head of the new EMC-VMware Pivotal Initiative, talked about how companies need to have faster, more nimble feedback loops from their massive data stores. Telephone companies know they have dropped calls but they don’t know who’s call they dropped, and it can take days or even weeks to find out.
That’s the sort of problem that this new world of fast, big data analytics can solve. In that world, the phone company “at the very least, could text you an apology,” he noted.
Panel moderator Michael Driscoll, CEO of Metamarkets, really wanted to hear about what this “big data utopia” — where the system could ingest, transform and spit out answers to questions in real time — could mean. The applications that could start coming out within months could be impressive, according to Ashok Srivastava, chief data scientist for Verizon. There are the obvious things like real-time or near-real-time response to customer problems (see the dropped call issue above) and requests but he also foresees breakthroughs in cybersecurity. He cited earlier talks at Structure:Data about how systems can increasingly understand the motion and movement of people around the globe and the movements of concepts through society.
And he thinks real-time big data capabilities will play out in citizen science research. “Imagine taking your cell phone pictures and combining them with multiple millions of other cell phone pictures. That’s something that can be used by scientists,” he noted.
In health and medicine, the ability to query the most up-to-date personal health data along with historical data might enable real-time predictions about a person’s health or the status of machine health, he added.
Check out the rest of our Structure:Data 2013 live coverage here, and a video embed of the session follows below:
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
- Controversy, courtrooms and the cloud in Q1
- Q4 Wrap-up: SOPA and the future of digital content
- Social 2013: The enterprise strikes back
Nasdaq quotes delayed at least 15 minutes, all others at least 20 minutes.
Markets are closed on certain holidays. Stock Market Holiday List
By accessing this page, you agree to the following
Press Release Service provided by PRConnect.
Stock quotes supplied by Telekurs USA
Postage Rates Bots go here