The things that make big data what it is – high velocity, variety, and volume – make it a challenge to defend. And it presents a tempting target for potential attackers.
But big data technologies are also being used to help cybersecurity, since many of the same tools and approaches can be used to collect log and incident data, process it quickly, and spot suspicious activity.
More Data, More Brains
"Modern cybersecurity solutions are mostly driven by big data," said Bogdan Botezatu, senior threat analyst at Bitdefender.
To start with, all the major anti-virus and endpoint protection vendors, as well as network security and firewall providers, train their systems on the massive volumes of malware and known attack paths that they have collected.
With millions of samples, security vendors can train their systems to recognize known attacks but also identify patterns that allow them to spot attacks that have never been seen before.
All the major security vendors have either already added advanced threat detection, behavior analytics, and machine learning to their systems or are in the process of furiously trying to catch up with their competitors who have already done so.
"Machine learning algorithms are trained multiple times a day on huge sets of malicious files," said Botezatu. "Quality assurance runs on known good files to minimize false positives."
Vendors aren't the only ones collecting virtual mountains of information.
Internally, data center operators are collecting data feeds from both their on-premises and cloud infrastructure to look for suspicious files, behaviors, and communications.
"Event-correlation technologies piece an attack’s disparate components together to stop it cold," Botezatu said. "File reputation systems consider how many running instances of an application exist in the customer pool to understand how likely that app is to be malicious."
None of this would be possible without the ability to store and analyze massive amounts of information – and do in real time.
"Big data powers the cybersecurity world," he said. "There are few verticals as privileged as ours in terms of knowledge about how to secure big data."
This is vital, since security incidents are getting faster and larger in scope.
According to a report released in April by cybersecurity vendor Gemanto, 2.6 billion records were breached last year, the first time this number exceeded 2 billion and an increase of 88 percent over the previous year.
That averages out to more than 7 million records a day.
Even more worrisome, in most breaches the time it takes for systems to be compromised is measured in minutes, and exfiltration occurs within hours, according to the latest Verizon data breach investigations report.
That brings us to the next frontier in cybersecurity where big data is poised to make an impact: incident response.
As more data is collected not just about the attacks, but about how data centers react to them, the security industry is starting to create automated playbooks that allow organizations to respond instantly and intelligently to the attacks.
Companies without that kind of scale either have to wait until they've collected enough data to make analytics useful or share their playbooks with their peers.
Keep an eye out for vendors to emerge in this space who will not only help data centers put together incident-response playbooks and automate them, but also collect them in a central location, where they can do analytics on the responses, figure out the best strategies, and then add that knowledge to their recommendation engines.