Machine learning: Cybersecurity dream-come-true or pipe dream?

IT experts agree that machine learning has demonstrated enormous value in enhancing search engine capabilities or in spotting patterns in everything from finance to medicine. But the debate continues about its value in cybersecurity

machine learning
Credit: Thinkstock

Want to know which of your workers aren’t working much, which ones are planning to leave soon and/or might also be planning to steal some proprietary data on the way out the door?

Machine learning can help you spot it much more quickly than your legacy HR department, according to Ariel Silverstone, a consulting chief security and privacy officer, who said he’s seen machine learning do that with “a week’s worth of baseline data.”

He is not the only one convinced of its value. Zev J. Eigen, global director of data analytics for the law firm Littler Mendelson, said while machine learning’s application to cybersecurity is relatively new, it has the potential to “revolutionize” it.

And former Symantec CTO Amit Mital (now manager at KNRL Labs), at a panel discussion sponsored by Fortune magazine this past July, called Artificial Intelligence (Machine Learning is a component of Artificial Intelligence) one of the “few beacons of hope in this mess” – the mess being cybersecurity, which he contended is “basically broken.”

zev eigen

Zev J. Eigen, global director of data analytics, Littler Mendelson

But not all experts are convinced of machine learning's revolutionary power.

Simon Crosby, CTO of Bromium, whose recent post in Dark Reading was headlined, “Machine Learning is cybersecurity’s latest pipe dream.” He argued that, “there is no silver bullet in security, and there’s no evidence at all that these tools help.”

That skepticism is pretty much in line with the conclusion of research firm Gartner, which ranked Machine Learning among the top five technologies at the “peak of inflated expectations” in its 2015 Hype Cycle.

But a number of experts, while they all agree there is no silver bullet in cybersecurity, say there is a lot of room between that and a pipe dream.

[ ALSO ON CSO: The 5 worst Big Data privacy risks (and how to guard against them) ]

“Machine learning is not a silver bullet,” said Stephan Jou, CTO of Interset, “but in an industry where we see huge losses on a weekly basis even after companies have deployed millions of dollars of security technology, it is incredibly short-sighted to underestimate it.”

stephan jou

Stephan Jou, CTO, Interset

Gary King, director of the Institute for Quantitative Social Science at Harvard University, agreed. Machine learning is “not remotely a pipe dream,” he said.

“That doesn't mean it can do anything without any thought. There are things you can't do well with it now, and there are many more things that some people will do badly even if others could use them to great effect,” he said, adding that skilled humans need to be in charge of machine learning, directing an intelligence effort.

But those humans, “should also receive as much help as would be useful. As it happens, [machine learning] can help a great deal,” he said.

And Silverstone, while he also agrees that, “nothing can prevent anything from being hacked,” said he thinks Gartner’s Hype Cycle conclusion is dead wrong. Machine learning, rather than being overhyped, “is severely, significantly under-hyped,” he said.

“When you have enough data and you understand why the data show certain trends, you can improve prediction to much better than 90% – perhaps to more than 99%.”

That, he said, means it is possible not just to ask the machine, “Will I be attacked next week?” but, “Will I be attacked next Tuesday from China at 3 p.m.?” and even “What and when is the likely next hack, from where and by whom?”

“That is possible today with very high accuracy,” he said, “and I believe that much more complicated algorithms are not only possible but are actually being used. We can do stuff today that two years ago I wouldn’t have believed.”

Crosby agrees that machine learning is a “powerful tool,” acknowledging its effectiveness in cases like Google search and the recommendation engines of companies like Amazon and Netflix.

But he also noted that Google’s attempt to identify flu epidemics, “turned out to be woefully inaccurate.”

He argued that while machine learning is very good at finding similarities between things, “it’s not so good at finding anomalies. In fact, any discussion of anomalous behavior presumes that it is possible to describe normal behavior,” which he said is very difficult.

“This gives malicious actors plenty of opportunity to ‘hide in plain sight’ and even an opportunity to train the system that malicious activity is normal,” he said.

But Eigen, said while those difficulties exist, that doesn’t mean machine learning has no added value. Every system is “game-able,” he said, “but the question we should be asking in addition is what is the extent to which this problem is worse without [machine learning]?”

Jou is even more emphatic. “Machine learning has proven that it can define what is normal and then define anomalies,” he said.

He agreed that it will not replace humans, but said it takes what humans do – recognize patterns and then anomalies to those patterns – and automates it. “Machine learning is really just taking data sets, finding the patterns and defining what is normal versus what is ‘weird’,” he said.

He said the technique that attackers use to try to fool a system into thinking that their activity is normal, called “model poisoning,” can be countered by, “using multiple models per data source.

“That means an adversary using this approach would have to have full knowledge of all the models being used to detect dangerous behaviors, and would need to simultaneously poison all models and data sources,” he said.

Both Jou and Silverstone said machine learning has demonstrated within organizations that it can predict which employees are likely to leave, and/or turn malicious and steal data. “We regularly catch bad actors in this way,” Jou said.

Silverstone said he knows from direct experience that, “with about a week’s worth of baseline data, we can tell which worker on the network is sloughing off, likely to leave or likely to be malicious. Also we can predict what level of bandwidth I will need at what time of the day, and which ports and even which sites people will go to.”

He said part of the strength of machine learning is that it can recognize context, as in: “Does an actor have the right to perform a specific action, and where and when etc.? That can mean a simple predication that results in needing only 6GB firewalls instead of 60GB. And the possibilities go far beyond this,” he said.

He contended that anyone who argues that machine learning can’t learn and spot the differences between normal and anomaly, “is talking about older [machine learning]. There is nothing better than a machine to note similarities or dissimilarities,” he said, adding that, “anybody who wants to put a human being up against a machine in finding data anomalies is welcome to try.”

Eigen noted that the roots of machine learning go back decades. Indeed, they go back at least as far as the famed British computer scientist Alan Turing, who led the team that built the machine that cracked the Nazi “Enigma” code in World War II, dramatized in the recent movie, “The Imitation Game”. In a 1950 paper, Turing raised the question, “Can machines think?”

The reason machine learning is such a hot “new” topic now, Eigen said, is because, “we now have better data storage and higher-quality data that we can process more rapidly.”

Jou said the perception that machine learning is overhyped could be because its use in cybersecurity is relatively new, but he believes that, “once it starts to demonstrate the same success it has had in these other fields it will revolutionize cybersecurity.”

This does not mean, machine learning’s advocates say, that its use has matured to the point where its effective use is commonplace throughout the public and private sectors.

“Day to day, I’m not seeing many enterprise organizations innovating with it,” Silverstone said, “but it is happening in some research facilities, universities and very much in the financial sector.”

Jou said machine learning is harder to adapt to cybersecurity because, “security people are not in the habit of sharing data. We don’t sit around saying, ‘I just got breached, here are my firewall traffic logs. Show me yours.’ Also, many companies are just realizing that they have become big data companies,” he said.

But machine learning will, “get there, and get there quickly, because so much has already been learned about how to apply machine learning successfully in other areas.”

Silverstone said the possibilities are extraordinary. “If the concept of machine learning is applied to digital currency, it could mean you can steal all the money you want, but you wouldn’t be able to spend it,” he said.

This story, "Machine learning: Cybersecurity dream-come-true or pipe dream?" was originally published by CSO.