BSidesLV Preview: Using Machine Learning for Security Analytics

Tuesday, July 02, 2013

Anthony M. Freed


Security BSides Las Vegas is now less than one month away – they have the momentum needed for another fantastic learning opportunity, and we still have time to highlight a few more of the conference’s upcoming sessions.

Previously we covered sessions about a Windows web server tool called OMENS,  a review of Fun with WebSockets Using Socket Puppet and a session on open source penetration testing and forensics.

We also took a look at Vulnerabilities in Application Whitelisting, a session on how security professionals can more clearly articulate their ideas in a talk titled Never Mind Your Diet, Cut the Crap From Your Vocabulary, and another on Baking Assurance into Software.

BSidesLVThis week we will review a session being presented by Alex Pinto (@alexcpsec) titled Using Machine Learning to support Information Security.

Pinto has over 13 years of experience in information security solutions architecture, strategy and monitoring, and holds the CISSP-ISSAP, CISA, CISM, CREST CCT APP and PMP certifications as well as having been a PCI QSA for more than five years.

For the past year Pinto has been on sabbatical researching and exploring the possible applications for Machine Learning and Predictive Analytics for Information Security Data, specifically in supporting the challenges inherent in trying to make sense of SIEM solutions as a whole.

Though a newcomer to BSides, Pinto has experience presenting at marketing-heavy conferences such as Gartner Security Summit, PCI London and sponsored talks at RSA Europe.

“I have never presented at a ‘real con’ before, where people will actually ask intelligent questions,” Pinto quipped, “but I think I can hold my own.”

Pinto notes that Big Data, Data Science, Machine Learning and Analytics are some of the latest buzzwords that have inundated our industry, and he feels “we are being sold a unicorn-laden, silver-bullet panacea by heavy handed marketing folks,” that has evoked “an expected pushback from the most enlightened members of our community.”

However, he also believes there might just be enough technical meat in these movements to better utilize the data we have been collecting in order to automate some of our security decision making.

His session is intended to address recent insights into applying Machine Learning techniques to data analytics, and try to make the point that “if all of this technology can be used to show us ‘better’ ads in social media and track our behavior online, it can also be used to defend our networks as well.”

The amount of security log data that is being accumulated today, be it for compliance or for incident response reasons, is bigger than ever, Pinto said. Given a recent push due to regulations such as PCI and HIPAA, even small and medium companies have a lot of data stored in log management solutions no one is utilizing.

“There is a surplus of data and a shortage of professionals that are capable of analyzing this data and making sense of it,” Pinto said. “This is one of the main criticisms about compliance and ‘check-box security’ practices, and it is a legitimate criticism because no one is safer for just accumulating this data.”

But this is not exclusively a ‘tool problem’ Pinto says from experience, as he used to coordinate analytics teams in a previously held position, and so understands very well the challenges involved.

“I have seen really talented and experienced people be able to configure one of these systems to really perform well,” he said, “but it usually takes a number of months or years and an a couple of these SOC supermen to make this happen.”

Pinto advocates the idea of using Machine Learning techniques to mine for valuable security information in order to allow organizations to make informed decisions based on the vast amounts of information they already have available.

“This practice may not outperform a well trained analyst, but it’s certainly better than no action at all,” Pinto asserted. “It can greatly enhance the analyst’s productivity and effectiveness by letting him focus on the small percentage of data that is highly likely to be meaningful.”

Pinto’s session was designed to introduce the concepts and motivations behind using machine learning as an enhancement for early detection systems.

“There are way too few security professionals who are looking at the potential of these tools, when they seem to be such a perfect fit to help us process the vast amount of data and forensic evidence that we have at hand,” Pinto stated.

He hopes that his work will clear up some of the confusion around what Big Data, Machine Learning and related technologies are, and their potential applications to Information Security can be.

“It is very easy to be turned off by the hype, and everyone is too busy to do fact-checking on every new subject that comes forward,” Pinto said. “I have gotten very passionate about this subject and my intention is to share my findings with the greater community.”

The application of Machine learning and Big Data analytics to security intelligence does have some challenges as well, Pinto noted, particularly because these powerful tools are very easy to misuse if you do not understand the math or the concepts behind them.

“Most of the public debate has focused on gathering more data, but that doesn’t help if we can’t handle it, and adding more hay will just make the needle harder to find,” Pinto said. “Machine learning actually performs better when you feed it more data, as opposed to current monitoring technology, and that is why it must be discussed.”

Pinto believes that given the current level of media hype, it is too easy to regard the market movement into Big Data as “silly” and not fully understand that it can lead to greater analytics capabilities for all the data we have stored in our log managers.

“Don’t misunderstand me, we are definitely not there yet, no matter what the vendors say,” Pinto acknowledged, “but the path is starting to become clearer.”

There is a lot of research going on in all of these areas, mostly academic for now Pinto said, and it should be a hot topic in Information Security in 2 to 4 years. He will be unveiling his latest research project called MLSec ( shortly before the BSidesLV conference.

“We as a community should be ready for the changes and opportunities this will present.”

Cross Posted from Tripwire's State of Security

Possibly Related Articles:
Black Hat Conference conference Security BSides Las Vegas
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.