Your organization’s leadership is 12 times more likely to be the target of a security incident and nine times more likely to be the target of a data breach than they were last year. Find out how they can be protected.
Read the Datasheet
Gift Cardsharks: The Massive Threat Campaigns Circling Beneath the Surface
Learn about the attack group primarily targeting gift card retailers and the monetization techniques they use.
Get the Report
Threat Hunting Workshop Series
Join one of our security threat hunting workshops to get hands-on experience investigating and remediating threats.
Attend an Upcoming Workshop
Inside Magecart: New RiskIQ & Flashpoint Research Report
Learn about the groups and criminal underworld behind the front-page breaches.
Threat Hunting Guide: 3 Must-Haves for the Effective Modern Threat Hunter
The threat hunting landscape is constantly evolving. Learn the techniques, tactics, and tools needed to become a highly-effective threat hunter.
Machine learning is becoming more critical to cybersecurity every day. As I’ve written before, it’s a powerful weapon against the large-scale automation favored by today’s cyber threat actors, but the dynamic between cyber attackers and defenders is evolving.
Nowadays, machine learning is mostly used by cyber security software to ingest massive quantities of data and identify cyber threats, but that will all soon change as increasingly sophisticated cybercriminals tap into their own machine learning tools to counter this. The early stages of this malicious machine learning will likely take the form of bad guys directly targeting the good guys’ algorithms directly to sabotage, mislead, and reverse-engineer them.
We’re on the precipice of the age of adversarial machine learning, where dueling algorithms will determine an organization’s cyber security, as well as the safety of its employees and customers. Here’s what that time will look like.
An organization’s cyber threat detection ability is often only as good as its machine learning models, which would make these models a logical target for cyber attackers. In terms of adversarial machine learning, this could mean cybersecurity vendors get hacked themselves by cyber threat actors looking to gain access to the algorithms and data that trains their models. With this information, the bad guys can build their campaigns to evade detection, or they can build identical models they can test their cyber attacks against.
Cyber threat actors can also target these algorithms externally. Based on a model’s outputs, or how often they detect specific cyber threats, cyber threat actors can extrapolate a model’s signatures via trial and error and learn how to game them to avoid detection. Public models are the most susceptible to this type of manipulation because cyber threat actors would have access to the same models that defenders use. This knowledge of how their campaigns and cyber attacks are detected would let them completely disguise themselves, causing a host of problems for cybersecurity teams.
Because machine learning models don’t process and understand things on a human level, this could mean blurring a logo on a phishing page just enough so that it’s still recognizable to humans but utterly confusing to signature-based cyber security models. It may also mean cyber threat obfuscating the part of their code they know cybersecurity models are looking for to detect their cyber threats.
The cyber security community largely has ignored adversarial machine learning, but it’s almost certainly something we’ll be contending with sooner than later. I’ve covered just a couple examples here—there are many ways that machine learning models will be maliciously targeted.
The adversarial machine learning war isn’t raging yet, but my team is already taking measures to counter it. The two most effective ways to combat adversarial machine learning:
Blending: It’s far easier to game one model over several. Every data scientist has a “go-to” algorithm for training their models, but it’s essential to not only use other algorithms but try other algorithms together. We use blended models (also known as stacked models) to detect cyber threats where the base models marry two or more different perspectives.
Co-training: Co-training is a semi-supervised machine learning method where two or more supervised models work together to classify unlabeled examples — examples that haven’t been classified by humans. By learning from several real examples, different Magecart scripts, for instance, they can learn to find cyber threats in the wild, even as they evolve. If these models disagree on how to classify these examples, the disagreements escalate to our active learning system, which includes a review by human analysts. This way, the analyst can recognize when the model is no longer detecting cyber threats at an acceptable rate, which may indicate that a cyber threat actor is privy to its algorithms.
Placing an expert in the loop also helps when the model is unsure how to categorize a particular instance. Having this analyst or data scientist available when a model “asks for help,” is crucial as cyber threats change, especially when they change with the intent to fool the model. Left alone, the model may make an incorrect assumption about if it’s a cyber threat or if it’s benign. Developing a feedback mechanism that provides your model with the ability to identify and surface questionable items is critical to the success of your model.
As data scientists in the age of adversarial machine learning, it’s our duty to make sure the bad guys won’t be the only ones with these kinds of tricks up their sleeve. Cyber threats change all the time, so detection models must change accordingly. To keep up with adversaries, especially as they employ machine learning, it’s critical to use models that can learn incrementally.
When adversaries change their cyber threats to beat your models, your models must transform to counter the new cyber threats. It sounds like a chess match, but the stakes are much higher — it’s going to determine the future of cyber security.
What’s in a #malvertisement? We found more #magecart and a 186% spike in drive-by delivery https://t.co/rsl9GGiRUZ
.@TechCrunch's @zackwhittaker found that thousands of MoviePass customer card numbers were exposed because a critical server was left unsecured. With @ydklijnsma and RiskIQ data in @passivetotal, he discovered the exposure began all the way back in May https://t.co/blde3p21dU
Can you spot the phish? In tomorrow's PassiveTotal Thursday, we’ll take a real-life #phishing page targeting a popular brand and break it down to show how it differs from the genuine. Register today: https://t.co/EP2q6On5vE #ThreatHunting
We're thrilled to welcome Dean Ćoza, who will lead our product and technology teams as RiskIQ Chief Product Officer. Read more about Dean's appointment here:
Check out the brand new @RiskIQ Threat Hunting course on @CybraryIT
Manage Your Attack Surface Management using the "Mark of the Web"
https://t.co/ZGDBGyecJr #cybersecurity #magecart #course #cybrary