
Nicole IbarraJune 11, 2020
Protecting ML Deployments Deployments
From Adversarial Samples
Organizations around the world have become increasingly reliant on machine learning systems. Many are utilizing machine learning to automate the detection of cyber-threats such as phishing, malware, malicious emails, and more.
Since machine learning is a highly favored method for fraud detection, it has become a target for cybercriminals.There are many techniques used by attackers to try and override ML classifiers. One of the most widely used is adversarial machine learning. This is a technique in which algorithms are fed malicious input in an attempt to fool them into making analysis mistakes. When the analysis makes a mistake and misclassifies, the attacker is successful in bypassing the system.
Fraudsters have essentially designed their own machine learning models to identify how threats are being classified by anti-fraud systems. In creating workarounds, they are able to successfully and automatically override malicious classifiers. Their models learn how to prompt misclassifications, keeping malicious sites active.
The use of machine learning in domains often relies on a classifier that flags certain attributes as malicious based on a fixed set of features. In phishing detection scenarios, a ML model can be fed with a suspicious URL to assess whether it’s a phishing site. In doing this, adversarial machine learning systems slowly learn how to get around classifiers and security controls.
These are some of the methods attackers use to bypass machine learning security controls:
- Subtle modification of characters in a URL
- Redirection of the URL
- Use of special characters in the site hostname
- Encode the hostname
Machine learning models are mostly capable of recognizing these variations because they are trained to identify phishing using these techniques. However, attackers are always improving their capabilities by leveraging adversarial machine learning to automatically generate URLs which bypass an existing system. Attackers often use generative adversarial networks (GANs) to add slight changes to URLs and bypass the phishing detector. These generated URLs have similar traits to the legitimate ones, allowing them to successfully trick the ML classifier.
Here are a few ways to protect security related machine learning deployments:
- Find potential vulnerabilities in the learning process before deployment. Establish a deep understanding of the machine learning technique used in your instance and eliminate vulnerabilities wherever possible.
- Reduce the potential impact of adversarial attacks by adding extra security layers. For instance, you can design a few strong classifiers for the same task and call them randomly for each request. Since all the classifiers have different architectures, the attacker would have to design an adversarial example that bypasses all the classifiers, which is far less likely.
- Create intelligent countermeasures to improve the security of the machine learning model. One strategy is to implement adversarial training. Adversarial training is the process of explicitly re-training a ML model by including adversarial samples.
Machine learning is a reliable, highly effective method for detecting cyberthreats. It’s important to remember that attackers are constantly finding ways to bypass security measures. Identifying vulnerabilities and adding security controls around machine learning models will deter attackers.