Topic Next Gen Web
A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content
Next Gen Web
Sophos AI is working to make the web a safer place, using deep neural networks to detect malicious URLs, detect and warn users about phishing sites, and block malware delivery at the source. These deep learning models add additional layers of security to Sophos’s synchronized security architecture, stopping threats before they ever reach your network.
Detecting Malicious URLs and Stopping the Attack Early
Any good attack-chain usually involves tricking users at some point, whether it’s getting them to run a malicious file because […]
Garbage In, Garbage Out: How Purportedly Great Machine Learning Models can be Screwed Up by Bad Data
As processing power and deep learning techniques have improved, deep learning has become a powerful tool to detect and classify […]
Getting Insight Out Of and Back Into Deep Neural Networks
Deep learning has emerged as a powerful tool for classifying malicious software artifacts, however the generic black-box nature of these […]
eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys
For years security machine learning research has promised to
obviate the need for signature based detection by automatically learning
to detect indicators of attack. Unfortunately, this vision hasn’t come to
fruition: in fact, developing and maintaining today’s security machine
learning systems can require engineering resources that are comparable
to that of signature-based detection systems, due in part to the need
to develop and continuously tune the “features” these machine learning
systems look at as attacks evolve. Deep learning, a subfield of machine
learning, promises to change this by operating on raw input signals and
automating the process of feature design and extraction. In this paper
we propose the eXpose neural network, which uses a deep learning approach we have developed to take generic, raw short character strings as
input (a common case for security inputs, which include artifacts like potentially malicious URLs, file paths, named pipes, named mutexes, and
registry keys), and learns to simultaneously extract features and classify using character-level embeddings and convolutional neural network.
In addition to completely automating the feature design and extraction
process, eXpose outperforms manual feature extraction based baselines
on all of the intrusion detection problems we tested it on, yielding a 5%-
10% detection rate gain at 0.1% false positive rate compared to these
baselines.