Author Richard Harang
CatBERT: Context-Aware Tiny BERT for Detecting Targeted Social Engineering Emails
Targeted phishing emails are a major cyber threat on the Internet today and are insufficiently addressed by current defenses. In this paper, we leverage industrial-scale datasets from Sophos cloud email security service, which defends tens of millions of customer mailboxes, to propose a novel Transformer-based architecture for detecting targeted phishing emails. Our model leverages both natural language and email header inputs, is more computationally efficient than competing transformer approaches, and we show that it is less prone to adversarial attacks which deliberately replace keywords with typos or synonyms.
A machine learning approach to inferring the maliciousness of unknown IP addresses, autonomous systems, and ISPs
Introduction The machine learning-based detection technologies we build at Sophos AI rely on many information sources, including binary programs, system […]
How SophosAI Stops BEC gift card scams
Gift cards are a favorite way for scammers to squeeze money out of their victims. Unlike wire or bank transfers, […]
Sophos-ReversingLabs (SOREL) 20 Million sample malware dataset
The Sophos AI team is excited to announce the release of SOREL-20M (Sophos-ReversingLabs – 20 million) – a production-scale dataset […]
ML Expectation vs. Reality, Part 2: Doing the Actual Analysis
So, you’ve followed the advice in Part 1 of this series. Now you’ve got a nice big data set and you’re pretty sure that […]
How much malware is out there, anyway?
Introduction When we move machine learning models from the lab to the real world, tracking and evaluating model performance becomes […]
Debugging Deep Learning Models
Attention conservation notice: This is a slightly expanded version of a Twitter thread I posted back in January 2020. If […]