Publications Our documented research findings
Malware Data Science: Attack Detection and Attribution
De-anonymizing programmers via code stylometry
Crafting adversarial input sequences for recurrent neural networks
Git blame who?: Stylistic authorship attribution of small, incomplete source code fragments
MEADE: Towards a Malicious Email Attachment Detection Engine
Towards Principled Uncertainty Estimation for Deep Neural Networks
SeqDroid: Obfuscated Android Malware Detection Using Stacked Convolutional and Recurrent Neural Networks
A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content
A Simple and Agile Cloud Infrastructure to Support Cybersecurity Oriented Machine Learning Workflows
Generating up to date, well labeled datasets for machine learning (ML) security models is a unique engineering challenge, as large data volumes, complexity of labeling, and constant concept drift makes it difficult to generate effective training datasets. Here we describe a simple, resilient cloud infrastructure for generating ML training and testing datasets, that has enhanced the speed at which our team is able to research and keep in production a multitude of security ML models.