AI DevOps: Behind the Scenes of a Global AV’s Machine Learning Infrastrcture

Presented by Alex Long at DEFCON 26 AI Village

Synopsis

Thus far, the security community has treated machine learning as a research problem. The painful oversight here is in thinking that laboratory results would translate easily to the real world, and as such, not devoting sufficient focus to bridging that gap. Researchers enjoy the luxuries of neat bite-sized datasets to experiment upon, but the harsh reality of millions of potentially malicious files streaming in daily soon hits would-be ML-practitioners in the face like a tsunami-sized splash of ice water. And while in research there’s no such thing as “too much” data, dataset sizes challenge real-world cyber security professionals with tough questions: “How will we store these files efficiently without hampering our ability to use them for day-to-day operations?” or “How do we satisfy competing use-cases such as the need to analyze specific files and the need to run analyses across the entire dataset?” Or maybe most importantly: “Will my boss have a heart-attack when he sees my AWS bill?”

In this talk, we give an overview of how our system works and how it answers the difficult questions of real world ML such as the ones listed above. This talk will provide a rare look into the guts of a large-scale machine learning production system. As a result, it will give audience members the tools and understanding to confidently tackle such problems themselves and ultimately give them a bedrock of immediately practical knowledge for deploying large-scale on-demand deep learning in the cloud.