as machine learning & AI systems are increasing in popularity, we are right to be concerned about the security of such systems.
after reading papers & talking to peers, i realized that someone should probably explain the field in layman terms. here's my attempt! (1/14)
after reading papers & talking to peers, i realized that someone should probably explain the field in layman terms. here's my attempt! (1/14)
most big systems have malicious actors.
ex: corrupt politicians in the government, "mean girls" in middle school social dynamics
there are always malicious people trying to deceive ML systems. ex: uploading an inappropriate video to YouTube kids (2/14)
ex: corrupt politicians in the government, "mean girls" in middle school social dynamics
there are always malicious people trying to deceive ML systems. ex: uploading an inappropriate video to YouTube kids (2/14)
let's say a malicious person intentionally creates some input (inappropriate video) to fool an ML system (YouTube's filter for appropriateness). we call such an input an *adversarial example.* (Goodfellow et al. 2017, Gilmer et al. 2018) (3/14)
since the story of machine learning security started with imperceptible perturbations, the research community jumped on this train! many researchers started constraining the definition of an adversarial example to this small, imperceptibly changed input. uh oh... (5/14)
naturally, much of the initial literature on adversarial examples focused on this narrow definition. but malicious actors don't care about a specific kind of adversarial example. in practice, many ML systems are broken by simple things like image rotations or stickers. (6/14)
after a multiple year long battle between adversarial attacks and defenses, we saw a "showdown:" Athalye et al. 2018 took the best defenses from the previous top AI conference and found some adversarial examples to break them all. oh snap! what do we do now? (7/14)
one line of thinking: are humans, the best machines right now, vulnerable to adversarial examples? yes! it can be simple, like knocking down stop signs so drivers run through intersections, or complicated like adding specific perturbations to images (Elsayed et al. 2018). (8/14)
what does it mean for an ML system to be "good enough?" Engstrom et al. 2019 argue that adversarial examples aren't "bugs." they say inputs (ex: cat image) have robust (ears, eyes, etc) features and non-robust features (random other pixels, patterns we can't see). (9/14)
if someone messes with a robust feature (ex: mess up a cat's face), we humans will be confused, because we care about robust features way more! current ML systems care about non-robust features a lot, so "imperceptible perturbations" make successful adversarial examples. (10/14)
another line of thinking: how can we measure the vulnerability of an ML system? is it possible to prove that ML systems are robust to certain types of adversarial examples (ex: "imperceptibly changed" examples)? researchers are beginning to work on this. (11/14)
as ML systems (ex: facial recognition) have just recently begun to spread, we're going to see types of adversarial examples we didn't anticipate. we *need* to all be on the same page about what adversarial examples are so we don't miss them. (12/14)
there's no doubt that the growth of ML use will bring up new types of problems. every new technology does this (ex: smartphones, the Internet, social media). for many understandable reasons (which i won't get into), people are afraid of the consequences of ML systems. (13/14)
humans are great at finding problems to solve. i don't think "staying away from problems" is a good reason to stay away from developing AI.
yes, there is high risk involved. but high rewards come from high risk. i choose to be an optimist. i hope you do, too. (14/14)
yes, there is high risk involved. but high rewards come from high risk. i choose to be an optimist. i hope you do, too. (14/14)
some great blog posts (in my opinion):
* Unsolved research problems vs. real-world threat models: @catherio/unsolved-research-problems-vs-real-world-threat-models-e270e256bc9e" target="_blank" rel="noopener" onclick="event.stopPropagation()">medium.com (@catherineols)
* Is attacking machine learning easier than defending it?: cleverhans.io (@goodfellow_ian and @NicolasPapernot )
* Unsolved research problems vs. real-world threat models: @catherio/unsolved-research-problems-vs-real-world-threat-models-e270e256bc9e" target="_blank" rel="noopener" onclick="event.stopPropagation()">medium.com (@catherineols)
* Is attacking machine learning easier than defending it?: cleverhans.io (@goodfellow_ian and @NicolasPapernot )
Loading suggestions...