7 Tweets 18 reads Sep 06, 2023
Outliers do not fit in with the rest of the data.
But how can an ML model identify them?
Let me introduce one-class classification.
1/7
General classification tries to distinguish between two or more classes with the training set containing data from all classes.
One-class classification on the other hand has only the target class.
The training data contains data only from one class.
2/7
How can only one class be useful?
If we can identify what belongs to the class, we can also identify what doesn't belong to the class.
Consider the example below.
3/7
In a factory, we have a running machine that works perfectly 99.99% of the time.
It is cheap and easy to get data from this working machine. But sampling a faulty machine would be expensive and hard.
To get examples, you need to damage the machine in several ways.
4/7
One-class classification can be used in this case.
You only use data from the working machine to train the model.
If something goes wrong, the data will be totally different from the existing class and the model can raise a red flag.
5/7
One-class classification is especially useful in 'catastrophe detection' or anomaly detection:
- Check motor failure
- Nuclear plant monitoring
- Airplane gearbox monitoring
6/7
That's it for today.
I hope you've found this thread helpful.
Like/Retweet the first tweet below for support and follow @levikul09 for more Data Science threads.
Thanks 😉
7/7

Loading suggestions...