Wanna build ML models that generate business value? 📈
You need to think beyond "abstract" evaluation metrics (like mean-squared error or accuracy).
Add business metrics to the combo 🚀 💼
This is how you do it ↓
You need to think beyond "abstract" evaluation metrics (like mean-squared error or accuracy).
Add business metrics to the combo 🚀 💼
This is how you do it ↓
Example: Imagine you work at Tesla, building the next generation of self-driving cars 🚗
You wanna build a better version of the autopilot system, which decides in real-time what the car should do next.
You wanna build a better version of the autopilot system, which decides in real-time what the car should do next.
You have historical data with labels you can use to train your ML model, in this case, a classifier with 4 possible outputs:
⬆️ - go straight
⬅️ - turn left
➡️ - turn right
✋ - stop
And you manage to build a model with 99.99% accuracy.
Is this accuracy *good*, or not?
⬆️ - go straight
⬅️ - turn left
➡️ - turn right
✋ - stop
And you manage to build a model with 99.99% accuracy.
Is this accuracy *good*, or not?
To answer this, you need to translate this abstract accuracy metric into something meaningful for the business.
For example, what is the likelihood of a car crash?
For example, what is the likelihood of a car crash?
To greenlight your new autopilot system, the team needs to ensure that the likelihood of a car crash is
→ lower than the current system's (baseline 1)
→ lower than the probability of a crash when a human drives the car (baseline 2)
→ lower than the current system's (baseline 1)
→ lower than the probability of a crash when a human drives the car (baseline 2)
To compute the likelihood of a car crash, you need to
→ immerse the ML agent into a traffic simulation engine.
→ let it drive as much as possible, and
→ record every crash event.
This way you get your crash likelihood.
→ immerse the ML agent into a traffic simulation engine.
→ let it drive as much as possible, and
→ record every crash event.
This way you get your crash likelihood.
You compare this metric with the 2 baselines and decide if the model is "good" when
(your_system_crash < baseline 1) AND (your_system_crash < baseline 2)
If either one of these inequalities does not hold, the model is NOT good enough, and you need to work on it further.
(your_system_crash < baseline 1) AND (your_system_crash < baseline 2)
If either one of these inequalities does not hold, the model is NOT good enough, and you need to work on it further.
To sum up,
→ Real-world ML models are ultimately evaluated in terms of business metrics.
→ An ML model is "good" when its implied business metric beats the status quo (aka current baseline).
→ Real-world ML models are ultimately evaluated in terms of business metrics.
→ An ML model is "good" when its implied business metric beats the status quo (aka current baseline).
Wanna become a freelance data scientist?
Join my e-mail list and get my eBook "How to become a freelance data scientist", for FREE ↓
freelance-data-science.carrd.co
Join my e-mail list and get my eBook "How to become a freelance data scientist", for FREE ↓
freelance-data-science.carrd.co
Every week I share real-world Data Science/Machine Learning content.
Follow me @paulabartabajo_ so you do not miss what's coming next.
Wanna help?
Like/Retweet the first tweet below to spread the wisdom ↓↓↓
Follow me @paulabartabajo_ so you do not miss what's coming next.
Wanna help?
Like/Retweet the first tweet below to spread the wisdom ↓↓↓
Loading suggestions...