I recently said that statistics helps us understand our data in a way that machine learning does not.
Use of the word "understand" proved extremely controversial.
Here's a (short) thread on what it means for statistics to help us understand our data:
Use of the word "understand" proved extremely controversial.
Here's a (short) thread on what it means for statistics to help us understand our data:
A lot of people interpreted "understand" as meaning statistics is highly likely to give you the true causal relationships in your data.
I think this focus on "causation" as the only form of useful understanding is wrongheaded.
I think this focus on "causation" as the only form of useful understanding is wrongheaded.
Traditional statistical methods often give you a structured framework of associations between your variables.
I would argue that even if these relationships are not causal, this is still a useful way of learning about what's going on.
I would argue that even if these relationships are not causal, this is still a useful way of learning about what's going on.
Much like a repair technician might think about how certain symptoms are associated with various mechanical problems, statistical associations can give you a reasonable framework for hypothesizing about what *might* be going on.
You can then follow up with more focused investigations to try to narrow down the possibilities.
Statistical models are often WRONG, but they do give you a USEFUL, data-validated framework for thinking about new scenarios.
That's not nothing.
That's not nothing.
Follow me if you don't want to miss out on more fun discussions about how statistics helps make our lives better.
Loading suggestions...