Hey!
In machine learning, “regression to the mean” is an important concept for evaluating models. Simply put, regression to the mean is the idea that unusual events are often followed by more typical ones.
Consider the following scenario. A machine learning engineer, or data scientist, trains a model on some data and the performance is 98%! That’s very good. However, when they deploy the model to production they see a performance of 85% on the first day, then 83%, then 87%, then 84%, etc. It’s clear that the model’s “mean performance” is likely closer to 85% rather than the initial 98%. The performance ultimately regressed to the mean. This example illustrates another machine learning concept called overfitting. Overfitting occurs when the model performs well on the training data but does not generalize to new data.
A powerful technique in machine learning to mitigate this phenomenon is called holdout validation. It works by first setting aside some data before training the model on the remaining data. Then, after training, the model can be evaluated on the data that was held out in order to get a better idea of how your model will actually perform in production. You can get an even better idea by holding out different portions each time and re-evaluating the model on each one.
How to Use This Mental Model in Daily Life
- Avoid the Primacy Effect: This is a cognitive bias stating that people tend to give greater importance to the first piece of information they hear. The first thing may have been abnormal. Consider getting second opinions, crowdsourcing, running more experiments, etc.
- Stay Level-Headed: If you get a major win, don’t let it go to your head and cloud your judgment. It could have been random chance or beginner’s luck. Additionally, if you have a major loss, it may have been an unfortunate happenstance. Try again, and don’t give up!
- Be Aware of the Hedonic Treadmill: The hedonic treadmill is the idea that people have an average level of happiness. If you’re having a tough time right now, things will likely get better if they’ve been better in the past. However, if you’re having a great time right now, things may soon return back to normal, so enjoy it while it lasts.
Announcements
- I released a free PDF! Fundamental ML Algorithms: From Theory to Scikit-learn Code is a document that contains key ML terminology, clear explanations of ML algorithms along with the math behind them, code examples, and more! Click on the hyperlink above if your interested!
My New Content
- YouTube Video – I Read the ML System Design Interview Book by Alex Xu – Honest Review
- YouTube Video – The Only 3 Functions You Need for ML: Scikit-Learn Tutorial
- YouTube Video – Get Out of Jupyter Notebooks! Beginner Machine Learning Project
- YouTube Video – Learn Scikit-Learn Now! Crash Tutorial with Code
- YouTube Video – Learn PyTorch Now! Crash Tutorial with Code
- YouTube Video – Can LSTM Predict Moview Reviews? (PyTorch+Word2Vec+LSTMs)
Michael Hammer
Read all my newsletters here: https://michaelphammer.com/newsletter/