Hey!
In machine learning (ML), there is a balance between applying your expertise and acting as if you know nothing. Expertise can be a powerful tool when deciding what information you should train your ML algorithm with, but it can lead to consequential biases. Your expertise can bias you away from selecting unusual features that could have been very useful. In machine learning, a “feature” is a measurable aspect of your data (ex: age, gender, class, etc.)
For example, consider an expert on the Titanic is attempting to predict which passengers survived using an ML algorithm. The expert may be aware that passengers were segregated by class so they may have selected “class” as a useful feature. However, this expertise may have blinded them to the fact that the single greatest predictor is the person’s gender. Think of the common phrase: “women and children first”.
Machine learning has several techniques that can help mitigate this:
- Avoid data snooping. Your expertise may lead you to manipulate the features so that your algorithm gives you the results you expect (confirmation bias). Create a “test set” to evaluate your algorithm and avoid looking at it too much. If you’ve looked at the test data, then you may think you know what the algorithm should do. If the algorithm doesn’t do what you expect, you might manipulate it until it does. This is a form of confirmation bias.
- Use algorithms that drop features without your intervention. Lasso regression and decision trees are two algorithms that can ignore features of data that they determine to not be useful.
- Use backward elimination by training a model on all available features, removing a feature that is insignificant, training the model again, removing another feature, training the model again, etc. With this technique, you continue that cycle until removing a feature does not improve the results of your model. Using the model’s performance to select features removes the expert’s biases.
- Use dimensionality reduction. These techniques can help you create new features that represent your data. A feature can be thought of as a dimension on a graph, and by reducing the dimensions these techniques cause only strong features to survive.
In life and in machine learning, we should be careful of our biases!
I hope you found this month’s newsletter thought-provoking.