Garbage In, Garbage Out

Hey!

In machine learning people talk about “Garbage In, Garbage Out” when referring to the data that is used to train a ML model. This mental model can be thought of in two ways.

  1. If you use bad data (aka garbage) to train your model then you’ll get a model that performs poorly (aka a garbage model).
  2. If you train a good model using good data, but then try to make predictions using bad data (garbage), then you’ll get bad predictions (aka more garbage).

 

It doesn’t matter how powerful your ML algorithm is, if you use bad data then you will get bad results.

There are three components of this mental model:

  1. The Inputs (Data)
  2. The Tool (ML Algorithm)
  3. The Results (ML Model’s Output)

 

“Garbage In, Garbage Out” focuses on The Inputs and the The Results. However, this doesn’t mean we should disregard The Tool. There is another concept in machine learning called “bias”. An ML model has “bias” if it’s not capable of learning the data that you give it. This is an example of where we should probably choose a different tool (ML algorithm) instead of trying to get better data.

I think these ideas are useful to apply to our own lives as well. When we are getting bad results we can consider two things:

  1. The Tool: Is the tool we are using capable of getting the results we want?
  2. The Inputs: Are we giving good inputs to the tool?

 

Example:

  • Problem with the Tool: If you have been studying really hard but your grades aren’t improving, then maybe your study techniques need updating.
  • Problem with the Inputs: If you have been using good study techniques but your grades aren’t improving, then maybe you’re not studying the right material.

 

I hope you found this month’s newsletter thought provoking!

Michael Hammer

Read all my Newsletters here: https://michaelphammer.com/newsletter/

Consider Sharing:

Facebook
Twitter
Pinterest
LinkedIn

More Articles You Might Enjoy