Machine Learning: Shattering Simplified

In machine learning, “shattering” refers to the ability of a hypothesis class to represent all possible functions that map inputs to outputs. In simpler terms, it refers to the ability of a machine learning model to fit any given training data perfectly.

The term is used in the context of the VC (Vapnik-Chervonenkis) dimension, which is a measure of the capacity of a hypothesis class. A hypothesis class with a high VC dimension has the ability to shatter any given training data, meaning that it can perfectly fit the data with its parameters. However, this also means that it is more prone to overfitting, as it has a higher capacity to memorize the training data.

So, in short, “shattering” in machine learning refers to the ability of a model to fit any training data, but this can also lead to overfitting if the model has too high a capacity.