Improving generalization by preventing memorization in neural networks

03Jun20

I’m excited to share a student paper that was just accepted to ICML. Neural networks are capable of memorizing training labels, but if this happens they will generalize poorly when applied to test data. Where is that information about memorized labels stored? Well, it has to be stored in the neural network weights somewhere. You can write down an information measure that captures the amount of memorization. Unfortunately, it’s very difficult to estimate or control this term, because it involves high-dimensional quantities. Hrayr found an interesting way to control this information, by using a separate neural network that estimates gradients without relying too much on label information.

The top line (y) are the actual labels in the Clothing-1M dataset provided by human annotators. As you can see, many of these labels are wrong or confusing. Our approach can be used to identify these noisy labels, and in some cases correct them (second line, y hat).

One of the most fun parts of this project is that we can then check which labels in a dataset seem to require the most memorization. In the figure, you can see that the labels provided by humans (y) for this dataset are often wrong or confusing. Our network that tries to learn a classifier without memorizing labels often does better (y hat).

Filed under: Posted by Greg Ver Steeg | Leave a Comment

No Responses Yet to “Improving generalization by preventing memorization in neural networks”

Feed for this Entry Trackback Address

Leave a Comment