Learning coefficient estimation: the details

post by Zach Furman (zfurman) · 2023-11-16T03:19:09.013Z · LW · GW · 0 comments

This is a link post for https://colab.research.google.com/github/zfurman56/intro-lc-estimation/blob/main/Intro_to_LC_estimation.ipynb

Contents

  What this is for
  What this isn't for
  TLDR 
None
No comments

What this is for

The learning coefficient (LC), or RLCT, is a quantity from singular learning theory that can help to quantify the "complexity" of deep learning models, among other things.

This guide is primarily intended to help people interested in improving learning coefficient estimation get up to speed with how it works, behind the scenes. If you're just trying to use the LC for your own project, you can just use the library without knowing all the details, though this guide might still be helpful. It's highly recommended you read this post [LW · GW] before reading this one, if you haven't already.

We're primarily covering the WBIC paper (Watanabe 2010), the foundation for current LC estimation techniques, but the presentation here is original, aiming for better intuition, and differs substantially from the paper. We'll also briefly cover Lau et al. 2023.

Despite all the lengthy talk, what you end up doing in practice is really simple, and the code is designed to highlight that. After some relatively quick setup, the actual LC calculation can be comfortably done in one or two lines of code.

What this isn't for

TLDR 

In chart form: as we move from idealized (top) to realistic (bottom), we get new problems, solutions, and directions for improvement. The guide itself covers the first two rows in the most detail, which are likely the most conceptually difficult to think about, and skips directly from the second row to the fourth row at the very end.

 

See the linked Colab notebook for the full guide.

0 comments

Comments sorted by top scores.