Linear infra-Bayesian Bandits

post by Vanessa Kosoy (vanessa-kosoy) · 2024-05-10T06:41:09.206Z · LW · GW · 5 comments

This is a link post for https://arxiv.org/abs/2405.05673

Contents

5 comments

Linked is my MSc thesis, where I do regret analysis for an infra-Bayesian [AF · GW][1] generalization of stochastic linear bandits.

The main significance that I see in this work is:

In addition to the open questions in the "summary" section, there is also a natural open question of extending these results to non-crisp infradistributions. (I didn't mention it in the thesis because it requires too much additional context to motivate.)

  1. ^

    I use the word "imprecise" rather than "infra-Bayesian" in the title, because the proposed algorithms achieves a regret bound which is worst-case over the hypothesis class, so it's not "Bayesian" in any non-trivial sense.

  2. ^

    In particular, I suspect that there's a flavor of homogeneous ultradistributions [LW(p) · GW(p)] for which the parameter  becomes unnecessary. Specifically, an affine ultradistribution can be thought of as the result of "take an affine subspace of the affine space of signed distributions, intersect it with the space of actual (positive) distributions, then take downwards closure into contributions to make it into a homogeneous ultradistribution". But we can also consider the alternative "take an affine subspace of the affine space of signed distributions, take downwards closure into signed contributions and then intersect it with the space of actual (positive) contributions". The order matters!

5 comments

Comments sorted by top scores.

comment by davidad · 2024-05-16T07:37:30.259Z · LW(p) · GW(p)

Re footnote 2, and the claim that the order matters, do you have a concrete example of a homogeneous ultradistribution that is affine in one sense but not the other?

Replies from: vanessa-kosoy
comment by Vanessa Kosoy (vanessa-kosoy) · 2024-05-16T09:19:18.361Z · LW(p) · GW(p)

Sorry, that footnote is just flat wrong, the order actually doesn't matter here. Good catch!

There is a related thing which might work, namely taking the downwards closure of the affine subspace w.r.t. some cone which is somewhat larger than the cone of measures. For example, if your underlying space has a metric, you might consider the cone of signed measures which have non-negative integral with all positive functions whose logarithm is 1-Lipschitz.

comment by mesaoptimizer · 2024-05-11T10:40:18.856Z · LW(p) · GW(p)

Sort-of off-topic, so feel free to maybe move this comment elsewhere.

I'm quite surprised to see that you have just shipped an MSc thesis, because I didn't expect you to be doing an MSc (or anything in traditional academia). I didn't think you needed one, since I think you have enough career capital to continue to work indefinitely on the things you want to work on and get paid well for it. I also assumed that you might find academia somewhat a waste of your time in comparison to doing stuff you wanted to do.

Perhaps you could help clarify what I'm missing?

Replies from: vanessa-kosoy, Davidmanheim
comment by Vanessa Kosoy (vanessa-kosoy) · 2024-05-11T11:00:52.856Z · LW(p) · GW(p)

My thesis is the same research I intended to do anyway, so the thesis itself is not a waste of time at least.

The main reason I decided to do grad school, is that I want to attract more researchers to work on the learning-theoretic agenda, and I don't want my candidate pool to be limited to the LW/EA-sphere. Most qualified candidates would be people on an academic career track. These people care about prestige, and many of them would be reluctant to e.g. work in an unknown research institute headed by an unknown person without even a PhD. If I secure an actual faculty position, I will also be able to direct grad students to do LTA research.

Other benefits include:

  • Opportunity for networking inside the academia (also useful for bringing in collaborators).
  • Safety net against EA-adjacent funding for agent foundations collapsing some time in the future.
  • Maybe getting some advice on better navigating the peer review system (important for building prestige in order to attract collaborators, and just increasing exposure to my research in general).

So far it's not obvious whether it's going to pay off, but I already paid the vast majority of the cost anyway (i.e. the time I wouldn't have to spend if I just continued as independent).

comment by Davidmanheim · 2024-05-15T09:31:05.472Z · LW(p) · GW(p)

I'll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance.  It's not always better to work in academia, but it's also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)

And in my experience, much of the good work coming out of AI Safety has been sidelined because it fails the academic prestige test, and so it fails to engage with academics who could contribute or who have done closely related work. Other work avoids or fails the publication process because the authors don't have the right kind of guidance and experience to get their papers in to the right conferences and journals, and not only is it therefore often worse for not getting feedback from peer review, but it doesn't engage others in the research area.