We can do better than argmax

jan_kulveit

We can do better than argmax

post by Jan_Kulveit · 2022-10-10T10:32:02.788Z · LW · GW · 4 comments

4 comments

4 comments

Comments sorted by top scores.

comment by [deleted] · 2022-10-11T01:11:13.073Z · LW(p) · GW(p)

I am not quite sure this is correct.

Take the classic investment advice "a mixture of stocks AND bonds". For the purpose of this post assume the classic advice is correct.

What that advice is saying "stocks is the best option normally but bonds act as a hedge".

Meaning for the conditional probability where stocks are going down, bonds tend to hold more of their value.

So this isn't softmax. Softmax would say "invest in the best past performing stock fund and also spread money around to the other stocks". (Argmax would be "all in on Berkshire Hathaway" or whatever)

"Conditional probability hedging" is something else. It's taking the most probable bad outcome from your top choice and reevaluating your actions, assuming the most probable bad choice is true.

For example a robot picking something up might consider the most probable bad outcome to be the item was dropped, and it might choose to take some action to mitigate if that happens. (If it has 2 arms it could put the second one in the path of the most probable way the item could fall)

comment by harsimony · 2022-10-10T23:36:31.009Z · LW(p) · GW(p)

I like this intuition and it would be interesting to formalize the optimal charitable portfolio in a more general sense.

I talked about a toy model of hits-based giving which has a similar property (the funder spends on projects proportional to their expected value rather than on the best projects):

https://ea.greaterwrong.com/posts/eGhhcH6FB2Zw77dTG/a-model-of-hits-based-giving

Updated version here: https://harsimony.wordpress.com/2022/03/24/a-model-of-hits-based-giving/

comment by Donald Hobson (donald-hobson) · 2022-10-11T20:14:24.164Z · LW(p) · GW(p)

Suppose you have several different, similarly good plans to build a nuke. (Sure, not an EA goal, but it makes the rest of the analogy work) You have a single critical mass of uranium. Trying to split it between both projects would guarantee that neither succeed. Neither one would have the critical mass. Switching is costly. Sometimes, gains aren't sublinear. If you need a minimum scale to get stuff done, and are faced with 2 equally good projects and only resources for 1. Sometimes the best solution is to pick one project, put all your resources on that, and stick with it (unless it goes really seriously wrong.)

comment by Dagon · 2022-10-11T15:15:56.563Z · LW(p) · GW(p)

Needs examples and worked calculations. I can't tell if it's obvious or not - I've only ever heard the softer/more complete version - "put your effort into the most important thing that needs more resources". Uncertainty could play a part here, but for most cases, simple declining returns to effort, and changing relative importance as "the most important" gets partly addressed cover the need to vary.

Likewise, the difference between urgent and important leads to some diversity in effort. In order to optimize for the long term, you have to GET TO the long term.

We can do better than argmax

Contents

4 comments