↑ comment by Pattern ·
2021-06-13T00:17:47.537Z · LW(p) · GW(p)
Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)?
Here's a fictional story:
You decide to study more. Your grades go up. You like that, so you decide to study really really hard. You get burnt out. Your grades go down. (There's also an argument here that the metric - grades - isn't necessarily ideal, but that's a different thing.)*
*There might be a less extreme version involving 'you stay up late studying', and 'because you get less sleep it has less effect (memory stuff)'.
This isn't meant as an unsolvable problem - it's just that:
are both true.
Maybe this style of mechanism, or 'causal influence' is rare. But its (biological) nature arguably, may characterize a domain (life). So in that area at least, it's worth taking note of.
I guess I'm saying, if you want to know if you have to be worried about Goodhart's Law, in general, I think it depends. Just spend time optimizing your metric, and spend time optimizing for you metric, and see what happens. If you want more specific feedback, I think you'll probably have to be more specific.