Posts

Comments

Comment by SemanticMerlin on Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”) · 2024-01-05T04:20:42.112Z · LW · GW

Very surprised to be the first comment on this, nice work. You’ve framed beyond-episode goals really well. One thing that is bothering me, and I must be missing something - why is there a prima facie supposition of the emergence of beyond-episode goals at all? As you (rightly) note, the naive logic about SGD as a mechanism would seem strongly to point away from the plausibility of BEG. This is well written but I feel like “suppose some BEG emerges” is treated almost axiomatically. Don’t we need a stronger circumstantial/theoretical/evidentiary reason for thinking BEGs are, like, a thing that happens in SOTA deep learning paradigms?