
Not all biases are equal - a study of sycophancy and bias in fine-tuned LLMs 2024-11-11T23:11:15.233Z
[Linkpost] Hawkish nationalism vs international AI power and benefit sharing 2024-10-18T18:13:19.425Z


Comment by jakub_krys (kryjak) on Implications of the inference scaling paradigm for AI safety · 2025-01-15T01:20:13.161Z · LW · GW

I had a similar reflection yesterday regarding these inference-time techniques (post-training, unhobbling, whatever you want to call it) being in the very early days. Would it be too much of a stretch to draw parallels here between how such unhobbling methods lead to an explosion of human capabilities over the past ~10000 years? The human DNA has undergone roughly the same number of 'gradient updates' (evolutionary cycles) as our predecessors from a few millenia ago. I see it as having an equivalent amount of training compute. Yet through an efficient use of tools, language, writing, coordination and similar, we have completely outdone what our ancestors were able to do.

There is a difference in that for us, these abilities arose naturally through evolution. We are now manually engineering them into AI systems. I would not be surprised to see a real capability explosion soon (much faster than what we are observing now) - not because of the continued scaling up of pre-training, but because of these post-training enhancements.

Comment by jakub_krys (kryjak) on [Linkpost] Hawkish nationalism vs international AI power and benefit sharing · 2024-10-20T18:24:37.851Z · LW · GW

Thanks for the comments, I'm looking forward to reading your article. Is 'The Gentle Path' a reference to 'The Narrow Path' or just a naming coincidence?

Comment by jakub_krys (kryjak) on It is time to start war gaming for AGI · 2024-10-17T09:31:08.012Z · LW · GW

I think something along these lines is organised by Intelligence Rising.