Posts
Comments
Its late where I am now so I'm going to read carefully and respond to comments tomorrow, but before I go to bed I want to quickly respond to your claim that you found the post hostile because I don't want to leave it hanging.
I wanted to express my disagreements/misunderstandings/whatever as clearly as I could but had no intention to express hostility. I bear no hostility towards anyone reading this, especially people who have worked hard thinking about important issues like AI alignment. Apologies to you and anyone else who found the post hostile.
Thanks for taking the time to explain this to me! I would like to read your links before responding to the meat of your comment, but I wanted to note something before going forward because there is a pattern I've noticed in both my verbal conversations on this subject and the comments so far.
I say something like 'lots of systems don't seem to converge on the same abstractions' and then someone else says 'yeah, I agree obviously' and then starts talking about another feature of the NAH while not taking this as evidence against the NAH.
But most posts on the NAH explicitly mention something like the claim that many systems will converge on similar abstractions [1]. I find this really confusing!
Going forward it might be useful to taboo the phrase 'the Natural Abstraction Hypothesis' (?) and just discuss what we think is true about the world.
Your comment that its a claim about 'proving things about the distribution of environments' is helpful. To help me understand what people mean by the NAH could you tell me what would (in your view) constitute strong evidence against the NAH? (If the fact that we can point to systems which haven't converged on using the same abstractions doesn't count)
- ^
Natural Abstractions: Key Claims, Theorems and Critiques: 'many cognitive systems learn similar abstractions',
Testing the Natural Abstraction Hypothesis: Project Intro 'a wide variety of cognitive architectures will learn to use approximately the same high-level abstract objects/concepts to reason about the world'
The Natural Abstraction Hypothesis: Implications and Evidence 'there exist abstractions (relatively low-dimensional summaries which capture information relevant for prediction) which are "natural" in the sense that we should expect a wide variety of cognitive systems to converge on using them.
'
Hello! My name is Alfred. I recently took part in AI Safety Camp 2024 and have been thinking about the Agent-like structure problem. Hopefully I will have some posts to share on the subject soon.