Posts
Comments
Great article, I found the decision theory behind, if they think I think they think etc very interesting. I'm a bit confused about the knight of faith. In my mental model, people who look like the knight of faith aren't accepting the situation is hopeless, but rather powering on through some combination of mentally minimizing barriers, pinning hopes on small odds and wishful thinking.
For example lets put it in the context of flipping 10 coins.
Rationalist - I'm expecting 5 heads
My model of knight of faith - I'm expecting 10 heads because there's a slight chance and I really need it to be true.
Knight of faith as described - I'm expecting 20 heads and I'm going to base my decisions on 20 heads actually happening.
Maybe that's an obvious distinction but in that case why bring up the knight of faith and instead just focus on the power of wishful thinking in some situations.
Great post! I really enjoy your writing style. I agree with everything up to your last sentence of cooperative epistemics. It looks like a false equivalence between a community of perfect trust and a community based on mistrust. I'm thinking a community of "trust but verify" with a vague assumption of goodwill will capture all the benefits of mistrust without the risks of half rationalists or "half a forum of autists" going off the deep end and making a carrying error in their EV calculations to overly negative results.
Corrupted Hardware leads me to think we need to aim high to end up at an optimum level of honesty.
Edit: Thanks Cole and Shankar.
It does seem like LLMs struggle with "trick" questions that are ironically close to well known trick questions but with an easier answer. Simple Bench is doing much the same thing and models do seem to be improving over time. I guess the important question is whether this flaw will effect more sophisticated work.
On another note I find your question 2 to be almost incomprehensible and my first instinct would be to try to trap the bug by feeling for it with my hands.
Can you please send the new fooming shoggoth album to spotify, I was really enjoying that music!
edit: Ah I see this question has been answered, but I like to note that I'm impressed by the ai music and I'm going to look into making some myself. Perhaps songs about cognitive bias's could be a good way to learn them deep enough in your brain that you can avoid them in non-theroetic situations.
It's tough to gauge which benchmarks or puzzles are important/worth getting nervous about. I can imagine a world where LLMs can still fail easy benchmarks (much easier than the one in this post) but still be superhuman in many other areas including strategic reasoning.
Another benchmark could be explaining your pun! Chatgpt couldn't help me, Claude suggested red herring but without making the connection to the hair / herring rythme. If it's something else I can't work it out.
I'm also interested in what I see as the most important part of any diet, how you resist temptations. As noted in your Scott Alexander link, almost any diet works as long as you stick to it, the hard part is sticking to it. I'm assuming that even if the boring diet reduces hunger you will still be tempted when offered a cookie or a bacon and egg roll.
It felt a bit strange reading through the evidence that willpower is not important and that CICO doesn't work when that's the exact approach I used to lose 30kgs and keep it off for over 10 years now. I did combine it with intermittent fasting and generally high protein intake but to maintain my weight I rely on calorie minimalisation. CICO had a lot of advantages to me but it's possible it just clicked with my personality. I think it's fairly likely that the boring diet clicks with your personality and that it wouldn't work for me (I'm the sense I would quickly give up).
Thanks for clearing that up, I think I was confused because it's hard to imagine putting compassionate crime prevention strategies together with a strict death penalty for repeated shoplifting.
It would be far more moral and cost-effective to focus on prevention, through increased policing, economic opportunities or similar interventions.
Executions and lifelong prison sentences both suffer from leaving families seperated which leads to more crime and other negative externalities many of which can only be speculated upon.
For example, American culture seems to be resistant to overreach from the government. I can imagine far more civil unrest from a heavy handed execution policy than in a country such as Singapore.
I'm curious about the purpose of this post. I think I understand the concept of steelmanning, but I’m struggling to see the specific goal here.
The post doesn’t address countries with low crime rates that don’t use the death penalty, and just seems to double down on executing vast number of criminals rather than any number of other possible options to reduce crime. Also speculating here but I imagine the impacts on social cohesion and flow on effects from ease of executions (political prisoners etc) would make the cure worse than the disease.
Is excluding these concerns part of the steelmanning process? I think the post could have been a bit clearer on what is being steelmanned and what are arguments you are making.