If you predict that there's a 20% chance of the AI destroying the world and an 80% chance of global warming destroying the world and there's a 100% chance the AI will stop global warming if released and unmolested then you are better off releasing the AI.

Or you can just give a person 6 points for achieving their goal and -20 points for releasing the AI. Even though the person knows rationally that the AI could destroy the world points matter more than that, and that strongly encourages people to try negotiating with the AI.

I've played the AI box game on other forums. We designed a system to incentivise release of the AI. We rolled randomly the ethics of the AI, rolled random events with dice and the AI offered various solutions to those problems. A certain number of accepted solutions would enable the AI to free itself. You lost points if you failed to deal with the problems and lost lots of points if you freed the AI and they happened to have goals you disagreed with like annihilation of everything.

Psychology was very important in those, as you said. Different people have very different values and to appeal to each person you have to know their values.

I am from Britain and I can say with experience that working for a company in exchange for money is not an effective way to avoid 24/7 sleep with the hero situations. I know quite a few people who have a poor work life balance because they are working for a company and have more stressful situations and conflicts. I've seen people work themselves to depression, divorce, and death thanks to my involvement with the very toxic British banking culture.

Your avoidance of such things dependends on the independent variable of how assertive you are at managing your work/life balance and how good your goal setting is. It's quite easy to overwork yourself for money. Wanting to be a sidekick or a hero or a equal professional doesn't increase or decrease your skill at maintaining a work life balance or your goal setting skills any more than it increases your physical strength or intellect.

You can easily model beliefs and work out if they're likely to have good or bad results. They could theoretically have a variety of infinite impacts, but most probably have a fairly small and limited effect. Humans have lots of beliefs, they can't all have a major impact.

For the catastrophic consequences issue, have you read this?

The slippery slope issue of potentially catastrophic consequences from a model can be limited by establishing arbitrary lines before hand that you refuse to cross. Whether you should sacrifice your beliefs, like with Gandhi, depends on what the value given for said sacrifice is, how valuable your sacrifice is to your models, and what the likelihood of catastrophic failure is. You can swear an oath not to cross those lines, give valuable possessions to people to destroy if you cross those lines so you can heavily limit the chance of catastrophic failure.

Allowing your beliefs to change for any reason other than to better reflect the world, only serves to make you worse at knowing how best to deal with the world.

Yeah, your success rate drops, but your ability to socialize can rise since irrational beliefs are how many think. If your irrational beliefs are of low importance, not likely to cause major issues, and unlikely to cause catastrophic failure they could be helpful.

I think some sort of debating or arguing class would be very helpful. People should have good reasons for why they do things and this applies to most topics.

Some sort of thing where the topics debated would be marked by how well you cited facts or how clear your chains of reasoning were. So if you were asked to discuss why on food was better than another you should have some process like looking up the answer in a text book or looking up stuff from science websites online, interpreting them correctly, and presenting the truth. Lots of fanfare and marks should be awarded for doing this process successfully so people are trained to see this process as valuable. Lots of effort should be made to make sure people look for good advice sources.

On novel questions, reliable processes like asking a bunch of people their opinions, doing some tests, and presenting those results should be suggested.

For specific knowledge fields clear aid should be provided for these topics- finances. Maths should deal with this a lot. Physical health. Biology should deal with this a lot, in a more comprehensive manner than it does now- I've often seen them address illnesses, but rarely seen science curriculems address good health practices. Sexual health and relationship good practice should be addressed a lot more widely. Those are issues everyone is likely to face and everyone should have a solid foundation of knowledge to aid them with rather than a slapsash compilation of things from random people and the internet.

There is a definite likelihood that acting out a belief will cause you to believe it due to your brain poorly distinguishing signalling and true beliefs.

That can be advantageous at times. Some beliefs may be less important to you, and worthy of being sacrificed for the greater good. If you say, believe that forcing people to wear suits is immoral and that veganism is immoral then it may be worth you sacrificing your belief in the unethical nature of suits so you can better stop people eating animals.

A willingness to do this is beneficial in most people who want to join organizations. They normally have a set of arbitrary rules on social conduct, dress, who to respect and who to respect less, how to deal with sickness and weakness, what media to watch, who to escalate issues to in the event of a conflict. If you don't do this you'll find it tricky gaining much power because people can spot people who fake these things.

D&D has often had issues with magic users. They often are stronger than non magic users at all levels. For example, use of the spell sleep allows you to disable a group of enemies with no save allowed. Exploitation is common.

In games you can generally gain a huge amount of power by researching the right choices and doing them.

In the real world that's a lot trickier because people in the past have researched the right choices and heavily exploited and monopolized existing power resources, and any publicly known power resources will likely be heavily exploited. Competition makes it harder than when you're playing with three or four people.