Posts

What is malevolence? On the nature, measurement, and distribution of dark traits 2024-10-23T08:41:33.197Z
David Althaus's Shortform 2024-06-28T12:22:26.329Z
Many therapy schools work with inner multiplicity (not just IFS) 2022-09-17T10:27:41.350Z
Incentivizing forecasting via social media 2020-12-16T12:15:01.446Z
Decision Theory and the Irrelevance of Impossible Outcomes 2017-01-28T10:16:48.534Z
[Link] How the Simulation Argument Dampens Future Fanaticism 2016-09-09T13:17:53.233Z
In Praise of Maximizing – With Some Caveats 2015-03-15T19:40:06.647Z
Meetup : First LW Meetup in Warsaw 2014-03-22T16:41:08.015Z
Literature-review on cognitive effects of modafinil (my bachelor thesis) 2014-01-08T19:23:47.830Z
Meetup : First Meetup in Cologne (Köln) 2013-10-14T20:13:50.185Z
[Link] Should Psychological Neuroscience Research Be Funded? 2013-04-18T12:13:12.902Z
Meetup : First meetup in Innsbruck 2012-11-21T22:31:18.709Z
Meetup : Munich Meetup, October 28th 2012-09-25T08:29:15.481Z
[LINK] Antidepressants: Bad Drugs... Or Bad Patients? 2012-01-04T21:31:33.387Z
Vasili Arkhipov Day 2011-10-27T06:27:20.018Z
Podcast Recommendations 2011-10-24T16:49:28.124Z
Get genotyped for free ( If your IQ is high enough) 2011-10-01T16:00:31.558Z
Meetup : Munich Meetup, Saturday September 10th, 2PM 2011-09-03T20:03:33.034Z
LINK: Ben Goertzel; Does Humanity Need an "AI-Nanny"? 2011-08-17T18:27:54.422Z
Munich Meetup, Saturday September 10th, 2PM 2011-08-12T14:04:14.871Z
First LW-Meetup in Germany 2011-07-10T08:13:46.039Z
Lesswrongers from the German-speaking world, unite! 2011-05-19T20:32:02.370Z

Comments

Comment by David Althaus (wallowinmaya) on What is malevolence? On the nature, measurement, and distribution of dark traits · 2024-10-27T08:42:14.542Z · LW · GW

Thanks, I mostly agree.

But even in colonialism, individual traits played a role. For example, compare King Leopold II's rule over the Congo Free State vs. other colonial regimes. 

While all colonialism was exploitative, under Leopold's personal rule the Congo saw extraordinarily brutal policies, e.g., his rubber quota system led soldiers to torture and cut off the hands of workers, including children, who failed to meet quotas. Under his rule,1.5-15 million Congolese people died—the total population was only around 15 to 20 million. The brutality was so extreme that it caused public outrage which led other colonial powers to intervene until the Belgian government took control over the Congo Free State from Leopold.

Compare this to, say, British colonial administration during certain periods which, while still overall morally reprehensible, saw much less barbaric policies under some administrators who showed basic compassion for indigenous people. For instance, Governor William Bentinck in India abolished practices like sati (widows burning themselves alive) and implemented other humanitarian reforms. 

One can easily find other examples (e.g. sadistic slave owners vs. more compassionate slave owners). 

In conclusion, I totally agree that power imbalances enabled systemic exploitation regardless of individual temperament. But individual traits significantly affected how much suffering and death that exploitation created in practice.[1] 

  1. ^

    Also, slavery and colonialism were ultimately abolished (in the Western world). My guess is that those who advocated for these reforms were, on average, more compassionate and less malevolent than those who tried to preserve these practices. Of course, the reformers were also heavily influenced by great ideas like the Enlightenment / classic liberalism. 

Comment by David Althaus (wallowinmaya) on What is malevolence? On the nature, measurement, and distribution of dark traits · 2024-10-24T09:20:46.981Z · LW · GW

Thanks, good point! I suppose it's a balancing act and depends on the specifics in question and the amount of shame we dole out. My hunch would be that a combination of empathy and shame ("carrot and stick") may be best.  

Comment by David Althaus (wallowinmaya) on What is malevolence? On the nature, measurement, and distribution of dark traits · 2024-10-24T09:18:30.596Z · LW · GW

I agree that the problem of "evil" is multifactorial with individual personality traits being only one of several relevant factors, with others like "evil/fanatical ideologies" or misaligned incentives/organizations plausibly being overall more important. Still, I think that ignoring the individual character dimension is perilous. 

It seems to me that most people become much more evil when they aren't punished for it. [...] So if we teach AIs to be as "aligned" as the average person, and then AIs increase in power beyond our ability to punish them, we can expect to be treated as a much-less-powerful group in history - which is to say, not very well.

Makes sense. On average, power corrupts / people become more malevolent if no one holds them accountable—but again, there seem to exist interindividual differences with some people behaving much better than others even when having enormous power (cf. this section). 

Comment by David Althaus (wallowinmaya) on David Althaus's Shortform · 2024-07-01T09:46:56.141Z · LW · GW

Thanks. Sorry for not being more clear, I pasted a screenshot (I'm reading the book on Kindle and can't copy-paste) and asked Claude to transcribe the image into written text. 

Again, this is not the first time this happened. Claude refused to help me translate a passage from the Quran (I wanted to check which of two translations was more accurate), refused to transcribe other parts of the above-mentioned Kindle book, and refused to provide me with details about what happened at Tuol Sleng prison. I eventually could persuade Claude in all of these cases but I grew tired of wasting my time and found it frustrating to deal with Claude's obnoxious holier-than-thou attitude. 

Comment by David Althaus (wallowinmaya) on David Althaus's Shortform · 2024-06-30T11:52:11.193Z · LW · GW

I downvoted Claude's response (i.e., clicked the thumbs-down symbol below the response) and selected "overactive refusal" as the reason. I didn't get in contact with Anthropic directly.

Comment by David Althaus (wallowinmaya) on David Althaus's Shortform · 2024-06-28T12:22:26.512Z · LW · GW

I had to cancel my Claude subscription (and signed up for ChatGPT) because Claude (3.5 Sonnet) constantly refuses to transcribe or engage with texts that discuss extremism or violence, even if it's clear that this is done in order to better understand and prevent extremist violence. 

Example text Claude refuses to transcribe below. For context, the text discusses the motivations and beliefs of Yigal Amir who assassinated the Israeli Prime Minister in 1995.

God gave the land of Israel to the Jewish People," he explained, and he, Yigal Amir, was making certain that God's promises, which he believed in with all his heart and to which he had committed his life, were not to be denied. He could not fathom, he declared, how a Jewish state would dare renege on the Jewish birthright, and he could not passively stand by as this terrifying religious tragedy took place. In Amir's thinking, his action was not a personal matter or an act of passion but a solution, albeit an extreme one, to a religious and psychological trauma brought about by the actions of the Rabin government. Though aware of the seriousness of his action, Amir explained that his fervent faith encouraged and empowered him to commit this act of murder. He told his interrogators, "Without believing in God and an eternal world to come, I would never have had the power to do this." Rabin deserved to die because he was facilitating, in Amir's and other militants' view, the possible mass murder of Jews by consenting to the Oslo peace agreements. This made Rabin, according to halacha, or Jewish law, a rodef, someone about to kill an innocent person and whom a bystander may therefore execute without a trial. Rabin was also a moser, a Jew who willingly betrays his brethren, and guilty of treason for cooperating with Yasser Arafat and the Palestinian Authority in surrendering rights to the Holy Land. Jewish jurisprudence considers the actions of the rodef and moser among the most pernicious crimes; persons guilty of such acts are to be killed at the first opportunity.

This type of refusal has happened numerous times. Claude doesn't change its behavior when I provide arguments (unless I spend a lot of time on this). 

I haven't used ChatGPT as much but it so far has never refused.

I hope Anthropic changes Claude so I can continue using it again; I certainly don't like the idea of supporting OpenAI. 

Comment by David Althaus (wallowinmaya) on Making AIs less likely to be spiteful · 2023-09-29T12:19:41.868Z · LW · GW

Really great post! 

It’s unclear how much human psychology can inform our understanding of AI motivations and relevant interventions but it does seem relevant that spitefulness correlates highly (Moshagen et al., 2018, Table 8, N  1,261) with several other “dark traits”, especially psychopathy (r = .74), sadism (r = .59), and Machiavellianism (r = .59). 

(Moshagen et al. (2018) therefore suggest that “[...] dark traits are specific manifestations of a general, basic dispositional behavioral tendency [...] to maximize one’s individual utility— disregarding, accepting, or malevolently provoking disutility for others—, accompanied by beliefs that serve as justifications.”)

Plausibly there are (for instance, evolutionary) reasons for why these traits correlate so strongly with each other, and perhaps better understanding them could inform interventions to reduce spite and other dark traits (cf. Lukas' comment). 

If this is correct, we might suspect that AIs that will exhibit spiteful preferences/behavior will also tend to exhibit other dark traits (and vice versa!), which may be action guiding. (For example, interventions that make AIs less likely to be psychopathic, sadistic, Machiavellian, etc. would also make them less spiteful, at least in expectation.)

Comment by David Althaus (wallowinmaya) on Please don't throw your mind away · 2023-02-16T15:46:28.792Z · LW · GW

Great post, thanks for writing! 

Most of this matches my experience pretty well. I think I had my best ideas during phases (others seem to agree) when I was unusually low on guilt- and obligation-driven EA/impact-focused motivation and was just playfully exploring ideas for fun and out of curiosity.

One problem with letting your research/ideas be guided by impact-focused thinking is that you basically train your mind to immediately ask yourself after entertaining a certain idea for a few seconds "well, is that actually impactful?". And basically all of the time, the answer is "well, probably not". This makes you disinclined to further explore the neighboring idea space. 

However, even really useful ideas / research angles start out being somewhat unpromising and full of hurdles and problems and need a lot of refinement. If you allow yourself to just explore idea space for fun, you might overcome these problems and stumble on something truly promising. But if you had been in an "obsessing about maximizing impact" mindset you would have given up too soon because, in this mindset, spending hours or even days without having any impact feels too terrible to keep going.

Comment by David Althaus (wallowinmaya) on On Blogging and Podcasting · 2023-01-13T11:03:40.229Z · LW · GW

Lol, thanks. :)

Comment by David Althaus (wallowinmaya) on On Blogging and Podcasting · 2023-01-12T16:58:50.582Z · LW · GW

Thanks for this post, I thought this was useful. 

I needed a writing buddy to pick up the momentum to actually write it

I'd be interested in knowing more how this worked in practice (no worries if you don't feel like elaborating/don't have the time!). 

Comment by David Althaus (wallowinmaya) on Let’s think about slowing down AI · 2023-01-06T14:29:41.892Z · LW · GW

I think mostly I expect us to continue to overestimate the sanity and integrity of most of the world, then get fucked over like we got fucked over by OpenAI or FTX. I think there are ways to relating to the rest of the world that would be much better, but a naive update in the direction of "just trust other people more" would likely make things worse.

[...]
Again, I think the question you are raising is crucial, and I have giant warning flags about a bunch of the things that are going on (the foremost one is that it sure really is a time to reflect on your relation to the world when a very prominent member of your community just stole 8 billion dollars of innocent people's money and committed the largest fraud since Enron), [...]

I very much agree with the sentiment of the second paragraph. 

Regarding the first paragraph, my own take is that (many) EAs and rationalists might be wise to trust themselves and their allies less.[1]

The main update of the FTX fiasco (and other events I'll describe later) I'd make is that perhaps many/most EAs and rationalists aren't very good at character judgment.  They probably trust other EAs and rationalists too readily because they are part of the same tribe and automatically assume that agreeing with noble ideas in the abstract translates to noble behavior in practice. 

(To clarify, you personally seem to be good at character judgment, so this message is not directed at you. (I base that mostly on your comments I read about the SBF situation, big kudos for that, btw!)

It seems like a non-trivial fraction of people that joined the EA and rationalist community very early turned out to be of questionable character, and this wasn't noticed for years by large parts of the community. I have in mind people like Anissimov, Helm, Dill, SBF, Geoff Anders, arguably Vassar—these are just the known ones. Most of them were not just part of the movement, they were allowed to occupy highly influential positions. I don't know what the base rate for such people is in other movements—it's plausibly even higher—but as a whole our movements don't seem to be fantastic at spotting sketchy people quickly. (FWIW, my personal experiences with a sketchy, early EA (not on the above list) inspired this post.)

My own takeaway is that perhaps EAs and rationalists aren't that much better in terms of integrity than the outside world and—given that we probably have to coordinate with some people to get anything done—I'm now more willing to coordinate with "outsiders" than I was, say, eight years ago. 

 

  1. ^

    Though I would be hesitant to spread this message; the kinds of people who should trust themselves and their character judgment less are more likely the ones who will not take this message to heart, and vice versa.

Comment by David Althaus (wallowinmaya) on Many therapy schools work with inner multiplicity (not just IFS) · 2022-09-19T10:22:50.410Z · LW · GW

This is mentioned in the introduction. 

I'm biased, of course, but it seems fine to write a post like this. (Similarly, it's fine for CFAR staff members to write a post about CFAR techniques. In fact, I prefer if precisely these people write such posts because they have the relevant expertise.)

Would you like us to add a more prominent disclaimer somewhere? (We worried that this might look like advertising.)

Comment by David Althaus (wallowinmaya) on Many therapy schools work with inner multiplicity (not just IFS) · 2022-09-18T10:37:09.191Z · LW · GW

A quick look through https://www.goodtherapy.org/learn-about-therapy/types/compassion-focused-therapy gives an impression of yet another mix of CBT, DBT and ACT, nothing revolutionary or especially new, though maybe I missed something.

In my experience, ~nothing in this area is downright revolutionary. Most therapies are heavily influenced by previous concepts and techniques. (Personally, I'd still say that CFT brings something new to the table.)

I guess what matters if it works for you or not. 

Is this assertion borne out by twin studies? Or is believing it a test for CFT suitability only?

To some extent. Most human traits have a genetic component, including (Big-Five) personality traits, depressive tendencies, anxiety disorders, conduct disorders, personality disorders, and so on. (e.g., Polderman et al., 2015). This is also true for (self-)destructive tendencies like malevolent personality traits (citing my own summary of some studies here because I'm lazy, sorry).

(Also agree with Kaj's warning about misinterpreting heritability.)

More generally speaking, I'd say this belief is borne out of understanding evolutionary psychology/history. Basically, all of our motivations and fears have an evolutionary basis. We fear death, because the ancestors who didn't were eaten by lions. We fear being ostracized and care about being respected because in the Environment of Evolutionary Adaptedness our survival and reproductive success was dependent on our social status. Therefore, it's to be expected that most humans, at some point or another, worry about death or health problems or feel emotions like jealousy or envy. They don't have to be rooted in some trauma or early life experience—though they are usually exacerbated by them. In most cases, it's not realistic to eliminate such emotions entirely. This doesn't mean that one is an "abnormal" or "defective" person that experienced irreversible harm inflicted by another human sometime in one's development. (Just to be clear, as mentioned in the main text, no one believes that life experiences don't matter. Of course, they matter a great deal!)

But yeah, if you are skeptical of the above, it's a good reason to not seek a CFT therapist. 

Comment by David Althaus (wallowinmaya) on Many therapy schools work with inner multiplicity (not just IFS) · 2022-09-18T09:52:30.580Z · LW · GW

From studying and using all of the above my conclusion is that IFS offers the most tractable approach to this issue of competing 'parts'. And in many ways the most powerful. 

In our experience, different people respond to different therapies. I know several people for whom, say, CFT worked better than IFS. Glad to hear that IFS worked for you!

When you read about modern therapies, they all borrow from one another in a way that did not occur say 50 years ago where there were very entrenched schools of thought.

Yes, that's definitely the case. My sense is that many people overestimate how revolutionary various therapies are because their founders downplay how many concepts and techniques they took from other modalities. (Though this can be advantageous because the "hype" increases motivation and probably fuels various self-fulfilling prophecies.)

Comment by David Althaus (wallowinmaya) on A guide to Iterated Amplification & Debate · 2022-05-27T12:37:45.319Z · LW · GW

For what it's worth, I read/skimmed all of the listed IDA explanations and found this post to be the best explanation of IDA and Debate (and how they relate to each other). So thanks a lot for writing this! 

Comment by David Althaus (wallowinmaya) on Book summary: Unlocking the Emotional Brain · 2021-12-16T13:21:07.818Z · LW · GW

Thanks a lot for this post (and the whole sequence), Kaj! I found it very helpful already. 
 
Below a question I first wanted to ask you via PM but others might also benefit from an elaboration on this. 

You describe the second step of the erasure sequence as follows (emphasis mine): 

>Activating, at the same time, the contradictory belief and having the experience of simultaneously believing in two different things which cannot both be true.

When I try this myself, I feel like I cannot actually experience two things simultaneously. There seems to be at least half a second or so between trying to hold the target schema in consciousness and focusing my attention on disconfirming knowledge or experiences. 

(Generally, I'd guess it's not actually possible to hold two distinct things in consciousness simultaneously, at least that's what I heard various meditation teachers (and perhaps also neuroscientists) claim; you might have even mentioned this in this sequence yourself, if I remember correctly. Relatedly, I heard the claim that multitasking actually involves rapid cycling of one's attention between various tasks, even though it feels from the inside like one is doing several things simultaneously.)

So should I try to minimize the duration between holding the target schema and disconfirming knowledge in consciousness (potentially aiming to literally feel as though I experience both things at once) or is it enough to just keep cycling back and forth between the two every few seconds? (If yes, what about, say, 30 seconds?) 

One issue I suspect I have is that there is a tradeoff between how vividly I can experience the target schema and how rapidly I'm cycling back to the disconfirming knowledge.

Or maybe I'm doing something wrong here? Admittedly, I haven't tried this for more than a minute or so before immediately proceeding to spending 5 minutes on formulating this question. :)

Comment by David Althaus (wallowinmaya) on Malicious non-state actors and AI safety · 2021-04-27T10:53:38.579Z · LW · GW

The post Reducing long-term risks from malevolent actors is somewhat related and might be of interest to you. 

Comment by David Althaus (wallowinmaya) on Tweet markets for impersonal truth tracking? · 2020-11-10T11:05:48.100Z · LW · GW

Cool post! Daniel Kokotajlo and I have been exploring somewhat similar ideas.

In a nutshell, our idea was that a major social media company (such as Twitter) could develop a feature that incentivizes forecasting in two ways. First, the feature would automatically suggest questions of interest to the user, e.g., questions thematically related to the user’s current tweet or currently trending issues. Second, users who make more accurate forecasts than the community will be rewarded with increased visibility. 

Our idea is different in two major ways: 

I.
First, you suggest to directly bet on Tweets whereas as we envisioned that people would bet/forecast on questions that are related to Tweets. 

This seems to have some advantages: There would only be one question related to many thousands of Tweets. Rather than resolving thousands of Tweets, one would only have to resolve one question. Most Tweets are also very imprecise. In contrast, these questions (and their resolution criteria) could be formulated very precisely (partly because one could spend much more time refining them because they are much fewer in number). The drawback is that this might feel less "direct" and "fun" in some ways.

II.
Second, contrary to your idea, we had in mind that the questions would be resolved by employees and not voted on by the public. Our worry is that the public voting would dissolve in easily manipulated popularity contest that might also lead to increased polarization and/or distrust of the whole platform. But it is true that users might not trust employees of Twitter—potentially for good reason! 

Maybe one could combine these two ideas. Maybe the resolution of questions could be done by a committee or court that consists of employees and members of the public (and maybe other people that enjoy a high level of trust such as maybe popular judges or scientists?). Members of this committee could even undergo a selection and training process, maybe somewhat similar to the selection and training process of US juries which seem to be widely trusted to make reasonable decisions.

 

Comment by David Althaus (wallowinmaya) on Melatonin: Much More Than You Wanted To Know · 2019-03-31T18:12:46.643Z · LW · GW

Regarding how melatonin might cause more vivid dreams. I found the theory put forward here quite plausible:

There are user reports that melatonin causes vivid dreams. Actually, all sleep aids appear to some users to produce more vivid dreams.

What is most likely happening is that the drug modifies the sleep cycle so the person emerges from REM sleep (when dreams are most vivid) to waking quickly – more quickly that when no drug is used. The user subjectively reports the drug as producing vivid dreams.

Comment by David Althaus (wallowinmaya) on Anti-tribalism and positive mental health as high-value cause areas · 2018-08-02T10:00:04.895Z · LW · GW

Great that you're thinking about this issue! A few sketchy thoughts below:

I) As you say, autistic people seem to be more resilient with regards to tribalism. And autistic tendencies and following rationality communities arguably correlates as well. So intuitively, it seems that something like higher rationality and awareness of biases could be useful for reducing tribalism. Or is there another way of making people "more autistic"?

Given this and other observations (e.g., autistic people seem to have lower mental health, on average), it seems a bit hasty to focus on increasing general mental health as the most effective intervention for reducing tribalism.

II) Given our high uncertainty of what causes tribalism and how to most effectively reduce it, it seems that more research in this area could be one of the most effective cause areas.

I see at least two avenues for such research:

A) More "historical" and correlational research. First, we might want to operationalize 'tribalism' or identify some proxies for it (any ideas?). Then we could do some historical studies and find potential correlates. It would be interesting to study to what extent increasing economic inequality, the advent of social media, and other forces have historically correlated with the extent of tribalism.

B) Potentially more promising would be experimental psychological research aimed to identify causal factors and mediators of tribalism. For example, one could present subjects with various interventions and then see which intervention reduce (or increase!) tribalism. Potential interventions include i) changing people's mood (e.g., presenting them with happy videos), ii) increasing the engagement of controlled cognitive processes (system 2) (e.g. by priming them with the CRT), iii) or decreases the engagement of such processes (e.g. via cognitive load), iv) using de-biasing techniques, v) decreasing or increasing their sense of general security (by e.g. presenting them with threatening or scary images or scenarios). There are many more possible interventions.

C) Another method would be correlational psychological research. Roughly, one could give subjects a variety of personality tests and other psychological scales (e.g. Big Five, CRT, etc.) and examine what correlates with tribalistic tendencies.

D) Another idea would be to develop some sort of "tribalism scale" which could lay the groundwork for further psychological research.

Of course, first one should do a more thorough literature review on this topic. It seems likely that there already exists some good work in this area.

--------

Even more sketchy thoughts:

III) Could it be that some forms of higher mental health actually increase tribalism? Tribalism also goes along with a feeling of belonging to a "good" group/tribe that fights against the bad tribe. Although at times frustrating this might contribute to a sense of certainty and "having a mission or purpose". Personally, I feel quite depressed and frustrated by not being able to wholeheartedly identify with any major political force because they currently all seem pretty irrational in many areas. Of course, higher mental health will probably reduce your need to belong to a group and thus might still reduce tribalism.

IV) Studies (there was another one which I can't find at the moment) seem to indicate that social media posts (e.g. on Twitter or Facebook) involving anger or outrage spread more easily than posts involving all other emotions like sadness, joy, etc. So maybe altering the architecture of Facebook or twitter would be particularly effective (e.g. tweaking the news feed algorithm such that posts with a lot of anger reactions get less traction). Of course, this is pretty unlikely to be implemented. It also has disadvantages in the case of justified outrage. Maybe encouraging people to create new social networking sites that somehow alleviate those problems would be useful but that seems pretty far-fetched.

Comment by David Althaus (wallowinmaya) on A Step-by-step Guide to Finding a (Good!) Therapist · 2018-08-01T10:57:45.431Z · LW · GW

Can one use the service reflect also if one is not located in the Bay Area? Or do you happen to know of similar services for outside the Bay Area or US? Thanks a lot in advance.

Comment by David Althaus (wallowinmaya) on LW 2.0 Open Beta Live · 2017-10-01T11:24:17.407Z · LW · GW

The open beta will end with a vote of users with over a thousand karma on whether we should switch the lesswrong.com URL to point to the new code and database

How will you alert these users? (I'm asking because I have over 1000 karma but I don't know where I should vote.)

Comment by David Althaus (wallowinmaya) on S-risks: Why they are the worst existential risks, and how to prevent them · 2017-06-21T18:41:50.852Z · LW · GW

One of the more crucial points, I think, is that positive utility is – for most humans – complex and its creation is conjunctive. Disutility, in contrast, is disjunctive. Consequently, the probability of creating the former is smaller than the latter – all else being equal (of course, all else is not equal).

In other words, the scenarios leading towards the creation of (large amounts of) positive human value are conjunctive: to create a highly positive future, we have to eliminate (or at least substantially reduce) physical pain and boredom and injustice and loneliness and inequality (at least certain forms of it) and death, etc. etc. etc. (You might argue that getting "FAI" and "CEV" right would accomplish all those things at once (true) but getting FAI and CEV right is, of course, a highly conjunctive task in itself.)

In contrast, disutility is much more easily created and essentially disjunctive. Many roads lead towards dystopia: sadistic programmers or failing AI safety wholesale (or "only" value-loading or extrapolating, or stable self-modification), or some totalitarian regime takes over, etc. etc.

It's also not a coincidence that even the most untalented writer with the most limited imagination can conjure up a convincing dystopian society. Envisioning a true utopia in concrete detail, on the other hand, is nigh impossible for most human minds.

Footnote 10 of the above mentioned s-risk-static makes a related point (emphasis mine):

"[...] human intuitions about what is valuable are often complex and fragile (Yudkowsky, 2011), taking up only a small area in the space of all possible values. In other words, the number of possible configurations of matter constituting anything we would value highly (under reflection) is arguably smaller than the number of possible configurations that constitute some sort of strong suffering or disvalue, making the incidental creation of the latter ceteris paribus more likely."

Consequently, UFAIs such as paperclippers are more likely to create large amounts of disutility than utility (factoring out acausal considerations) incidentally (e.g. because creating simulations is instrumentally useful for them).

Generally, I like how you put it in your comment here:

In terms of utility, the landscape of possible human-built superintelligences might look like a big flat plain (paperclippers and other things that kill everyone without fuss), with a tall sharp peak (FAI) surrounded by a pit that's astronomically deeper (many almost-FAIs and other designs that sound natural to humans). The pit needs to be compared to the peak, not the plain. If the pit is more likely, I'd rather have the plain.

Yeah. In a nutshell, supporting generic x-risk-reduction (which also reduces extinction risks) is in one's best interest, if and only if one's own normative trade-ratio of suffering vs. happiness is less suffering-focused than one's estimate of the ratio of expected future happiness to suffering (feel free to replace "happiness" with utility and "suffering" with disutility). If one is more pessimistic about the future or if one needs large amounts of happiness to trade-off small amounts of suffering, one should rather focus on s-risk-reduction instead. Of course, this simplistic analysis leaves out issues like cooperation with others, neglectedness, tractability, moral uncertainty, acausal considerations, etc.

Do you think that makes sense?

Comment by David Althaus (wallowinmaya) on S-risks: Why they are the worst existential risks, and how to prevent them · 2017-06-21T06:50:26.393Z · LW · GW

The article that introduced the term "s-risk" was shared on LessWrong in October 2016. The content of the article and the talk seem similar.

Did you simply not come across it or did the article just (catastrophically) fail to explain the concept of s-risks and its relevance?

Comment by David Althaus (wallowinmaya) on Requesting Questions For A 2017 LessWrong Survey · 2017-04-16T16:33:20.048Z · LW · GW

Here is another question that would be very interesting, IMO:

“For what value of X would you be indifferent about the choice between A) creating a utopia that lasts for one-hundred years and whose X inhabitants are all extremely happy, cultured, intelligent, fair, just, benevolent, etc. and lead rich, meaningful lives, and B) preventing one average human from being horribly tortured for one month?"

Comment by David Althaus (wallowinmaya) on Requesting Questions For A 2017 LessWrong Survey · 2017-04-14T09:51:17.979Z · LW · GW

I think it's great that you're doing this survey!

I would like to suggest two possible questions about acausal thinking/superrationality:

1)

Newcomb’s problem: one box or two boxes?

  • Accept: two boxes
  • Lean toward: two boxes
  • Accept: one box
  • Lean toward: one box
  • Other

(This is the formulation used in the famous PhilPapers survey.)

2)

Would you cooperate or defect against other community members in a one-shot Prisoner’s Dilemma?

  • Definitely cooperate
  • Leaning toward: cooperate
  • Leaning toward: defect
  • Definitely defect
  • Other

I think that these questions are not only interesting in and of itself, but that they are also highly important for further research I'd like to conduct. (I can go into more detail if necessary.)

Comment by David Althaus (wallowinmaya) on Net Utility and Planetary Biocide · 2017-04-09T07:42:18.479Z · LW · GW

First of all, I don't think that morality is objective as I'm a proponent of moral anti-realism. That means that I don't believe that there is such a thing as "objective utility" that you could objectively measure.

But, to use your terms, I also believe that there currently exists more "disutility" than "utility" in the world. I'd formulating it this way: I think there exists more suffering (disutility, disvalue, etc.) than happiness (utility, value, etc.) in the world today. Note that this is just a consequence of my own personal values, in particular my "exchange rate" or "trade ratio" between happiness and suffering: I'm (roughly) utilitarian but I give more weight to suffering than to happiness. But this doesn't mean that there is "objectively" more disutility than utility in the world.

For example, I would not push a button that creates a city with 1000 extremely happy beings but where 10 people are being tortured. But a utilitarian with a more positive-leaning trade ratio might want to push the button because the happiness of the 1000 outweighs the suffering of the 10. Although we might disagree, neither of us is "wrong".

Similar reasoning applies with regards to the "expected value" of the future. Or to use a less confusing term: The ratio of expected happiness to suffering of the future. Crucially, this question has both an empirical as well as a normative component. The expected value (EV) of the future for a person will both depend on her normative trade ratio as well as her empirical beliefs about the future.

I want to emphasize, however, that even if one thinks that the EV of the future is negative, one should not try to destroy the world! There are many reasons for this so I'll just pick a few: First of all, it's extremely unlikely that you will succeed and will probably only cause more suffering in the process. Secondly, planetary biocide is one of the worst possible things one can do according to many value systems. I think it's extremely important to be nice to other value systems and promote cooperation among their proponents. If you attempted to implement planetary biocide you would cause distrust, probably violence and the breakdown of cooperation, which will only increase future suffering, hurting everyone in expectation.

Below, I list several more relevant essays that expand on what I've written here and which I can highly recommend. Most of these link to the Foundational Research Institute (FRI) which is not a coincidence as FRI's mission is to identify cooperative and effective strategies to reduce future suffering.

I. Regarding the empirical side of future suffering

II. On the benefits of cooperation

III. On ethics

Comment by David Althaus (wallowinmaya) on The Library of Scott Alexandria · 2017-04-08T17:18:11.930Z · LW · GW

Great list!

IMO, one should add Prescriptions, Paradoxes, and Perversities to the list. Maybe to the section "Medicine, Therapy, and Human Enhancement".

Comment by David Althaus (wallowinmaya) on Seven Apocalypses · 2016-09-29T16:30:29.984Z · LW · GW

I don't understand why you exclude risks of astronomical suffering ("hell apocalypses").

Below you claim that those risks are "Pascalian" but this seems wrong.

Comment by David Althaus (wallowinmaya) on Meetup : First meetup of Rationality Zürich · 2015-10-24T09:39:46.812Z · LW · GW

Cool that you are doing this!

Is there also a facebook event?

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-17T10:55:09.943Z · LW · GW

That's not true -- for example, in cases where the search costs for the full space are trivial, pure maximizing is very common.

Ok, sure. I probably should have written that pure maximizing or satisficing is hard to find in important, complex and non-contrived instances. I had in mind such domains as career, ethics, romance, and so on. I think it's hard to find a pure maximizer or satisficer here.

My objection is stronger. The behavior of optimizing for (gain - cost) does NOT lie on the continuum between satisficing and maximizing as defined in your post, primarily because they have no concept of the cost of search.

Sorry, I fear that I don't completely understand your point. Do you agree that there are individual differences in people, such that some people tend to search longer for a better solution and other people are more easily satisfied with their circumstances – be it their career, their love life or the world in general?

Maybe I should have tried an operationalized definition: Maximizers are people who get high scores on this maximization scale (page 1182) and satisficers are people who get low scores.

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-16T18:23:38.354Z · LW · GW

But you don't seem to have made a compelling argument that such people are worse off than epistemic maximisers.

If we just consider personal happiness, then I agree with you – it's probably even the case that epistemic satisficers are happier than epistemic maximizers. But many of us don't live for the sake of happiness alone. Furthermore, it's probably the case that epistemic maximizers are good for society as a whole. If every human had been an epistemic satisficer we never would have discovered the scientific method or eradicated small pox, for example.

Also, discovering and following your terminal values is good for you almost by definition, I would say, so either we are using terms differently or I'm misunderstanding you. Let's say one of your terminal values is to increase happiness and to reduce suffering. Because you are a Catholic you think the best way to do this is to convert as many people to Catholicism as possible (because then they won't go to hell and will go to heaven). However, if Catholicism is false, then your method is wholly suboptimal and then it lies in your interest to discover the truth and being an epistemic maximizer (rational) certainly would help with this.

With regards to your romantic example, I also agree. Romantic satisficers are probably happier than romantic maximizers. Therefore I wrote in the introduction:

For example, Schwartz et al. (2002) found "negative correlations between maximization and happiness, optimism, self-esteem, and life satisfaction, and positive correlations between maximization and depression, perfectionism, and regret."

Again: But in all those examples, we are only talking about your personal happiness. Satisficer are probably happier than maximizers, but they are less likely to reach their terminal values – if they value other things besides their own happiness, which many people do: Many people wouldn't enter the experience machine, for example. But sure, if your only terminal value is your happiness then you should definitely try hard to become a satisficer in every domain.

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-16T17:43:16.677Z · LW · GW

Continuing my previous comment

That's not satisficing because I don't take the first option alternative that is good enough. That's also not maximizing as I am not committed to searching for the global optimum.

I agree: It's neither pure satisficing nor pure maximizing. Generally speaking, in the real world it's probably very hard to find (non-contrived) instances of pure satisficing or pure maximizing. In reality, people fall on a continuum from pure satisficers to pure maximizers (I did acknowledge this in footnotes 1 and 2, but I probably should have been clearer).

But I think it makes sense to assert that certain people exhibit more satisficer-characteristics and others exhibit more maximizer-characteristics. For example, imagine that Anna travels to 127 different countries and goes to over 2500 different cafes to find the best chocolate cookie. Anna could be meaningfully described as a "cookie-maximizer", even if she gave up after 10 years of cookie-searching although she wasn't able to find the best chocolate cookie on planet Earth. :)

Somewhat relatedly, someone might be a maximizer in a certain domain, but a satisficer in another domain. I'm for example a satisficer when it comes to food and interior decoration, but (more of) a maximizer in other domains.

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-16T15:20:44.859Z · LW · GW

I see no mention of costs in these definitions.

Let's try a basic and, dare I say it, rational way of trying to achieve some outcome: you look for a better alternative until your estimate of costs for further search exceeds your estimate of the gains you would get from finding a superior option.

Agree. Thus in footnote 3 I wrote:

[3] Rational maximizers take the value of information and opportunity costs into account.

Continuation of this comment

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-15T08:33:47.907Z · LW · GW

You've got me there :)

Comment by David Althaus (wallowinmaya) on In Praise of Maximizing – With Some Caveats · 2015-03-14T22:03:14.634Z · LW · GW

But what does one maximize?

Expected utility :)

We can not maximize more than one thing (except in trivial cases).

I guess I have to disagree. Sure, in any given moment you can maximize only one thing but this is simply not true for larger time horizons. Let's illustrate this with a typical day of Imaginary John: He wakes up and goes to work at an investment bank to earn money (money maximizing) to donate it later to GiveWell (ethical maximizing). Later at night he goes on OKCupid/or to a party to find his true soulmate (romantic maximizing). He maximized three different things in just one day. But I agree that there are always trade-offs. John could had worked all day instead of going to the party.

I imagine that most of the components of that function are subject to diminishng returns, and such components I would satisfice. So I understand this whole thing as saying that these things have the potential for unbounded linear or superlinear utility?

I think that some components of my utility function are not subject to diminishing returns. Let's use your first example, "epistemic rationality". Epistemic rationality is basically about acquiring true beliefs or new (true) information. But sometimes learning new information can radically change your whole life and thus is not subject to diminishing marginal returns. To use an example: Let's imagine you are a consequentialist and donate to charities to help blind people in the USA. Then you learn about effective altruism and cost-effectiveness and decide to donate to the most effective charities. Reading such arguments has just increased your positive impact on the world by a hundredfold! (Btw, Bostrom uses the term "crucial consideration" exactly for such things.) But sure, at some point, you gonna hit diminishing returns.

On to the next issue – Ethics: Let's say one value of mine is to reduce suffering (what could be called non-suffering maximizing). This value is also not subject to diminishing marginal returns. For example, imagine 10.000 people getting tortured (sorry). Saving the first 100 people from getting tortured is as valuable to me as saving the last 100 people.

Admittedly, with regards to social interactions there is an upper bound. But this upper bound is probably higher than most seem to assume. Also, it occurred to me that one has to distinguish between the quality and the quantity of one's social interactions. The quality of one's social interactions is unlikely to be subject to diminishing marginal returns any time soon. However, the quantity of social interactions definitely is subject to diminishing marginal returns (see e.g. Dunbar's number).

Btw, "attention" is another resource that actually has increasing marginal returns (I've stolen this example from Valentine Smith who used it in a CFAR workshop).

But I agree that unbounded utility functions can be problematic (but bounded ones, too.) However, satisficing might not help you with this.

Comment by David Althaus (wallowinmaya) on Open thread, Mar. 9 - Mar. 15, 2015 · 2015-03-12T17:35:33.869Z · LW · GW

Again, I'm just giving quick feedback. Hopefully you've already given more detail in essay. Other than that, your summary seems fine to me.

Thanks! And yeah, ending aging and death are some of the examples I gave in the complete essay.

Comment by David Althaus (wallowinmaya) on Open thread, Mar. 9 - Mar. 15, 2015 · 2015-03-09T09:57:22.291Z · LW · GW

I wrote an essay about the advantages (and disadvantages) of maximizing over satisficing but I’m a bit unsure about its quality, that’s why I would like to ask for feedback here before I post it on LessWrong.

Here’s a short summary:

According to research there are so called “maximizers” who tend to extensively search for the optimal solution. Other people — “satisficers” — settle for good enough and tend to accept the status quo. One can apply this distinction to many areas:

Epistemology/Belief systems: Some people, one could describe them as epistemic maximizers, try to update their beliefs until they are maximally coherent and maximally consistent with the available data. Other people, epistemic satisficers, are not as curious and are content with their belief system, even if it has serious flaws and is not particularly coherent or accurate. But they don’t go to great lengths to search for a better alternative because their current belief system is good enough for them.

Ethics: Many people are as altruistic as is necessary to feel good enough; phenomenons like “moral licensing” and “purchasing of moral satisfaction” are evidence in favor of this. One could describe this as ethical satisficing. But there are also people who try to extensively search for the best moral action, i.e. for the action that does the most good (with regards to their axiology). Effective altruists are good example for this type of ethical maximizing.

Social realm/relationships: This point is pretty obvious.

Existential/ big picture questions: I’m less sure about this point but it seems like one could apply the distinction also here. Some people wonder a lot about the big picture, spent a lot of time reflecting on their terminal values and how to reach them in an optimal way. Nick Bostrom would be good example for the type of person I have in mind here and what could be called “existential maximizing”. In contrast, other people, not necessarily less intelligent or curious, don’t spend much time thinking about such crucial considerations. They take the fundamental rules of existence and the human condition (the “existential status quo”) as a given and don’t try to change it. Relatedly, transhumanists could also be thought of as existential maximizers in the sense that they are not satisfied with the human condition and try to change it – and maybe ultimately reach an “optimal mode of existence”.

What is “better”? Well, research shows that satisficers are happier and more easygoing. Maximizers tend to be more depressed and “picky”. They can also be quite arrogant and annoying. On the other hand, maximizers are more curious and always try hard to improve their life – and the lives of other people, which is nice.

I would really love to get some feedback on it.

Comment by David Althaus (wallowinmaya) on Attempted Telekinesis · 2015-02-25T11:04:38.076Z · LW · GW

Great post. Some cases of "attempted telekinesis" seem to be similar to "shoulding at the universe".

To stay with your example: I can easily imagine that if I were in your place and experienced this stressful situation with CFAR, my system 1 would have became emotionally upset and "shoulded" at the universe: "I shouldn't have to do this alone. Someone should help me. It is so unfair that I have so much responsibility".

This is similar to attempted telekinesis in the sense that my system 1 somehow thinks that just by becoming emotionally upset it will magic someone (or the universe itself) into helping me and improving my situation.

Shoulding at the universe is also a paradigmatic example of a wasted motion. Realizing this helped me a lot because I used to should at the universe all the time ("I shouldn't have to learn useless stuff for university because I don't have enough time to do important work."; "This guy shouldn't be so irrational and strawman my arguments"; etc. etc.)

Comment by David Althaus (wallowinmaya) on question: the 40 hour work week vs Silicon Valley? · 2014-10-24T12:54:33.548Z · LW · GW

Two words: Interindividual differences.

They also recommend 8-9 hours sleep. Some people need more, some people need less. The same point applies to many different phenomena.

Comment by David Althaus (wallowinmaya) on [LINK] 2014 Fields Medals and Nevanlinna Prize anounced · 2014-08-13T13:26:24.346Z · LW · GW

I think Bostrom puts it nicely in his new book "Superintelligence":

A colleague of mine likes to point out that a Fields Medal (the highest honor in mathematics) indicates two things about the recipient: that he was capable of accomplishing something important, and that he didn't.

Comment by David Althaus (wallowinmaya) on Bragging Thread, August 2014 · 2014-08-05T18:27:46.940Z · LW · GW

I translated the essay Superintelligence and the paper In Defense of posthuman Dignity by Nick Bostrom into German in order to publish them on the blog of GBS Schweiz.

He thanked me by sending me a signed copy of his new book "Superintelligence". Which made me pretty happy.

Comment by David Althaus (wallowinmaya) on Meetup : First LW Meetup in Warsaw · 2014-03-23T19:45:01.176Z · LW · GW

I changed the privacy settings. Link should work now.

Comment by David Althaus (wallowinmaya) on Meetup : First LW Meetup in Warsaw · 2014-03-22T22:36:26.764Z · LW · GW

Don't know how useful that is, but I created a FB event: https://www.facebook.com/events/360486800773506/?ref_dashboard_filter=upcoming&source=1

Comment by David Althaus (wallowinmaya) on Meetup : First LW Meetup in Warsaw · 2014-03-22T22:28:22.971Z · LW · GW

Cool, yeah, I'm going to the Berlin Meetup. See you there!

Comment by David Althaus (wallowinmaya) on Rationalists Are Less Credulous But Better At Taking Ideas Seriously · 2014-02-05T23:16:51.224Z · LW · GW

You got me kinda scared. I just use Evernote or wordpress for all my important writing. That should be enough, right?

Comment by David Althaus (wallowinmaya) on Are Your Enemies Innately Evil? · 2014-02-03T13:19:03.391Z · LW · GW

Great post of course.

If it took a mutant to do monstrous things, the history of the human species would look very different. Mutants would be rare.

Maybe I'm missing something, but shouldn't it read: "Mutants would not be rare." ? Many monstrous things happened in human history, so if only mutants could do evil deeds, there would have to be a lot of them. Furthermore, mutants are rare, so no need for the subjunctive "would".

Comment by David Althaus (wallowinmaya) on Literature-review on cognitive effects of modafinil (my bachelor thesis) · 2014-01-10T19:14:24.834Z · LW · GW

But... I read quickly through it, and I saw no meta-analysis. Just a literature review. What's with the post title?

You're right. I don't remember why I wrote "meta-analysis". (Probably because it sounds fancy and smart). I updated the title.

Is this referring to effect sizes or p-values?

p-values.

Eh. Absence of improvement != damage.

True.

...Randal 2004 didn't find a statistically-significant decrease...

No. In Randall et al. (2004) participants in the 200 mg modafinil condition made significantly more errors (p<0,05) in the Intra/Extradimensional Set Shift task than participants in the placebo and the 100 mg modafinil condition. (The 200 mg group made on average around 27 errors. The 100 mg group around 14. The control group around 17 errors.)

Actually, you linked to a different study. The results can be found in the complete study I linked to. I can upload it if you want to see it yourself.

Reprinted from Baranski et al. (2004) without permission.

Every single graphic in this whole thing is reprinted without permission, to tell the truth. (Is this a problem?)

Comment by David Althaus (wallowinmaya) on Literature-review on cognitive effects of modafinil (my bachelor thesis) · 2014-01-09T12:14:49.102Z · LW · GW

Well, I take modafinil primarily as a motivation-enhancer.

Comment by David Althaus (wallowinmaya) on Meetup : First Meetup in Cologne (Köln) · 2013-12-12T12:41:57.107Z · LW · GW

In my opinion it went good. Maybe I'll organize another meetup in january or february. Sure, we can play "Paranoid debating".