AI Safety reading group

post by SoerenE · 2017-01-28T12:07:17.681Z · LW · GW · Legacy · 10 comments

I am hosting a weekly AI Safety reading group, and perhaps someone here would be interested in joining.

Here is what the reading group has covered so far:

Next week, on Wednesday the 1st of February 19:45 UTC, we will discuss "How Feasible is the Rapid Development of Artificial Superintelligence?" by Kaj Sotala. I publish some slides before each meeting, and present the article, so you can also join if you have have not read the article. 

To join, add me on Skype ("soeren.elverlin"). General coordination happens on a Facebook group, at

You can see the time in your local timezone here:


Comments sorted by top scores.

comment by ignoranceprior · 2017-01-28T23:19:02.173Z · LW(p) · GW(p)

You could advertise this on /r/ControlProblem too.

Replies from: SoerenE
comment by SoerenE · 2017-01-29T19:57:21.653Z · LW(p) · GW(p)

Good idea. I will do so.

comment by LawrenceC · 2017-01-30T17:03:31.407Z · LW(p) · GW(p)

Thanks Søren! Could I ask what you're planning on covering in the future? Is this mainly going to be a technical or non-technical reading group?

I noticed that your group seems to have covered a lot of the basic readings on AI Safety, but I'm curious what your future plans.

Replies from: SoerenE
comment by SoerenE · 2017-01-30T20:10:51.540Z · LW(p) · GW(p)

There are no specific plans - at the end of each session we discuss briefly what we should read for next time. I expect it will remain a mostly non-technical reading group.

comment by RedMan · 2017-01-30T14:33:59.804Z · LW(p) · GW(p)

What evil can be perpetrated by AGI that cannot be perpetrated by a sufficiently capable human or group of colluding humans?

Leo Szilard could probably have built a bomb that would wipe out the human race, we are still here, and do not credit that to the success of developing a 'Friendly Hungarian' or the success of the 'Hungarian Safety' research community. Arguably, Edward Teller was a 'slightly unfriendly' Hungarian, and we did OK with him too.

Replies from: SoerenE
comment by SoerenE · 2017-01-30T14:50:51.154Z · LW(p) · GW(p)

The word 'sufficiently' makes your claim a tautology. A 'sufficiently' capable human is capable of anything, by definition.

Your claim that Leo Szilard probably could have wiped out the human race seems very far from the historical consensus.

Replies from: RedMan
comment by RedMan · 2017-01-30T14:56:25.693Z · LW(p) · GW(p)

He produced a then novel scenario for a technological development which could potentially have that consequence:

He also worked in the field of nuclear weapons development, and may have had access to the necessary material, equipment, and personnel required to construct such a device, or modify an existing device intended for use in a nuclear test.

I assert that my use of 'sufficiently' in this context is appropriate, the intellectual threshold for humanity-destroying action is fairly low, and certainly within the capacity of many humans today.

Replies from: SoerenE
comment by SoerenE · 2017-01-30T15:55:40.645Z · LW(p) · GW(p)

Do you think Leo Szilard would have had more success through through overt means (political campaigning to end the human race) or surreptitiously adding kilotons of cobalt to a device intended for use in a nuclear test? I think both strategies would be unsuccessful (p<0.001 conditional on Szilard wishing to kill all humans).

I fully accept the following proposition: IF many humans currently have the capability to kill all humans THEN worrying about long-term AI Safety is probably a bad priority. I strongly deny the antecedent.

I guess the two most plausible candidates would be Trump and Putin, and I believe they are exceedingly likely to leave survivors (p=0.9999).

Replies from: RedMan
comment by RedMan · 2017-01-30T16:49:38.537Z · LW(p) · GW(p)

Addressing your question, Szilard's political action:–Szilárd_letter directly led to the construction of the a-bomb and the nuclear arms race. The jury is still out on whether that wipes out the human race.

I assert that at present, the number of AGIs capable of doing as much damage as the two human figures you named is zero. I further assert that the number of humans capable of doing tremendous damage to the earth or the human race is likely to increase, not decrease.

I assert that the risk posed of AGI acting without human influence destroying the human race will never exceed the risk of humans, making use of technology (including AGI), destroying the human race through malice or incompetence.

Therefore, I assert that your If-Then statement is more likely to become true in the future than the opposite (if no humans have the capability to kill all humans then long-term ai safety is probably a good priority).

Replies from: SoerenE
comment by SoerenE · 2017-01-30T20:40:27.703Z · LW(p) · GW(p)

I think I agree with all your assertions :).

(Please forgive me for a nitpick: The opposite statement would be "Many humans have the ability to kill all humans AND AI Safety is a good priority". NOT (A IMPLIES B) is equivalent to A AND NOT B. )