Posts

Comments

Comment by Joern Stoehler on Most experts believe COVID-19 was probably not a lab leak · 2024-02-03T12:50:01.368Z · LW · GW

Imo mildly misleading. I expect large parts of the 85% to just not have read their mails, or to have been too busy to answer what may look to them like a mildly useful survey.

Comment by Joern Stoehler on Decent plan prize announcement (1 paragraph, $1k) · 2024-01-12T14:12:05.832Z · LW · GW

Why are you concerned in that scenario? Any more concrete details on what you expect to go wrong?

I don't think there's a cure-it-all solution, except "don't build it", and even that might be counterproductive in some edge cases.

Comment by Joern Stoehler on Why Yudkowsky is wrong about "covalently bonded equivalents of biology" · 2023-12-07T07:52:00.034Z · LW · GW

Addendum: I just learned that dipole-dipole interaction are classified as a type of vdW force in chemistry. This is different from solid state physics, where vdW is reserved for the quantum mechanical effect of induced dipole - induced dipole interaction.

So it's indeed vdW forces that keep a protein in its shape. (This might also explain why OP found different oom for their strength?)

Comment by Joern Stoehler on Why Yudkowsky is wrong about "covalently bonded equivalents of biology" · 2023-12-06T16:14:26.035Z · LW · GW

When discussing the stability of proteins, I mostly think of their folding, not whether their primary or secondary structure breaks.

The free energy difference between folded and unfolded states of a typical protein is allegedly (not an expert!) in the range 21-63 kJ/mol. So way less than a single covalent bond.

I have a friend who does his physics PhD on protein folding, and from what I remember he mostly simulates the surface charge of proteins, i.e. cares about dipole-dipole interactions (the weaker version of ionic bonds) and interaction effects with the surrounding water (again dipole-dipole afaict).

This suggests that vdW forces aren't all that important, but the energy scale you get from imagining vdW forces is still way better than when imagining covalent bonds.

Regarding how to do enzyme-like catalysts with covalent nanotech: my first guess is that we'd want to build a structure that has several "folded"/usable states close in energy, e.g. due to rotational degrees of freedoms in the covalent bonds. This way "unfolding"/breaking the machine requires a lot of energy, while it can still mechanically move to catalyze a chemical reaction at low activation energies.

Comment by Joern Stoehler on Lying to chess players for alignment · 2023-10-26T07:36:06.695Z · LW · GW

See Table 2 in https://www.emilkirkegaard.com/p/skill-vs-luck-in-games for

[...] the corresponding winning probability of a player who is exactly one standard deviation better than his opponent. We refer to this probability as p^sd . For comparison, we also provide the winning probablities when a 99% percentile player is matched against a 1% percentile player, which we call p99 1 .

Go & Chess (p^sd=83.3,72.9) are notably above Backgammon (p^sd=53.6%)

Comment by Joern Stoehler on Superforecasting the premises in “Is power-seeking AI an existential risk?” · 2023-10-21T09:47:53.397Z · LW · GW

Seconded for whatever group I participated in.

Comment by Joern Stoehler on A short calculation about a Twitter poll · 2023-08-15T15:29:15.560Z · LW · GW

I expect that other voters correlate with my choice, and so I am not just deciding 1 vote, but actually a significant fraction of votes.

If the number of uncorrelated blue voters, plus the number of people who vote identical to me exceeds 50%, then I can save the uncorrelated blue voters.

More formally: let R, B, C denote the fraction of uncorrelated red, uncorrelated blue and correlated voters that will vote the same as you do. Let S be how large a fraction of people you'd let die in order to save yourself (i.e. some measure of selfishness).

Then choosing blue over red gives you extra utility/lives saved depending on what R,B,C,S are.

If B>0.5 then the utility difference is 0.

If B<0.5 and B+C>0.5 then the difference is +B.

If B+C<0.5 then the difference is -(C+S).

By taking the expectation over your uncertainties about what B,R,C might be, for example by averaging across some randomly chosen scenarios that seem like they properly cover your uncertainty, you get the difference in expected utility between voting blue and red.

Estimating C,R,B can be done by guessing which algorithms other voters use to decide their votes, and how much those algorithms equal your own. Getting good precision on the latter part probably involves also guessing the epistemic state of other voters, i.e. their guesses for C,R,B, and doing some more complicated game theory and solving for equilibria.

Comment by Joern Stoehler on Corrigibility, Much more detail than anyone wants to Read · 2023-05-10T11:03:50.447Z · LW · GW

Thanks for this concise post :) If we set I actually worry that agent will not do nothing, but instead prevent us from doing anything that reduces . Imo it is not easy to formalize such that we no longer want to reduce ourselves. For example, we may want to glue a vase onto a fixed location inside our house, preventing it from accidentally falling and breaking. This however also prevents us from constantly moving the vase around the house, or from breaking it and scattering the pieces for maximum entropy.

Building an aligned superintelligence may also reduce , as the SI steers the universe into a narrow set of states.