Posts

OpenAI Superalignment: Weak-to-strong generalization 2023-12-14T19:47:24.347Z
Interview with Paul Christiano: How We Prevent the AI’s from Killing us 2023-04-27T14:39:49.571Z
Personal predictions for decisions: seeking insights 2023-02-15T06:45:20.298Z

Comments

Comment by Dalmert on Reflections on Less Online · 2024-07-07T11:06:29.763Z · LW · GW

If anyone reading this feel like they missed out, or this sparked their curiosity, or they are bummed that they might have to wait 11 months for a chance at something similar, or they feel like that so many cool things happen in North America and so few things in Europe, (all preceding "or"s are inclusive) then I can heartily recommend you to come to LessWrong Community Weekend 2024 [Applications Open] in Berlin in about 2 months over the weekend of 13 September. Applications are open as of now.

I've attended it a couple of times so far, and I quite liked it. Reading this article, it seemed very similar and I begun to wonder if LWCW was a big inspiration for LessOnline, or if they had a common source of inspiration. So I do mean to emphasize what I wrote in the first paragraph: if you think you might like something as described here then I strongly encourage you to come!

(If someone attended both then maybe they can weigh in even more authoritatively whether my impression is accurate or if more nuance would be beneficial.)

Comment by Dalmert on Sum-threshold attacks · 2023-09-14T00:38:50.772Z · LW · GW

In a not-too-fast and therefore requisitely stealthy ASI takeover scenario, if the intelligence explosion is not too steep, this could be a main meta-method by which the system gains increasing influence and power while fully remaining under the radar and avoiding detection until it is reasonably sure that it can no longer be opposed. This could be happening without anyone knowing or maybe even being able to know. Frightening.

Comment by Dalmert on AI: Practical Advice for the Worried · 2023-03-17T10:50:07.221Z · LW · GW

The employees of the RAND corporation, in charge of nuclear strategic planning, famously did not contribute to their retirement accounts because they did not expect to live long enough to need them.


Any sources for this? I tried searching around without avail yet, which is surprising if this is indeed famously known.

Comment by Dalmert on Personal predictions for decisions: seeking insights · 2023-02-21T18:51:14.276Z · LW · GW

I expect that until I find a satisfactory resolution to this topic, I might come back to it a few times, and potentially keep a bit of a log here of what I find in case it does add up to something. So far this is one of the things I found:

https://www.lesswrong.com/posts/JnDEAmNhSpBRpjD8L/resolutions-to-the-challenge-of-resolving-forecasts

This seems very relevant to a part of what I was pondering about, but not sure how actionable are the takeaways yet.

Comment by Dalmert on Medlife Crisis: "Why Do People Keep Falling For Things That Don't Work?" · 2023-02-21T07:53:33.488Z · LW · GW

I strong-upvoted this, but I fear you won't see a lot of traction on this forum for this idea.

I have a vague understanding of why, but I don't think I heard compelling enough reasons from other LWers yet. If someone has some, I'd be happy to read them or be pointed towards them.

I value empiricism highly, i.e. putting ideas into action to be tested against the universe; but I think I've read EY state somewhere that a superintelligence would need to perform very few or even zero experiments to find out a lot (or even most? all?) true things about our universe that we humans need painstaking effort and experiments for.

Please don't consider this very vague recollection as anywhere close to a steelman.

I think this was motivated by how much bits of information can be taken in even with human-like senses, and how a single bit of information can halve a set of hypotheses. And where I did not see sufficient motivation for this argument for yet: this can indeed be true for very valuable bits of information, but are we assuming that any entity will easily be able to receive those very valuable bits? Surely a lot of bits are redundant and give no novel information, and some bits are very costly to attain. Sometimes you are lucky if you can even just so much as eliminate a single potential hypothesis, and even that is costly and requires interacting with the universe instead of just passively observing it.

But let's hear it from others!

(I'm not sure if this spectrum of positions have any accepted names, maybe rationalist vs empiricist?)