LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

david-gross on David Gross's Shortform

According to Seigen Ishin (Ch'ing-yüan Wei-hsin):
"Before a man studies Zen, to him mountains are mountains and waters are waters; after he gets an insight into the truth of Zen through the instruction of a good master, mountains to him are not mountains and waters are not waters; but after this when he really attains to the abode of rest, mountains are once more mountains and waters are waters."
(D. T. Suzuki, Essays in Zen Buddhism, First Series, 1926, London; New York: Published for the Buddhist Society, London by Rider, p. 24.)

quila on quila's Shortform

Conditional on us solving alignment, I agree it's more likely that we live in an "easy-by-default" world, rather than a "hard-by-default" one in which we got lucky or played very well.

I think that language in discussions of anthropics is unintentionally prone to masking ambiguities or conflations, especially wrt logical vs indexical probability [LW · GW], so I want to be very careful writing about this. I think there may be some conceptual conflation happening here, but I'm not sure how to word it. I'll see if it becomes clear indirectly.

One difference between our intuitions may be that I'm implicitly thinking within a manyworlds frame. Within that frame it's actually certain that we'll solve alignment in some branches.

So if we then 'condition on solving alignment in the future', my mind defaults to something like this: "this is not much of an update, it just means we're in a future where the past was not a death outcome. Some of the pasts leading up to those futures had really difficult solutions, and some of them managed to find easier ones or get lucky. The probabilities of these non-death outcomes relative to each other have not changed as a result of this conditioning." (I.e I disagree with the quote)

The most probable reason I can see for this difference is if you're thinking in terms of a single future, where you expect to die.^[1] In this frame, if you observe yourself surviving, you should update your logical belief that alignment is hard. (because P(continued observation|alignment being hard) is low, if we imagine a single future, but certain if we imagine the space of indexically possible futures)

Whereas I read it as only indexical, and am generally thinking about this in terms of indexical probabilities.

I totally agree that we shouldn't update our logical beliefs in this way. I.e., that with regard to beliefs about logical probabilities (such as 'alignment is very hard for humans'), we "shouldn't condition on solving alignment, because we haven't yet." I.e that we shouldn't condition on the future not being mostly death outcomes when we haven't averted them and have reason to think they are.

Maybe this helps clarify my position?

On another point:

the developments in non-agentic AI we're facing are still one regime change away from the dynamics that could kill us

I agree with this, and I still found the current lack of goals over the world surprising and worth trying to get as a trait of superintelligent systems.

^{^}
(I'm not disagreeing with this being the most common outcome)

shoshannah-tekofsky on Selfmaker662's Shortform

These are quizzes you make yourself. Did OKC ever have those? It's not for a matching percentage.

A quiz in paiq is 6 questions, 3 multiple choice and 3 open. If someone gets the right answer on the multiple choice, then you get to see their open question answers as a match request, and you can accept or reject the match based in that. I think it's really great.

You can also browse other people's tests and see if you want to take any. The tests seem more descriptive of someone than most written profiles I've read cause it's much harder to misrepresent personal traits in a quiz then in a self-declared profile

martinsq on quila's Shortform

Everything makes sense except your second paragraph. Conditional on us solving alignment, I agree it's more likely that we live in an "easy-by-default" world, rather than a "hard-by-default" one in which we got lucky or played very well. But we shouldn't condition on solving alignment, because we haven't yet.

Thus, in our current situation, the only way anthropics pushes us towards "we should work more on non-agentic systems" is if you believe "world were we still exist are more likely to have easy alignment-through-non-agentic-AIs". Which you do believe, and I don't. Mostly because I think in almost no worlds we have been killed by misalignment at this point. Or put another way, the developments in non-agentic AI we're facing are still one regime change away from the dynamics that could kill us (and information in the current regime doesn't extrapolate much to the next one).

christiankl on ChristianKl's Shortform

That's certainly also an option. I personally found for myself, that I feel intuitively less drawn to NaCl+KCl than to NaCl + K2CO3 (I have both at home).

Most supplements that have mixes of electrolytes don't seem to use KCl and so would give you relatively less chloride than the NaCl+KCl mix.

jay-bailey on How do I get better at D&D Sci?

pandas is a good library for this - it takes CSV files and turns them into Python objects you can manipulate. plotly / matplotlib lets you visualise data, which is also useful. GPT-4 / Claude could help you with this. I would recommend starting by getting a language model to help you create plots of the data according to relevant subsets. Like if you think that the season matters for how much gold is collected, give the model a couple of examples of the data format and simply ask it to write a script to plot gold per season.

quila on quila's Shortform

It sounds like you're anthropic updating on the fact that we'll exist in the future

The quote you replied to was meant to be about the past.^[1]

I can see why it looks like I'm updating on existing in the future, though.^[2] I think it may be more interpretable when framed as choosing actions based on what kinds of paths into the future are more likely, which I think should include assessing where our observations so far would fall.

Specifically, I think that ("we find a general agent-alignment solution right as takeoff is very near" given "early AGIs take a form that was unexpected") is less probable than ("observing early AGI's causes us to form new insights that lead to a different class of solution" given "early AGIs take a form that was unexpected"). Because I think that, and because I think we're at that point where takeoff is near, it seems like it's some evidence for being on that second path.

This should only constitute an anthropic update to the extent you think more-agentic architectures would have already killed us

I do think that. I think that superintelligence is possible to create with much less compute than is being used for SOTA LLMs. Here's a thread with some general arguments for this.

Of course, you could claim that our understanding of the past is not perfect, and thus should still update

I think my understanding of why we've survived so far re:AI is very not perfect. For example, I don't know what would have needed to happen for training setups which would have produced agentic superintelligence by now to be found first, or (framed inversely) how lucky we needed to be to survive this far.

~~~

I'm not sure if this reply will address the disagreement, or if it will still seem from your pov that I'm making some logical mistake. I'm not actually fully sure what the disagreement is. You're welcome to try to help me understand if one remains.

I'm sorry if any part of this response is confusing, I'm still learning to write clearly.

^{^}
I originally thought you were asking why it's true of the past, but then I realized we very probably agreed (in principle) in that case.
^{^}
And to an extent it internally feels like I'm doing this, and then asking "what do my actions need to be to make this be true" in a similar sense to how an FDT agent would act in transparent newcombs. But framing it like this is probably unnecessarily confusing and I feel confused about this description.

quila on quila's Shortform

(I think I misinterpreted your question and started drafting another response, will reply to relevant portions of this reply there)

martinsq on quila's Shortform

Yes, but

This update is screened off by "you actually looking at the past and checking whether we got lucky many times or there is a consistent reason". Of course, you could claim that our understanding of the past is not perfect, and thus should still update, only less so. Although to be honest, I think there's a strong case for the past clearly showing that we just got lucky a few times.
It sounded like you were saying the consistent reason is "our architectures are non-agentic". This should only constitute an anthropic update to the extent you think more-agentic architectures would have already killed us (instead of killing us in the next decade). I'm not of this opinion. And if I was, I'd need to take into account factors like "how much faster I'd have expected capabilities to advance", etc.

martinsq on quila's Shortform

Under the anthropic principle [? · GW], we should expect there to be a 'consistent underlying reason' for our continued survival.

Why? It sounds like you're anthropic updating on the fact that we'll exist in the future, which of course wouldn't make sense because we're not yet sure of that. So what am I missing?

LessWrong 2.0 Reader

Archive

Recent comments