LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

next page (older posts) →

Recent comments

tag on Super additivity of consciousness

Under physicalist epiphenomenalism (which is the standard approach to the mind-matter relation), the mind is super-impressed on reality, perfectly synchronized, and parallel to it.

Under dualist epiphenomenalism, that might be true.

duschkopf on Semantic Disagreement of Sleeping Beauty Problem

If this were true that the concept of „indexical sample space“ does not capture the thirder position, how do you explain that it produces exactly the same probabilities that thirders entertain? Operating with indexicals is a necessary condition (and motivation) for Thirdism, which means assuming indexical sample spaces when it comes to the mathematical formalization of arguments in terms of probability theory. To my knowledge no relevant thirder literature denies that. And within the thirder model, these probabilities indeed hold true. If we assume Monday and Tuesday to be mutually exclusive, than this is mathematically the case. Math is not a judge of our assumptions here, it is merely the executive organ which in this case produces thirder probabilities. The point at issue is whether the theoretical assumptions of the thirder model fit reality and probabilities could be transfered into the real world. Thirders say yes, speaking of regular probabilities, halfers say no speaking of irregular, „weighted“ probabilities.

lauro-langosco on RobertM's Shortform

Yeah fair point. I do think labs have some some nonzero amount of responsibility to be proactive about what others believe about their commitments. I agree it doesn't extend to 'rebut every random rumor'.

oliver-daniels-koch on Oliver Daniels-Koch's Shortform

I think I'm mostly right, but using a somewhat confused frame.

It makes more sense to think of MAD approaches as detecting all abnormal reasons (including deceptive alignment) by default, and then if we get that working we'll try to decrease false anomalies by doing something like comparing the least common ancestor of the measurements in a novel mechanism to the least common ancestor of the measurements on trusted mechanisms.

linda-linsefors on LessWrong Community Weekend 2024 [Applications Open]

Thanks :)

ramblindash on Dating Roundup #3: Third Time’s the Charm

[M]aybe being yourself and open works for people who happen to already be relationship-compatible. People who are not would be worse off by trying to be themselves. I think I have been burned in the past a lot by that kind of advice, although my experience is too much of an anecdote to infer an average.

I think you are maybe using a different definition of "worse off." I would submit that a relationship that is maintainable only by being inauthentic and unopen is, in the long run, significantly worse than no relationship, both because of the experience of being in it, but also because of opportunity cost.

That's different than holding some things back at the beginning, or keeping some impolite thoughts to yourself sometimes. But if your goal is a long-term partnership, you move further away from that goal by spending time and energy on someone you know you aren't compatible with.

molly on Forecasting: the way I think about it

Good q, yes, that's the vertical axis in all the figures.

viliam on Dating Roundup #3: Third Time’s the Charm

How can expectations exist without roles? When everyone is free to do whatever they want to, no one can expect anything specific...

Well, we can still have general, i.e. not gender-specific expectations, such as: people should be nice and emotionally mature. Nothing wrong with that. But it seems like the traditional gender roles also provided some gender-specific "hacks", and now we don't have them.

Or you could ask which traits are valued at the dating marketplace, or more specifically at the part you are interested in. But there is no general answer anymore; it depends on what you are looking for. For example, if you want to have a traditional relationship, it would make sense to behave according to the traditional roles, and expect the same from your potential partners. Other subcultures have different rules. And I suppose most people are confused, do random things, get random results, then hopefully learn and try something different.

oliver-daniels-koch on Oliver Daniels-Koch's Shortform

One confusion I have with MAD as an approach to ELK is that it seems to assume some kind of initial inner alignment. If we're flagging when the model takes actions / makes predictions for "unusual reasons", where unusual is define with respect to some trusted set, but aligned and misaligned models are behaviorally indistinguishable on the trusted set, then a model could learn to do things for misaligned reasons on the trusted set, and then use those same reasons on the untrusted set. For example, a deceptively aligned model would appear aligned in training but attempt take-over in deployment for the "same reason" (e.g. to maximize paperclips), but a MAD approach that "properly" handles out of distribution cases would not flag take over attempts because we want models to be able to respond to novel situations.

I guess this is part of what motivates measurement tampering as a subclass of ELK - instead of trying to track motivations of the agent as reasons, we try to track the reasons for the measurement predictions, and we have some trusted set with no tampering, where we know the reasons for the measurements is ~exactly that the thing we want to be measuring.

Now time to check my answer by rereading https://www.alignmentforum.org/posts/vwt3wKXWaCvqZyF74/mechanistic-anomaly-detection-and-elk [AF · GW]

keltan on some thoughts on LessOnline

That’s a great idea, Thank you!

And here it is: https://manifold.markets/keltan/will-there-be-a-lessonline-2025

LessWrong 2.0 Reader

Archive

Recent comments