LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

dagon on Zero-Sum Defeats Nash Equilibrium

reason for downvote: this doesn't make clear (and is probably wrong about) the tie from game theory descriptions "zero sum" and "nash equilibrium". I suspect they don't mean what you think they mean, but perhaps you're just focusing on other aspects of the decisions, and where the game theory is less directly important.

In fact, neither bike protections nor crime is fixed-sum. If everyone buys locks, thieves go to a bit more effort to defeat the locks, and there's probably LESS theft, but not zero. The Nash equilibrium for effort-to-secure vs effort-to-steal will depend entirely on payoffs, and there's no reason to believe it's legible enough to find (or that it even contains) a zero-crime option.

erik-jenner on Oliver Daniels-Koch's Shortform

Nice overview, agree with most of it!

weak to strong generalization is a class of approaches to ELK which relies on generalizing a "weak" supervision signal to more difficult domains using the inductive biases and internal structure of the strong model.

You could also distinguish between weak-to-strong generalization, where you have a weak supervision signal on the entire distribution (which may sometimes be wrong), and easy-to-hard generalization, where you have a correct supervision signal but only on an easy part of the distribution. Of course both of these are simplifications. In reality, I'd expect the setting to be more like: you have a certain weak supervision budget (or maybe even budgets at different levels of strength), and you can probably decide how to spend the budget. You might only have an imperfect sense of which cases are "easy" vs "hard" though.

mechanistic anomaly detection is an approach to ELK

I think going from MAD to a fully general ELK solution requires some extra ingredients. In practice, the plan might be to MTD and then using the AI in ways such that this is enough (rather than needing a fully general ELK solution). This is related to narrow elicitation though MTD seems even narrower. Even for MTD, you probably need something to bridge the easy-to-hard gap, but at least for that there are specific proposals that seem plausible (this [AF · GW] or, as a more concrete instance, exclusion fine-tuning from the Redwood MTD paper). I think it could turn out that general/worst-case solutions to MAD and ELK run into very similar obstacles, but I don't think a practical MAD solution (e.g. contingent on empirical facts about deep learning) obviously lets you solve ELK.

I would also add that you could motivate MAD as a method to deal with scheming (or other high-stakes failures). In that case, the things to compare against most naturally might look a bit different (e.g. AI control, coup probes, interpretability-themed things); and it wouldn't make sense to compare against debate in that setting. I think most mainstream ML problems that are similar to MAD are closer to this than to scalable oversight.

benito on Raemon's Shortform

I am fairly strongly against having faces, which I think boot up a lot of social instincts that I disprefer on LessWrong which is about which argument is true, not who you like / have relationships with. I think some other sort of unique icon could be good.

rotatingpaguro on Dating Roundup #3: Third Time’s the Charm

Are you libertarian about this specifically? Do you think it's better if people also have the choice of dating apps? Or would you ban them if given the choice?

rotatingpaguro on Dating Roundup #3: Third Time’s the Charm

As noted last time, Rob Henderson finds that women in their twenties swipe right (‘like’) twice as often for a man with a master’s degree over a bachelor’s degree.

Causal or association?

Manifold Love: pro-tip: if a woman measures her hand against yours, this is almost always flirtation.

Totally did not know this. Is this true?

2. Authenticity and openness with your partner tends to be reciprocal and strongly predicts relationship satisfaction. That makes sense, this is underrated.

Is this causal? I mean, maybe being yourself and open works for people who happen to already be relationship-compatible. People who are not would be worse off by trying to be themselves. I think I have been burned in the past a lot by that kind of advice, although my experience is too much of an anecdote to infer an average.

unexpectedvalues on My hour of memoryless lucidity

Update: the strangely-textured fluid turned out to be a dentigerous cyst, which was the best possible outcome. I won't need a second surgery :)

unexpectedvalues on My hour of memoryless lucidity

I just asked -- it was a combination of midazolam (as you had hypothesized), propofol, fentanyl (!), and ketamine.

gunnar_zarncke on Dating Roundup #3: Third Time’s the Charm

As I have said elsewhere [LW(p) · GW(p)]:

Dating apps are broken. Maybe it's better dating apps die soon.

On the supplier side: Misaligned incentives (keep users on the platform) and opaque algorithms lead to bad matches.

On the demand side: Misaligned incentives (first impressions, low cost to exit) and no plausible deniability lead to predators being favored.

Real dating happens when you can observe many potential mates and there is a path to getting closer. Traditionally that was schools, clubs, church, work. Now, not so much. Let's build something that fosters what was lost, now double down on a failed principle - 1-to-1 matching.

mikhail-samin on How do open AI models affect incentive to race?

If the new Llama is comparable to GPT-5 in performance, there’s much less short-term economic incentive to train GPT-5.
If an open model allows some of what people would otherwise pay a close model developer for, there’s less incentive to be a close model developer.
People work on frontier models without trying to get to AGI. Talent is attracted to work at a lab that releases models and then work on random corporate ML instead of building AGI.

But:

Sharing information on frontier models architecture and/or training details, which inevitably happens if you release an open-source model, gives the whole field insights that reduce the time until someone knows how to make something that will kill everyone.
If you know a version of Llama comparable to GPT-4 is going to be released, you want to release a model comparable to GPT4.5 before your customers stop paying you as they can switch to open-source.
People gain experience with frontier models and the talent pool for racing to AGI increases. If people want to continue working on frontier models but their workplace can’t continue to spend as much as frontier labs on training runs, they might decide to work for a frontier lab instead.
Not sure, but maybe some of the infrastructure powered by open models might be switchable to close models, and this might increase profits for close source developers if customers become familiar with/integrate open-source models and then want to replace them with more capable systems, when it’s cost-effective?
Mostly less direct: availability of open-source models for irresponsible use might make it harder to put in place regulation that’d reduce the race dynamics (vis various destabilizing ways they can be used).

romeostevensit on Dating Roundup #3: Third Time’s the Charm

Most of the useful ones are fairly symmetrical. Things like taking care of health and appearance for yourself but also more effort than you would otherwise on the margin because you care about your partner's experience. Taking note of things that seem specific to your partner/make them happy and noticing opportunities to do them. Noticing that the way your partner expresses care is probably the way they also wish they could receive it, and symmetrically noticing that the ways you keep expressing care for your partner are ways you secretly want care and doing the counterintuitively difficult emotional work of learning to ask for it instead of resenting their lack of mind-reading.

Creating space in which your partner can be vulnerable to expose their real preference (e.g. sexual preferences). Both men and women have a pretty hard time with this (especially any gender-narrative dystonic preferences) and often have had some pretty hurtful rejections in the past from other unthinking young people.

Then there are things that people present as if they are relationship obligations and try to avoid the emotional maturity of having them be explicitly discussed requests instead of tacitly held/resented demands. Such as coddling their coping mechanisms while not being allowed to acknowledge that you are paying costs to accommodate them (or you doing this to them).

Men often wind up in the valley of bad emotional sensitivity where they think that going into these sorts of things will make them unattractive/feminine (see: gender-narrative dystonic), mostly because they haven't had solid models of masculine emotional space-holding from their fathers and older male peers, modern age siloing isolates people from a lot of feedback from older people at every stage of life. They don't have a context in which to train the first awkward 100 hours of these skills. I think this is often why people report things like circling being very helpful.

LessWrong 2.0 Reader

Archive

Recent comments