Posts

Gwern: Why So Few Matt Levines? 2024-10-29T01:07:27.564Z
Linkpost: Surely you can be serious 2024-07-18T22:18:09.271Z
Daniel Dennett has died (1942-2024) 2024-04-19T16:17:04.742Z
LessWrong's (first) album: I Have Been A Good Bing 2024-04-01T07:33:45.242Z
kave's Shortform 2024-03-05T04:35:13.510Z
If you weren't such an idiot... 2024-03-02T00:01:37.314Z
New LessWrong review winner UI ("The LeastWrong" section and full-art post pages) 2024-02-28T02:42:05.801Z
On plans for a functional society 2023-12-12T00:07:46.629Z
A bet on critical periods in neural networks 2023-11-06T23:21:17.279Z
Singular learning theory and bridging from ML to brain emulations 2023-11-01T21:31:54.789Z
The Good Life in the face of the apocalypse 2023-10-16T22:40:15.200Z
How to partition teams to move fast? Debating "low-dimensional cuts" 2023-10-13T21:43:53.067Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
PSA: The Sequences don't need to be read in sequence 2022-05-23T02:53:41.957Z

Comments

Comment by kave on "It's a 10% chance which I did 10 times, so it should be 100%" · 2024-11-20T18:56:23.838Z · LW · GW

Curated! A short and sweet post, with a useful rule of thumb for those who live and die by the Fermi. There are some more good ones in the comments, too!

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-19T03:15:39.602Z · LW · GW

I don't think this distinction is robust enough to rely on as much of a defensive property. I think it's probably not that hard to think "I probably would have tried something in direction X, or direction Y", and then gather lots of bits about how well the clusters X and Y work.

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-18T16:46:35.472Z · LW · GW

I'm curious about Claude's coauthorship of this post. How did it help?

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-18T16:46:20.686Z · LW · GW

The crucial point is: when the untrusted model gets to step 6, it sees R in the history instead of A, so it knows its attempt was replaced.

Why does it know its attempt was replaced? Because it can guess it wouldn't have produced R? If it can tell that, how come it can't guess it specifically would have produced A?

Comment by kave on Making a conservative case for alignment · 2024-11-16T01:59:26.257Z · LW · GW

I thought this title meant the post would be making a case from conservative (i.e. minimal) assumptions.

Maybe change the title to "making a politically conservative case for alignment" or something?

Comment by kave on Eli's shortform feed · 2024-11-11T04:54:26.555Z · LW · GW

I wonder what the lifetime spend on dating apps is. I expect that for most people who ever pay it's >$100

Comment by kave on Eli's shortform feed · 2024-11-09T19:20:45.853Z · LW · GW

I think the credit assignment is legit hard, rather than just being a case of bad norms. Do you disagree?

Comment by kave on Eli's shortform feed · 2024-11-08T23:23:27.066Z · LW · GW

I would guess they tried it because they hoped it would be competitive with their other product, and sunset it because that didn't happen with the amount of energy they wanted to allocate to the bet. There may also have been an element of updating more about how much focus their core product needed.

I only skimmed the retrospective now, but it seems mostly to be detailing problems that stymied their ability to find traction.

Comment by kave on Eli's shortform feed · 2024-11-08T22:48:29.398Z · LW · GW

It's possible no one tried literally "recreate OkC", but I think dating startups are very oversubscribed by founders, relative to interest from VCs [1] [2] [3] (and I think VCs are mostly correct that they won't make money [4] [5]).

(Edit: I want to note that those are things I found after a bit of googling to see if my sense of the consensus was borne out; they are meant in the spirit of "several samples of weak evidence")

I don't particularly believe you that OkC solves dating for a significant fraction of people. IIRC, a previous time we talked about this, @romeostevensit suggested you had not sufficiently internalised the OkCupid blog findings about how much people prioritised physical attraction.

You mention manifold.love, but also mention it's in maintenance mode – I think because the type of business you want people to build does not in fact work.

I think it's fine to lament our lack of good mechanisms for public good provision, and claim our society is failing at that. But I think you're trying to draw an update that's something like "tech startups should be doing an unbiased search through viable valuable business, but they're clearly not", or maybe, "tech startups are supposed to be able to solve a large fraction of our problems, but if they can't solve this, then that's not true", and I don't think either of these conclusions seem that licensed from the dating data point.

Comment by kave on Are Your Enemies Innately Evil? · 2024-11-06T16:25:23.878Z · LW · GW

Yes, though I'm not confident.

Comment by kave on Are Your Enemies Innately Evil? · 2024-11-06T07:30:44.637Z · LW · GW

I saw this poll and thought to myself "gosh, politics, religion and cultural opinions sure are areas where I actively try to be non-heroic, as they aren't where I wish to spend my energy".

Comment by kave on Habryka's Shortform Feed · 2024-11-01T21:35:50.115Z · LW · GW

They load it in as a web font (i.e. you load Calibri from their server when you load that search page). We don't do that on LessWrong

Comment by kave on Habryka's Shortform Feed · 2024-11-01T20:59:24.773Z · LW · GW

Yeah, that's a google Easter Egg. You can also try "Comic Sans" or "Trebuchet MS".

Comment by kave on Habryka's Shortform Feed · 2024-10-31T01:11:07.069Z · LW · GW

One sad thing about older versions of Gill Sans: Il1 all look the same. Nova at least distinguishes the 1.

IMO, we should probably move towards system fonts, though I would like to choose something that preserves character a little more.

Comment by kave on Open Thread Fall 2024 · 2024-10-29T02:49:29.358Z · LW · GW

I don't think we've changed how often we use serifs vs sans serifs. Is there anything particular you're thinking of?

Comment by kave on Gwern: Why So Few Matt Levines? · 2024-10-29T01:07:56.095Z · LW · GW

@gwern I think it prolly makes sense for me to assign this post to your account? Let me know if you're OK with that.

Comment by kave on Dark Forest Theories · 2024-10-28T19:15:26.236Z · LW · GW

For me, Dark Forest Theory reads strongly as "everyone is hiding, (because) everyone is hunting", rather than just "everyone is hiding".

Comment by kave on The hostile telepaths problem · 2024-10-28T05:39:37.922Z · LW · GW

From the related book Elephant in the Brain:

Here is the thesis we’ll be exploring in this book: We, human beings, are a species that’s not only capable of acting on hidden motives—we’re designed to do it. Our brains are built to act in our self-interest while at the same time trying hard not to appear selfish in front of other people. And in order to throw them off the trail, our brains often keep “us,” our conscious minds, in the dark. The less we know of our own ugly motives, the easier it is to hide them from others.

Comment by kave on johnswentworth's Shortform · 2024-10-25T18:03:45.165Z · LW · GW

I think Steve Hsu has written some about the evidence for additivity on his blog (Information Processing). He also talks about it a bit in section 3.1 of this paper.

Comment by kave on Automation collapse · 2024-10-23T07:52:56.255Z · LW · GW

It seems like there's a general principle here, that it's hard to use pure empiricism to bound behaviour over large input and action spaces. You either need to design the behaviour, or understand it mechanistically.

Comment by kave on What is the alpha in one bit of evidence? · 2024-10-23T02:55:32.931Z · LW · GW

I don't understand why you would short the market if your P(Doom) is high. I think most Dooms don't involve shorts paying off?

Comment by kave on Mark Xu's Shortform · 2024-10-10T01:27:41.752Z · LW · GW

ANT has a stronger safety culture, and so it is a more pleasant experience to work at ANT for the average safety researcher. This suggests that there might be a systematic bias towards ANT that pulls away from the "optimal allocation".

I think this depends on whether you think AI safety at a lab is more of an O-ring process or a swiss-cheese process. Also, if you think it's more of an O-ring process, you might be generally less excited about working at a scaling lab.

Comment by kave on sarahconstantin's Shortform · 2024-10-07T18:31:41.280Z · LW · GW

the idea that social media was sending them personalized messages

I imagine they were obsessed with false versions of this idea, rather than obsession about targeted advertising?

Comment by kave on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-10-04T18:29:37.658Z · LW · GW

I'm not sure I'm understanding your setup (I only skimmed the post). Are you using takeoff to mean something like "takeoff from now" or ("takeoff from [some specific event that is now in the past]")? If I look at your graph at the end, it looks to me like "Paul Slow" is a faster timeline but a longer takeoff (Paul Slow's takeoff beginning near the beginning of the graph, and Fast takeoff beginning around the intersection of the two blue lines).

Comment by kave on MichaelDickens's Shortform · 2024-10-03T18:53:25.824Z · LW · GW

Wasn't the relevant part of your argument like, "AI safety research outside of the labs is not that good, so that's a contributing factor among many to it not being bad to lose the ability to do safety funding for governance work"? If so, I think that "most of OpenPhil's actual safety funding has gone to building a robust safety research ecosystem outside of the labs" is not a good rejoinder to "isn't there a large benefit to building a robust safety research ecosystem outside of the labs?", because the rejoinder is focusing on relative allocations within "(technical) safety research", and the complaint was about the allocation between "(technical) safety research" vs "other AI x-risk stuff".

Comment by kave on Leon Lang's Shortform · 2024-10-03T18:41:44.936Z · LW · GW

I've not seen the claim that the scaling laws are bending. Where should I look?

Comment by kave on Open Thread Summer 2024 · 2024-10-01T18:31:48.239Z · LW · GW

possible worlds that split off when the photon was created

I don't think this is a very good way of thinking about what happens. I think worlds appear as fairly robust features of the wavefunction when quantum superpositions get entangled with large systems that differ in lots of degrees of freedom based on the state of the superposition.

So, when the intergalactic photon interacts non-trivially with a large system (e.g. Earth), a world becomes distinct in the wavefunction, because there's a lump of amplitude that is separated from other lumps of amplitude by distance in many, many dimensions. This means it basically doesn't interact with the rest of the wavefunction, and so looks like a distinct world.

Comment by kave on Nisan's Shortform · 2024-09-28T04:01:44.804Z · LW · GW

I tried to replicate. At 20 it went on to 25, and I explained what it got wrong. I tried again. I interrupted at 6 and it stopped at 7, saying "Gotcha, stopped right at eleven!". I explained what happened and it said something like "Good job, you found the horrible, marrow cricket" (these last 3 words are verbatim) and then broke.

Comment by kave on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It. · 2024-09-12T05:28:24.667Z · LW · GW

Thanks. I think a bunch of discussions I've seen or been part of could have been more focused by establishing whether the crux was "1 is bad" vs "I think this is an instance of 3, not 1".

Comment by kave on Building an Inexpensive, Aesthetic, Private Forum · 2024-09-12T05:23:52.452Z · LW · GW

IMO, pro Slack instances are wonderful for searching & good for many different kinds of media, though not mixed media (i.e. you can upload videos, photos, pdfs (and search over them all, including with speech recognition!) but inserting photos into a message is annoying).

I'm not really familiar with Zulip or Discord.

(Also, I'm not sure with pro Slack instances really qualify for (2) anymore)

Comment by kave on Building an Inexpensive, Aesthetic, Private Forum · 2024-09-11T23:40:28.498Z · LW · GW

I don't know what "private" means to you, but if you just mean you can control who joins, I think google groups are a good choice for 2 - 4.

Zulip, Discord and Slack are all options as well, though they all (to differing degrees) encourage shorter, chattier posts.

Comment by kave on Building an Inexpensive, Aesthetic, Private Forum · 2024-09-10T02:20:51.856Z · LW · GW

I also expect it would be a bit more expensive than something like Said’s suggestions

Comment by kave on instruction tuning and autoregressive distribution shift · 2024-09-05T20:35:04.397Z · LW · GW

Is the central argumentative line of this post that high-quality & informative text in the training distribution rarely corrects itself, post-training locates the high-quality part of the distribution, and so LLMs rarely correct themselves?

Or is it the more specific claim that post-training is locating parts of the distribution where the text is generated by someone in a context that highlights their prestige from their competence, and such text rarely corrects itself?

I don't see yet why the latter would be true, so my guess is you meant the former. (Though I do think the latter prompt would more strongly imply non-self-correction).

Comment by kave on instruction tuning and autoregressive distribution shift · 2024-09-05T19:01:39.728Z · LW · GW

I'm not sure whether this is important to the main thrust of the post, but I disagree with most of this paragraph:

Again, they're an expert in the field -- and this is the sort of claim that would be fairly easy to check even if you're not an expert yourself, just by Googling around and skimming recent papers.  It's also not the sort of claim where there's any obvious incentive for deception.  It's hard to think of a plausible scenario in which this person writes this sentence, and yet the sentence is false or even controversial.

In my experience, it's quite hard to check what "the gold standard" of something is, particularly in cutting-edge research fields. There are lots of different metrics on which methods compete, and it's hard to know their importance as an outsider.

And the obvious incentive for deception is that the physics prof works on NPsM, and so is talking it up (or has developed a method that beats NPsM on some benchmark, and so is talking it up to impress people with their new method ...)

Comment by kave on Habryka's Shortform Feed · 2024-09-04T21:38:29.951Z · LW · GW

Regarding the sign of Lightcone Offices: I think one sort of score for a charity is the stuff that it has done, and another is the quality of its generator of new projects (and the past work is evidence for that generator).

I'm not sure exactly the correct way to combine those scores, but my guess is most people who think the offices and their legacy were good should like us having money because of the high first score. And people who think they were bad should definitely be aware that we ran them (and chose to close them) when evaluating our second score.

So, I want us to list it on our impact track record section, somewhat regardless of sign.

Comment by kave on Eli's shortform feed · 2024-08-30T20:01:36.448Z · LW · GW

What are the semantics of "otherwise"? Are they more like:

  • X otherwise Y ↦ X → ¬Y, or
  • X otherwise Y ↦ X ↔ ¬Y
Comment by kave on Eli's shortform feed · 2024-08-30T17:46:03.514Z · LW · GW

Presumably you also want the policy to include that you don't want "Y" and weren't going to do "X" anyway?

Comment by kave on Eli's shortform feed · 2024-08-29T20:16:34.461Z · LW · GW

What is the "don't give in to threats" policy that this is more complex than? In particular, what are 'threats'?

Comment by kave on Open Thread Summer 2024 · 2024-08-22T03:56:01.472Z · LW · GW

Yeah, I think if we don’t do a UI rework soon to get rid of it (while still giving some prominence to the markets where they exist), we should at least do some special casing of its commenting behaviour.

Comment by kave on Please do not use AI to write for you · 2024-08-21T23:11:51.871Z · LW · GW

I agree. I realise the irony of this given that I worked on the big splash pages for the review winner posts.

Comment by kave on You don't know how bad most things are nor precisely how they're bad. · 2024-08-21T01:01:16.154Z · LW · GW

Curated.

This was a fun post to read! I liked learning about God's prank on musicians, a little bit about how pianos work and how tuning works. I particularly appreciated how Solenoid_Entity shared what it was like trying to hear the problems and what the fixes sounded like. I feel like I got a lot of detail about what the actual subtle problems in the audio waves were, whether or not they were perceptible to plebs like me.

I'm not sure if I agree about the importance of preserving high-quality tunings like this. I lean towards yes, but mainly because I expect a bunch of people would actually enjoy music slightly more in a world with better tunings. Not least, because it might make a difference to the production processes of music makers.

The comments were good on this one. I particularly liked the thread under Garrett's comment, which made me think about the tradeoffs between the abundance of mass production and the often higher quality cap of artisanal work (though I think the absolute quality cap is normally higher for industrial production).

For more on potential incommensurability of skills, see: What Money Cannot Buy.

Comment by kave on Why you should be using a retinoid · 2024-08-19T21:47:15.532Z · LW · GW

No. Topical adapalene has side effects. Also, not sure why you're capitalizing retinoids.

I just checked 5 of the individual wiki pages linked from the retinoid page. You suggest they have side effects (together with their mechanisms of action) that "[indicate] negative long-term effects".

None of the linked pages' listed topical side effects indicated that to me. Here is the section on side effects for adapalene:

Of the three topical retinoids, adapalene is often regarded as the best tolerated. It can cause mild adverse effects such as photosensitivity, irritation, redness, dryness, itching, and burning, and 1% to 10% of users experience a brief sensation of warmth or stinging, as well as dry skin, peeling and redness during the first two to four weeks using the medication. These effects are considered mild and usually decrease over time. Serious allergic reactions are rare.

Pregnancy & lactation

Use of topical adapalene in pregnancy has not been well studied but has a theoretical risk of retinoid embryopathy. Thus far, there is no evidence that the cream causes problems in the baby if used during pregnancy. Use is at the consumer's own risk.

Topical adapalene has poor systemic absorption and results in low blood levels (less than 0.025 mcg/L) even after long term use, suggesting that there is low risk of harm for a nursing infant. However, it is recommended that the topical medication should not be applied to the nipple or any other area that may come into direct contact with the infant's skin.

What in here indicates negative long-term effects?

Comment by kave on Why you should be using a retinoid · 2024-08-19T16:11:27.471Z · LW · GW

After checking out Melissa55’s YouTube channel, probably worth noting she was on HRT from her 40s until recently, so that might confound the retinoids effect for her in particular.

Comment by kave on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) · 2024-08-17T07:07:07.522Z · LW · GW

I think if you made a bot that posted the same comment on every post except for, say, a link to a high-quality audio narration of the post, it would probably be acceptable behaviour.

EDIT: Though my true rejection is more like, I wouldn't rule out the site admins making an auto commenter that reminded people of argumentative norms or something like that. Of course, it seems likely that whatever end the auto commenter was supposed to serve would be better served using a different UI element than a comment (as also seems true here), but it's not something I would say we should never try.

I think as site admins we should be trying to serve something like the overall health and vision of the site, and not just locally the user's level of annoyance, though I do think the user's level of annoyance is a relevant thing to take into account!

There's something a little loopy here that's hard to reason about. People might be annoyed because a comment burns the commons. But I think there's a difference in opinion about whether it's burning or contributing to the commons. And then, I imagine, those who think it's burning the commons want to offer their annoyance as proof of the burn. But there's a circularity there I don't know quite how to think through.

Comment by kave on Fields that I reference when thinking about AI takeover prevention · 2024-08-17T01:23:52.145Z · LW · GW

I think that would be a good series of posts! Especially if the person was reviewing the recommendations analytically, trying to figure out if they make sense in the source domain, seeing if they make sense for AI, and so on.

Comment by kave on Recommendation: reports on the search for missing hiker Bill Ewasko · 2024-08-15T19:36:33.146Z · LW · GW

See also frontier64 and eukaryote on helicopter searches.

Comment by kave on Open Thread Summer 2024 · 2024-08-15T02:24:36.779Z · LW · GW

To the extent you're saying that the "Personal" name for the category is confusing, I agree. I'm not sure what a better name is, but I'd like to use one.

Your last paragraph is in the right ballpark, but by my lights the central concern isn't so much about LessWrong mods getting involved in fights over what goes on the frontpage. It's more about keeping the frontpage free of certain kinds of context requirements and social forces.

LessWrong is meant for thinking and communicating about rationality, AI x-risk and related ideas. It shouldn't require familiarity with the social scenes around those topics.

Organisations aren't exactly "a social scene". And they are relevant to modeling the space's development. But I think there's two reasons to keep information about those organisations off the frontpage.

  1. While relevant to the development of ideas, that information is not the same as the development of those ideas. We can focus on org's contribution to the ideas without focusing on organisational changes.
  2. It helps limit certain social forces. My model for why LessWrong keeps politics off the frontpage is to minimize the risk of coöption by mainstream political forces and fights. Similarly, I think keeping org updates off the frontpage helps prevent LessWrong from overly identifying with particular movements or orgs. I'm afraid this would muck up our truth-seeking. Powerful, high-status organizations can easily warp discourse. "Everyone knows that they're basically right about stuff". I think this already happens to some degree – comments from staff at MIRI, ARC, Redwood, Lightcone seem to me to gain momentum solely from who wrote them. Though of course it's hard to be sure, as the comments are often also pretty good on their merits.

As AI news heats up, I do think our categories are straining a bit. There's a lot of relevant but news-y content. I still feel good about keeping things like Zvi's AI newsletters off the frontpage, but I worry that putting them in the "Personal" category de-emphasize them too much.

Comment by kave on Californians, tell your reps to vote yes on SB 1047! · 2024-08-14T21:16:08.983Z · LW · GW

I did this. They noted down my support! Though they also didn't really give me a sign they understood what I was saying (and I did a pretty poor job of explaining)

Rep: Hello, office of Buffy Wicks.
kave: Oh, uh ... hi. I'm calling ... about SB-1047 ... and ... I guess I wanted to check if I can register my support given that I live in Buffy Wicks' area but I can't vote ...
Rep: OK I'll note that down as support. Have a great day!

Comment by kave on Californians, tell your reps to vote yes on SB 1047! · 2024-08-14T17:44:46.870Z · LW · GW

I guess if I'm worried that this is important to them I can just proactively bring it up

Comment by kave on Californians, tell your reps to vote yes on SB 1047! · 2024-08-14T17:30:39.328Z · LW · GW

I've been thinking about calling to support this bill, but haven't because I'm worried that as a resident who can't vote, they don't want to hear from me. My understanding is that if you tell the Californian rep you don't have the right to vote (e.g. because you're on a visa), they will ignore you. And that you can probably mislead without lying, but it will be necessary to mislead.

Anyone know better?