Posts

Gwern: Why So Few Matt Levines? 2024-10-29T01:07:27.564Z
Linkpost: Surely you can be serious 2024-07-18T22:18:09.271Z
Daniel Dennett has died (1942-2024) 2024-04-19T16:17:04.742Z
LessWrong's (first) album: I Have Been A Good Bing 2024-04-01T07:33:45.242Z
kave's Shortform 2024-03-05T04:35:13.510Z
If you weren't such an idiot... 2024-03-02T00:01:37.314Z
New LessWrong review winner UI ("The LeastWrong" section and full-art post pages) 2024-02-28T02:42:05.801Z
On plans for a functional society 2023-12-12T00:07:46.629Z
A bet on critical periods in neural networks 2023-11-06T23:21:17.279Z
Singular learning theory and bridging from ML to brain emulations 2023-11-01T21:31:54.789Z
The Good Life in the face of the apocalypse 2023-10-16T22:40:15.200Z
How to partition teams to move fast? Debating "low-dimensional cuts" 2023-10-13T21:43:53.067Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
PSA: The Sequences don't need to be read in sequence 2022-05-23T02:53:41.957Z

Comments

Comment by kave on Yoav Ravid's Shortform · 2024-12-19T17:24:07.309Z · LW · GW

I do feel like it would be good to start with a more optimistic prior on new posts. Over the last year, the mean post karma was a little over 13, and the median was 5.

Comment by kave on Understanding Shapley Values with Venn Diagrams · 2024-12-19T00:38:35.879Z · LW · GW

This seems unlikely to satisfy linearity, as A/B + C/D is not equal to (A+C)/(B+D)

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-17T22:06:48.395Z · LW · GW

I don't feel particularly uncertain. This EA Forum comment and its parents inform my view quite a bit.

Comment by kave on D&D.Sci Dungeonbuilding: the Dungeon Tournament · 2024-12-16T20:07:31.140Z · LW · GW

Maybe sometimes a team will die in the dungeon?

Comment by kave on [deleted post] 2024-12-15T16:25:27.926Z

<details>blah blah</details>

Comment by kave on D&D.Sci Dungeonbuilding: the Dungeon Tournament · 2024-12-14T23:04:21.392Z · LW · GW

So I did some super dumb modelling.

I was like: let's assume that there aren't interaction effects between the encounters either in the difficulty along a path or in the tendency to co-occur. And let's assume position doesn't matter. Let's also assume that the adventurers choose the minimally difficult path, only moving across room edges.

To estimate the value of an encounter, let's look at how the dungeons where it occurs in one of the two unavoidable locations (1 and 9) differ on average from the overall average.

Assuming ChatGPT did all the implementation correctly, this predictor never overestimates the score by much. Though it frequently, and sometimes egregiously, underestimates the score.

Anyway, using this model and this pathing assumption, we have DBN/OWH/NOC

We skip the goblins and put our fairly rubbish trap in the middle to stop adventurers picking and choosing which parts of the outside paths they take. The optimal path for the adventurers is DONOC, which has a predicted score of 30.29, which ChatGPT tells me is ~95th percentile.

I'd love to come at this with saner modelling (especially of adventurer behaviour), but I somewhat doubt I will.

Comment by kave on D&D.Sci Dungeonbuilding: the Dungeon Tournament · 2024-12-14T21:14:32.110Z · LW · GW

I'm guessing encounter 4 (rather than encounter 6) follows encounter 3?

Comment by kave on How to Price a Futures Contract · 2024-12-14T19:58:30.753Z · LW · GW

You can simulate a future by short-selling the underlying security and buying a bond with the revenue. You can simulate short-selling the same future by borrowing money (selling a bond) and using the money to buy the underlying security.

I think these are backwards. At the end of your simulated future, you end up with one less of the stock, but you have k extra cash. At the end of your simulated short sell, you end up with one extra of the stock and k less cash.

Comment by kave on Efficiency and resource use scaling parity · 2024-12-12T23:58:45.285Z · LW · GW

A neat stylised fact, if it's true. It would be cool to see people checking it in more domains.

I appreciate that Ege included all of examples, theory, and predictions of the theory. I think there's lots of room for criticism of this model, which it would be cool to see tried. In particular, as far as I understand the formalism, it doesn't seem like it is obviously discussing the costs of the investments, as opposed to their returns.

But I still like this as a rule of thumb (open to revision).

Comment by kave on Deception Chess: Game #1 · 2024-12-12T19:35:28.295Z · LW · GW

I still think this post is cool. Ultimately, I don't think the evidence presented here bares that strongly on the underlying question: "can humans get AIs to do their alignment homework?". But I think it bares on it at all, and was conducted quickly and competently.

I would like to live in a world where lots of people gather lots of weak pieces of evidence on important questions.

Comment by kave on Open Thread Fall 2024 · 2024-12-08T19:13:55.669Z · LW · GW

Yep, if the first vote takes the score to ≤ 0, then the post will be dropped off the latest list. This is somewhat ameliorated by:

(a) a fair number of people browsing https://lesswrong.com/allPosts

(b) https://greaterwrong.com having chronological sort by default

(c) posts appearing in recent discussion in order that they're posted (though I do wonder if we filter out negative karma posts from recent discussion)

I often play around with different karma / sorting mechanisms, and I do think it would be nice to have a more Bayesian approach that started with a stronger prior. My guess is the effect you're talking about isn't a big issue in practice, though probably worth a bit of my time to sample some negative karma posts.

Comment by kave on Open Thread Fall 2024 · 2024-12-08T19:07:08.365Z · LW · GW

I had a quick look in the database, and you do have some tag filters set, which could cause the behaviour you describe

Comment by kave on Algebraic Linguistics · 2024-12-08T00:12:31.168Z · LW · GW
  • Because it's a number and a vector, you're unlikely to see anyone (other than programmers) trying to use i as a variable.

I think it's quite common to use i as index variable (for example, in a sum)

(edit: whoops, I see several people have mentioned this) 

Comment by kave on johnswentworth's Shortform · 2024-12-06T19:52:15.644Z · LW · GW

In this case sitting down with someone doing similar tasks but getting more use out of LMs would likely help.

I would contribute to a bounty for y'all to do this. I would like to know whether the slow progress is prompting-induced or not.

Comment by kave on Open Thread Fall 2024 · 2024-12-06T18:29:50.944Z · LW · GW

Click on the gear icon next to the feed selector 

Comment by kave on Open Thread Fall 2024 · 2024-12-06T17:59:20.340Z · LW · GW

A quick question re: your list: do you have any tag filters set?

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-06T16:34:59.139Z · LW · GW

I think "unacceptable reputational costs" here basically means the conjunction of "Dustin doesn't like the work" and "incurs reputational costs for Dustin". Because of the first conjunct, I don't think this suggestion would help Lightcone, sadly.

Comment by kave on Open Thread Fall 2024 · 2024-12-06T16:05:57.210Z · LW · GW

The "latest" tab works via the hacker news algorithm. Ruby has a footnote about it here. I think we set the "starting age" to 2 hours, and the power for the decay rate to 1.15.

Comment by kave on Headphones hook · 2024-12-05T17:29:25.229Z · LW · GW

mod note: this post used to say "LessWrong doesn't seem to support the <details> element, otherwise I would put this code block in it".

We do now support it, so I've edited the post to put the code block in such an element

Comment by kave on Overcoming Bias Anthology · 2024-12-04T23:03:31.860Z · LW · GW

Robin Hanson is one of the intellectual fathers of LessWrong, and I'm very glad there's a curated, organised list of some of his main themes.

He's the first thinker I remember reading and thinking "what? that's completely wrong", who went on to have a big influence on my thought. Apparently I'm not an isolated case (paragraph 3 page 94).

Thanks, Arjun and Richard.

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-03T02:37:33.571Z · LW · GW

37bvhXnjRz4hipURrq2EMAXN2w6xproa9T

I've updated the post with it.

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-02T18:53:39.590Z · LW · GW

FTX did successfully retrieve the $1M from the title company! We didn't have any control over those funds, so I don't think we were involved apart from pointing FTX in the right direction.

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-02T15:58:16.051Z · LW · GW

Habryka means we would have to pick one number per Stripe link (eg one like for $5/month, 1 for $100/month, etc)

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T22:51:25.448Z · LW · GW

Are you checking the box for “Save my info for 1-click checkout with Link”? That’s the only way I’ve figured out get Stripe to ask for my phone number. If so, you can safely uncheck that

(Also, I don’t know if it’s important you, but I don’t think we would see your phone number if you gave it to Stripe)

Comment by kave on Which Biases are most important to Overcome? · 2024-12-01T20:39:46.361Z · LW · GW

What do you mean by A?

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T20:13:31.043Z · LW · GW

Habryka is slightly sloppily referring to using Janus' 'base model jailbreak' for Claude 3.5 Sonnet

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T18:42:59.464Z · LW · GW

as I understand it, the majority of this money will go towards supporting Lighthaven

I think if you take Habryka's numbers at face value, a hair under half of the money this year will go to Lighthaven (35% of core staff salaries@1.4M = 0.49M. 1M for a deferred interest payment. And then the claim that otherwise Lighthaven is breaking even). And in future years, well less than half.

I worry that the future of LW will be endangered by the financial burden of Lighthaven

I think this is a reasonable worry, but I again want to note that Habryka is projecting a neutral or positive cashflow from Lighthaven to the org.

That said, I can think of a couple of reasons for financial pessimism[1]. Having Lighthaven is riskier. It involves a bunch of hard-to-avoid costs. So, if Lighthaven has a bad year, that does indeed endanger the project as a whole.

Another reason to be worried: Lightcone might stop trying to make Lighthaven break even. Lightcone is currently fairly focused on using Lighthaven in revenue-producing ways. My guess is that we'll always try and structure stuff at Lighthaven such that it pays its own way (for example, when we ran LessOnline we sold tickets[2]). But maybe not! Maybe Lightcone will pivot Lighthaven to a loss-making plan, because it foresees greater altruistic benefit (and expects to be able to fundraise to cover it).

So the bundling of the two projects still leaks some risk.

Of course, you might also think Lighthaven makes LessWrong more financially robust, if on the mainline it ends up producing a modest profit that can be used to subsidise LessWrong.

  1. ^

    Other than just doubting Habryka's projections, which also might make sense.

  2. ^

    My understanding of the numbers is that we lost money once you take into account staff time, but broke even if you don't. And it seems the people most involved with running it are hopeful about cutting a bunch of costs in future.

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T16:55:34.961Z · LW · GW

I worry that cos this hasn't received a reply in a bit, people might think it's not in the spirit of the post. I'm even more worried people might think that critical comments aren't in the spirit of the post.

Both critical comments and high-effort-demanding questions are in the spirit of the post, IMO! But the latter might take awhile to get a response

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T01:54:41.291Z · LW · GW

The EIN is 92-0861538

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-11-30T22:29:36.303Z · LW · GW

My impression matches your initial one, to be clear. Like my point estimate of the median is like 85%, my confidence only extends to >50%

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-11-30T22:23:00.231Z · LW · GW

Lightcone is also heterogeneous, but I think it's accurate that the median view at Lightcone is >50% on misaligned takeover

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-11-30T19:25:54.054Z · LW · GW

Maybe remove decimal numbers entirely throughout the graphs? This is what it looked like for me, and led to the error. And this image is way zoomed in compared to what I see naturally on my screen.

Good idea. Done.

Comment by kave on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-11-30T17:00:26.392Z · LW · GW

Yes, I think you're right. I was confused by Shoshannah's numbers last night, but it was late and I didn't manage to summon enough sapience to realise something was wrong and offer a correction. Thanks for doing that!

Comment by kave on You are not too "irrational" to know your preferences. · 2024-11-27T22:32:07.517Z · LW · GW

as applied to them

A slight nitpick: I think this treats their like of the activity applied to them as a scalar, but I think it's also plausibly a function of how you, the applier, go about it. Like maybe they are very in to this activity as applied to them, but not in the way you do it.

Comment by kave on Epistemic status: poetry (and other poems) · 2024-11-22T03:13:38.911Z · LW · GW

If I had to pick a favourite, I'd probably go for Fire and AIs, but The GPT is also great: very terrifying sublime 

Comment by kave on lemonhope's Shortform · 2024-11-22T01:48:52.403Z · LW · GW

Perhaps say some words on why they might want to?

Comment by kave on "It's a 10% chance which I did 10 times, so it should be 100%" · 2024-11-20T18:56:23.838Z · LW · GW

Curated! A short and sweet post, with a useful rule of thumb for those who live and die by the Fermi. There are some more good ones in the comments, too!

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-19T03:15:39.602Z · LW · GW

I don't think this distinction is robust enough to rely on as much of a defensive property. I think it's probably not that hard to think "I probably would have tried something in direction X, or direction Y", and then gather lots of bits about how well the clusters X and Y work.

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-18T16:46:35.472Z · LW · GW

I'm curious about Claude's coauthorship of this post. How did it help?

Comment by kave on Why imperfect adversarial robustness doesn't doom AI control · 2024-11-18T16:46:20.686Z · LW · GW

The crucial point is: when the untrusted model gets to step 6, it sees R in the history instead of A, so it knows its attempt was replaced.

Why does it know its attempt was replaced? Because it can guess it wouldn't have produced R? If it can tell that, how come it can't guess it specifically would have produced A?

Comment by kave on Making a conservative case for alignment · 2024-11-16T01:59:26.257Z · LW · GW

I thought this title meant the post would be making a case from conservative (i.e. minimal) assumptions.

Maybe change the title to "making a politically conservative case for alignment" or something?

Comment by kave on Eli's shortform feed · 2024-11-11T04:54:26.555Z · LW · GW

I wonder what the lifetime spend on dating apps is. I expect that for most people who ever pay it's >$100

Comment by kave on Eli's shortform feed · 2024-11-09T19:20:45.853Z · LW · GW

I think the credit assignment is legit hard, rather than just being a case of bad norms. Do you disagree?

Comment by kave on Eli's shortform feed · 2024-11-08T23:23:27.066Z · LW · GW

I would guess they tried it because they hoped it would be competitive with their other product, and sunset it because that didn't happen with the amount of energy they wanted to allocate to the bet. There may also have been an element of updating more about how much focus their core product needed.

I only skimmed the retrospective now, but it seems mostly to be detailing problems that stymied their ability to find traction.

Comment by kave on Eli's shortform feed · 2024-11-08T22:48:29.398Z · LW · GW

It's possible no one tried literally "recreate OkC", but I think dating startups are very oversubscribed by founders, relative to interest from VCs [1] [2] [3] (and I think VCs are mostly correct that they won't make money [4] [5]).

(Edit: I want to note that those are things I found after a bit of googling to see if my sense of the consensus was borne out; they are meant in the spirit of "several samples of weak evidence")

I don't particularly believe you that OkC solves dating for a significant fraction of people. IIRC, a previous time we talked about this, @romeostevensit suggested you had not sufficiently internalised the OkCupid blog findings about how much people prioritised physical attraction.

You mention manifold.love, but also mention it's in maintenance mode – I think because the type of business you want people to build does not in fact work.

I think it's fine to lament our lack of good mechanisms for public good provision, and claim our society is failing at that. But I think you're trying to draw an update that's something like "tech startups should be doing an unbiased search through viable valuable business, but they're clearly not", or maybe, "tech startups are supposed to be able to solve a large fraction of our problems, but if they can't solve this, then that's not true", and I don't think either of these conclusions seem that licensed from the dating data point.

Comment by kave on Are Your Enemies Innately Evil? · 2024-11-06T16:25:23.878Z · LW · GW

Yes, though I'm not confident.

Comment by kave on Are Your Enemies Innately Evil? · 2024-11-06T07:30:44.637Z · LW · GW

I saw this poll and thought to myself "gosh, politics, religion and cultural opinions sure are areas where I actively try to be non-heroic, as they aren't where I wish to spend my energy".

Comment by kave on Habryka's Shortform Feed · 2024-11-01T21:35:50.115Z · LW · GW

They load it in as a web font (i.e. you load Calibri from their server when you load that search page). We don't do that on LessWrong

Comment by kave on Habryka's Shortform Feed · 2024-11-01T20:59:24.773Z · LW · GW

Yeah, that's a google Easter Egg. You can also try "Comic Sans" or "Trebuchet MS".

Comment by kave on Habryka's Shortform Feed · 2024-10-31T01:11:07.069Z · LW · GW

One sad thing about older versions of Gill Sans: Il1 all look the same. Nova at least distinguishes the 1.

IMO, we should probably move towards system fonts, though I would like to choose something that preserves character a little more.