Posts

Announcing the Q1 2025 Long-Term Future Fund grant round 2024-12-20T02:20:22.448Z
Sorry for the downtime, looks like we got DDosd 2024-12-02T04:14:30.209Z
(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser 2024-11-30T02:55:16.077Z
OpenAI Email Archives (from Musk v. Altman and OpenAI blog) 2024-11-16T06:38:03.937Z
Using Dangerous AI, But Safely? 2024-11-16T04:29:20.914Z
Open Thread Fall 2024 2024-10-05T22:28:50.398Z
If-Then Commitments for AI Risk Reduction [by Holden Karnofsky] 2024-09-13T19:38:53.194Z
Open Thread Summer 2024 2024-06-11T20:57:18.805Z
"AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case 2024-05-03T18:10:12.478Z
Goal oriented cognition in "a single forward pass" 2024-04-22T05:03:18.649Z
Express interest in an "FHI of the West" 2024-04-18T03:32:58.592Z
Structured Transparency: a framework for addressing use/mis-use trade-offs when sharing information 2024-04-11T18:35:44.824Z
LessWrong's (first) album: I Have Been A Good Bing 2024-04-01T07:33:45.242Z
How useful is "AI Control" as a framing on AI X-Risk? 2024-03-14T18:06:30.459Z
Open Thread Spring 2024 2024-03-11T19:17:23.833Z
Is a random box of gas predictable after 20 seconds? 2024-01-24T23:00:53.184Z
Will quantum randomness affect the 2028 election? 2024-01-24T22:54:30.800Z
Vote in the LessWrong review! (LW 2022 Review voting phase) 2024-01-17T07:22:17.921Z
AI Impacts 2023 Expert Survey on Progress in AI 2024-01-05T19:42:17.226Z
Originality vs. Correctness 2023-12-06T18:51:49.531Z
The LessWrong 2022 Review 2023-12-05T04:00:00.000Z
Open Thread – Winter 2023/2024 2023-12-04T22:59:49.957Z
Complex systems research as a field (and its relevance to AI Alignment) 2023-12-01T22:10:25.801Z
How useful is mechanistic interpretability? 2023-12-01T02:54:53.488Z
My techno-optimism [By Vitalik Buterin] 2023-11-27T23:53:35.859Z
"Epistemic range of motion" and LessWrong moderation 2023-11-27T21:58:40.834Z
Debate helps supervise human experts [Paper] 2023-11-17T05:25:17.030Z
How much to update on recent AI governance moves? 2023-11-16T23:46:01.601Z
AI Timelines 2023-11-10T05:28:24.841Z
How to (hopefully ethically) make money off of AGI 2023-11-06T23:35:16.476Z
Integrity in AI Governance and Advocacy 2023-11-03T19:52:33.180Z
What's up with "Responsible Scaling Policies"? 2023-10-29T04:17:07.839Z
Trying to understand John Wentworth's research agenda 2023-10-20T00:05:40.929Z
Trying to deconfuse some core AI x-risk problems 2023-10-17T18:36:56.189Z
How should TurnTrout handle his DeepMind equity situation? 2023-10-16T18:25:38.895Z
The Lighthaven Campus is open for bookings 2023-09-30T01:08:12.664Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
Long-Term Future Fund Ask Us Anything (September 2023) 2023-08-31T00:28:13.953Z
Open Thread - August 2023 2023-08-09T03:52:55.729Z
Long-Term Future Fund: April 2023 grant recommendations 2023-08-02T07:54:49.083Z
Final Lightspeed Grants coworking/office hours before the application deadline 2023-07-05T06:03:37.649Z
Correctly Calibrated Trust 2023-06-24T19:48:05.702Z
My tentative best guess on how EAs and Rationalists sometimes turn crazy 2023-06-21T04:11:28.518Z
Lightcone Infrastructure/LessWrong is looking for funding 2023-06-14T04:45:53.425Z
Launching Lightspeed Grants (Apply by July 6th) 2023-06-07T02:53:29.227Z
Yoshua Bengio argues for tool-AI and to ban "executive-AI" 2023-05-09T00:13:08.719Z
Open & Welcome Thread – April 2023 2023-04-10T06:36:03.545Z
Shutting Down the Lightcone Offices 2023-03-14T22:47:51.539Z
Review AI Alignment posts to help figure out how to make a proper AI Alignment review 2023-01-10T00:19:23.503Z
Kurzgesagt – The Last Human (Youtube) 2022-06-29T03:28:44.213Z

Comments

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-21T08:54:09.593Z · LW · GW

FWIW, de-facto I have never looked at DMs or DM metadata, unless multiple people reached out to us about a person spamming or harassing them, and then we still only looked at the DMs that that person sent. 

So I think your prior here wasn't crazy. It is indeed the case that we've never acted against it, as far as I know.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-20T20:44:15.618Z · LW · GW

An obvious thing to have would be a very easy "flag" button that a user can press if they receive a DM, and if they press that we can look at the DM content they flagged, and then take appropriate action. That's still kind of late in the game (I would like to avoid most spam and harassment before it reaches the user), but it does seem like something we should have.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-20T20:21:26.323Z · LW · GW

Is it OK for LW admins to look at DM metadata for spam prevention reasons? 

Sometimes new users show up and spam a bunch of other users in DMs (in particular high-profile users). We can't limit DM usage to only users with activity on the site, because many valuable DMs get sent by people who don't want to post publicly. We have some basic rate limits for DMs, but of course those can't capture many forms of harassment or spam. 

Right now, admins can only see how many DMs users have sent, and not who users have messaged, without making a whole manual database query, which we have a policy of not doing unless we have a high level of suspicion of malicious behavior. However, I feel like it would be quite useful for identifying who is doing spammy things if we could also see who users have sent DMs to, but of course, this might feel bad from a privacy perspective to people. 

So I am curious about what others think. Should admins be able to look at DM metadata to help us identify who is abusing the DM system? Or should we stick to aggregate statistics like we do right now? (React or vote "agree" if you think we should use DM metadata, and react or vote "disagree" if you think we should not use DM metadata).

Comment by habryka (habryka4) on Yoav Ravid's Shortform · 2024-12-19T19:02:47.118Z · LW · GW

I mean, the difference between 7 and 5 karma on frontpage ranking is miniscule, so I don't think that made any difference. The real question is "why did nobody upvote it"? Like, I think there physically isn't enough space on the frontpage to give 5 karma posts visibility for very long, without filling most of the frontpage with new unvetted content.

Comment by habryka (habryka4) on Yoav Ravid's Shortform · 2024-12-19T18:08:11.987Z · LW · GW

I agree this is true for content by new users, but honestly, we kind of need to hide content from most users from the frontpage until someone decided to upvote it.

For more active users, their strong-vote strength gets applied by default to the post, which helps a good amount with early downvotes not hurting visibility that much.

Comment by habryka (habryka4) on Yoav Ravid's Shortform · 2024-12-19T18:06:40.320Z · LW · GW

I think it's a hard tradeoff. I do think lots of people take psychological hits, but it is also genuinely important that people who are not a good fit for the site learn quickly and get the hint that they either have to shape up or get out. Otherwise we are at risk of quickly deteroriating in discussion quality. I do think this still makes it valuable to reduce variance, but I think we've already largely done that with the strong-vote and vote-weighting system. 

Upvotes by senior users matter a lot more, and any senior user can you dig you out of multiple junior users downvoting you, which helps.

Comment by habryka (habryka4) on Yoav Ravid's Shortform · 2024-12-19T15:56:56.516Z · LW · GW

I don't know either! Early voting is often quite noisy, and this thing is a bit politics-adjacent. I expect it won't end up downvoted too long. I've considered hiding vote-scores for the first few hours, but we do ultimately still have to use something for visibility calculations, and I don't like withholding information from users.

Comment by habryka (habryka4) on Yoav Ravid's Shortform · 2024-12-19T15:55:11.503Z · LW · GW

I think this would train the wrong habits in LessWrong users, and also skew the incentive landscape that is already tilted somewhat too much in the direction of "you get karma if you post content" away from "you get karma if your content on average makes the site better".

Comment by habryka (habryka4) on Open Thread Fall 2024 · 2024-12-19T04:41:12.634Z · LW · GW

Huh, what browser and OS?

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-14T06:30:35.836Z · LW · GW

You are correct. Seems like I got confused. Obvious in retrospect. Thank you for catching the error!

Comment by habryka (habryka4) on OpenAI Email Archives (from Musk v. Altman and OpenAI blog) · 2024-12-14T05:06:02.973Z · LW · GW

My bet would be on the Musk lawsuit document being correct. The OpenAI emails seemed edited in a few different ways (and also had a kind of careless redaction failures).

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-14T01:37:41.762Z · LW · GW

I have updated the OpenAI Email Archives to now also include all emails that OpenAI has published in their March 2024 and December 2024 blogposts!

I continue to think reading through these is quite valuable, and even more interesting with the March 2024 and December 2024 emails included.

Comment by habryka (habryka4) on OpenAI Email Archives (from Musk v. Altman and OpenAI blog) · 2024-12-14T01:30:32.137Z · LW · GW

Update: This is now done!

Comment by habryka (habryka4) on OpenAI Email Archives (from Musk v. Altman and OpenAI blog) · 2024-12-13T23:10:01.881Z · LW · GW

Yep! I am working on updating this post with the new emails (as well as the emails from the March OpenAI blogpost that also had a bunch of emails not in this post).

Comment by habryka (habryka4) on Haotian's Shortform · 2024-12-13T02:44:02.445Z · LW · GW

I think beyond insightfulness, there is also a "groundedness" component that is different. LLM written writing either lies about personal experience, or is completely absent of references to personal experience. That makes writing usually much less concrete and worse, or actively deceptive.

Comment by habryka (habryka4) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-13T00:09:56.950Z · LW · GW

Thank you! 

After many other people have said similar things I am now pretty sure we will either keep the fundraiser open until like the second or third week of January, or add some way of pledging funds to be donated in the next year.

Comment by habryka (habryka4) on Haotian's Shortform · 2024-12-12T22:40:57.984Z · LW · GW

For the record, I wrote it late at night and ran the response through an LLM to improve readability for her benefit. 

This is IMO generally considered bad form on LW. Please clearly mark whether an LLM was involved in writing a comment, if the final content is not something that genuinely reflects your own voice and writing style and you hold it to the same standard as your own writing. Like, it's OK to iterate on a paragraph or two with an LLM without marking that super prominently, but if you have a whole comment which clearly is a straightforward copy-paste from an LLM, that should get you downvoted (and banned if you do it repeatedly).

Comment by habryka (habryka4) on How to (hopefully ethically) make money off of AGI · 2024-12-11T08:23:51.586Z · LW · GW

Going through the post, I figured I would backtest the mentioned strategies seeing how well they performed. 

Starting with NoahK's suggested big stock tickers: "TSM, MSFT, GOOG, AMZN, ASML, NVDA"

If you naively bought these stocks weighted by market cap, you would have made a 60% annual return:

You would have also very strongly outperformed the S&P 500. That is quite good. 

Let's look at one of the proposed AI index funds that was mentioned: 

iShares has one under ticket IRBO. Let's see what it holds... Looks like very low concentration (all <2%) but the top names are... Faraday, Meitu, Alchip, Splunk, Microstrategy. (???)

That is... OK. Honestly, also looking at the composition of this index fund, I am not very impressed. Making only a 12% return in the year 2024 on AI stocks does feel like you failed at actually indexing on the AI market. My guess is even at the time, someone investing on the basis of this post would have chosen something more like IYW which is up 34% YTD.

Overall, the investment advice in this post backtests well. 

Comment by habryka (habryka4) on Investing for a World Transformed by AI · 2024-12-11T08:11:20.292Z · LW · GW

I plugged the stocks mentioned in here into Double's backtesting tool. I couldn't get 6 of the stocks (Samsung, one of the solar ones, 4 other random ones). At least in 2024 the companies listed weighted by market cap produced a return of about 36%, being roughly on par with the S&P 500 (which clearly had an amazing year): 

Comment by habryka (habryka4) on How to (hopefully ethically) make money off of AGI · 2024-12-11T07:58:19.962Z · LW · GW

I couldn't get Samsung into the backtest, but the portfolio went up roughly 20% over the first three months, drastically outperforming the S&P 500. 

Since then it went down relative to the S&P 500, and is now roughly on par with it, but man, that one sure is up at roughly +35%. Overall a pretty good portfolio, though surprising to me that in the last year, it still didn't really outperform just the S&P 500.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-10T19:53:37.946Z · LW · GW

Thank you! I appreciate the quick oops here, and agree it was a mistake (but fixing it as quickly as you did I think basically made up for all the costs, and I greatly appreciate it).

Just to clarify, I don't want to make a strong statement that it's worth updating the data and maintaining the dashboard. By my lights it would be good enough to just have a static snapshot of it forever. The thing that seemed so costly to me was breaking old links and getting rid of data that you did think was correct. 

Thanks again!

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-10T17:45:56.781Z · LW · GW

(Did Ben indicate he didn’t consider it? My guess is he considered it, but thinks it’s not that likely and doesn’t have amazingly interesting things to say on it.

I think having a norm of explicitly saying “I considered whether you were saying the truth but I don’t believe it” seems like an OK norm, but not obviously a great one. In this case Ben also responded to a comment of mine which already said this, and so I really don’t see a reason for repeating it.)

Comment by habryka (habryka4) on Hazard's Shortform Feed · 2024-12-09T20:07:06.469Z · LW · GW

I've talked to Michael Vassar many times in person. I'm somewhat confident he has taken LSD based on him saying so (although if this turned out wrong I wouldn't be too surprised, my memory is hazy)

I would take bets at 9:1 odds that Michael has taken large amounts of psychedelics. I would also take bets at similar odds that he promotes the use of psychedelics.

Comment by habryka (habryka4) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-09T19:53:12.576Z · LW · GW

We have a giant fundraiser bar at the top of the frontpage, and a link to this post in the navbar. I feel like that's plenty spam already :P

Comment by habryka (habryka4) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-09T19:19:31.220Z · LW · GW

Good question. We don't pay our principal, the mortgage is structured as a balloon payment due in ~17 years (at which point the default thing to do would be to refinance or sell the property).

Comment by habryka (habryka4) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-09T17:20:52.686Z · LW · GW

No metrics, but it has come up a lot in various Intercom chats and Open Thread comments where people introduce themselves. I am really very unconfident in this number, but it does seem nontrivial.

Comment by habryka (habryka4) on Sapphire Shorts · 2024-12-09T07:23:41.825Z · LW · GW

Appreciate you sharing this. FWIW, I have heard of lots of really quite reckless and damaging-seeming drug consuming behavior around Olivia over the years, and am sad to hear it's still going on.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-09T03:42:42.683Z · LW · GW

I would currently be quite surprised if you had taken the same action if I was instead making an inference that positively reflects on CEA or EA. I might of course be wrong, but you did do it right after I wrote something critical of EA and CEA, and did not do it the many other times it was linked in the past year. Sadly your institution has a long history of being pretty shady with data and public comms this way, and so my priors are not very positively inclined.

I continue to think that it would make sense to at least leave the data up that CEA did feel comfortable linking in the last 1.5 years. By my norms invalidating links like this, especially if the underlying page happens to be unscrapeable by the internet archive, is really very bad form.

I did really appreciate your mid 2023 post!

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-09T01:27:27.221Z · LW · GW

Oh, huh, that seems very sad. Why would you do that? Please leave up the data that we have. I think it's generally bad form to break links that people relied on. The data was accurate as far as I can tell until August 2024, and you linked to it yourself a bunch over the years, don't just break all of those links.

I am pretty up-to-date with other EA metrics and I don't really see how this would be misleading. You had a disclaimer at the top that I think gave all the relevant context. Let people make their own inferences, or add more context, but please don't just take things down.

Unfortunately, archive.org doesn't seem to have worked for that URL, so we can't even rely on that to show the relevant data trends.

Edit: I'll be honest, after thinking about it for longer, the only reason I can think of why you would take down the data is because it makes CEA and EA look less on an upwards trajectory. But this seems so crazy. How can I trust data coming out of CEA if you have a policy of retracting data that doesn't align with the story you want to tell about CEA and EA? The whole point of sharing raw data is to allow other people to come to their own conclusions. This really seems like such a dumb move from a trust perspective.

Comment by habryka (habryka4) on hmys's Shortform · 2024-12-09T01:05:39.426Z · LW · GW

If we do a merch store, I would definitely want things to be high quality. Slightly ironically I can't guarantee as much for the donation tiers (since we already promised the t-shirts at least and we might not find a good manufacturer), but I will definitely still try to make it good. I can't really guarantee it in advance, based on my experiences with the stuff.

Comment by habryka (habryka4) on Hazard's Shortform Feed · 2024-12-08T23:04:54.597Z · LW · GW

Makes sense. My agree-react and my sense of Niplav's comments were specifically about Michael's writing/podcasts.

Comment by habryka (habryka4) on Hazard's Shortform Feed · 2024-12-08T22:08:25.191Z · LW · GW

(I think Jessica and Ben have both been great writers and I have learned a lot from both of them. I have also learned a bunch of things from Michael, but definitely not via his writing or podcasts or anything that wasn't in-person, or second-hand in-person. If you did learn something from the Michael podcasts or occasional piece of writing he has done, like the ones linked above, that would be a surprise to me)

Comment by habryka (habryka4) on hmys's Shortform · 2024-12-08T22:00:59.451Z · LW · GW

Yeah, I've been thinking about doing this. We do have a reward tier where you get a limited edition t-shirt or hoodie for $1000, but like, we haven't actually designed that one yet, and so there isn't that much appeal. 

Comment by habryka (habryka4) on GPTs are Predictors, not Imitators · 2024-12-08T21:06:28.971Z · LW · GW

I don't understand the problem with this sentence. Yes, the task is harder than the task of being a human (as good as a human is at that task). Many objectives that humans optimize for are also not optimized to 100%, and as such, humans also face many tasks that they would like to get better at, and so are harder than the task of simply being a human. Indeed, if you optimized an AI system on those, you would also get no guarantee that the system would end up only as competent as a human.

This is a fact about practically all tasks (including things like calculating the nth-digit of pi, or playing chess), but it is indeed a fact that lots of people get wrong.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-08T19:37:36.828Z · LW · GW

Yeah, makes sense. I don't think I am providing a full paper trail of evidence one can easily travel along, but I would take bets you would come to agree with it if you did spend the effort to look into it.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-08T19:22:58.182Z · LW · GW

(Someone is welcome to link post, but indeed I am somewhat hoping to avoid posting over there as much, as I find it reliably stressful in mostly unproductive ways) 

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-08T19:03:41.821Z · LW · GW

I do think the EA example is quite good on an illustrative level. It really strikes me as a rare case where we have an enormous pile of public empirical evidence (which is linked in the post) and it also seems by now really quite clear from a common-sense perspective. 

I don't think it makes sense to call this point "contentious". I think it's about as clear as these cases go. At least of the top of my head I can't think of an example that would have been clearer (maybe if you had some social movement that more fully collapsed and where you could do a retrospective root cause analysis, but it's extremely rare to have as clear of a natural experiment as the FTX one). I do think it's political in our local social environment, and so is harder to talk about, so I agree on that dimension a different example would be better.

I do think it would be good/nice to add an additional datapoint, but I also think this would risk being misleading. The point about reputation being lazily evaluated is mostly true from common-sense observations and logical reasoning, and the EA point is mostly trying to provide evidence for "yes, this is a real mistake that real people make". I think even if you dispute EAs reputation having gotten worse, I think the quotes from people above are still invalid and would mislead people (and I had this model before we observed the empirical evidence, and am writing it up because people told me they found it helpful for thinking through the FTX stuff as it was happening).

I think if I had a lot more time, I think the best thing to do would be to draw on some literature on polling errors or marketing, since the voting situation seems quite analogous. This might even get us some estimates of how strong the correlation between unevaluated and evaluated attitudes are, and how much they diverge for different levels of investment, if there exists any measurable one, and that would be cool.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-08T08:49:12.458Z · LW · GW

Practically all growth metrics are down (and have indeed turned negative on most measures), a substantial fraction of core contributors are distancing themselves from the EA affiliation, surveys among EA community builders report EA-affiliation as a major recurring obstacle[1], and many of the leaders who previously thought it wasn't a big deal now concede that it was/is a huge deal.

Also, informally, recruiting for things like EA Fund managers, or getting funding for EA Funds has become substantially harder. EA leadership positions appear to be filled by less competent people, and in most conversations I have with various people who have been around for a while, people seem to both express much less personal excitement or interest in identifying or championing anything EA-related, and report the same for most other people.

Related to the concepts in my essay, when measured the reputational differential also seem to reliably point towards people updating negatively towards EA as they learn more about EA (which shows up in the quotes you mentioned, and which more recently shows up in the latest Pulse survey, though I mostly consider that survey uninformative for roughly the reasons outlined in this post).

 

  1. ^

    As reported to me by someone I trust working in the space recently. I don't have a link at hand.

Comment by habryka (habryka4) on Sapphire Shorts · 2024-12-07T23:52:15.476Z · LW · GW

Is this someone who has a parasocial relationship with Vassar, or a more direct relationship? I was under the impression that the idea that Michael Vassar supports this sort of thing was a malicious lie spread by rationalist leaders in order to purge the Vassarites from the community.

I think "psychosis is underrated" and/or "psychosis is often the sign of a good kind of cognitive processing" are things I have heard from at least people very close to Michael (I think @jessicata made some arguments in this direction): 

"Psychosis" doesn't have to be a bad thing, even if it usually is in our society; it can be an exploration of perceptions and possibilities not before imagined, in a supportive environment that helps the subject to navigate reality in a new way; some of R.D. Liang's work is relevant here, describing psychotic mental states as a result of ontological insecurity following from an internal division of the self at a previous time.

(To be clear, I don't think "jessicata is in favor of psychosis" is at all a reasonable gloss here, but I do think there is an attitude towards things like psychosis that I disagree with that is common in the relevant circles)

Comment by habryka (habryka4) on Common misconceptions about OpenAI · 2024-12-07T23:31:43.228Z · LW · GW

I explained it a bit here: https://www.lesswrong.com/posts/fjfWrKhEawwBGCTGs/a-simple-case-for-extreme-inner-misalignment?commentId=tXPrvXihTwp2hKYME 

Yeah, the principled reason (though I am not like super confident of this) is that posts are almost always too big and have too many claims in them to make a single agree/disagree vote make sense. Inline reacts are the intended way for people to express agreement and disagreement on posts.

I am not super sure this is right, but I do want to avoid agreement/disagreement becoming disconnected from truth values, and I think applying them to elements that clearly don't have a single truth value weakens that connection.

Comment by habryka (habryka4) on Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader · 2024-12-07T23:06:58.494Z · LW · GW

Makes sense. My experience has been that in-person conversations are helpful for getting on the same page, but they also often come with confidentiality requests that then make it very hard for information to propagate back out into the broader social fabric, and that often makes those conversations more costly than beneficial. But I do think it's a good starting point if you don't do the very costly confidentiality stuff.

Fwiw, the contents of this original post actually have nothing to do with EA itself, or the past articles that mentioned me.

Yep, that makes sense. I wasn't trying to imply that it was (but still seems good to clarify).

Comment by habryka (habryka4) on Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader · 2024-12-07T22:53:39.079Z · LW · GW

Sure, happy to chat sometime. 

I haven't looked into the things I mentioned in a ton of detail (though have spent a few hours on it), but have learned to err on the side of sharing my takes here (where even if they are wrong, it seems better to have them be in the open so that people correct them and people can track what I believe even if they think it's dumb/wrong).

Comment by habryka (habryka4) on Understanding Shapley Values with Venn Diagrams · 2024-12-07T22:51:59.645Z · LW · GW

Do you know whether the person who wrote this would be OK with crossposting the complete content of the article to LW? I would be interested in curating it and sending it out in our 30,000 subscriber curation newsletter, if they were up for it.

Comment by habryka (habryka4) on Common misconceptions about OpenAI · 2024-12-07T22:51:01.128Z · LW · GW

I think people were happy to have the conversation happen. I did strong-downvote it, but I don't think upvotes are the correct measure here. If we had something like agree/disagree-votes on posts, that would have been the right measure, and my guess is it would have overall been skewed pretty strongly into the disagree-vote diretion.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-12-07T22:45:53.601Z · LW · GW

Reputation is lazily evaluated

When evaluating the reputation of your organization, community, or project, many people flock to surveys in which you ask randomly selected people what they think of your thing, or what their attitudes towards your organization, community or project are. 

If you do this, you will very reliably get back data that looks like people are indifferent to you and your projects, and your results will probably be dominated by extremely shallow things like "do the words in your name invoke positive or negative associations".

People largely only form opinions of you or your projects when they have some reason to do that, like trying to figure out whether to buy your product, or join your social movement, or vote for you in an election. You basically never care about what people think about you while engaging in activities completely unrelated to you, you care about what people will do when they have to take any action that is related to your goals. But the former is exactly what you are measuring in attitude surveys.

As an example of this (used here for illustrative purposes, and what caused me to form strong opinions on this, but not intended as the central point of this post): Many leaders in the Effective Altruism community ran various surveys after the collapse of FTX trying to understand what the reputation of "Effective Altruism" is. The results were basically always the same: People mostly didn't know what EA was, and had vaguely positive associations with the term when asked. The people who had recently become familiar with it (which weren't that many) did lower their opinions of EA, but the vast majority of people did not (because they mostly didn't know what it was). 

As far as I can tell, these surveys left most EA leaders thinking that the reputational effects of FTX were limited. After all, most people never heard about EA in the context of FTX, and seemed to mostly have positive associations with the term, and the average like or dislike in surveys barely budged. In reflections at the time, conclusions looked like this

  1. The fact that most people don't really care much about EA is both a blessing and a curse. But either way, it's a fact of life; and even as we internally try to learn what lessons we can from FTX, we should keep in mind that people outside EA mostly can't be bothered to pay attention.
  2. An incident rate in the single digit percents means that most community builders will have at least one example of someone raising FTX-related concerns—but our guess is that negative brand-related reactions are more likely to come from things like EA's perceived affiliation with tech or earning to give than FTX.
  3. We have some uncertainty about how well these results generalize outside the sample populations. E.g. we have heard claims that people who work in policy were unusually spooked by FTX. That seems plausible to us, though Ben would guess that policy EAs similarly overestimate the extent to which people outside EA care about EA drama.

Or this:

Yes, my best understanding is still that people mostly don't know what EA is, the small fraction that do mostly have a mildly positive opinion, and that neither of these points were affected much by FTX.[1] 

This, I think, was an extremely costly mistake to make. Since then, practically all metrics of the EA community's health and growth have sharply declined, and the extremely large and negative reputational effects have become clear.

Most programmers are familiar with the idea of a "lazily evaluated variable" - a value that isn't computed until the exact moment you try to use it. Instead of calculating the value upfront, the system maintains just enough information to be able to calculate it when needed. If you never end up using that value, you never pay the computational cost of calculating it. Similarly, most people don't form meaningful opinions about organizations or projects until the moment they need to make a decision that involves that organization. Just as a lazy variable suddenly gets evaluated when you first try to read its value, people's real opinions about projects don't materialize until they're in a position where that opinion matters - like when deciding whether to donate, join, or support the project's initiatives.

Reputation is lazily evaluated. People conserve their mental energy, time, and social capital by not forming detailed opinions about things until those opinions become relevant to their decisions. When surveys try to force early evaluation of these "lazy" opinions, they get something more like a placeholder value than the actual opinion that would form in a real decision-making context.

This computation is not purely cognitive. As people encounter a product, organization or community that they are considering doing something with, they will ask their friends whether they have any opinions, perform online searches, and generally seek out information to help them with whatever decision they are facing. This is part of the reason for why this metaphorical computation is costly and put off until it's necessary.

So when you are trying to understand what people think of you, or how people's opinions of you are changing, pay much more attention to the attitudes of people who have recently put in the effort to learn about you, or were facing some decision related to you, and so are more representative of where people tend to end up at when they are in a similar position. These will be much better indicators of your actual latent reputation than what happens when you ask people on a survey. 

For the EA surveys, these indicators looked very bleak: 

"Results demonstrated that FTX had decreased satisfaction by 0.5-1 points on a 10-point scale within the EA community"

"Among those aware of EA, attitudes remain positive and actually maybe increased post-FTX —though they were lower (d = -1.5, with large uncertainty) among those who were additionally aware of FTX."

"Most respondents reported continuing to trust EA organizations, though over 30% said they had substantially lost trust in EA public figures or leadership."

If various people in EA had paid attention to these, instead of to the approximately meaningless placeholder variables that you get when you ask people what they think of you without actually getting them to perform the costly computation associated with forming an opinion of you, I think they would have made substantially better predictions.

Comment by habryka (habryka4) on Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader · 2024-12-07T22:05:24.958Z · LW · GW

FWIW, if anyone is interested in my take, my guess is it doesn't make sense to support this (and mild-downvoted the post). 

I am pretty worried about some of your past reporting/activism in the space somewhat intentionally conflating between some broader Bay Area VC and tech culture and the "EA community" in a way that IMO ended up being more misleading than informing (and then you ended up promoting media articles that I think were misleading, despite I think many people pointing this out).

People can form their own opinions on this: https://forum.effectivealtruism.org/posts/JCyX29F77Jak5gbwq/ea-sexual-harassment-and-abuse?commentId=DAxFgmWe3acigvTfi 

I might also be wrong here, and I don't feel super confident, but I at least have some of my flags firing and would have a prior that lawsuits in the space, driven by the people who currently seem involved, would be bad. I think it's reasonable for people to have very different takes on this. 

I am obviously generally quite in favor of people sharing bad experiences they had, but would currently make bets that most people on LW would regret getting involved with this (but am also open to argument and don't feel super robust in this).

Comment by habryka (habryka4) on Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader · 2024-12-07T21:44:49.493Z · LW · GW

(The crosspost link isn't working)

Comment by habryka (habryka4) on Open Thread Fall 2024 · 2024-12-07T17:29:30.078Z · LW · GW

We have a few kinds of potential bonus a post could get, but yeah, something seems very off about your sort order, and I would really like to dig into it. A screenshot would still be quite valuable.

Comment by habryka (habryka4) on 1a3orn's Shortform · 2024-12-06T21:25:09.546Z · LW · GW

I was just thinking of adding some kind of donation tier where if you donate $20k to us we will custom-build a Gerver sofa, and dedicate it to you.

Comment by habryka (habryka4) on johnswentworth's Shortform · 2024-12-06T20:04:10.026Z · LW · GW

My guess is neither of you is very good at using them, and getting value out of them somewhat scales with skill. 

Models can easily replace on the order of 50% of my coding work these days, and if I have any major task, my guess is I quite reliably get 20%-30% productivity improvements out of them. It does take time to figure out at which things they are good at, and how to prompt them.