Posts

Two easy things that maybe Just Work to improve AI discourse 2024-06-08T15:51:18.078Z
What does it look like for AI to significantly improve human coordination, before superintelligence? 2024-01-15T19:22:50.079Z
How do you feel about LessWrong these days? [Open feedback thread] 2023-12-05T20:54:42.317Z
Vote on worthwhile OpenAI topics to discuss 2023-11-21T00:03:03.898Z
New LessWrong feature: Dialogue Matching 2023-11-16T21:27:16.763Z
Does davidad's uploading moonshot work? 2023-11-03T02:21:51.720Z
Holly Elmore and Rob Miles dialogue on AI Safety Advocacy 2023-10-20T21:04:32.645Z
How to partition teams to move fast? Debating "low-dimensional cuts" 2023-10-13T21:43:53.067Z
Thomas Kwa's MIRI research experience 2023-10-02T16:42:37.886Z
Feedback-loops, Deliberate Practice, and Transfer Learning 2023-09-07T01:57:33.066Z
A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX 2023-09-01T04:03:41.067Z
Consider applying to a 2-week alignment project with former GitHub CEO 2023-04-04T06:20:49.532Z
How I buy things when Lightcone wants them fast 2022-09-26T05:02:09.003Z
How my team at Lightcone sometimes gets stuff done 2022-09-19T05:47:06.787Z
($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? 2021-07-22T01:26:26.117Z
What other peptide vaccines might it be useful to make? 2021-03-03T06:25:40.130Z
Credence polls for 26 claims from the 2019 Review 2021-01-09T07:13:24.166Z
Weekend Review Bash: Guided review writing, Forecasting and co-working, in EU and US times 2021-01-08T21:04:12.332Z
Thread for making 2019 Review accountability commitments 2020-12-18T05:07:25.533Z
Which sources do you trust the most on nutrition advice for exercise? 2020-12-16T03:22:40.088Z
The LessWrong 2018 Book is Available for Pre-order 2020-12-01T08:00:00.000Z
Why is there a "clogged drainpipe" effect in idea generation? 2020-11-20T19:08:08.461Z
Final Babble Challenge (for now): 100 ways to light a candle 2020-11-12T23:17:07.790Z
Babble Challenge: 50 thoughts on stable, cooperative institutions 2020-11-05T06:38:38.997Z
Babble challenge: 50 consequences of intelligent ant colonies 2020-10-29T07:21:33.379Z
Babble challenge: 50 ways of solving a problem in your life 2020-10-22T04:49:42.661Z
What are some beautiful, rationalist artworks? 2020-10-17T06:32:43.142Z
Babble challenge: 50 ways of hiding Einstein's pen for fifty years 2020-10-15T07:23:48.541Z
Babble challenge: 50 ways to escape a locked room 2020-10-08T05:13:06.985Z
Babble challenge: 50 ways of sending something to the moon 2020-10-01T04:20:24.016Z
Sunday August 16, 12pm (PDT) — talks by Ozzie Gooen, habryka, Ben Pace 2020-08-14T18:32:35.378Z
Sunday August 9, 1pm (PDT) — talks by elityre, jacobjacob, Ruby 2020-08-06T22:50:21.550Z
Sunday August 2, 12pm (PDT) — talks by jimrandomh, johnswenthworth, Daniel Filan, Jacobian 2020-07-30T23:55:44.712Z
$1000 bounty for OpenAI to show whether GPT3 was "deliberately" pretending to be stupider than it is 2020-07-21T18:42:44.704Z
Lessons on AI Takeover from the conquistadors 2020-07-17T22:35:32.265Z
Meta-preferences are weird 2020-07-16T23:03:40.226Z
Sunday July 19, 1pm (PDT) — talks by Raemon, ricraz, mr-hire, Jameson Quinn 2020-07-16T20:04:37.974Z
Mazes and Duality 2020-07-14T19:54:42.479Z
Sunday July 12 — talks by Scott Garrabrant, Alexflint, alexei, Stuart_Armstrong 2020-07-08T00:27:57.876Z
Public Positions and Private Guts 2020-06-26T23:00:52.838Z
Missing dog reasoning 2020-06-26T21:30:00.491Z
Sunday June 28 – talks by johnswentworth, Daniel kokotajlo, Charlie Steiner, TurnTrout 2020-06-26T19:13:23.754Z
DontDoxScottAlexander.com - A Petition 2020-06-25T05:44:50.050Z
Sunday June 21st – talks by Abram Demski, alkjash, orthonormal, eukaryote, Vaniver 2020-06-18T20:10:38.978Z
FHI paper on COVID-19 government countermeasures 2020-06-04T21:06:51.287Z
[Job ad] Lead an ambitious COVID-19 forecasting project [Deadline extended: June 10th] 2020-05-27T16:38:04.084Z
Crisis and opportunity during coronavirus 2020-03-12T20:20:55.703Z
[Link] Beyond the hill: thoughts on ontologies for thinking, essay-completeness and forecasting 2020-02-02T12:39:06.563Z
[Part 1] Amplifying generalist research via forecasting – Models of impact and challenges 2019-12-19T15:50:33.412Z
[Part 2] Amplifying generalist research via forecasting – results from a preliminary exploration 2019-12-19T15:49:45.901Z

Comments

Comment by jacobjacob on o3 · 2024-12-20T23:43:45.584Z · LW · GW

For people who don't expect a strong government response... remember that Elon is First Buddy now. 🎢

Comment by jacobjacob on jacobjacob's Shortform Feed · 2024-12-20T23:19:58.834Z · LW · GW

It is January 2020. 

Comment by jacobjacob on Anthropic leadership conversation · 2024-12-20T22:47:15.214Z · LW · GW

Okay, well, I'm not going to post "Anthropic leadership conversation [fewer likes]" 😂

Comment by jacobjacob on Anthropic leadership conversation · 2024-12-20T22:34:23.774Z · LW · GW

(Can you edit out all the "like"s, or give permission for an admin to do edit it out? I think in written text it makes speakers sound, for lack of a better word, unflatteringly moronic) 

Comment by jacobjacob on A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX · 2024-12-10T20:58:52.378Z · LW · GW

McNamara was at Ford, not Toyota. I reckon he modelled manufacturing like an efficient Boeing manager not an efficient SpaceX manager

Comment by jacobjacob on Secret Collusion: Will We Know When to Unplug AI? · 2024-09-19T17:18:38.603Z · LW · GW

(Nitpick: I'd find the first paragraphs would be much easier to read if they didn't have any of the bolding)

Comment by jacobjacob on Proveably Safe Self Driving Cars [Modulo Assumptions] · 2024-09-19T12:30:59.014Z · LW · GW

rename the "provable safety" area as "provable safety modulo assumptions" area and be very explicit about our assumptions.

Very much agree. I gave some feedback along those lines as the term was coined; and am sad it didn't catch on. But of course "provable safety modulo assumptions" isn't very short and catchy...

I do like the word "guarantee" as a substitute. We can talk of formal guarantees, but also of a store guaranteeing that an item you buy will meet a certain standard. So it's connotations are nicely in the direction of proof but without, as it were, "proving too much" :)

Comment by jacobjacob on Dario Amodei leaves OpenAI · 2024-08-24T15:56:44.980Z · LW · GW

Interesting thread to return to, 4 years later. 

Comment by jacobjacob on Leaving MIRI, Seeking Funding · 2024-08-12T18:07:27.438Z · LW · GW

FYI: I skimmed the post quickly and didn't realize there was a Patreon! 

If you wanted to change that, you might want to put it at the very end of the post, on a new line, saying something like: "If you'd like to fund my work directly, you can do so via Patreon [here](link)."

Comment by jacobjacob on jacobjacob's Shortform Feed · 2024-07-23T06:24:27.592Z · LW · GW

Someone posted these quotes in a Slack I'm in... what Ellsberg said to Kissinger: 

“Henry, there’s something I would like to tell you, for what it’s worth, something I wish I had been told years ago. You’ve been a consultant for a long time, and you’ve dealt a great deal with top secret information. But you’re about to receive a whole slew of special clearances, maybe fifteen or twenty of them, that are higher than top secret.

“I’ve had a number of these myself, and I’ve known other people who have just acquired them, and I have a pretty good sense of what the effects of receiving these clearances are on a person who didn’t previously know they even existed. And the effects of reading the information that they will make available to you.

[...]

“In the meantime it will have become very hard for you to learn from anybody who doesn’t have these clearances. Because you’ll be thinking as you listen to them: ‘What would this man be telling me if he knew what I know? Would he be giving me the same advice, or would it totally change his predictions and recommendations?’ And that mental exercise is so torturous that after a while you give it up and just stop listening. I’ve seen this with my superiors, my colleagues….and with myself.

“You will deal with a person who doesn’t have those clearances only from the point of view of what you want him to believe and what impression you want him to go away with, since you’ll have to lie carefully to him about what you know. In effect, you will have to manipulate him. You’ll give up trying to assess what he has to say. The danger is, you’ll become something like a moron. You’ll become incapable of learning from most people in the world, no matter how much experience they may have in their particular areas that may be much greater than yours.”

(link)

Comment by jacobjacob on Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity · 2024-07-18T04:26:46.949Z · LW · GW

tbf I never realized "sic" was mostly meant to point out errors, specifically. I thought it was used to mean "this might sound extreme --- but I am in fact quoting literally"

Comment by jacobjacob on Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity · 2024-07-16T22:12:58.935Z · LW · GW

I mean that in both cases he used literally those words.

Comment by jacobjacob on Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity · 2024-07-16T18:57:48.989Z · LW · GW

It's not epistemically poor to say these things if they're actually true.

Invalid. 

Compare: 

A: "So I had some questions about your finances, it seems your trading desk and exchange operate sort of closely together? There were some things that confused me..."

B: "our team is 20 insanely smart engineers" 

A: "right, but i had a concern that i thought perhaps ---"

B: "if you join us and succeed you'll be a multi millionaire"  

A: "...okay, but what if there's a sudden downturn ---" 

B: "bull market is inevitable right now"

 

Maybe not false. But epistemically poor form. 

Comment by jacobjacob on Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity · 2024-07-16T18:52:51.632Z · LW · GW

(crossposted to the EA Forum)

(😭 there has to be a better way of doing this, lol)

Comment by jacobjacob on Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity · 2024-07-16T03:20:15.448Z · LW · GW

(crossposted to EA forum)

I agree with much of Leopold's empirical claims, timelines, and analysis. I'm acting on it myself in my planning as something like a mainline scenario. 

Nonetheless, the piece exhibited some patterns that gave me a pretty strong allergic reaction. It made or implied claims like:

  • a small circle of the smartest people believe this
  • i will give you a view into this small elite group who are the only who are situationally aware
  • the inner circle longed tsmc way before you
  • if you believe me; you can get 100x richer -- there's still alpha, you can still be early
  • This geopolitical outcome is "inevitable" (sic!)
  • in the future the coolest and most elite group will work on The Project. "see you in the desert" (sic)
  • Etc.

Combined with a lot of retweets, with praise, on launch day, that were clearly coordinated behind the scenes; it gives me the feeling of being deliberately written to meme a narrative into existence via self-fulfilling prophecy; rather than inferring a forecast via analysis.

As a sidenote, this felt to me like an indication of how different the AI safety adjacent community is now to when I joined it about a decade ago. In the early days of this space, I expect a piece like this would have been something like "epistemically cancelled": fairly strongly decried as violating important norms around reasoning and cooperation. I actually expect that had someone written this publicly in 2016, they would've plausibly been uninvited as a speaker to any EAGs in 2017.

I don't particularly want to debate whether these epistemic boundaries were correct --- I'd just like to claim that, empirically, I think they de facto would have been enforced. Though, if others who have been around have a different impression of how this would've played out, I'd be curious to hear.

Comment by jacobjacob on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) · 2024-07-06T23:45:37.286Z · LW · GW

[censored_meme.png]

I like review bot and think it's good

Comment by jacobjacob on Habryka's Shortform Feed · 2024-07-04T20:13:22.961Z · LW · GW

(Sidenote: it seems Sam was kind of explicitly asking to be pressured, so your comment seems legit :)  
But I also think that, had Sam not done so, I would still really appreciate him showing up and responding to Oli's top-level post, and I think it should be fine for folks from companies to show up and engage with the topic at hand (NDAs), without also having to do a general AMA about all kinds of other aspects of their strategy and policies. If Zach's questions do get very upvoted, though, it might suggest there's demand for some kind of Anthropic AMA event.) 

Comment by jacobjacob on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) · 2024-07-04T19:17:58.001Z · LW · GW

Poor Review Bot, why do you get so downvoted? :(

Comment by jacobjacob on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) · 2024-07-04T17:30:53.095Z · LW · GW

I was around a few years ago when there were already debates about whether 80k should recommend OpenAI jobs. And that's before any of the fishy stuff leaked out, and they were stacking up cool governance commitments like becoming a capped-profit and having a merge-and-assist-clause. 

And, well, it sure seem like a mistake in hindsight how much advertising they got. 

Comment by jacobjacob on Actually, Power Plants May Be an AI Training Bottleneck. · 2024-07-03T01:50:49.344Z · LW · GW

30 kW

typo

Comment by jacobjacob on Habryka's Shortform Feed · 2024-07-01T01:13:46.443Z · LW · GW

Not sure how to interpret the "agree" votes on this comment. If someone is able to share that they agree with the core claim because of object-level evidence, I am interested. (Rather than agreeing with the claim that this state of affairs is "quite sad".)

Comment by jacobjacob on Habryka's Shortform Feed · 2024-07-01T01:11:35.472Z · LW · GW

Does anyone from Anthropic want to explicitly deny that they are under an agreement like this? 

(I know the post talks about some and not necessarily all employees, but am still interested). 

Comment by jacobjacob on Boycott OpenAI · 2024-06-22T04:13:51.678Z · LW · GW

Note that, by the grapevine, sometimes serving inference requests might loose OpenAI money due to them subsidising it. Not sure how this relates to boycott incentives. 

Comment by jacobjacob on What distinguishes "early", "mid" and "end" games? · 2024-06-21T18:38:36.138Z · LW · GW

That metaphor suddenly slide from chess into poker. 

Comment by jacobjacob on The Leopold Model: Analysis and Reactions · 2024-06-16T12:01:39.757Z · LW · GW

If AI ends up intelligent enough and with enough manufacturing capability to threaten nuclear deterrence; I'd expect it to also deduce any conclusions I would.

So it seems mostly a question of what the world would do with those conclusions earlier, rather than not at all.

A key exception is if later AGI would be blocked on certain kinds of manufacturing to create it's destabilizing tech, and if drawing attention to that earlier starts serially blocking work earlier.

Comment by jacobjacob on The Leopold Model: Analysis and Reactions · 2024-06-15T15:24:22.616Z · LW · GW

I have thoughts on the impact of AI on nuclear deterrents; and claims made thereof in the post.

But I'm uncertain whether it's wise to discuss such things publicly.

Curious if folks have takes on that. (The meta question)

Comment by jacobjacob on Access to powerful AI might make computer security radically easier · 2024-06-12T14:59:33.813Z · LW · GW

y'know, come to think of it... Training and inference differ massively in how much compute they consume. So after you've trained a massive system, you have a lot of compute free to do inference (modulo needing to use it to generate revenue, run your apps, etc). Meaning that for large scale, critical applications, it might in fact be feasible to tolerate some big, multiple OOMs, hit to the compute cost of your inference; if that's all that's required to get the zero knowledge benefits, and if those are crucial 

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-12T10:27:42.833Z · LW · GW

"arguments" is perhaps a bit generous of a term...

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T21:00:38.296Z · LW · GW

(also, lol at this being voted into negative! Giving karma as encouragement seems like a great thing. It's the whole point of it. It's even a venerable LW tradition, and was how people incentivised participation in the annual community surveys in the elden days)

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T09:42:18.075Z · LW · GW

(Also the arguments of this comment do not apply to Community Notes.)

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T08:47:49.701Z · LW · GW

the amount of people who could write sensible arguments is small

Disagree. The quality of arguments that need debunking is often way below the average LW:ers intellectual pay grade. And there's actually quite a lot of us.

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T08:46:19.103Z · LW · GW

Cross posting sure seems cheap. Though I think replying and engaging with existing discourse is easier than building a following of one's top level posts from scratch.

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T08:44:24.680Z · LW · GW

Yeah, my hypothesis is something like this might work.

(Though I can totally see how it wouldn't though, and I wouldn't have thought it a few years ago, so my intuition might just be mistaken)

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T08:42:43.769Z · LW · GW

I dont think the numbers really check out on your claim. Only a small proportion of people reading this are alignment researchers. And for remaining folks many are probably on Twitter anyway, or otherwise have some similarly slack part of their daily scheduling filled with sort of random non high opportunity cost stuff.

Historically there sadly hasn't been scalable ways for the average LW lurker to contribute to safety progress; now there might be a little one.

Comment by jacobjacob on Access to powerful AI might make computer security radically easier · 2024-06-08T22:21:48.692Z · LW · GW

never thought I'd die fighting side by side with an elf...

https://www.coindesk.com/tech/2024/04/09/venture-firm-a16z-releases-jolt-a-zero-knowledge-virtual-machine/

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-08T22:12:35.794Z · LW · GW

If anyone signed up to Community Notes because of this post, feel free to comment below and I'll give you upvote karma :) (not agreement karma)

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-08T21:37:07.613Z · LW · GW

Yes, I've felt some silent majority patterns.

Collective action problem idea: we could run an experiment -- 30 ppl opt in to writing 10 comments and liking 10 comments they think raise the sanity waterline, conditional on a total of 29 other people opting in too. (A "kickstarter".) Then we see if it seemed like it made a difference.

I'd join. If anyone is also down for that, feel free to use this comment as a schelling point and reply with your interest below.

(I'm not sure the right number of folks, but if we like the result we could just do another round.)

Comment by jacobjacob on Two easy things that maybe Just Work to improve AI discourse · 2024-06-08T18:05:00.281Z · LW · GW

there could still be founder effects in the discourse, or particularly influential people could be engaged in the twitter discourse.

I think that's the case. Mostly the latter, some of the former.

Comment by jacobjacob on Closed-Source Evaluations · 2024-06-08T16:29:56.052Z · LW · GW

Without commenting on the proposal itself; I think the term "eval test set" is clearer for this purpose than "closed source eval".

Comment by jacobjacob on Closed-Source Evaluations · 2024-06-08T16:23:37.877Z · LW · GW

I'm writing a quick and dirty post because the alternative is that I wait for months and maybe not write it after all.

This is the way. 

Comment by jacobjacob on Access to powerful AI might make computer security radically easier · 2024-06-08T16:20:15.857Z · LW · GW

I think this is an application of a more general, very powerful principle of mechanism design: when cognitive labor is abundant, near omni-present surveillance becomes feasible. 

For domestic life, this is terrifying. 

But for some high stakes, arms race-style scenarios, it might have applications. 

Beyond what you metioned, I'm particularly interested in this being a game-changer for bilateral negotiation. Two parties make an agreement, consent to being monitored by an AI auditor, and verify that the auditor's design will communicate with the other party if and only if there has been a rule breach. (Beyond the rule breach, it won't be able to leak any other information. And, being an AI, it can be designed to have its memory erased, never recruited as a spy, etc.) However, one big challenge of building this is how two adversarial parties could ever gain enough confidence to allow such a hardware/software package into a secure facility, especially if it's whole point is to have a communication channel to their adversary. 

Comment by jacobjacob on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-05T14:25:38.033Z · LW · GW

ah that makes sense thanks

Comment by jacobjacob on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-05T10:15:24.724Z · LW · GW

Sidenote: I'm a bit confused by the name. The all caps makes it seem like an acronym. But it seems to not be? 

Comment by jacobjacob on Talent Needs of Technical AI Safety Teams · 2024-05-25T16:34:56.616Z · LW · GW

Sure that works! Maybe use a term like "importantly misguided" instead of "correct"? (Seems easier for me to evaluate)

Comment by jacobjacob on Talent Needs of Technical AI Safety Teams · 2024-05-24T02:44:58.242Z · LW · GW

To anyone reading this who is considering working in alignment --

Following the recent revelations, I now believe OpenAI should be regarded as a bad faith actor. If you go work at OpenAI, I believe your work will be net negative; and will most likely be used to "safetywash" or "governance-wash" Sam Altman's mad dash to AGI. It now appears Sam Altman is at least a sketchy as SBF. Attempts to build "social capital" or "affect the culture from the inside" will not work under current leadership (indeed, what we're currently seeing are the failed results of 5+ years of such attempts). I would very strongly encourage anyone looking to contribute to stay away from OpenAI

I recognize this is a statement, and not an argument. I don't have the time to write out the full argument. But I'm leaving this comment here, such that others can signal agreement with it. 

Comment by jacobjacob on jacobjacob's Shortform Feed · 2024-05-08T08:30:34.408Z · LW · GW

That's more about me being interested in key global infrastructure, I've been curious about them for quite a lot of years after realising the combination of how significant what they're building is vs how few folks know about them. I don't know that they have any particularly generative AI related projects in the short term. 

Comment by jacobjacob on jacobjacob's Shortform Feed · 2024-05-08T07:54:31.698Z · LW · GW

Anyone know folks working on semiconductors in Taiwan and Abu Dhabi, or on fiber at Tata Industries in Mumbai? 

I'm currently travelling around the world and talking to folks about various kinds of AI infrastructure, and looking for recommendations of folks to meet! 

If so, freel free to DM me! 

(If you don't know me, I'm a dev here on LessWrong and was also part of founding Lightcone Infrastructure.)

Comment by jacobjacob on Express interest in an "FHI of the West" · 2024-04-19T08:45:13.435Z · LW · GW

Noting that a nicer name that's just waiting to be had, in this context, is "Future of the Lightcone Institute" :) 

Comment by jacobjacob on Express interest in an "FHI of the West" · 2024-04-18T09:17:00.726Z · LW · GW

Two notes: 

  1. I think the title is a somewhat obscure pun referencing the old saying that Stanford was the "Harvard of the West". If one is not familiar with that saying, I guess some of the nuance is lost in the choice of term. (I personally had never heard that saying before recently, and I'm not even quite sure I'm referencing the right "X of the West" pun)
  2. habryka did have a call with Nick Bostrom a few weeks back, to discuss his idea for an "FHI of the West", and I'm quite confident he referred to it with that phrase on the call, too. Far as I'm aware Nick didn't particularly react to it with more than a bit humor. 
Comment by jacobjacob on Does anyone know good essays on how different AI timelines will affect asset prices? · 2024-03-06T20:17:55.814Z · LW · GW

See this: https://www.lesswrong.com/posts/CTBta9i8sav7tjC2r/how-to-hopefully-ethically-make-money-off-of-agi