Posts

[Cross-post] Book Review: Bureaucracy, by James Q Wilson 2024-08-19T13:57:10.872Z
davekasten's Shortform 2024-05-01T22:07:24.997Z

Comments

Comment by davekasten on Zach Stein-Perlman's Shortform · 2024-09-06T22:18:54.902Z · LW · GW

Is this where we think our pressuring-Anthropic points are best spent ? 

Comment by davekasten on Habryka's Shortform Feed · 2024-09-04T02:52:58.520Z · LW · GW

I personally endorse this as an example of us being a community that Has The Will To Try To Build Nice Things.

Comment by davekasten on The Checklist: What Succeeding at AI Safety Will Involve · 2024-09-04T02:52:22.220Z · LW · GW

To say the obvious thing: I think if Anthropic isn't able to make at least somewhat-roughly-meaningful predictions about AI welfare, then their core current public research agendas have failed?

Comment by davekasten on Buck's Shortform · 2024-09-02T18:01:07.203Z · LW · GW

Fair enough! 

Comment by davekasten on Buck's Shortform · 2024-09-02T15:23:52.982Z · LW · GW

Possibly misguided question given the context -- I see you incorporating imperfect information in "the attack fails silently", why not also a distinction between "the attack succeeds noisily, the AI wins and we know it won" and "the attack succeeds silently, the AI wins and we don't know it won" ? 

Comment by davekasten on Verification methods for international AI agreements · 2024-08-31T18:43:02.621Z · LW · GW

I would suggest that the set of means available to nation-states to unilaterally surveil another nation state is far more expansive than the list you have.  For example, the good-old-fashioned "Paying two hundred and eighty-two thousand dollars in a Grand Cayman banking account to a Chinese bureaucrat"* appears nowhere in your list.  


*If you get that this is a reference to the movie Spy Game, you are cool.  If you don't, go watch Spy Game.  It has a worldview on power that is extremely relevant to rationalists. 

Comment by davekasten on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T16:16:14.851Z · LW · GW

I think you could argue plausibly that the climax of Vernor Vinge's A Deepness In the Sky has aspects of this, though it's subverted in multiple interesting spoilery ways.

In fact, I think you could argue that a lot of Vinge's writing tends to have major climaxes dependent on Xanatos Gambit pileups based on deception themes. 

Comment by davekasten on Why Large Bureaucratic Organizations? · 2024-08-27T21:16:49.552Z · LW · GW

This feels like a great theory for one motivation, but it isn't at all complete. 

For example: this theory doesn't really predict why anyone is ever hired above the bottom level of an organization at the margin.  

Comment by davekasten on Would catching your AIs trying to escape convince AI developers to slow down or undeploy? · 2024-08-27T02:59:33.026Z · LW · GW

That's a fair criticism!  Season 1 is definitely slower on that front compared to the others.  I think season 1 is the most normal "crime of the week" season by far, which is why I view it as a good on-ramp for folks less familiar.  Arguably, for someone situated as you are, you should just watch the pilot, read a quick wiki summary of every other episode in season 1 except for the last 2, watch those last 2, and move into season 2 when things get moving a little faster.  (Finch needs a character that doesn't appear until season 2 to do a lot of useful exposition on how he thinks about the Machine's alignment). 

Comment by davekasten on Would catching your AIs trying to escape convince AI developers to slow down or undeploy? · 2024-08-27T02:36:52.421Z · LW · GW

I will continue to pitch on the idea that Person of Interest is a TV show chock full of extremely popular TV people, including one lead beloved by Republicans, and we inexplicably fail to push people towards its Actually Did The Research presentation of loss-of-control stuff.

We should do that.  You all should, unironically, recommend it, streaming now on Amazon Prime for free, to your normie parents and aunts and uncles if they're curious about what you do at work. 

Comment by davekasten on davekasten's Shortform · 2024-08-23T22:21:24.167Z · LW · GW
Comment by davekasten on davekasten's Shortform · 2024-08-23T22:21:14.342Z · LW · GW

Why are we so much more worried about LLMs having CBRN risk than super-radicalization risk, precisely ?

(or is this just a expected-harm metric rather than a probability metric ?) 

Comment by davekasten on Zach Stein-Perlman's Shortform · 2024-08-23T14:39:24.690Z · LW · GW

I think you're eliding the difference between "powerful capabilities" being developed, the window of risk, and the best solution.  

For example, if Anthropic believes "_we_ will have it internally in 1-3 years, but no small labs will, and we can contain it internally" then they might conclude that the warrant for a state-level FMD is low.  Alternatively, you might conclude, "we will have it internally in 1-3 years, other small labs will be close behind, and an American state's capabilities won't be sufficient, we need DoD, FBI, and IC authorities to go stompy on this threat", and thus think a state-level FMD is low-value-add.  

Very unsure I agree with either of these hypos to be clear!  Just trying to explore the possibility space and point out this is complex. 

Comment by davekasten on davekasten's Shortform · 2024-08-22T22:39:11.918Z · LW · GW

I am (speaking personally) pleasantly surprised by Anthropic's letter.  https://cdn.sanity.io/files/4zrzovbb/website/6a3b14a98a781a6b69b9a3c5b65da26a44ecddc6.pdf

Comment by davekasten on davekasten's Shortform · 2024-08-22T21:51:04.419Z · LW · GW

I think at the meta level I very much doubt that I am responsible enough to create and curate a list of human beings for the most dangerous hazards.  For example, I am very confident that I could not 100% successfully detect a foreign government spy inside my friend group, because even the US intelligence community can't do that...  you need other mitigating controls, instead.

Comment by davekasten on davekasten's Shortform · 2024-08-22T21:49:36.656Z · LW · GW

Yeah, that's a useful taxonomy to be reminded of.  I think it's interesting how the "development hazard", item 8, with maybe a smidge of "adversary hazard", is the driver of people's thinking on AI.  I'm pretty unconvinced that good infohazard doctrine, even for AI, can be written based on thinking mainly about that!

Comment by davekasten on davekasten's Shortform · 2024-08-22T21:42:22.982Z · LW · GW

I think a lot of my underlying instinctive opposition to this concept boils down to thinking that we can and do coordinate on this stuff quite a lot.  Arguably, AI is the weird counterexample of a thought that wants to be thunk -- I think modern Western society is very nearly tailor-made to seek a thing that is abstract, maximizing, systematizing of knowledge, and useful, especially if it fills a hole left by the collapse of organized religion.  

I think for most other infohazards, the proper approach requires setting up an (often-government) team that handles them, which requires those employees to expose themselves to the infohazard to manage it.  And, yeah, sometimes they suffer real damage from it.  There's no way to analyze ISIS beheading videos to stop their perpetrators without seeing some beheading videos; I think that's the more-common varietal of infohazard I'm thinking of.

Comment by davekasten on davekasten's Shortform · 2024-08-22T21:21:17.398Z · LW · GW

I think this is plausibly describing some folks!  

But I also think there's a separate piece -- I observe, with pretty high odds that it isn't just an act, that at least some people are trying to associate themselves with the near-term harms and AI ethics stuff because they think that is the higher-status stuff, despite direct obvious evidence that the highest-status people in the room disagree.  

Comment by davekasten on davekasten's Shortform · 2024-08-21T15:13:43.661Z · LW · GW

I'm pretty sure that I think "infohazard" is a conceptual dead-end concept that embeds some really false understandings of how secrets are used by humans.  It is an orphan of a concept -- it doesn't go anywhere.  Ok, the information's harmful.  You need humans to touch that info anyways to do responsible risk-mitigation.  So now what ? 

Comment by davekasten on Zach Stein-Perlman's Shortform · 2024-08-20T22:27:00.443Z · LW · GW

I just want to note that people who've never worked in a true high-confidentiality environment (professional services, national defense, professional services for national defense) probably radically underestimate the level of brain damage and friction that Zac is describing here:

"Imagine, if you will, trying to hold a long conversation about AI risk - but you can’t reveal any information about, or learned from, or even just informative about LessWrong.  Every claim needs an independent public source, as do any jargon or concepts that would give an informed listener information about the site, etc.; you have to find different analogies and check that citations are public and for all that you get pretty regular hostility anyway because of… well, there are plenty of misunderstandings and caricatures to go around."

Confidentiality is really, really hard to maintain.  Doing so while also engaging the public is terrifying.  I really admire the frontier labs folks who try to engage publicly despite that quite severe constraint, and really worry a lot as a policy guy about the incentives we're creating to make that even less likely in the future.

Comment by davekasten on Akash's Shortform · 2024-08-18T00:28:22.060Z · LW · GW

One question I have is whether Nancy Pelosi was asked and agreed to do this, or whether Nancy Pelosi identified this proactively as an opportunity to try to win back some tech folks to the Dem side.  Substantially changes our estimate of how much influence the labs have in this conversation. 

Comment by davekasten on Provably Safe AI: Worldview and Projects · 2024-08-18T00:26:56.268Z · LW · GW

I mean, I think it's worth doing an initial loose and qualitative discussion to make sure that you're thinking about overlapping spaces conceptually.  Otherwise, not worth the more detailed effort. 

Comment by davekasten on Provably Safe AI: Worldview and Projects · 2024-08-17T21:58:45.029Z · LW · GW

Unsolicited suggestion: it is probably useful for y'all to define further what "pick a lock"means -- e.g., if someone builds a custom defeat device of some sort, that does some sort of activity that is non-destructive but engages in a mechanical operation very surprising to someone thinking of traditional lock-picking methods -- does that count? 

(I think you'd probably say yes, so long as the device isn't, e.g., a robot arm that's nondestructively grabbing the master key for the lock out of Zac's pocket and inserting it into the lock, but some sort of definining-in-advance would likely help.)

Nonetheless, think this would be awesome as an open challenge at Defcon (I suspect you can convince them to Black Badge the challenge...)

Comment by davekasten on Fields that I reference when thinking about AI takeover prevention · 2024-08-17T16:23:13.318Z · LW · GW

Thanks, this is helpful!  I also think this helps me understand a lot better what is intended to be different about @Buck 's research agenda from others, that I didn't understand previously.

Comment by davekasten on Fields that I reference when thinking about AI takeover prevention · 2024-08-17T01:13:00.899Z · LW · GW

So, I really, really am not trying to be snarky here but am worried this comment will come across this way regardless.  I think this is actually quite important as a core factual question given that you've been around this community for a while, and I'm asking you in your capacity as "person who's been around for a minute".  It's non-hyperbolically true that no one has published this sort of list before in this community?  

I'm asking, because if that's the case, someone should, e.g., just write a series of posts that just marches through US government best-practices documents on these domains (e.g., Chemical Safety Board, DoD NISPOM, etc.) and draws out conclusions on AI policy.   

Comment by davekasten on Akash's Shortform · 2024-08-14T22:32:47.072Z · LW · GW

I think I agree with much-to-all of this.  One further amplification I'd make about the last point: the culture of DC policymaking is one where people are expected to be quick studies and it's OK to be new to a topic; talent is much more funged from topic to topic in response to changing priorities than you'd expect.  Your Lesswrong-informed outside view of how much you need to know on a topic to start commenting on policy ideas is probably wrong.

(Yes, I know, someone is about to say "but what if you are WRONG about the big idea given weird corner case X or second-order effects Y?"   Look, reversed stupidity is not wisdom, but also also sometimes you can just quickly identify stupid-across-almost-all-possible-worlds ideas and convince people just not to do them rather than having to advocate for an explicit good-idea alternative.)

Comment by davekasten on AI #75: Math is Easier · 2024-08-01T22:02:46.417Z · LW · GW

Everyone assumes that it was named after Claude Shannon, but it appears they've never actually confirmed that.

Comment by davekasten on davekasten's Shortform · 2024-07-28T15:06:54.355Z · LW · GW

(This is not an endorsement of Jim Caviezel's beliefs, in case anyone somehow missed my point here.)

Comment by davekasten on davekasten's Shortform · 2024-07-28T15:06:23.565Z · LW · GW

I feel like one of the trivially most obvious signs that AI safety comms hasn't gone actually mainstream yet is that we don't say, "yeah, superintelligent AI is very risky.  No, I don't mean Terminator.  I'm thinking more Person of Interest, you know, that show with the guy from the Sound of Freedom and the other guy who was on Lost and Evil?"
 

Comment by davekasten on davekasten's Shortform · 2024-07-25T18:06:58.696Z · LW · GW

Yup1  I think those are potentially very plausible, and similar things were on my short list of possible explanations. I would be very not shocked if those are the true reasons.  I just don't think I have anywhere near enough evidence yet to actually conclude that, so I'm just reporting the random observation for now :)

Comment by davekasten on davekasten's Shortform · 2024-07-24T20:16:13.706Z · LW · GW

A random observation from a think tank event last night in DC -- the average person in those rooms is convinced there's a problem, but that it's the near-term harms, the AI ethics stuff, etc.  The highest-status and highest-rank people in those rooms seem to be much more concerned about catastrophic harms. 

This is a very weird set of selection effects.  I'm not sure what to make of it, honestly.

Comment by davekasten on Trust as a bottleneck to growing teams quickly · 2024-07-13T18:34:42.699Z · LW · GW

One item that I think I see missing from this list is what you might call "ritual" -- agreed-upon ways of knowing what to do in a given context that two members of an organization can have shared mental models of, whether or not they've worked together in the past.  This allows you to scale trust by reducing the amount of trust needed to handle the same amount of ambiguity, at some loss of flexibility. 

For example, when I was at McKinsey, calling a meeting a "problem solving" as opposed to a "status update" or a "steerco" would invoke three distinct sets of behaviors and expectations.  As a result, each participant had some sense of what other participants might by-default expect the meeting to feel like and be, and so even if participants hadn't worked much with each other in the past, they could know how to act in a trust-building way in that meeting context.  The flip side is that if the meeting needed something very different from the normal behaviors, it became slightly harder to break out of the default mode. 

Comment by davekasten on If AI starts to end the world, is suicide a good idea? · 2024-07-10T04:28:06.242Z · LW · GW

I would politely, but urgently suggest that if you're thinking a lot about scenarios where you could justify suicide, you might not be as interested in the scenarios as the permission you think they might give you.  And you might not realize that!  Motivated reasoning is a powerful force for folks who are feeling some mental troubles.  

This is the sort of thing where checking in with a loved one generally about how they perceive your general affect and mood is a really good idea.  I urge you to do that.  You're probably fine and just playing with some abstract ideas, but why not check in with a loved one just in case? 

Comment by davekasten on Reflections on Less Online · 2024-07-08T19:25:09.214Z · LW · GW

One of the things I greatly enjoyed about this writeup is that it reminded me how much the "empty-plate" vibe was lovely and something I want to try to create more of in my own day-to-day.  

Tangible specific action: I have been raving about how much I loved the Lighthaven supply cabinets.  I literally just now purchased a set of organizers shaped for my own bookcases to be able to recreate a similar thing in my own home; thank you for your reminder that caused me to do this.

Comment by davekasten on Reflections on Less Online · 2024-07-08T19:04:52.438Z · LW · GW

I would like to politely request that if you happen to have a chance to tell Leo's owner that Leo is clearly a very happy dog that feels loved, could you please do so on my behalf ? 

Comment by davekasten on jacquesthibs's Shortform · 2024-07-07T17:07:52.514Z · LW · GW

I am fairly skeptical that we don't already have something close-enough-to-approximate this if we had access to all the private email logs of the relevant institutions matched to some sort of correlation of "when this led to an outcome" metric (e.g., when was the relevant preprint paper or strategy deck or whatever released)

Comment by davekasten on davekasten's Shortform · 2024-07-01T19:20:58.257Z · LW · GW

You know, you're not the first person to make that argument to me recently.  I admit that I find it more persuasive than I used to.

Put another way: "will AI take all the jobs" is another way of saying* "will I suddenly lose the ability to feed and protect those I love."  It's an apocalypse in microcosm, and it's one that doesn't require a lot of theory to grasp.  

*Yes, yes, you could imagine universal basic income or whatever.  Do you think the average person is Actually Expecting to Get That ? 

Comment by davekasten on davekasten's Shortform · 2024-07-01T01:21:18.702Z · LW · GW

I totally think it's true that there are warning shots that would be non-mass-casualty events, to be clear, and I agree that the scenarios you note could maybe be those.

(I was trying to use "plausibly" to gesture at a wide range of scenarios, but I totally agree the comment as written isn't clearly meaning that).

I don't think folks intended anything Orwellian, just sort of something we stumbled into, and heck, if we can both be less Orwellian and be more compelling policy advocates at the same time, why not, I figure. 

Comment by davekasten on davekasten's Shortform · 2024-07-01T00:14:14.863Z · LW · GW

I really dislike the term "warning shot," and I'm trying to get it out of my vocabulary.  I understand how it came to be a term people use.  But, if we think it might actually be something that happens, and when it happens, it plausibly and tragically results the deaths of many folks, isn't the right term "mass casualty event" ? 

Comment by davekasten on The Incredible Fentanyl-Detecting Machine · 2024-06-30T15:11:34.225Z · LW · GW

I think this reveals something interesting about how US policymakers think about technology.  They don't really care how it works, they care that if they put budgetary dollars on this, they might plausibly get an outcome where (in combination with the social system that is the border and its policing), they get fentanyl detected.

Comment by davekasten on The Leeroy Jenkins principle: How faulty AI could guarantee "warning shots" · 2024-06-29T23:06:44.114Z · LW · GW

I am glad you wrote this, as I have been spending some time wondering about this possibility space. 

One more option: an AI can have a utility function where it seeks to max its time alive, and have enough cognition to think it is likely to die regardless when humans decide it is dangerous.  Even if they think they cannot win, they might seek to cause chaos that increases their total time to live.

Comment by davekasten on Buck's Shortform · 2024-06-28T23:04:09.962Z · LW · GW

This is definitely a tradeoff space!  

YES, there is a tradeoff here and yes regulatory capture is real, but there are also plenty of benign agencies that balance these things fairly well.  Most people on these forums live in nations where regulators do a pretty darn good job on the well-understood problems and balance these concerns fairly well.  (Inside Context Problems?)  

You tend to see design of regulatory processes that requires stakeholder input; in particular, the modern American regulatory state's reliance on the Administrative Procedure Act means that it's very difficult for a regulator to regulate without getting feedback from a wide variety of external stakeholders, ensuring that they have some flexibility without being arbitrary. 

I also think, contrary to conventional wisdom, that your concern is part of why many regulators end up in a "revolving-door" mechanism -- you often want individuals moving back and forth between those two worlds to cross-populate assumptions and check for areas where regulation has gotten misaligned with end goals

 

Comment by davekasten on Eli's shortform feed · 2024-06-20T04:28:11.275Z · LW · GW

No clue if true, but even if true, but DARPA is not at all a comparable to Intel.  Entity set up for very different purposes and engaging in very different patterns of capital investment.

Also very unclear to me why R&D is relevant bucket.  Presumably buying GPUs is either capex or if rented, is recognized under a different opex bucket (for secure cloud services) than R&D ? 

My claim isn't that the USG is like running its own research and fabs at equivalent levels of capability to Intel or TSMC.  It's just that if a war starts, it has access to plenty of GPUs through its own capacity and its ability to mandate borrowing of hardware at scale from the private sector.  

 

Comment by davekasten on Eli's shortform feed · 2024-06-19T20:14:02.322Z · LW · GW

I meant more "already in a data center," though probably some in a warehouse, too.

I roll to disbelieve that the people who read Hacker News in Ft. Meade, MD and have giant budgets aren't making some of the same decisions that people who read Hacker News in Palo Alto, CA and Redmond, WA would. 
 

Comment by davekasten on Eli's shortform feed · 2024-06-19T04:11:57.740Z · LW · GW

As you note, TSMC is building fabs in the US (and Europe) to reduce this risk.

I also think that it's worth noting that, at least in the short run, if the US didn't have shipments of new chips and was at war, the US government would just use wartime powers to take existing GPUs from whichever companies they felt weren't using them optimally for war and give them to the companies (or US Govt labs) that are.  

Plus, are you really gonna bet that the intelligence community and DoD and DoE don't have a HUUUUGE stack of H100s? I sure wouldn't take that action.

Comment by davekasten on jacquesthibs's Shortform · 2024-06-14T00:54:40.995Z · LW · GW

I think I am very doubtful of the ability of outsiders to correctly predict -- especially outsiders new to government contracting -- what the government might pull in.  I'd love to be wrong, though!  Someone should try it, and I think I was probably too definitive in my comment above.

Comment by davekasten on jacquesthibs's Shortform · 2024-06-13T20:19:42.599Z · LW · GW

If you think nationalization is near and the default, you shouldn't try to build projects and hope they get scooped into the nationalized thing.  You should try to directly influence the policy apparatus through writing, speaking on podcasts, and getting to know officials in the agencies most likely to be in charge of that.

(Note: not a huge fan of nationalization myself due to red-queen's-race concerns)

Comment by davekasten on davekasten's Shortform · 2024-06-07T20:39:02.974Z · LW · GW

I totally understand your point, agree that many folks would use your phrasing, and nonetheless think there is something uniquely descriptively true about the phrasing I chose and I stand by it.

Comment by davekasten on davekasten's Shortform · 2024-06-07T20:35:18.410Z · LW · GW

Say more ? 

Comment by davekasten on How was Less Online for you? · 2024-06-05T18:22:07.236Z · LW · GW

I personally did not find it messy compared to similar events I have gone to; not going to tell you your opinions are wrong to feel or to hold, just wanted avoid having one person's opinion get fossilized as everyone's opinion :)