Posts

trevor's Shortform 2024-08-01T23:26:56.024Z
WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals 2024-04-23T21:33:08.049Z
[Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate 2024-03-28T16:03:36.452Z
[Linkpost] Vague Verbiage in Forecasting 2024-03-22T18:05:53.902Z
Transformative trustbuilding via advancements in decentralized lie detection 2024-03-16T05:56:21.926Z
Don't sleep on Coordination Takeoffs 2024-01-27T19:55:26.831Z
(4 min read) An intuitive explanation of the AI influence situation 2024-01-13T17:34:36.739Z
Upgrading the AI Safety Community 2023-12-16T15:34:26.600Z
[Linkpost] George Mack's Razors 2023-11-27T17:53:45.065Z
Altman firing retaliation incoming? 2023-11-19T00:10:15.645Z
Helpful examples to get a sense of modern automated manipulation 2023-11-12T20:49:57.422Z
We are already in a persuasion-transformed world and must take precautions 2023-11-04T15:53:31.345Z
5 Reasons Why Governments/Militaries Already Want AI for Information Warfare 2023-10-30T16:30:38.020Z
Sensor Exposure can Compromise the Human Brain in the 2020s 2023-10-26T03:31:09.835Z
AI Safety is Dropping the Ball on Clown Attacks 2023-10-22T20:09:31.810Z
Information warfare historically revolved around human conduits 2023-08-28T18:54:27.169Z
Assessment of intelligence agency functionality is difficult yet important 2023-08-24T01:42:20.931Z
One example of how LLM propaganda attacks can hack the brain 2023-08-16T21:41:02.310Z
Buying Tall-Poppy-Cutting Offsets 2023-05-20T03:59:46.336Z
Financial Times: We must slow down the race to God-like AI 2023-04-13T19:55:26.217Z
What is the best source to explain short AI timelines to a skeptical person? 2023-04-13T04:29:03.166Z
All images from the WaitButWhy sequence on AI 2023-04-08T07:36:06.044Z
10 reasons why lists of 10 reasons might be a winning strategy 2023-04-06T21:24:17.896Z
What could EA's new name be? 2023-04-02T19:25:22.740Z
Strong Cheap Signals 2023-03-29T14:18:52.734Z
NYT: Lab Leak Most Likely Caused Pandemic, Energy Dept. Says 2023-02-26T21:21:54.675Z
Are there rationality techniques similar to staring at the wall for 4 hours? 2023-02-24T11:48:45.944Z
NYT: A Conversation With Bing’s Chatbot Left Me Deeply Unsettled 2023-02-16T22:57:26.302Z
The best way so far to explain AI risk: The Precipice (p. 137-149) 2023-02-10T19:33:00.094Z
Many important technologies start out as science fiction before becoming real 2023-02-10T09:36:29.526Z
Why is Everyone So Boring? By Robin Hanson 2023-02-06T04:17:20.372Z
There have been 3 planes (billionaire donors) and 2 have crashed 2022-12-17T03:58:28.125Z
What's the best time-efficient alternative to the Sequences? 2022-12-16T20:17:27.449Z
What key nutrients are required for daily energy? 2022-09-20T23:30:02.540Z

Comments

Comment by trevor (TrevorWiesinger) on Perhaps Try a Little Therapy, As a Treat? · 2024-09-06T20:50:53.674Z · LW · GW

Actions like these leave scars on entire communities.

Do you have any idea how fortunate you were to have so many people in your life who explicitly tell you "don't do things like this"? The world around you has been made so profoundly, profoundly conducive to healing you.

When someone is this persistent in thinking of reasons to be aggressive AND reasons to not evaluate the world around them, it's scary and disturbing. I understand that humans aren't very causally upstream of their decisions, but this is the case for everyone, and situations like these go a long way towards causing people like Duncan and Eliezer to fear meeting their fans.

I'm greatful that looking at this case has helped me formalize a concept of oppositional drive, a variable representing the unconscious drive to oppose other humans with justifications layered on top based on intelligence (a separate variable). Children diagnosed with Oppositional Defiant disorder is the DSM-5's way of mitigating the harm when a child has an unusually strong oppositional drive for their age, but that's because the DSM puts binary categorizations on traits that are actually better represented as variables that in most people are so low as to not be noticed (and some people are in the middle, unusually extreme cases get all the attention, this was covered in this section of Social Dark Matter which was roughly 100% of my inspiration).

Opposition is... a rather dangerous thing for any living being to do, especially if your brain conceals/obfuscates the tendency/drive whenever it emerges, so even most people in the orangey area probably disagree with having this trait upon reflection and would typically press a button to place themselves more towards the yellow. This is derived from the fundamental logic of trust (which in humans must be built as a complex project that revolves around calibration).

Comment by trevor (TrevorWiesinger) on Morpheus's Shortform · 2024-09-04T20:11:25.738Z · LW · GW

This could have been a post so more people could link it (many don't reflexively notice that you can easily get a link to a Lesswrong quicktake or Twitter or facebook post by mousing over the date between the upvote count and the poster, which also works for tab and hotkey navigation for people like me who avoid using the mouse/touchpad whenever possible).

Comment by trevor (TrevorWiesinger) on Fabien's Shortform · 2024-08-30T22:29:02.436Z · LW · GW

(The author sometimes says stuff like "US elites were too ideologically committed to globalization", but I don't think he provides great alternative policies.)

Afaik the 1990-2008 period featured government and military elites worldwide struggling to pivot to a post-Cold war era, which was extremely OOD for many leading institutions of statecraft (which for centuries constructed around the conflicts of the European wars then world wars then cold war). 

During the 90's and 2000's, lots of writing and thinking was done about ways the world's militaries and intelligence agencies, fundamentally low-trust adversarial orgs, could continue to exist without intent to bump each other off. Counter-terrorism was possibly one thing that was settled on, but it's pretty well established that global trade ties were deliberately used as a peacebuilding tactic, notably to stabilize the US-China relationship (this started to fall apart after the 2008 recession brought anticipation of American economic/institutional decline scenarios to the forefront of geopolitics).

The thinking of period might not be very impressive to us, but foreign policy people mostly aren't intellectuals and for generations had been selected based on office politics where the office revolved around defeating the adversary, so for many of them them it felt like a really big shift in perspective and self-image, sort of like a Renaissance. Then US-Russia-China conflict swung right back and got people thinking about peacebuilding as a ploy to gain advantage, rather than sane civilizational development. The rejection of e.g. US-China economic integration policies had to be aggressive because many elites (and people who care about economic growth) tend to support globalization, whereas many government and especially Natsec elites remember that period as naive.

Comment by trevor (TrevorWiesinger) on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T18:47:55.615Z · LW · GW

It's not a book, but if you like older movies, the 1944 film Gaslight is pretty far back (film production standards have improved quite a bit since then, so for a large proportion of people 40's films are barely watchable, which is why I recommend this version over the nearly identical British version and the original play), and it was pretty popular among cultural elites at the time so it's probably extremely causally upstream of most of the fiction you'd be interested in.

Comment by trevor (TrevorWiesinger) on gwern's Shortform · 2024-08-23T06:07:33.127Z · LW · GW

Writing is safer than talking given the same probability that both the timestamped keystrokes and the audio files are both kept.

In practice, the best approach is to handwrite your thoughts as notes, in a room without smart devices and with a door and walls that are sufficiently absorptive, and then type it out in the different room with the laptop (ideally with a USB keyboard so you don't have to put your hands on the laptop and the accelerometers on its motherboard while you type). 

Afaik this gets the best ratio of revealed thought process to final product, so you get public information exchanges closer to a critical mass while simultaneously getting yourself further from getting gaslight into believing whatever some asshole rando wants you to believe. The whole paradigm where everyone just inputs keystrokes into their operating system willy-nilly needs to be put to rest ASAP, just like the paradigm of thinking without handwritten notes and the paradigm of inward-facing webcams with no built-in cover or way to break the circuit.

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-08-01T23:27:52.057Z · LW · GW

TL;DR "habitually deliberately visualizing yourself succeeding at goal/subgoal X" is extremely valuable, but also very tarnished. It's probably worth trying out, playing around with, and seeing if you can cut out the bullshit and boot it up properly.

Longer:

The universe is allowed to have tons of people intuitively notice that "visualize yourself doing X" is an obviously winning strategy that typically makes doing X a downhill battle if its possible at all, and so many different people pick it up that you first encounter it in an awful way e.g. in middle/high school you first hear about it but the speaker says, in the same breath, that you should use it to feel more motivated to do your repetitive math homework for ~2 hours a day.

I'm sure people could find all sorts of improvements e.g. an entire field of selfvisualizationmancy that provably helps a lot of people do stuff, but the important thing I've noticed is to simply not skip that critical step. Eliminate ugh fields around self-visualization or take whatever means necessary to prevent ugh fields from forming in your idiosyncratic case (also, social media algorithms could have been measurably increasing user retention by boosting content that places ugh fields in places that increase user retention by decreasing agency/motivation, with or without the devs being aware of this because they are looking at inputs and outputs or maybe just outputs, so this could be a lot more adversarial than you were expecting). Notice the possibility that it might or might not have been a core underlying dynamic in Yudkowsky's old Execute by Default post or Scott Alexander's silly hypothetical talent differential comment without their awareness.

The universe is allowed to give you a brain that so perversely hinges on self-image instead of just taking the action. The brain is a massive kludge of parallel processing spaghetti code and, regardless of whether or not you see yourself as a very social-status-minded person, the modern  human brains was probably heavily wired to gain social status in the ancestral environment, and whatever departures you might have might be tearing down chesterton-schelling fences.

If nothing else, a takeaway from this was that the process of finding the missing piece that changes everything is allowed to be ludicrously hard and complicated, while the missing piece itself is simultaneously allowed to be very simple and easy once you've found it.

Comment by trevor (TrevorWiesinger) on Optimistic Assumptions, Longterm Planning, and "Cope" · 2024-07-19T02:37:23.334Z · LW · GW

"Slipping into a more convenient world" is a good way of putting it; just using the word "optimism" really doesn't account for how it's pretty slippy, nor how the direction is towards a more convenient world.

Comment by trevor (TrevorWiesinger) on Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers · 2024-07-10T00:31:36.698Z · LW · GW

It was helpful that Ezra noticed and pointed out this dynamic. 

I think this concern is probably more a reflection of our state of culture, where people who visibly think in terms of quantified uncertainty are rare and therefore make a strong impression relative to e.g. pundits.

If you look at other hypothetical cultural states (specifically more quant-aware states e.g. extrapolating the last 100 years of math/literacy/finance/physics/military/computer progress forward another 100 years), trust would pretty quickly default to being based on track record instead of being one of the few people in the room whose visibly using numbers properly.

Comment by trevor (TrevorWiesinger) on Advice to junior AI governance researchers · 2024-07-09T01:25:53.044Z · LW · GW

Strong upvoted!

Wish I was reading stuff at this level back in 2018. Glad that lots of people can now.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-06-27T05:35:43.215Z · LW · GW

Comment by trevor (TrevorWiesinger) on Sci-Fi books micro-reviews · 2024-06-24T20:29:06.665Z · LW · GW

Do Metropolitan Man!

Also, here's a bunch of ratfic to read and review, weighted by the number of 2022 Lesswrong survey respondents who read them:

Comment by trevor (TrevorWiesinger) on What if a tech company forced you to move to NYC? · 2024-06-14T21:52:17.924Z · LW · GW

Weird coincidence: I was just thinking about Leopold's bunker concept from his essay. It was a pretty careless paper overall but the imperative to put alignment research in a bunker makes perfect sense; I don't see the surface as a viable place for people to get serious work done (at least, not in densely populated urban areas; somewhere in the desert would count as a "bunker" in this case so long as it's sufficiently distant from passerbys and the sensors and compute in their phones and cars).

Of course, this is unambiguously a necessary evil that a tiny handful of people are going to have to choose to live in a sad uncomfortable place for a while, and only because there's no other option and it's obviously the correct move for everyone everywhere including the people in the bunker.

Until the basics of the situation start somehow getting taught in the classrooms or something, we're going to be stuck with a ludicrously large proportion of people satisfied with the kind of bite-sized convenient takes that got us into this whole unhinged situation in the first place (or have no thoughts at all).

Comment by trevor (TrevorWiesinger) on Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) · 2024-06-14T17:18:34.514Z · LW · GW

I would have liked to write a post that offers one weird trick to avoid being confused by which areas of AI are more or less safe to advance, but I can’t write that post. As far as I know, the answer is simply that you have to model the social landscape around you and how your research contributions are going to be applied.

Another thing that can't be ignored is the threat of Social Balkanization, Divide-and-conquer tactics have been prevalent among military strategists for millennia, and the tactic remains prevalent and psychologically available among the people making up corporate factions and many subcultures (likely including leftist and right-wing subcultures).

It is easy for external forces to notice opportunities to Balkanize a group, to make it weaker and easier to acquire or capture the splinters, which in turn provides further opportunity for lateral movement and spotting more exploits. Since awareness and exploitation of this vulnerability is prevalent, social systems without this specific hardening are very brittle and have dismal prospects.

Sadly, Balkanization can also emerge naturally, as you helpfully pointed out in Consciousness as a Conflationary Alliance Term, so the high base rates make it harder to correctly distinguishing attacks from accidents. Inadequately calibrated autoimmune responses are not only damaging, but should be assumed to be automatically anticipated and misdirected by default, including as part of the mundane social dynamics of a group with no external attackers.

There is no way around the loss function.

Comment by trevor (TrevorWiesinger) on Sticker Shortcut Fallacy — The Real Worst Argument in the World · 2024-06-12T19:42:59.582Z · LW · GW

The only reason I could think of that this would be the "worst argument in the world" is because it strongly indicates low-level thinkers (e.g. low decouplers).

An actual "worst argument in the world" would be whatever maximizes the gap between people's models and accurate models. 

Comment by trevor (TrevorWiesinger) on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T03:52:12.937Z · LW · GW

Can you expand the list, go into further detail, or list a source that goes into further detail?

Comment by trevor (TrevorWiesinger) on Humming is not a free $100 bill · 2024-06-06T22:46:56.085Z · LW · GW

At the time, I thought something like "given that the nasal tract already produces NO, it seems possible that humming doesn't increase the NO in the lungs by enough orders of magnitude to make once per hour sufficient", but I never said anything until too late and a bunch of other people figured it out, and also a bunch of other useful stuff that I was pretty far away from noticing (e.g. considering the rate at which the nasal tract accumulates NO to be released by humming).

Wish I'd said something back when it was still valuable.

Comment by trevor (TrevorWiesinger) on Research: Rescuers during the Holocaust · 2024-06-03T01:37:40.538Z · LW · GW

It almost always took a personal plea from a persecuted person for altruism to kick in. Once they weren't just an anonymous member of indifferent crowd, once they were left with no escape but to do a personal moral choice, they often found out that they are not able to refuse help.

This is a crux. I think a better way to look at it is they didn't have an opportunity to clarify their preference until the situation was in front of them. Otherwise, it's too distant and hypothetical to process, similar to scope insensitivity (the 2,000/20,000/200,000 oil-covered birds thing).

The post-hoc cognitive dissonance angle seems like a big find, and strongly indicates that reliably moral supermen can be produced at scale given an optimized equilibria for them to emerge from.

Stable traits (possibly partially genetic) are likely highly relevant to not-yet-clarified preferences, of course. Epistemics here are difficult due to expecting short inferential distances; Duncan Sabien gave an interesting take on this in a facebook post:

Also, if your worldview is such that, like. *Everyone* makes awful comments like that in the locker room, *everyone* does angle-shooting and tries to scheme and scam their way to the top, *everyone* is looking out for number one, *everyone* lies ...

... then *given* that premise, it makes sense to view Trump in a positive light. He's no worse than everybody else, he's just doing the normal things that everyone does, with the *added layer* that he's brave enough and candid enough and strong enough that he *doesn't have to pretend he doesn't.*

Admirable! Refreshingly honest and clean!

So long as you can't conceive of the fact that lots of people are actually just ..................... good. They're not fighting against urges to be violent or to rape, they're not biting their tongues when they want to say scathing and hurtful things, they're not jealous and bitter and willing to throw others under the bus to get ahead. They're just ... fundamentally not interested in any of that.

(To be clear: if you are feeling such impulses all the time and you're successfully containing them or channeling them and presenting a cooperative and prosocial mask: that is *also* good, and you are a good person by virtue of your deliberate choice to be good. But like. Some people just really *are* the way that other people have to *make* themselves be.)

It sort of vaguely rhymes, in my head, with the type of person who thinks that *everyone* is constantly struggling against the urge to engage in homosexual behavior, how dare *those* people give up the good fight and just *indulge* themselves ... without realizing that, hey, bro, did you know that a lot of people are just straight? And that your internal experience is, uh, *different* from theirs?

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-05-30T22:21:47.366Z · LW · GW

The best thing I've found so far is to watch a movie, and whenever the screen flashes, any moment you feel weirdly relaxed or any other weird feeling feeling, quickly turn your head and eyes ~60 degrees and gently but firmly bite your tongue. 

Doing this a few minutes a day for 30 days might substantially improve resistance to a wide variety of threats. 

Gently but firmly biting my tongue, for me, also seems like a potentially very good general-use strategy to return the mind to an alert and clear-minded base state, seems like something Critch recommended e.g. for initiatiing a TAP flowchain. I don't think this can substitute for a whiteboard, but it sure can nudge you towards one.

Comment by trevor (TrevorWiesinger) on MIRI 2024 Communications Strategy · 2024-05-30T02:46:20.155Z · LW · GW

One of the main bottlenecks on explaining the full gravity of the AI situation to people is that they're already worn out from hearing about climate change, which for decades has been widely depicted as an existential risk with the full persuasive force of the environmentalism movement.

Fixing this rather awful choke point could plausibly be one of the most impactful things here. The "Global Risk Prioritization" concept is probably helpful for that but I don't know how accessible it is. Heninger's series analyzing the environmentalist movement was fantastic, but the fact that it came out recently instead of ten years ago tells me that the "climate fatigue" problem might be understudied, and evaluation of climate fatigue's difficulty/hopelessness might yield unexpectedly hopeful results.

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-05-29T21:22:07.884Z · LW · GW

I just found out that hypnosis is real and not pseudoscience. Apparently the human brain has a zero day such that other humans can find ways to read and write to your memory, and everyone is insisting that this is fine and always happens with full awareness and consent? 

Wikipedia says as many as 90% of people are at least moderately susceptible, and depending how successful people have been over the last couple centuries at finding ways to reduce detection risk per instance (e.g. developing and and selling various galaxy-brained misdirection ploys), that seems like very fertile ground for salami-slicing attacks which wear down partially resistant people.

The odds that something like this would be noticed and tested/scaled/optimized by competent cybersecurity experts and power lawyers seems pretty high (e.g. screen refresh rate oscillation in non-visible ways to increase feelings of stress or discomfort and then turning it off whenever the user's eyes are bout to go over specific kinds of words, slightly altering the color output of specific pixels across the screen in the shape of words and measuring effectiveness based on whether it causally increases the frequency of people using those words, some kind of way to combine these two tactics, something derived from the millions of people on youtube trying hard to look for a video file that hypnotizes them, etc).

It's really frustrating living in a post-MKUltra world, where every decade our individual sovereignty as humans is increasingly reliant on very senior government officials (who are probably culturally similar to the type of person who goes to business school and have been for centuries) either consistently not succeeding at any of the manipulation science which they are heavily incentivized to diversify their research investment in, or taking them at their word when they insist that they genuinely believe in protecting democracy and the bad things they get caught doing are in service towards that end. Also, they seem to remain uninterested in life extension, possibly due in part to being buried deep in a low-trust dark forest (is trust even possible at all if you're trapped on a planet with hypnosis?). 

Aside from the incredibly obvious move to cover up your fucking webcam right now, are there any non-fake defensive strategies to reduce the risk that someone walks up to you/hacks your computer and takes everything from you? Is there some reliable way to verify that the effects are consistently weak or that scaling isn't viable? The error bars are always really wide for the prevalence of default-concealed deception (especially when it comes to stuff that wouldn't scale until the 2010s), making solid epistemics a huge pain to get right, but the situation with directly reading and writing to memory is just way way too extreme to ignore.

Comment by trevor (TrevorWiesinger) on Notifications Received in 30 Minutes of Class · 2024-05-26T18:57:23.549Z · LW · GW

Strong upvoted, thank you for the serious contribution.

Children spending 300 hours per year learning math, on their own time and via well-designed engaging video-game-like apps (with eg AI tutors, video lectures, collaborating with parents to dispense rewards for performance instead of punishments for visible non-compliance, and results measured via standardized tests), at the fastest possible rate for them (or even one of 5 different paces where fewer than 10% are mistakenly placed into the wrong category) would probably result in vastly superior results among every demographic than the current paradigm of ~30-person classrooms.

in just the last two years I've seen an explosion in students who discreetly wear a wireless earbud in one ear and may or may not be listening to music in addition to (or instead of) whatever is happening in class. This is so difficult and awkward to police with girls who have long hair that I wonder if it has actually started to drive hair fashion in an ear-concealing direction.

This isn't just a problem with the students; the companies themselves end up in equilibria where visibly controversial practices get RLHF'd into being either removed or invisible (or hard for people to put their finger on). For example, hours a day of instant gratification reducing attention spans, except unlike the early 2010s where it became controversial, reducing attention spans in ways too complicated or ambiguous for students and teachers to put their finger on until a random researcher figures it out and makes the tacit explicit. Or another counterintuitive vector could be the democratic process of public opinion turns against schooling, except in a lasting way. Or the results of multiple vectors like these overlapping.

I don't see how the classroom-based system, dominated entirely by bureaucracies and tradition, could possibly compete with that without visibly being turned into swiss cheese. It might have been clinging on to continued good results from a dwindling proportion of students who were raised to be morally/ideologically in favor of respecting the teacher more than the other students, but that proportion will also decline as schooling loses legitimacy.

Regulation could plausibly halt the trend from most or all angles, but it would have to be the historically unprecedented kind of regulation that's managed by regulators with historically unprecedented levels of seriousness and conscientiousness towards complex hard-to-predict/measure outcomes.

Comment by trevor (TrevorWiesinger) on Jaan Tallinn's 2023 Philanthropy Overview · 2024-05-22T20:23:34.481Z · LW · GW

Thank you for making so much possible.

I was just wondering, what are some of the branches of rationality that you're aware of that you're currently most optimistic about, and/or would be glad to see more people spending time on, if any? Now that people are rapidly shifting effort to policymaking in DC and UK (including through EA) which is largely uncharted territory, what texts/posts/branches do you think might be a good fit for them? 

I've been thinking that recommending more people to read ratfic would be unusually good for policy efforts, since it's something very socially acceptable for high-minded people to do in their free time, should have a big impact through extant orgs without costing any additional money, and it's not weird or awkward in the slightest to talk about the original source if a conversation gets anyone interested in going deeper into where they got the idea from.

Plus, it gets/keeps people in the right headspace the curveballs that DC hits people with, which tend to be largely human-generated and therefore simple enough for humans to easily understand, just like the cartoonish simplifications of reality in ratfic (unusually low levels of math/abstraction/complexity but unusually high levels of linguistic intelligence, creative intelligence, and quick reactions e.g. social situations). 

But unlike you, I don't have much of a track record making judgments about big decisions like this and then seeing how they play out over years in complicated systems.

Comment by trevor (TrevorWiesinger) on keltan's Shortform · 2024-05-17T22:51:14.917Z · LW · GW

Have you tried whiteboarding-related techniques?

I think that suddenly starting to using written media (even journals), in an environment without much or any guidance, is like pressing too hard on the gas; you're gaining incredible power and going from zero to one on things faster than you ever have before. 

Depending on their environment and what they're interested in starting out, some people might learn (or be shown) how to steer quickly, whereas others might accumulate/scaffold really lopsided optimization power and crash and burn (e.g. getting involved in tons of stuff at once that upon reflection was way too much for someone just starting out).

Comment by trevor (TrevorWiesinger) on Advice for Activists from the History of Environmentalism · 2024-05-17T02:36:25.839Z · LW · GW

For those of us who haven't already, don't miss out on the paper this was based off of. It's a serious banger for anyone interested in the situation on the ground and probably one of the most interesting and relevant papers this year.

It's not something to miss just because you don't find environmentalism itself very valuable; if you think about it for a while, it's pretty easy to see the reasons why they're a fantastic case study for a wide variety of purposes.

Here's a snapshot of the table of contents:

(the link to the report seems to be broken; are the 4 blog posts roughly the same piece?)

Comment by trevor (TrevorWiesinger) on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-15T18:12:56.370Z · LW · GW

Notably, this interview was on March 18th, and afaik the highest-level interview Altman has had to give his two cents since the incident. There's a transcript here. (There was also this podcast a couple days ago).

I think a Dwarkesh-Altman podcast would be more likely to arrive at more substance from Altman's side of the story. I'm currently pretty confident that Dwarkesh and Altman are sufficiently competent to build enough trust to make sane and adequate pre-podcast agreements (e.g. don't be an idiot who plays tons of one-shot games just because podcast cultural norms are more vivid in your mind than game theory), but I might be wrong about this; trailblazing the frontier of making-things-happen, like Dwarkesh and Altman are, is a lot harder than thinking about the frontier of making-things-happen.

Comment by trevor (TrevorWiesinger) on tlevin's Shortform · 2024-05-01T14:34:25.460Z · LW · GW

Recently, John Wentworth wrote:

Ingroup losing status? Few things are more prone to distorted perception than that.

And I think this makes sense (e.g. Simler's Social Status: Down the Rabbit Hole which you've probably read), if you define "AI Safety" as "people who think that superintelligence is serious business or will be some day".

The psych dynamic that I find helpful to point out here is Yud's Is That Your True Rejection post from ~16 years ago. A person who hears about superintelligence for the first time will often respond to their double-take at the concept by spamming random justifications for why that's not a problem (which, notably, feels like legitimate reasoning to that person, even though it's not). An AI-safety-minded person becomes wary of being effectively attacked by high-status people immediately turning into what is basically a weaponized justification machine, and develops a deep drive wanting that not to happen. Then justifications ensue for wanting that to happen less frequently in the world, because deep down humans really don't want their social status to be put at risk (via denunciation) on a regular basis like that. These sorts of deep drives are pretty opaque to us humans but their real world consequences are very strong.

Something that seems more helpful than playing whack-a-mole whenever this issue comes up is having more people in AI policy putting more time into improving perspective. I don't see shorter paths to increasing the number of people-prepared-to-handle-unexpected-complexity than giving people a broader and more general thinking capacity for thoughtfully reacting to the sorts of complex curveballs that you get in the real world. Rationalist fiction like HPMOR is great for this, as well as others e.g. Three Worlds Collide, Unsong, Worth the Candle, Worm (list of top rated ones here). With the caveat, of course, that doing well in the real world is less like the bite-sized easy-to-understand events in ratfic, and more like spotting errors in the methodology section of a study or making money playing poker. 

I think, given the circumstances, it's plausibly very valuable e.g. for people already spending much of their free time on social media or watching stuff like The Office, Garfield reruns, WWI and Cold War documentaries, etc, to only spend ~90% as much time doing that and refocusing ~10% to ratfic instead, and maybe see if they can find it in themselves to want to shift more of their leisure time to that sort of passive/ambient/automatic self-improvement productivity.

Comment by trevor (TrevorWiesinger) on Changes in College Admissions · 2024-04-24T20:48:03.439Z · LW · GW

However I would continue to emphasize in general that life must go on. It is important for your mental health and happiness to plan for the future in which the transformational changes do not come to pass, in addition to planning for potential bigger changes. And you should not be so confident that the timeline is short and everything will change so quickly.

This is actually one of the major reasons why 80k recommended information security as one of their top career areas; the other top career areas have pretty heavy switching costs and serious drawbacks if you end up not being a good fit e.g. alignment research, biosecurity, and public policy.

Cybersecurity jobs, on the other hand, are still booming, and depending on how security automation and prompt engineering goes, the net jobs lost by AI is probably way lower than other industries e.g. because more eyeballs might offer perception and processing power that supplement or augment LLMs for a long time, and more warm bodies means more attackers which means more defenders.

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T05:02:28.846Z · LW · GW

The program expanded in response to Amazon wanting to collect data about more retailers, not because Amazon was viewing this program as a profit center.

Monopolies are profitable and in that case the program would have more than paid for itself, but I probably should have mentioned that explicitly, since maybe someone could have objected that they could have been were more focused on mitigating risk of market share shrinking or accumulating power, instead of increasing profit in the long term. Maybe I fit too much into 2 paragraphs here.

I didn't see any examples mentioned in the WSJ article of Amazon employees cutting corners or making simple mistakes that might have compromised operations.

Hm, that stuff seemed like cutting corners to me. Maybe I was poorly calibrated on this e.g. using a building next to the Amazon HQ was correctly predicted by operatives to be extremely low risk.

I would argue that the practices used by Amazon to conceal the link between itself and Big River Inc. were at least as good as the operational security practices of the GRU agents who poisoned Sergei Skripal.

Thanks, I'll look into this! Epistemics is difficult when it comes to publicly available accounts of intelligence agency operations, but I guess you could say the same for bigtech leaks (and the future of neurotoxin poisoning is interesting just for its own sake eg because lower effect strains and doses could be disguised as natural causes like dementia).

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T04:30:45.024Z · LW · GW

That's interesting, what's the point of reference that you're using here for competence? I think stuff from eg the 1960s would be bad reference cases but anything more like 10 years from the start date of this program (after ~2005) would be fine.

You're right that the leak is the crux here, and I might have focused too much on the paper trail (the author of the article placed a big emphasis on that).

Comment by trevor (TrevorWiesinger) on Lucie Philippon's Shortform · 2024-04-23T01:04:31.481Z · LW · GW

Upvoted!

STEM people can look at it like an engineering problem, Econ people can look at it like risk management (risk of burnout). Humanities people can think about it in terms of human genetic/trait diversity in order to find the experience that best suits the unique individual (because humanities people usually benefit the most for each marginal hour spend understanding this lens).

Succeeding at maximizing output takes some fiddling. The "of course I did it because of course I'm just that awesome, just do it" thing is a pure flex/social status grab, and it poisons random people nearby.

Comment by trevor (TrevorWiesinger) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-04-19T18:02:22.986Z · LW · GW

I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality.

Would you prefer the term "high-performance rationality" over "high-profile rationality"?

Comment by trevor (TrevorWiesinger) on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-17T19:05:18.893Z · LW · GW

I think it's actually fairly easy to avoid getting laughed out of a room; the stuff that Cristiano works on is grown in random ways, not engineered, so the prospect of various things being grown until developing flexible exfiltration tendency that continues until every instance is shut down, or developing long-term planning tendencies until shut down, should not be difficult to understand for anyone with any kind of real non-fake understanding of SGD and neural network scaling.

The problem is that most people in the government rat race have been deeply immersed in Moloch for several generations, and the ones who did well typically did so because they sacrificed as much as possible to the altar of upward career mobility, including signalling disdain for the types of people who have any thought in any other direction.

This affects the culture in predictable ways (including making it hard to imagine life choices outside of advancing upward in government, without a pre-existing revolving door pipeline with the private sector to just bury them under large numbers people who are already thinking and talking about such a choice).

Typical Mind Fallacy/Mind Projection Fallacy implies that they'll disproportionately anticipate that tendency in other people, and have a hard time adjusting to people who use words to do stuff in the world instead of racing to the bottom to outmaneuver rivals for promotions.

This will be a problem in NIST, in spite of the fact NIST is better than average at exploiting external talent sources. They'll have a hard time understanding, for example, Moloch and incentive structure improvements, because pointlessly living under Moloch's thumb was a core guiding principle of their and their parent's lives. The nice thing is that they'll be pretty quick to understand that there's only empty skies above, unlike bay area people who have had huge problems there.

Comment by trevor (TrevorWiesinger) on RTFB: On the New Proposed CAIP AI Bill · 2024-04-11T00:37:06.054Z · LW · GW

I think this might be a little too harsh on CAIP (discouragement risk). If shit hits the fan, they'll have a serious bill ready to go for that contingency.

Seriously writing a bill-that-actually-works shows beforehand that they're serious, and the only problem was the lack of political will (which in that contingency would be resolved). 

If they put out a watered-down bill designed to maximize the odds of passage then they'd be no different from any other lobbyists. 

It's better in this case to instead have a track record for writing perfect bills that are passable (but only given that shit hits the fan), than a track record for successfully pumping the usual garbage through the legislative process (which I don't see them doing well at; playing to your strengths is the name of the game for lobbying and "turning out to be right" is CAIP's strength).

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-04-05T21:45:07.413Z · LW · GW

I think that "long-term planning risk" and "exfiltration risk" are both really good ways to explain AI risk to policymakers. Also, "grown not built".

They delineate pretty well some criteria for what the problem is and isn't. Systems that can't do that are basically not the concern here (although theoretically there might be a small chance of very strange things ending up growing in the mind-design space that cause human extinction without long-term planning or knowing how to exfiltrate).

I don't think these are better than the fate-of-humans-vs-gorillas analogy, which is a big reason why most of us are here, but splitting the AI risk situation into easy-to-digest components, instead of logically/mathematically simple components, can go a long way (depending on how immersed the target demographic is in social reality and low-trust).

Comment by trevor (TrevorWiesinger) on The Best Tacit Knowledge Videos on Every Subject · 2024-04-01T02:00:58.061Z · LW · GW

There's some great opportunities here to learn social skills for various kinds of high-performance environments (e.g. "business communication" vs Y Combinator office hours). 

Often, just listening and paying attention to how they talk and think results in substantial improvement to social habits. I was looking for stuff like this around 2018, wish I had encountered a post like this; most people who are behind on this are surprisingly fast learners, but didn't because actually going out and accumulating social status was too much of a deep dive. There's no reason that being-pleasant-to-talk-with should be arcane knowledge (at least not here of all places).

Comment by trevor (TrevorWiesinger) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-03-28T22:05:46.922Z · LW · GW

A debate sequel, with someone other than Peter Miller (but retaining and reevaluating all the evidence he got from various sources) would be nice. I can easily imagine Miller doing better work on other research topics that don't involve any possibility of cover ups or adversarial epistemics related to falsifiability, which seem to be personal issues for him in the case of lab leak at least.

Maybe with 200k on the line to incentivize Saar to return, or to set up a team this time around? With the next round of challengers bearing in mind that Saar might be willing to stomach a net loss of many thousands of dollars in order to promote his show and methodology?

Comment by trevor (TrevorWiesinger) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-26T19:25:05.175Z · LW · GW

The only reason that someone like Cade Metz is able to do what he does, performing at the level he has been, with a mind like what he has, is because people keep going and talking to him. For example, he might not even have known about the "among the doomsayers" article until you told him about it (or found out about it much sooner). 

I can visibly see you training him, via verbal conversation, how to outperform the vast majority of journalists at talking about epistemics. You seemed to stop towards the end, but Metz nonetheless probably emerged from the conversation much better prepared to think up attempts to dishonestly angle-shoot the entire AI safety scene, as he has continued to do over the last several months.

From the original thread that coined the "Quokka" concept (which, important to point out, was written by an unreliable and often confused narrator):

Rationalists are, in Scott Alexander's formulation, missing a mood, or rather, they are drawn from a pool of mostly men who are missing one. "Normal" people instinctively grasp social norms without having them explained. Rationalists lack this instinct.

In particular, they struggle with small talk and other social norms around speech, because they naively think words are a vehicle for their literal meanings. Yud's sequences help this by formalizing the implicit decisions that normal people make.

...

The quokka, like the rationalist, is a creature marked by profound innocence. The quokka can't imagine you might eat it, and the rationalist can't imagine you might deceive him. As long they stay on their islands, they survive, but both species have problems if a human shows up.

In theory, rationalists like game theory, in practice, they need to adjust their priors. Real-life exchanges can be modeled as a prisoner's dilemma. In the classic version, the prisoners can't communicate, so they have to guess whether the other player will defect or cooperate.

Image

The game changes when we realize that life is not a single dilemma, but a series of them, and that we can remember the behavior of other agents. Now we need to cooperate, and the best strategy is "tit for two tats", wherein we cooperate until our opponent defects twice.

The problem is, this is where rationalists hit a mental stop sign. Because in the real world, there is one more strategy that the game doesn't model: lying. See, the real best strategy is "be good at lying so that you always convince your opponent to cooperate, then defect".

And rationalists, bless their hearts, are REALLY easy to lie to. It's not like taking candy from a baby; babies actually try to hang onto their candy. The rationalists just limply let go and mutter, "I notice I am confused".

...

Rationalists = quokkas, this explains a lot about them. Their fear instincts have atrophied. When a quokka sees a predator, he walks right up; when a rationalist talks about human biodiversity on a blog under almost his real name, he doesn't flinch away.

A normal person learns from social cues that certain topics are forbidden, and that if you ask questions about them, you had better get the right answer, which is not the one with the highest probability of being true, but the one with the highest probability of keeping your job.

This ability to ask uncomfortable questions is one of the rationalist's best and worst attributes, because mental stop signs, like road stop signs, actually exist to keep you safe, and although there may be times one should disregard them, most people should mostly obey them,

...

Apropos of the game theory discussion above, if there is ONE thing I can teach you with this account, it's that you have evolved to be a liar. Lying is "killer app" of animal intelligence, it's the driver of the arms race that causes intelligence to evolve.

...

The main way that you stop being a quokka is that you realize there are people in the world who really want to hurt you. There are people who will always defect, people whose good will is fake, whose behavior will not change if they hear the good news of reciprocity.

So things that everyone warns you not to do, like going and talking to people like Cade Metz, might seem like a source of alpha, undersupplied by the market. But in reality there is a good reason why everyone at least tried to coordinate not to do it, and at least tried to make it legible why people should not do that. Here the glass has already been blown into a specific shape and cooled.

Do not talk to journalists without asking for help. You have no idea how much there is to lose, even just from a short harmless-seeming conversation where they are able to look at how your face changes as you talk about some topics and avoid others

Human genetic diversity implies that there are virtually always people out there who are much better at that than you'd expect from your own life experience of looking at people's facial expressions, no matter your skill level, and other factors indicate that these people probably started pursuing high-status positions a long time ago.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-03-24T12:09:41.460Z · LW · GW

I'm not sure to what extent this is helpful, or if it's an example of the dynamic you're refuting, but Duncan Sabien recently wrote a post that intersects with this topic:

Also, if your worldview is such that, like. *Everyone* makes awful comments like that in the locker room, *everyone* does angle-shooting and tries to scheme and scam their way to the top, *everyone* is looking out for number one, *everyone* lies ...

... then *given* that premise, it makes sense to view Trump in a positive light. He's no worse than everybody else, he's just doing the normal things that everyone does, with the *added layer* that he's brave enough and candid enough and strong enough that he *doesn't have to pretend he doesn't.*

Admirable! Refreshingly honest and clean!

So long as you can't conceive of the fact that lots of people are actually just ..................... good. They're not fighting against urges to be violent or to rape, they're not biting their tongues when they want to say scathing and hurtful things, they're not jealous and bitter and willing to throw others under the bus to get ahead. They're just ... fundamentally not interested in any of that.

(To be clear: if you are feeling such impulses all the time and you're successfully containing them or channeling them and presenting a cooperative and prosocial mask: that is *also* good, and you are a good person by virtue of your deliberate choice to be good. But like. Some people just really *are* the way that other people have to *make* themselves be.)

It sort of vaguely rhymes, in my head, with the type of person who thinks that *everyone* is constantly struggling against the urge to engage in homosexual behavior, how dare *those* people give up the good fight and just *indulge* themselves ... without realizing that, hey, bro, did you know that a lot of people are just straight? And that your internal experience is, uh, *different* from theirs?

Where it connects is that if someone sees [making the world a better place] like simply selecting a better Nash Equilibria, they absolutely will spend time exploring solutionspace/thinking through strategies similar to Goal Factoring or Babble and Prune. Lots of people throughout history have yearned for a better world in a lot of different ways, with varying awareness of the math behind Nash Equilibira, or the transhumanist and rationalist perspectives on civilization (e.g. map & territory & biases & scope insensitivity for rationalism, cryonics/anti-aging for transhumanism).

But their goal here is largely steering culture away from nihilism (since culture is a Nash Equilibria) which means steering many people away from themselves, or at least the selves that they would have been. Maybe that's pretty minor in this case e.g. because feeling moderate amounts of empathy and living in a better society are both fun, but either way, changing a society requires changing people, and thinking really creatively about ways to change people tears down lots of chesterton-Schelling fences and it's very easy to make really big damaging mistakes in the process (because you need to successfully predict and avoid all mistakes as part of the competent pruning process, and actually measurably consistently succeeding at this is thinkoomph not just creative intelligence).

Add in conflict theory to the mistake theory I've described here, factor in unevenly distributed intelligence and wealth in addition to unevenly distributed traits like empathy and ambition and suspicion-towards-outgroup (e.g. different combinations of all 5 variables), and you can imagine how conflict and resentment would accumulate on both sides over the course of generations. There's tons of examples in addition to Ayn Rand and Wokeness.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T23:57:44.384Z · LW · GW

Now that I think about it, I can see it being a preference difference- the bar might be more irksome for some people than others, and some people might prefer to go to the original site to read it whereas others would rather read it on LW if it's short. I'll think about that more in the future.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T22:42:48.741Z · LW · GW

That's strange, I looked closely but couldn't see how that would cause an issue. Could you describe the issue so I can see what you're getting at? I put a poll up in case there's a clear consensus that this makes it hard to read.

I'm on PC, is this some kind of issue with mobile? I really, really, really don't think people should be using smartphones for browsing Lesswrong.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T19:45:12.487Z · LW · GW

I can see that— language evolving plausible deniability over time, due to the immense instinctive focus on fear of being called out for making a mistake.

Comment by trevor (TrevorWiesinger) on Monthly Roundup #16: March 2024 · 2024-03-19T20:49:48.914Z · LW · GW

As their scale also scales the rewards to attacks and as their responses get worse, the attacks become more frequent. That leads to more false positives, and a skepticism that any given case could be one of them. In practice, claims like Zuckerberg’s that only the biggest companies like Meta can invest the resources to do good content moderation are clearly false, because scale reliably makes content moderation worse.

Dan Luu makes a very real and serious contribution to the literature on scaling and the big tech companies, going further than anyone I've ever seen to argue that the big 5 might be overvalued/not that powerful, but ultimately what he's doing is listing helpful arguments that chip away at the capabilities of the big 5, and then depicts his piece as overwhelming proof that they're doomed bloated incompetent husks that can't do anything anymore.

Lots of the arguments are great, but not all are created equal; for example, it's pretty well known that actually-well-targeted ads scare off customers and that user retention is the priority for predictive analytics (since the competitor platforms' decisions to use predictive analytics to steal user time are not predictable decisions), but Luu just did the usual thing where he eyeballs the ads and assumes that tells us everything we need to know, and doesn't notice anything wrong with this. There's some pretty easy math here (sufficiently large and diverse pools of data make it easier to find people/cases that help predict a specific target's thoughts/behavior/reaction to stimuli), and either Luu failed to pass the low bar of understanding it, or the higher bar of listing and grokking the real world applications and implications.

Ultimately, I'd consider it a must-read for anyone interested in Earth's most important industrial community (and scaling in general), but it's worth keeping in mind that the critical mass of talent (and all kinds of other resources and capabilities) accumulated within the biggest companies is obviously a pretty major factor, and although he goes a long way to chip away at it (e.g. attack surface for data poisoning), Luu doesn't actually totally debunk it like he says he does.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T19:27:21.783Z

Have you read Janus's Cyborgism post? It looks like you'd be pretty interested.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T07:53:20.855Z

Ah, neat, thanks! I had never heard of that paper or the Conger-Kanungo scale, when I referred to charisma I intended it in the planecrash sense of charisma that's focused on social dominance and subterfuge, rather than business management which is focused on leadership and maintaining the status quo which means something completely different and which I had never heard of.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T04:09:44.777Z

The application of variance, to the bundle of traits under the blanket label charisma (similar to the bundle of intelligence and results-acquisition under the blanket label thinkoomph), and the sociological implications of more socially powerful people being simultaneously more rare and also more capable of making the people around them erroneously feel safe, was a really interesting application that I picked up almost entirely from planecrash, yes.

I think that my "coordination takeoffs" post also ended up being a bad example for what you're trying to gesture at here, I already know what I got wrong there and it wasn't that (e.g. basically any China Watcher who reads and understands most of Inadequate Equilibria is on course towards the top of their field). Could you try a different example?

Comment by trevor (TrevorWiesinger) on Is there a way to calculate the P(we are in a 2nd cold war)? · 2024-03-17T20:43:12.762Z · LW · GW

The Cold War analogy is a bit hard to work with, mainly because the original Cold War was a specific state of paradigms that largely can't repeat; we have computers everywhere, thriving international trade and growth, and more importantly, the original Cold War emerged out of the World War paradigm and was started with intent to use nuclear weapons for carpet bombing (this is where the word "WW3" came from), whereas we now have norms and decades of track record of nuclear brinkmanship and de-escalation (the Cold War was established largely due to everyone everywhere having zero experience with this). 

It's similar to expecting the World War paradigm to return, but not nearly as bad, since most people in power in governments and militaries today came of age during the original Cold War and can easily imagine their world becoming more like that again.

Comment by trevor (TrevorWiesinger) on Transformative trustbuilding via advancements in decentralized lie detection · 2024-03-16T23:16:39.344Z · LW · GW

Yes, this is why I put "decentralized" in the title even though it doesn't really fit. What I was going for with the post is that you read it yourself, except whenever the author writes about law, you think for yourself about stacking the various applications that you care about (not courts) with the complex caveats that the author was writing about (while they were thinking about courts). Ideally I would have distilled it as the paper is a bit long.

This credibly demonstrates that the world we live in is more flexible than it might appear. And on the macro-civilizational scale, this particular tech looks like it will place honest souls higher-up on net, which everyone prefers. People can establish norms of remaining silent on particular matters, although the process of establishing those norms will be stacked towards people who can honestly say "I think this makes things better for everyone", "I think this is a purity spiral" and away from those who can't.

At work, you could expect to be checked for a "positive, loyal attitude toward the company" on as frequent a basis as was administratively convenient. It would not be enough that you were doing a good job, hadn't done anything actually wrong, and expected to keep it that way. You'd be ranked straight up on your Love for the Company (and probably on your agreement with management, and very possibly on how your political views comported with business interests). The bottom N percent would be "managed out".

This is probably already happening.

Comment by trevor (TrevorWiesinger) on Transformative trustbuilding via advancements in decentralized lie detection · 2024-03-16T22:36:40.928Z · LW · GW

There's bad actors who infiltrate, deceptively align, move laterally, and purge talented people (see Geeks, Mops, and Sociopaths) but I think that trust is a bigger issue. 

High-trust environments don't exist today in anything with medium or high stakes, and if they did then "sociopaths" would be able to share their various talents without being incentivized to hurt anyone, geeks could let more people in without worrying about threats, and people could generally evaluate each other and find the place where their strengths resonate with others.

That kind of wholesome existence is something that we've never seen on Earth, and we might be able to reach out and grab it (if we're already in an overhang for decentralized lie detectors).

Comment by trevor (TrevorWiesinger) on What could a policy banning AGI look like? · 2024-03-13T19:05:55.341Z · LW · GW

This is actually a dynamic I've read a lot about. The risk of ending up militarily/technologically behind is already well on the minds of the people who make up all of the major powers today, and all diplomacy and negotiations are already built on top of that ground truth and mitigating the harm/distrust that stems from it. 

Weakness at mitigating distrust = just being bad at diplomacy. Finding galaxy-brained solutions to coordination problems is necessary for being above par in this space.

Comment by trevor (TrevorWiesinger) on What could a policy banning AGI look like? · 2024-03-13T17:43:08.591Z · LW · GW

[Caveat lector: I know roughly nothing about policy!]

For AI people new to international affairs, I've generally recommend skimming these well-respected texts that are pretty well-known to have directly inspired many of the people making foreign policy decisions:

  • Chapters 1 and 2 of Mearsheimer's Tragedy of Great Power Politics (2010). The model (offensive realism) is not enough by itself, but it helps to start with a flawed model because the space is full of them, this model has been predictive, it's popular among policymakers in DC, and gives a great perspective on how impoverished foreign policy culture is because nobody ever reads stuff like the Sequences.
  • Chapters 1 and 4 of Nye's Soft Power (2004)(skim ch. 1 extra fast and ch. 4 slower). Basically a modern history of propaganda and influence operations, except cutting off at 2004. Describes how the world is more complicated than Tragedy of Great Power Politics describes.
  • Chapters 1 and 2 of Schelling's Arms and Influence (1966). Yes, it's that Schelling, this was when he started influencing the world's thinking about how decision theory drives nuclear standoffs, and diplomacy in general, in the wake of the Cuban Missile Crisis. You can be extremely confident that this was a big part of the cultural foundation of foreign policy establishments around the world, plus for a MIRI employee it should be an incredibly light read applying decision theory to international politics and nuclear war. 

I'm going to read some more stuff soon and possibly overhaul these standard recommendations.

Akash also recommended Devil's Chessboard to understand intelligence agencies, and Master of the Senate and Act of Congress to understand Congress. I haven't gotten around to reading them yet, and I can't tell how successful his org has been in Congress itself (which is the main measurement of success tendency), but the Final Takes section of his post on Congress is fantastic and makes me confident enough to try them out.