Posts

Stone Age Herbalist's notes on ant warfare and slavery 2024-11-09T02:40:01.128Z
trevor's Shortform 2024-08-01T23:26:56.024Z
WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals 2024-04-23T21:33:08.049Z
[Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate 2024-03-28T16:03:36.452Z
[Linkpost] Vague Verbiage in Forecasting 2024-03-22T18:05:53.902Z
Transformative trustbuilding via advancements in decentralized lie detection 2024-03-16T05:56:21.926Z
Don't sleep on Coordination Takeoffs 2024-01-27T19:55:26.831Z
(4 min read) An intuitive explanation of the AI influence situation 2024-01-13T17:34:36.739Z
Upgrading the AI Safety Community 2023-12-16T15:34:26.600Z
[Linkpost] George Mack's Razors 2023-11-27T17:53:45.065Z
Altman firing retaliation incoming? 2023-11-19T00:10:15.645Z
Helpful examples to get a sense of modern automated manipulation 2023-11-12T20:49:57.422Z
We are already in a persuasion-transformed world and must take precautions 2023-11-04T15:53:31.345Z
5 Reasons Why Governments/Militaries Already Want AI for Information Warfare 2023-10-30T16:30:38.020Z
Sensor Exposure can Compromise the Human Brain in the 2020s 2023-10-26T03:31:09.835Z
AI Safety is Dropping the Ball on Clown Attacks 2023-10-22T20:09:31.810Z
Information warfare historically revolved around human conduits 2023-08-28T18:54:27.169Z
Assessment of intelligence agency functionality is difficult yet important 2023-08-24T01:42:20.931Z
One example of how LLM propaganda attacks can hack the brain 2023-08-16T21:41:02.310Z
Buying Tall-Poppy-Cutting Offsets 2023-05-20T03:59:46.336Z
Financial Times: We must slow down the race to God-like AI 2023-04-13T19:55:26.217Z
What is the best source to explain short AI timelines to a skeptical person? 2023-04-13T04:29:03.166Z
All images from the WaitButWhy sequence on AI 2023-04-08T07:36:06.044Z
10 reasons why lists of 10 reasons might be a winning strategy 2023-04-06T21:24:17.896Z
What could EA's new name be? 2023-04-02T19:25:22.740Z
Strong Cheap Signals 2023-03-29T14:18:52.734Z
NYT: Lab Leak Most Likely Caused Pandemic, Energy Dept. Says 2023-02-26T21:21:54.675Z
Are there rationality techniques similar to staring at the wall for 4 hours? 2023-02-24T11:48:45.944Z
NYT: A Conversation With Bing’s Chatbot Left Me Deeply Unsettled 2023-02-16T22:57:26.302Z
The best way so far to explain AI risk: The Precipice (p. 137-149) 2023-02-10T19:33:00.094Z
Many important technologies start out as science fiction before becoming real 2023-02-10T09:36:29.526Z
Why is Everyone So Boring? By Robin Hanson 2023-02-06T04:17:20.372Z
There have been 3 planes (billionaire donors) and 2 have crashed 2022-12-17T03:58:28.125Z
What's the best time-efficient alternative to the Sequences? 2022-12-16T20:17:27.449Z
What key nutrients are required for daily energy? 2022-09-20T23:30:02.540Z

Comments

Comment by trevor (TrevorWiesinger) on avturchin's Shortform · 2024-12-16T05:28:46.622Z · LW · GW

Your effort must scale to be appropriate to the capabilities of the people trying to remove you from the system. You have to know if they're the type of person who would immediately default to checking the will.

More understanding and calibration towards what modern assassination practice you should actually expect is mandatory because you're dealing with people putting some amount of thinkoomph into making your life plans fail, so your cost of survival is determined by what you expect your attack surface looks like. The appropriate-cost and the cost-you-decided-to-pay vary in OOMs depending on the circumstances, particularly the intelligence, resources, and fixations of the attacker. For example, the fact that this happened 2 weeks after assassination got all over the news is a fact that you don't have the privilege of ignoring if you want the answer, even though that particular fact will probably turn out to be unhelpful e.g. because the whole thing was probably just a suicide due to the base rates of disease and accidents and suicide being so god damn high.

If this sounds wasteful, it is. It's why our civilization has largely moved past assassination, even though getting-people-out-of-the-way is so instrumentally convergent for humans. We could end up in a cycle where assassination gets popular again after people start excessively standing in each other's way (knowing they won't be killed for it), or a stable cultural state like the Dune books or the John Wick universe and we've just been living in a long trough where elites aren't physically forced to live their entire lives like mob bosses playing chess games against invisible adversaries.

So don't think that if you only follow the rules of Science, that makes your reasoning defensible.

There is no known procedure you can follow that makes your reasoning defensible.

There is no known set of injunctions which you can satisfy, and know that you will not have been a fool.

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-21T08:06:58.496Z · LW · GW

It was more of a 1970s-90s phenomenon actually, if you compare the best 90s moves (e.g. terminator 2) to the best 60s movies (e.g. space odyssey) it's pretty clear that directors just got a lot better at doing more stuff per second. Older movies are absolutely a window into a higher/deeper culture/way of thinking, but OOMs less efficient than e.g. reading Kant/Nietzsche/Orwell/Asimov/Plato. But I wouldn't be surprised if modern film is severely mindkilling and older film is the best substitute.

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-21T04:39:33.498Z · LW · GW

The content/minute rate is too low, it follows 1960s film standards where audiences weren't interested in science fiction films unless concepts were introduced to them very very slowly (at the time they were quite satisfied by this due to lower standards, similar to Shakespeare).

As a result it is not enjoyable (people will be on their phones) unless you spend much of the film either thinking or talking with friends about how it might have affected the course of science fiction as a foundational work in the genre (almost every sci-fi fan and writer at the time watched it).

Comment by trevor (TrevorWiesinger) on What are the good rationality films? · 2024-11-21T04:12:50.742Z · LW · GW

Tenet (2020) by George Nolan revolves around recursive thinking and responding to unreasonably difficult problems. Nolan introduces the time-reversed material as the core dynamic, then iteratively increases the complexity from there, in ways specifically designed to ensure that as much of the audience as possible picks up as much recursive thinking as possible.

This chart describes the movement of all key characters plot elements through the film; it is actually very easy to follow for most people. But you can also print out a bunch of copies and hand them out before the film (it isn't a spoiler so long as you don't look closely at the key).

Ivjwk943jac61

Most of the value comes from Eat the Instructions-style mentality, as both the characters and the viewer pick up on unconventional methods to exploit the time reversing technology, only to be shown even more sophisticated strategies and are walked through how they work and their full implications.

It also ties into scope sensitivity, but it focuses deeply on the angles of interfacing with other agents and their knowledge, and responding dynamically to mistakes and failures (though not anticipating them), rather than simply orienting yourself to mandatory number crunching.

The film touches on cooperation and cooperation failures under anomalous circumstances, particularly the circumstances introduced by the time reversing technology.

The most interesting of these was also the easiest to miss:

The impossibility of building trust between the hostile forces from the distant future and the characters in the story who make up the opposition faction. The antagonist, dying from cancer and selected because his personality was predicted to be hostile to the present and sympathetic to the future, was simply sent instructions and resources from the future, and decided to act as their proxy in spite of ending up with a great life and being unable to verify their accuracy or the true goals of the hostile force. As a result, the protagonists of the story ultimately build a faction that takes on a life of its own and dooms both their friends and the entire human race to death by playing a zero sum survival game with the future faction, due to their failure throughout the film to think sufficiently laterally and their inadequate exploitation of the time-reversing technology.

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-21T03:11:57.336Z · LW · GW

Screen arrangement suggestion: Rather than everyone sitting in a single crowd and commenting on the film, we split into two clusters, one closer to the screen and one further. 

The people in the front cluster hope to watch the film quietly, the people in the back cluster aim to comment/converse/socialize during the film, with the common knowledge that they should aim to not be audible to the people in the front group, and people can form clusters and move between them freely. 

The value of this depends on what film is chosen; eg "A space Odyssey" is not watchable without discussing historical context and "Tenet" ought to have some viewers wanting to better understand the details of what time travelly thing just happened.

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T05:57:35.373Z · LW · GW

"All the Presidents Men" by Alan Paluka

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T05:55:00.410Z · LW · GW
Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T05:47:01.602Z · LW · GW
Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T05:46:42.563Z · LW · GW

"Oppenheimer" by George Nolan

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T05:46:07.704Z · LW · GW

"Tenet" by George Nolan

Comment by trevor (TrevorWiesinger) on AI Safety is Dropping the Ball on Clown Attacks · 2024-11-14T17:20:02.184Z · LW · GW

I'm not sure what to think about this, Thomas777's approach is generally a good one but for both of these examples, a shorter route (that it's cleanly mutually understood to be adding insult to injury as a flex by the aggressor) seems pretty probable. Free speech/censorship might be a better example as plenty of cultures are less aware of information theory and progress.

I don't know what proportion of the people in the US Natsec community understand 'rigged psychological games' well enough to occasionally read books on the topic, but the bar is pretty low for hopping onto fads as tricks only require one person to notice or invent them and then they can simply just get popular (with all kinds of people with varying capabilities/resources/technology and bandwidth/information/deffciencies hopping on the bandwagon).

Comment by trevor (TrevorWiesinger) on Overcoming Bias Anthology · 2024-10-23T16:58:28.385Z · LW · GW

I notice that there's just shy of 128 here and they're mostly pretty short, so you can start the day by flipping a coin 7 times to decide which one to read. Not a bisection search, just convert the seven flips to binary and pick the corresponding number. At first, you only have to start over and do another 7 flips if you land on 1111110 (126), 1111111 (127), or 0000000 (128).

If you drink coffee in the morning, this is a way better way to start the day than social media, as the early phase of the stimulant effect reinforces behavior in most people. Hanson's approach to various topics is a good mentality to try boosting this way.

Comment by trevor (TrevorWiesinger) on sarahconstantin's Shortform · 2024-10-08T05:17:49.320Z · LW · GW

This reminds me of dath ilan's hallucination diagnosis from page 38 of Yudkowsky and Alicorn's glowfic But Hurting People Is Wrong.

It's pretty far from meeting dath ilan's standard though; in fact an x-ray would be more than sufficient as anyone capable of putting something in someone's ear would obviously vastly prefer to place it somewhere harder to check, whereas nobody would be capable of defeating an x-ray machine as metal parts are unavoidable. 

This concern pops up in books on the Cold War (employees at every org and every company regularly suffer from mental illnesses at somewhere around their base rates, but things get complicated at intelligence agencies where paranoid/creative/adversarial people are rewarded and even influence R&D funding) and an x-ray machine cleanly resolved the matter every time.

Comment by trevor (TrevorWiesinger) on First Lighthaven Sequences Reading Group · 2024-09-30T23:39:22.097Z · LW · GW

That's this week, right?

Is reading An Intuitive Explanation of Bayes Theorem recommended?

Comment by trevor (TrevorWiesinger) on whestler's Shortform · 2024-09-23T23:45:13.291Z · LW · GW

I agree that "general" isn't such a good word for humans. But unless civilization was initiated right after the minimum viable threshold was crossed, it seems somewhat unlikely to me that humans were very representative of the minimum viable threshold.

If any evolutionary process other than civilization precursors formed the feedback loop that caused human intelligence, then civilization would hit full swing sooner if that feedback loop continued pushing human intelligence further. Whether Earth took a century or a millennia between the harnessing of electricity and the first computer was heavily affected by economics and genetic diversity (e.g. Babbage, Lovelace, Turing), but afaik a "minimum viable general intelligence" could plausibly have taken millions or even billions of years under ideal cultural conditions to cross that particular gap.

Comment by trevor (TrevorWiesinger) on Lighthaven Sequences Reading Group #3 (Tuesday 09/24) · 2024-09-23T03:45:09.746Z · LW · GW

This is an idea and NOT a recommendation. Unintended consequences abound.

Have you thought about sorting into groups based on carefully-selected categories? For example, econ/social sciences vs quant background with extra whiteboard space, a separate group for new arrivals who didn't do the readings from the other weeks (as their perspectives will have less overlap), a separate group for people who deliberately took a bunch of notes and made a concise list vs a more casual easygoing group, etc?

Comment by trevor (TrevorWiesinger) on Perhaps Try a Little Therapy, As a Treat? · 2024-09-06T20:50:53.674Z · LW · GW

Actions like these leave scars on entire communities.

Do you have any idea how fortunate you were to have so many people in your life who explicitly tell you "don't do things like this"? The world around you has been made so profoundly, profoundly conducive to healing you.

When someone is this persistent in thinking of reasons to be aggressive AND reasons to not evaluate the world around them, it's scary and disturbing. I understand that humans aren't very causally upstream of their decisions, but this is the case for everyone, and situations like these go a long way towards causing people like Duncan and Eliezer to fear meeting their fans.

I'm greatful that looking at this case has helped me formalize a concept of oppositional drive, a variable representing the unconscious drive to oppose other humans with justifications layered on top based on intelligence (a separate variable). Children diagnosed with Oppositional Defiant disorder is the DSM-5's way of mitigating the harm when a child has an unusually strong oppositional drive for their age, but that's because the DSM puts binary categorizations on traits that are actually better represented as variables that in most people are so low as to not be noticed (and some people are in the middle, unusually extreme cases get all the attention, this was covered in this section of Social Dark Matter which was roughly 100% of my inspiration).

Opposition is... a rather dangerous thing for any living being to do, especially if your brain conceals/obfuscates the tendency/drive whenever it emerges, so even most people in the orangey area probably disagree with having this trait upon reflection and would typically press a button to place themselves more towards the yellow. This is derived from the fundamental logic of trust (which in humans must be built as a complex project that revolves around calibration).

Comment by trevor (TrevorWiesinger) on Morpheus's Shortform · 2024-09-04T20:11:25.738Z · LW · GW

This could have been a post so more people could link it (many don't reflexively notice that you can easily get a link to a Lesswrong quicktake or Twitter or facebook post by mousing over the date between the upvote count and the poster, which also works for tab and hotkey navigation for people like me who avoid using the mouse/touchpad whenever possible).

Comment by trevor (TrevorWiesinger) on Fabien's Shortform · 2024-08-30T22:29:02.436Z · LW · GW

(The author sometimes says stuff like "US elites were too ideologically committed to globalization", but I don't think he provides great alternative policies.)

Afaik the 1990-2008 period featured government and military elites worldwide struggling to pivot to a post-Cold war era, which was extremely OOD for many leading institutions of statecraft (which for centuries constructed around the conflicts of the European wars then world wars then cold war). 

During the 90's and 2000's, lots of writing and thinking was done about ways the world's militaries and intelligence agencies, fundamentally low-trust adversarial orgs, could continue to exist without intent to bump each other off. Counter-terrorism was possibly one thing that was settled on, but it's pretty well established that global trade ties were deliberately used as a peacebuilding tactic, notably to stabilize the US-China relationship (this started to fall apart after the 2008 recession brought anticipation of American economic/institutional decline scenarios to the forefront of geopolitics).

The thinking of period might not be very impressive to us, but foreign policy people mostly aren't intellectuals and for generations had been selected based on office politics where the office revolved around defeating the adversary, so for many of them them it felt like a really big shift in perspective and self-image, sort of like a Renaissance. Then US-Russia-China conflict swung right back and got people thinking about peacebuilding as a ploy to gain advantage, rather than sane civilizational development. The rejection of e.g. US-China economic integration policies had to be aggressive because many elites (and people who care about economic growth) tend to support globalization, whereas many government and especially Natsec elites remember that period as naive.

Comment by trevor (TrevorWiesinger) on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T18:47:55.615Z · LW · GW

It's not a book, but if you like older movies, the 1944 film Gaslight is pretty far back (film production standards have improved quite a bit since then, so for a large proportion of people 40's films are barely watchable, which is why I recommend this version over the nearly identical British version and the original play), and it was pretty popular among cultural elites at the time so it's probably extremely causally upstream of most of the fiction you'd be interested in.

Comment by trevor (TrevorWiesinger) on gwern's Shortform · 2024-08-23T06:07:33.127Z · LW · GW

Writing is safer than talking given the same probability that both the timestamped keystrokes and the audio files are both kept.

In practice, the best approach is to handwrite your thoughts as notes, in a room without smart devices and with a door and walls that are sufficiently absorptive, and then type it out in the different room with the laptop (ideally with a USB keyboard so you don't have to put your hands on the laptop and the accelerometers on its motherboard while you type). 

Afaik this gets the best ratio of revealed thought process to final product, so you get public information exchanges closer to a critical mass while simultaneously getting yourself further from getting gaslight into believing whatever some asshole rando wants you to believe. The whole paradigm where everyone just inputs keystrokes into their operating system willy-nilly needs to be put to rest ASAP, just like the paradigm of thinking without handwritten notes and the paradigm of inward-facing webcams with no built-in cover or way to break the circuit.

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-08-01T23:27:52.057Z · LW · GW

TL;DR "habitually deliberately visualizing yourself succeeding at goal/subgoal X" is extremely valuable, but also very tarnished. It's probably worth trying out, playing around with, and seeing if you can cut out the bullshit and boot it up properly.

Longer:

The universe is allowed to have tons of people intuitively notice that "visualize yourself doing X" is an obviously winning strategy that typically makes doing X a downhill battle if its possible at all, and so many different people pick it up that you first encounter it in an awful way e.g. in middle/high school you first hear about it but the speaker says, in the same breath, that you should use it to feel more motivated to do your repetitive math homework for ~2 hours a day.

I'm sure people could find all sorts of improvements e.g. an entire field of selfvisualizationmancy that provably helps a lot of people do stuff, but the important thing I've noticed is to simply not skip that critical step. Eliminate ugh fields around self-visualization or take whatever means necessary to prevent ugh fields from forming in your idiosyncratic case (also, social media algorithms could have been measurably increasing user retention by boosting content that places ugh fields in places that increase user retention by decreasing agency/motivation, with or without the devs being aware of this because they are looking at inputs and outputs or maybe just outputs, so this could be a lot more adversarial than you were expecting). Notice the possibility that it might or might not have been a core underlying dynamic in Yudkowsky's old Execute by Default post or Scott Alexander's silly hypothetical talent differential comment without their awareness.

The universe is allowed to give you a brain that so perversely hinges on self-image instead of just taking the action. The brain is a massive kludge of parallel processing spaghetti code and, regardless of whether or not you see yourself as a very social-status-minded person, the modern  human brains was probably heavily wired to gain social status in the ancestral environment, and whatever departures you might have might be tearing down chesterton-schelling fences.

If nothing else, a takeaway from this was that the process of finding the missing piece that changes everything is allowed to be ludicrously hard and complicated, while the missing piece itself is simultaneously allowed to be very simple and easy once you've found it.

Comment by trevor (TrevorWiesinger) on Optimistic Assumptions, Longterm Planning, and "Cope" · 2024-07-19T02:37:23.334Z · LW · GW

"Slipping into a more convenient world" is a good way of putting it; just using the word "optimism" really doesn't account for how it's pretty slippy, nor how the direction is towards a more convenient world.

Comment by trevor (TrevorWiesinger) on Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers · 2024-07-10T00:31:36.698Z · LW · GW

It was helpful that Ezra noticed and pointed out this dynamic. 

I think this concern is probably more a reflection of our state of culture, where people who visibly think in terms of quantified uncertainty are rare and therefore make a strong impression relative to e.g. pundits.

If you look at other hypothetical cultural states (specifically more quant-aware states e.g. extrapolating the last 100 years of math/literacy/finance/physics/military/computer progress forward another 100 years), trust would pretty quickly default to being based on track record instead of being one of the few people in the room whose visibly using numbers properly.

Comment by trevor (TrevorWiesinger) on Advice to junior AI governance researchers · 2024-07-09T01:25:53.044Z · LW · GW

Strong upvoted!

Wish I was reading stuff at this level back in 2018. Glad that lots of people can now.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-06-27T05:35:43.215Z · LW · GW

Comment by trevor (TrevorWiesinger) on Sci-Fi books micro-reviews · 2024-06-24T20:29:06.665Z · LW · GW

Do Metropolitan Man!

Also, here's a bunch of ratfic to read and review, weighted by the number of 2022 Lesswrong survey respondents who read them:

Comment by trevor (TrevorWiesinger) on What if a tech company forced you to move to NYC? · 2024-06-14T21:52:17.924Z · LW · GW

Weird coincidence: I was just thinking about Leopold's bunker concept from his essay. It was a pretty careless paper overall but the imperative to put alignment research in a bunker makes perfect sense; I don't see the surface as a viable place for people to get serious work done (at least, not in densely populated urban areas; somewhere in the desert would count as a "bunker" in this case so long as it's sufficiently distant from passerbys and the sensors and compute in their phones and cars).

Of course, this is unambiguously a necessary evil that a tiny handful of people are going to have to choose to live in a sad uncomfortable place for a while, and only because there's no other option and it's obviously the correct move for everyone everywhere including the people in the bunker.

Until the basics of the situation start somehow getting taught in the classrooms or something, we're going to be stuck with a ludicrously large proportion of people satisfied with the kind of bite-sized convenient takes that got us into this whole unhinged situation in the first place (or have no thoughts at all).

Comment by trevor (TrevorWiesinger) on Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) · 2024-06-14T17:18:34.514Z · LW · GW

I would have liked to write a post that offers one weird trick to avoid being confused by which areas of AI are more or less safe to advance, but I can’t write that post. As far as I know, the answer is simply that you have to model the social landscape around you and how your research contributions are going to be applied.

Another thing that can't be ignored is the threat of Social Balkanization, Divide-and-conquer tactics have been prevalent among military strategists for millennia, and the tactic remains prevalent and psychologically available among the people making up corporate factions and many subcultures (likely including leftist and right-wing subcultures).

It is easy for external forces to notice opportunities to Balkanize a group, to make it weaker and easier to acquire or capture the splinters, which in turn provides further opportunity for lateral movement and spotting more exploits. Since awareness and exploitation of this vulnerability is prevalent, social systems without this specific hardening are very brittle and have dismal prospects.

Sadly, Balkanization can also emerge naturally, as you helpfully pointed out in Consciousness as a Conflationary Alliance Term, so the high base rates make it harder to correctly distinguishing attacks from accidents. Inadequately calibrated autoimmune responses are not only damaging, but should be assumed to be automatically anticipated and misdirected by default, including as part of the mundane social dynamics of a group with no external attackers.

There is no way around the loss function.

Comment by trevor (TrevorWiesinger) on Sticker Shortcut Fallacy — The Real Worst Argument in the World · 2024-06-12T19:42:59.582Z · LW · GW

The only reason I could think of that this would be the "worst argument in the world" is because it strongly indicates low-level thinkers (e.g. low decouplers).

An actual "worst argument in the world" would be whatever maximizes the gap between people's models and accurate models. 

Comment by trevor (TrevorWiesinger) on Two easy things that maybe Just Work to improve AI discourse · 2024-06-09T03:52:12.937Z · LW · GW

Can you expand the list, go into further detail, or list a source that goes into further detail?

Comment by trevor (TrevorWiesinger) on Humming is not a free $100 bill · 2024-06-06T22:46:56.085Z · LW · GW

At the time, I thought something like "given that the nasal tract already produces NO, it seems possible that humming doesn't increase the NO in the lungs by enough orders of magnitude to make once per hour sufficient", but I never said anything until too late and a bunch of other people figured it out, and also a bunch of other useful stuff that I was pretty far away from noticing (e.g. considering the rate at which the nasal tract accumulates NO to be released by humming).

Wish I'd said something back when it was still valuable.

Comment by trevor (TrevorWiesinger) on Research: Rescuers during the Holocaust · 2024-06-03T01:37:40.538Z · LW · GW

It almost always took a personal plea from a persecuted person for altruism to kick in. Once they weren't just an anonymous member of indifferent crowd, once they were left with no escape but to do a personal moral choice, they often found out that they are not able to refuse help.

This is a crux. I think a better way to look at it is they didn't have an opportunity to clarify their preference until the situation was in front of them. Otherwise, it's too distant and hypothetical to process, similar to scope insensitivity (the 2,000/20,000/200,000 oil-covered birds thing).

The post-hoc cognitive dissonance angle seems like a big find, and strongly indicates that reliably moral supermen can be produced at scale given an optimized equilibria for them to emerge from.

Stable traits (possibly partially genetic) are likely highly relevant to not-yet-clarified preferences, of course. Epistemics here are difficult due to expecting short inferential distances; Duncan Sabien gave an interesting take on this in a facebook post:

Also, if your worldview is such that, like. *Everyone* makes awful comments like that in the locker room, *everyone* does angle-shooting and tries to scheme and scam their way to the top, *everyone* is looking out for number one, *everyone* lies ...

... then *given* that premise, it makes sense to view Trump in a positive light. He's no worse than everybody else, he's just doing the normal things that everyone does, with the *added layer* that he's brave enough and candid enough and strong enough that he *doesn't have to pretend he doesn't.*

Admirable! Refreshingly honest and clean!

So long as you can't conceive of the fact that lots of people are actually just ..................... good. They're not fighting against urges to be violent or to rape, they're not biting their tongues when they want to say scathing and hurtful things, they're not jealous and bitter and willing to throw others under the bus to get ahead. They're just ... fundamentally not interested in any of that.

(To be clear: if you are feeling such impulses all the time and you're successfully containing them or channeling them and presenting a cooperative and prosocial mask: that is *also* good, and you are a good person by virtue of your deliberate choice to be good. But like. Some people just really *are* the way that other people have to *make* themselves be.)

It sort of vaguely rhymes, in my head, with the type of person who thinks that *everyone* is constantly struggling against the urge to engage in homosexual behavior, how dare *those* people give up the good fight and just *indulge* themselves ... without realizing that, hey, bro, did you know that a lot of people are just straight? And that your internal experience is, uh, *different* from theirs?

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-05-30T22:21:47.366Z · LW · GW

The best thing I've found so far is to watch a movie, and whenever the screen flashes, any moment you feel weirdly relaxed or any other weird feeling feeling, quickly turn your head and eyes ~60 degrees and gently but firmly bite your tongue. 

Doing this a few minutes a day for 30 days might substantially improve resistance to a wide variety of threats. 

Gently but firmly biting my tongue, for me, also seems like a potentially very good general-use strategy to return the mind to an alert and clear-minded base state, seems like something Critch recommended e.g. for initiatiing a TAP flowchain. I don't think this can substitute for a whiteboard, but it sure can nudge you towards one.

Comment by trevor (TrevorWiesinger) on MIRI 2024 Communications Strategy · 2024-05-30T02:46:20.155Z · LW · GW

One of the main bottlenecks on explaining the full gravity of the AI situation to people is that they're already worn out from hearing about climate change, which for decades has been widely depicted as an existential risk with the full persuasive force of the environmentalism movement.

Fixing this rather awful choke point could plausibly be one of the most impactful things here. The "Global Risk Prioritization" concept is probably helpful for that but I don't know how accessible it is. Heninger's series analyzing the environmentalist movement was fantastic, but the fact that it came out recently instead of ten years ago tells me that the "climate fatigue" problem might be understudied, and evaluation of climate fatigue's difficulty/hopelessness might yield unexpectedly hopeful results.

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-05-29T21:22:07.884Z · LW · GW

I just found out that hypnosis is real and not pseudoscience. Apparently the human brain has a zero day such that other humans can find ways to read and write to your memory, and everyone is insisting that this is fine and always happens with full awareness and consent? 

Wikipedia says as many as 90% of people are at least moderately susceptible, and depending how successful people have been over the last couple centuries at finding ways to reduce detection risk per instance (e.g. developing and and selling various galaxy-brained misdirection ploys), that seems like very fertile ground for salami-slicing attacks which wear down partially resistant people.

The odds that something like this would be noticed and tested/scaled/optimized by competent cybersecurity experts and power lawyers seems pretty high (e.g. screen refresh rate oscillation in non-visible ways to increase feelings of stress or discomfort and then turning it off whenever the user's eyes are bout to go over specific kinds of words, slightly altering the color output of specific pixels across the screen in the shape of words and measuring effectiveness based on whether it causally increases the frequency of people using those words, some kind of way to combine these two tactics, something derived from the millions of people on youtube trying hard to look for a video file that hypnotizes them, etc).

It's really frustrating living in a post-MKUltra world, where every decade our individual sovereignty as humans is increasingly reliant on very senior government officials (who are probably culturally similar to the type of person who goes to business school and have been for centuries) either consistently not succeeding at any of the manipulation science which they are heavily incentivized to diversify their research investment in, or taking them at their word when they insist that they genuinely believe in protecting democracy and the bad things they get caught doing are in service towards that end. Also, they seem to remain uninterested in life extension, possibly due in part to being buried deep in a low-trust dark forest (is trust even possible at all if you're trapped on a planet with hypnosis?). 

Aside from the incredibly obvious move to cover up your fucking webcam right now, are there any non-fake defensive strategies to reduce the risk that someone walks up to you/hacks your computer and takes everything from you? Is there some reliable way to verify that the effects are consistently weak or that scaling isn't viable? The error bars are always really wide for the prevalence of default-concealed deception (especially when it comes to stuff that wouldn't scale until the 2010s), making solid epistemics a huge pain to get right, but the situation with directly reading and writing to memory is just way way too extreme to ignore.

Comment by trevor (TrevorWiesinger) on Notifications Received in 30 Minutes of Class · 2024-05-26T18:57:23.549Z · LW · GW

Strong upvoted, thank you for the serious contribution.

Children spending 300 hours per year learning math, on their own time and via well-designed engaging video-game-like apps (with eg AI tutors, video lectures, collaborating with parents to dispense rewards for performance instead of punishments for visible non-compliance, and results measured via standardized tests), at the fastest possible rate for them (or even one of 5 different paces where fewer than 10% are mistakenly placed into the wrong category) would probably result in vastly superior results among every demographic than the current paradigm of ~30-person classrooms.

in just the last two years I've seen an explosion in students who discreetly wear a wireless earbud in one ear and may or may not be listening to music in addition to (or instead of) whatever is happening in class. This is so difficult and awkward to police with girls who have long hair that I wonder if it has actually started to drive hair fashion in an ear-concealing direction.

This isn't just a problem with the students; the companies themselves end up in equilibria where visibly controversial practices get RLHF'd into being either removed or invisible (or hard for people to put their finger on). For example, hours a day of instant gratification reducing attention spans, except unlike the early 2010s where it became controversial, reducing attention spans in ways too complicated or ambiguous for students and teachers to put their finger on until a random researcher figures it out and makes the tacit explicit. Or another counterintuitive vector could be the democratic process of public opinion turns against schooling, except in a lasting way. Or the results of multiple vectors like these overlapping.

I don't see how the classroom-based system, dominated entirely by bureaucracies and tradition, could possibly compete with that without visibly being turned into swiss cheese. It might have been clinging on to continued good results from a dwindling proportion of students who were raised to be morally/ideologically in favor of respecting the teacher more than the other students, but that proportion will also decline as schooling loses legitimacy.

Regulation could plausibly halt the trend from most or all angles, but it would have to be the historically unprecedented kind of regulation that's managed by regulators with historically unprecedented levels of seriousness and conscientiousness towards complex hard-to-predict/measure outcomes.

Comment by trevor (TrevorWiesinger) on Jaan Tallinn's 2023 Philanthropy Overview · 2024-05-22T20:23:34.481Z · LW · GW

Thank you for making so much possible.

I was just wondering, what are some of the branches of rationality that you're aware of that you're currently most optimistic about, and/or would be glad to see more people spending time on, if any? Now that people are rapidly shifting effort to policymaking in DC and UK (including through EA) which is largely uncharted territory, what texts/posts/branches do you think might be a good fit for them? 

I've been thinking that recommending more people to read ratfic would be unusually good for policy efforts, since it's something very socially acceptable for high-minded people to do in their free time, should have a big impact through extant orgs without costing any additional money, and it's not weird or awkward in the slightest to talk about the original source if a conversation gets anyone interested in going deeper into where they got the idea from.

Plus, it gets/keeps people in the right headspace the curveballs that DC hits people with, which tend to be largely human-generated and therefore simple enough for humans to easily understand, just like the cartoonish simplifications of reality in ratfic (unusually low levels of math/abstraction/complexity but unusually high levels of linguistic intelligence, creative intelligence, and quick reactions e.g. social situations). 

But unlike you, I don't have much of a track record making judgments about big decisions like this and then seeing how they play out over years in complicated systems.

Comment by trevor (TrevorWiesinger) on keltan's Shortform · 2024-05-17T22:51:14.917Z · LW · GW

Have you tried whiteboarding-related techniques?

I think that suddenly starting to using written media (even journals), in an environment without much or any guidance, is like pressing too hard on the gas; you're gaining incredible power and going from zero to one on things faster than you ever have before. 

Depending on their environment and what they're interested in starting out, some people might learn (or be shown) how to steer quickly, whereas others might accumulate/scaffold really lopsided optimization power and crash and burn (e.g. getting involved in tons of stuff at once that upon reflection was way too much for someone just starting out).

Comment by trevor (TrevorWiesinger) on Advice for Activists from the History of Environmentalism · 2024-05-17T02:36:25.839Z · LW · GW

For those of us who haven't already, don't miss out on the paper this was based off of. It's a serious banger for anyone interested in the situation on the ground and probably one of the most interesting and relevant papers this year.

It's not something to miss just because you don't find environmentalism itself very valuable; if you think about it for a while, it's pretty easy to see the reasons why they're a fantastic case study for a wide variety of purposes.

Here's a snapshot of the table of contents:

(the link to the report seems to be broken; are the 4 blog posts roughly the same piece?)

Comment by trevor (TrevorWiesinger) on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-15T18:12:56.370Z · LW · GW

Notably, this interview was on March 18th, and afaik the highest-level interview Altman has had to give his two cents since the incident. There's a transcript here. (There was also this podcast a couple days ago).

I think a Dwarkesh-Altman podcast would be more likely to arrive at more substance from Altman's side of the story. I'm currently pretty confident that Dwarkesh and Altman are sufficiently competent to build enough trust to make sane and adequate pre-podcast agreements (e.g. don't be an idiot who plays tons of one-shot games just because podcast cultural norms are more vivid in your mind than game theory), but I might be wrong about this; trailblazing the frontier of making-things-happen, like Dwarkesh and Altman are, is a lot harder than thinking about the frontier of making-things-happen.

Comment by trevor (TrevorWiesinger) on tlevin's Shortform · 2024-05-01T14:34:25.460Z · LW · GW

Recently, John Wentworth wrote:

Ingroup losing status? Few things are more prone to distorted perception than that.

And I think this makes sense (e.g. Simler's Social Status: Down the Rabbit Hole which you've probably read), if you define "AI Safety" as "people who think that superintelligence is serious business or will be some day".

The psych dynamic that I find helpful to point out here is Yud's Is That Your True Rejection post from ~16 years ago. A person who hears about superintelligence for the first time will often respond to their double-take at the concept by spamming random justifications for why that's not a problem (which, notably, feels like legitimate reasoning to that person, even though it's not). An AI-safety-minded person becomes wary of being effectively attacked by high-status people immediately turning into what is basically a weaponized justification machine, and develops a deep drive wanting that not to happen. Then justifications ensue for wanting that to happen less frequently in the world, because deep down humans really don't want their social status to be put at risk (via denunciation) on a regular basis like that. These sorts of deep drives are pretty opaque to us humans but their real world consequences are very strong.

Something that seems more helpful than playing whack-a-mole whenever this issue comes up is having more people in AI policy putting more time into improving perspective. I don't see shorter paths to increasing the number of people-prepared-to-handle-unexpected-complexity than giving people a broader and more general thinking capacity for thoughtfully reacting to the sorts of complex curveballs that you get in the real world. Rationalist fiction like HPMOR is great for this, as well as others e.g. Three Worlds Collide, Unsong, Worth the Candle, Worm (list of top rated ones here). With the caveat, of course, that doing well in the real world is less like the bite-sized easy-to-understand events in ratfic, and more like spotting errors in the methodology section of a study or making money playing poker. 

I think, given the circumstances, it's plausibly very valuable e.g. for people already spending much of their free time on social media or watching stuff like The Office, Garfield reruns, WWI and Cold War documentaries, etc, to only spend ~90% as much time doing that and refocusing ~10% to ratfic instead, and maybe see if they can find it in themselves to want to shift more of their leisure time to that sort of passive/ambient/automatic self-improvement productivity.

Comment by trevor (TrevorWiesinger) on Changes in College Admissions · 2024-04-24T20:48:03.439Z · LW · GW

However I would continue to emphasize in general that life must go on. It is important for your mental health and happiness to plan for the future in which the transformational changes do not come to pass, in addition to planning for potential bigger changes. And you should not be so confident that the timeline is short and everything will change so quickly.

This is actually one of the major reasons why 80k recommended information security as one of their top career areas; the other top career areas have pretty heavy switching costs and serious drawbacks if you end up not being a good fit e.g. alignment research, biosecurity, and public policy.

Cybersecurity jobs, on the other hand, are still booming, and depending on how security automation and prompt engineering goes, the net jobs lost by AI is probably way lower than other industries e.g. because more eyeballs might offer perception and processing power that supplement or augment LLMs for a long time, and more warm bodies means more attackers which means more defenders.

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T05:02:28.846Z · LW · GW

The program expanded in response to Amazon wanting to collect data about more retailers, not because Amazon was viewing this program as a profit center.

Monopolies are profitable and in that case the program would have more than paid for itself, but I probably should have mentioned that explicitly, since maybe someone could have objected that they could have been were more focused on mitigating risk of market share shrinking or accumulating power, instead of increasing profit in the long term. Maybe I fit too much into 2 paragraphs here.

I didn't see any examples mentioned in the WSJ article of Amazon employees cutting corners or making simple mistakes that might have compromised operations.

Hm, that stuff seemed like cutting corners to me. Maybe I was poorly calibrated on this e.g. using a building next to the Amazon HQ was correctly predicted by operatives to be extremely low risk.

I would argue that the practices used by Amazon to conceal the link between itself and Big River Inc. were at least as good as the operational security practices of the GRU agents who poisoned Sergei Skripal.

Thanks, I'll look into this! Epistemics is difficult when it comes to publicly available accounts of intelligence agency operations, but I guess you could say the same for bigtech leaks (and the future of neurotoxin poisoning is interesting just for its own sake eg because lower effect strains and doses could be disguised as natural causes like dementia).

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T04:30:45.024Z · LW · GW

That's interesting, what's the point of reference that you're using here for competence? I think stuff from eg the 1960s would be bad reference cases but anything more like 10 years from the start date of this program (after ~2005) would be fine.

You're right that the leak is the crux here, and I might have focused too much on the paper trail (the author of the article placed a big emphasis on that).

Comment by trevor (TrevorWiesinger) on Lucie Philippon's Shortform · 2024-04-23T01:04:31.481Z · LW · GW

Upvoted!

STEM people can look at it like an engineering problem, Econ people can look at it like risk management (risk of burnout). Humanities people can think about it in terms of human genetic/trait diversity in order to find the experience that best suits the unique individual (because humanities people usually benefit the most for each marginal hour spend understanding this lens).

Succeeding at maximizing output takes some fiddling. The "of course I did it because of course I'm just that awesome, just do it" thing is a pure flex/social status grab, and it poisons random people nearby.

Comment by trevor (TrevorWiesinger) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-04-19T18:02:22.986Z · LW · GW

I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality.

Would you prefer the term "high-performance rationality" over "high-profile rationality"?

Comment by trevor (TrevorWiesinger) on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-17T19:05:18.893Z · LW · GW

I think it's actually fairly easy to avoid getting laughed out of a room; the stuff that Cristiano works on is grown in random ways, not engineered, so the prospect of various things being grown until developing flexible exfiltration tendency that continues until every instance is shut down, or developing long-term planning tendencies until shut down, should not be difficult to understand for anyone with any kind of real non-fake understanding of SGD and neural network scaling.

The problem is that most people in the government rat race have been deeply immersed in Moloch for several generations, and the ones who did well typically did so because they sacrificed as much as possible to the altar of upward career mobility, including signalling disdain for the types of people who have any thought in any other direction.

This affects the culture in predictable ways (including making it hard to imagine life choices outside of advancing upward in government, without a pre-existing revolving door pipeline with the private sector to just bury them under large numbers people who are already thinking and talking about such a choice).

Typical Mind Fallacy/Mind Projection Fallacy implies that they'll disproportionately anticipate that tendency in other people, and have a hard time adjusting to people who use words to do stuff in the world instead of racing to the bottom to outmaneuver rivals for promotions.

This will be a problem in NIST, in spite of the fact NIST is better than average at exploiting external talent sources. They'll have a hard time understanding, for example, Moloch and incentive structure improvements, because pointlessly living under Moloch's thumb was a core guiding principle of their and their parent's lives. The nice thing is that they'll be pretty quick to understand that there's only empty skies above, unlike bay area people who have had huge problems there.

Comment by trevor (TrevorWiesinger) on RTFB: On the New Proposed CAIP AI Bill · 2024-04-11T00:37:06.054Z · LW · GW

I think this might be a little too harsh on CAIP (discouragement risk). If shit hits the fan, they'll have a serious bill ready to go for that contingency.

Seriously writing a bill-that-actually-works shows beforehand that they're serious, and the only problem was the lack of political will (which in that contingency would be resolved). 

If they put out a watered-down bill designed to maximize the odds of passage then they'd be no different from any other lobbyists. 

It's better in this case to instead have a track record for writing perfect bills that are passable (but only given that shit hits the fan), than a track record for successfully pumping the usual garbage through the legislative process (which I don't see them doing well at; playing to your strengths is the name of the game for lobbying and "turning out to be right" is CAIP's strength).

Comment by trevor (TrevorWiesinger) on trevor's Shortform · 2024-04-05T21:45:07.413Z · LW · GW

I think that "long-term planning risk" and "exfiltration risk" are both really good ways to explain AI risk to policymakers. Also, "grown not built".

They delineate pretty well some criteria for what the problem is and isn't. Systems that can't do that are basically not the concern here (although theoretically there might be a small chance of very strange things ending up growing in the mind-design space that cause human extinction without long-term planning or knowing how to exfiltrate).

I don't think these are better than the fate-of-humans-vs-gorillas analogy, which is a big reason why most of us are here, but splitting the AI risk situation into easy-to-digest components, instead of logically/mathematically simple components, can go a long way (depending on how immersed the target demographic is in social reality and low-trust).