[Linkpost] George Mack's Razors 2023-11-27T17:53:45.065Z
Altman firing retaliation incoming? 2023-11-19T00:10:15.645Z
Helpful examples to get a sense of modern automated manipulation 2023-11-12T20:49:57.422Z
We are already in a persuasion-transformed world and must take precautions 2023-11-04T15:53:31.345Z
5 Reasons Why Governments/Militaries Already Want AI for Information Warfare 2023-10-30T16:30:38.020Z
Sensor Exposure can Compromise the Human Brain in the 2020s 2023-10-26T03:31:09.835Z
AI Safety is Dropping the Ball on Clown Attacks 2023-10-22T20:09:31.810Z
Information warfare historically revolved around human conduits 2023-08-28T18:54:27.169Z
Assessment of intelligence agency functionality is difficult yet important 2023-08-24T01:42:20.931Z
One example of how LLM propaganda attacks can hack the brain 2023-08-16T21:41:02.310Z
Buying Tall-Poppy-Cutting Offsets 2023-05-20T03:59:46.336Z
Financial Times: We must slow down the race to God-like AI 2023-04-13T19:55:26.217Z
What is the best source to explain short AI timelines to a skeptical person? 2023-04-13T04:29:03.166Z
All images from the WaitButWhy sequence on AI 2023-04-08T07:36:06.044Z
10 reasons why lists of 10 reasons might be a winning strategy 2023-04-06T21:24:17.896Z
What could EA's new name be? 2023-04-02T19:25:22.740Z
Strong Cheap Signals 2023-03-29T14:18:52.734Z
NYT: Lab Leak Most Likely Caused Pandemic, Energy Dept. Says 2023-02-26T21:21:54.675Z
Are there rationality techniques similar to staring at the wall for 4 hours? 2023-02-24T11:48:45.944Z
NYT: A Conversation With Bing’s Chatbot Left Me Deeply Unsettled 2023-02-16T22:57:26.302Z
The best way so far to explain AI risk: The Precipice (p. 137-149) 2023-02-10T19:33:00.094Z
Many important technologies start out as science fiction before becoming real 2023-02-10T09:36:29.526Z
Why is Everyone So Boring? By Robin Hanson 2023-02-06T04:17:20.372Z
There have been 3 planes (billionaire donors) and 2 have crashed 2022-12-17T03:58:28.125Z
What's the best time-efficient alternative to the Sequences? 2022-12-16T20:17:27.449Z
You are better at math (and alignment) than you think 2022-10-13T03:07:52.202Z
What key nutrients are required for daily energy? 2022-09-20T23:30:02.540Z


Comment by trevor (TrevorWiesinger) on The Witness · 2023-12-04T02:29:08.757Z · LW · GW

Strong upvoted because I liked the ending.

This story reminds me of a Twitter debate between Yud and D'Angelo (NOTE: this is from 6 MONTHS AGO and it is a snapshot of their thinking from a specific point in time):

Adam D'Angelo:

What are the strongest arguments against the possibility of an outcome where strong AI is widely accessible but there is a “balance of power” between different AI systems (or humans empowered by AI) that enables the law to be enforced and otherwise maintains stability?

Eliezer Yudkowsky:

That superintelligences that can eg do logical handshakes with each other or coordinate on building mutually trusted cognitive systems, form a natural coalition that excludes mammals. So there's a balance of power among them, but not with us in it.


You could argue a similar thing about lawyers, that prosecutors and defense lawyers speak the same jargon and have more of a repeated game going than citizens they represent. And yet we have a system that mostly works.


Lawyers combined cannot casually exterminate nonlawyers.


Even if they could (and assuming AGI could) they wouldn’t want to; it would be worse for them than keeping the rest of humanity alive, and also against their values. So I wouldn’t expect them to.


I agree that many lawyers wouldn't want to exterminate humanity, but building at least one AGI like that is indeed the alignment problem; failing that, an AGI coalition has no instrumental interest in protecting us.


Can you remind us again of the apparently obvious logic that the default behavior for an AGI is to want to exterminate us?


1: You don't want humans building other SIs that could compete with you for resources. 
2: You may want to do large-scale stuff that eg builds lots of fusion plants and boils the oceans as a heatsink for an early burst of computation. 
3: You might directly use those atoms.


For 1, seems much easier to just stop humans from doing that than to exterminate them all. 
For 2, if you have that kind of power it's probably easy to preserve humanity. 
For 3, I have trouble seeing a shortage of atoms as being a bottleneck to anything.

David Xu:

1: With a large enough power disparity, the easiest way to “stop” an opponent from doing something is to make them stop existing entirely. 
2: Easier still to get rid of them, as per 1. 
3: It’s not a bottleneck; but you’d still rather have those atoms than not have them.


1: Humans aren't a single opponent. If an ant gets into my house I might kill it but I don't waste my time trying to kill all the ants outside. 
2: This assumes no value placed on humans, which I think is unlikely 
3: But it takes energy, which likely is a bottleneck


If you have literally any noticeable value placed on humans living happily ever after, it's a trivial cost to upload their mind-states as you kill their bodies, and run them on a basketball-sized computer somewhere in some galaxy, experiencing a million years of glorious transhumanist future over the course of a few days - modulo that it's not possible for them to have real children - before you shut down and repurpose the basketball. 

We do not know how to make any AGI with any goal that is maximally satisfied by doing that rather than something else. Mostly because we don't know how to build an AGI that ends up with any precise preference period, but also because it's not trivial to specify a utility function whose maximum falls there rather than somewhere else. If we could pull off that kind of hat trick, even to the extent of getting a million years of subjective afterlife over the course of a few days, we could just as easily get all the galaxies in reach for sentient life.

Comment by trevor (TrevorWiesinger) on Out-of-distribution Bioattacks · 2023-12-02T17:10:33.676Z · LW · GW

Strong upvoted. I'm really glad that people like you are thinking about this.

Something that people often miss with bioattacks is the economic dimension. After the 2008 financial crisis, economic failure/collapse became perhaps the #1 goalpost of the US-China conflict

It's even debatable whether the 2008 financial crisis was the cause of the entire US China conflict (e.g. lots of people in DC and Beijing would put the odds at >60% that >50% of the current US-China conflict was caused by the 2008 recession alone, in contrast to other variables like the emergence of unpredictable changes in cybersecurity).

Unlike conventional war e.g. over Taiwan and cyberattacks, economic downturns have massive and clear effects on the balance of power between the US and China, with very little risk of a pyrrhic victory (I don't currently know how this compares to things like cognitive warfare which also yield high-stakes victories and defeats that are hard to distinguish from natural causes).

Notably, the imperative to cause massive economic damage, rather than destroy the country itself, allows attackers to ratchet down the lethality as far as they want, so long as it's enough to cause lockdowns which cause economic damage (maybe mass IQ reduction or other brain effects could achieve this instead). 

GOF research is filled with people who spent >5 years deeply immersed in a medical perspective e.g. virology, so it seems fairly likely to me that GOF researchers will think about the wider variety of capabilities of bioattacks, rather than inflexibly sticking to the bodycount-maximizing mindset of the Cold War.

I think that due to disorganization and compartmentalization within intelligence agencies, as well as unclear patterns of emergence and decay of competent groups of competent people, it's actually more likely that easier-access biological attacks would first be caused by radicals with privileged access within state agencies or state-adjacent organizations (like Booz Allen Hamilton, or the Internet Research Agency which was accused of interfering with the 2016 election on behalf of the Russian government). 

These radicals might incorrectly (or even correctly) predict that their country is a sinking ship and that they only way out is to personally change the balance of power; theoretically, they could even correctly predict that they are the only ones left competent enough to do this before it's too late.

Comment by trevor (TrevorWiesinger) on Stupid Question: Why am I getting consistently downvoted? · 2023-11-30T01:57:08.346Z · LW · GW

It looks like bad luck.

  1. It seems that you're averse to social status, or reject the premise in some way. That is a common cause of self-deprecation. The dynamics underlying social status, and the downstream effects, are in fact awful and shouldn't exist. It makes sense to look at the situation with social status and recoil in horror and disgust. I did something similar from 2014-2016, declined to participate, and it made my life hell. A proper fix is not currently within reach (accelerating AI might do a lot, building aligned AGI almost certainly will), and failing to jump through the hoops will make everything painful for you and the people around you (or at least unpleasant). Self-deprecation requires way more charisma than it appears, since they are merely pretending to throw away social status; we are social status-pursuing monkeys in a very deep way, and hemorrhaging one's own social status for real is the wrong move in our civilization's current form. This will be fixed eventually, unless we all die. Until then, "is this cringe" is a surprisingly easy subroutine to set up; I know, I've done it.
  2. Read Ngo's AI safety from first principles summary in order to make sure you're not missing anything important, and the Superintelligence FAQ and the Unilateralists Curse if that's not enough and you still get the sense that you're not on the same page as everyone else.
  3. If all of this seems a bit much, amplify daily motivation by reading the Execute by default, AI timelines dialog, Buck's freshman year, and What to do in response to an emergency.
Comment by trevor (TrevorWiesinger) on The Alignment Problem · 2023-11-30T00:39:41.570Z · LW · GW

Retracted because I used the word "fundamentally" incorrectly, resulting in a mathematically provably false statement (in fact it might be reasonable to assume that neutral networks are both fundamentally predictable and even fundamentally explainable, although I can't say for sure since as of Nov 2023 I don't have a sufficient understanding of Chaos theory). They sure are unpredictable and unexplainable right now, but there's nothing fundamental about that. 

This comment shouldn't have been upvoted by anyone. It said something that isn't true.

Comment by trevor (TrevorWiesinger) on Lying Alignment Chart · 2023-11-29T16:37:41.942Z · LW · GW

I think this is a neat model improvement from Scott Alexander's list of media lies from his series on media/news companies:

  1. Reasoning well, and getting things right
  2. Reasoning well, but getting things wrong because the world is complicated and you got unlucky.
  3. Reasoning badly, because you are dumb.
  4. Reasoning badly, because you are biased, and on some more-or-less subconscious level not even trying to reason well.
  5. Reasoning well, having a clear model of the world in your mind, but more-or-less subconsciously and unthinkingly presenting technically true facts in a deceptive way that leaves other people confused, without ever technically lying.
  6. Reasoning well, having a clear model of the world in your mind, but very consciously, and with full knowledge of what you’re doing, presenting technically true facts in a deceptive way intended to make other people confused, without ever technically lying.
  7. Reasoning well, having a clear model of the world in your mind, and literally lying and making up false facts to deceive other people.

In a perfect world, we would have separate words for all of these. In our own world, to save time and energy we usually apply a few pre-existing words to all of them.

(I think that last statement is wrong; we aren't applying a few pre-existing words in order to save time, we're applying pre-existing words because the millions of people who created and established the use of those few pre-existing words were largely clueless about the differences between these 7 separate instances, because Scott Alexander wrote this list in 2023 instead of hundreds of years ago).

Comment by trevor (TrevorWiesinger) on My techno-optimism [By Vitalik Buterin] · 2023-11-29T16:04:37.178Z · LW · GW

Strong epistemic upvoted, this is very helpful way for any reader. I only wrote the original comment because I thought it was worth putting out there. 

I'm still glad I included Bostrom's infographic though.

Comment by trevor (TrevorWiesinger) on [Linkpost] George Mack's Razors · 2023-11-28T19:35:29.629Z · LW · GW

I think that came from James Clear's Atomic Habits, talking about how if you get 1% better at something every day, then you get >30 times better at it after a year (1.01^365 = 37.7). But it has to be something where improvement by a factor of 30 is possible e.g. running a mile.

I think it makes sense that you can repeatedly get 30x better at, say, reducing p(doom), especially if you're starting from zero, but the 1% per day dynamic depends on how different types of things compound (e.g. applying the techniques from the CFAR handbook compounding with getting better at integrating bayesian thinking into your thoughts, and how those compound with getting an intuitive understanding of the Yudkowsky-christiano debate or AI timelines). 

Comment by trevor (TrevorWiesinger) on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T02:18:49.627Z · LW · GW

I think it's definitely possible that it increases defection rates and/or decreases morale among the officers, or that it completely bounces off most of the troops or increases defection rates there. Especially because you can't test it on officers and measure effectiveness in the environment of long trench wars, where nihilism ran rampant, because that environment wouldn't exist until it was far too late to use it as a testing environment.

But propaganda and war recruitment was generally pretty inferior to what exists today, e.g. the world's best psychologist was Sigmund Freud and behavioral economics was ~a century away. They were far worse than most people today at writing really good books that are easy to read and that anyone could enjoy, and the contemporary advances in propaganda that they did have resulted in massive and unprecedented scaling in nationalism and war capabilities, even though what they had at the time was vastly less effective than what we're used to today.

Comment by trevor (TrevorWiesinger) on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-28T00:40:17.799Z · LW · GW

I don't really see how you lose; you have a cultural renaissance, an economic boom, and a coordination takeoff in your pocket, and you have substantial degrees of freedom to convert it into German Nationalism that's an order of magnitude memetically stronger than the original WW1. 

The risk comes from Britain and France getting their own cultural renaissance, and that's actually a pretty easy fix; just insult the French and British every single time you write something, and that will probably be enough.

Comment by trevor (TrevorWiesinger) on My techno-optimism [By Vitalik Buterin] · 2023-11-28T00:14:15.107Z · LW · GW

Vitalik's take is galaxy-brained (is there an opposite of the term "scissor statement"?). Bostrom published the paper Existential Risk as a Global Priority in 2013 containing this picture:

Existential Risks: Threats to Humanity's Survival

and Yudkowsky probably already wrote a ton about this ~15 years ago, and yet both of them seem to have failed to rise to the challenge today of resolving the escalating situation with e/acc- at least not to this degree of effectiveness. Yudkowsky was trying to get massive twitter dunks and Bostrom was trying to bait controversy to get podcast views, and that sure looks a lot like both of them ended up as slaves to the algorithm's attention-maximizing behavior reinforcement (something something thermodynamic downhill).

Comment by trevor (TrevorWiesinger) on Could Germany have won World War I with high probability given the benefit of hindsight? · 2023-11-27T23:51:18.945Z · LW · GW

How much of your knowledge of game theory, FDT, and prediction markets/forecasting can you bring?

You wouldn't need to start with respect. People are much better at writing now than they were 100 years ago, on an instant-gratification basis. Plus, the Hobbit was 1937, I, Robot was the 1950s, Dune was 1965, so you'd be inspired by HPMOR and Three Body Problem and Hyperion while all your competitors would be running off of Frankenstein and Sherlock Holmes and Shakespeare. Not to mention real stuff like the Sequences, and the Extropian list, and what actually went down with the people involved in AI alignment from 2016-2023. 

Anyone here could do what Yudkowsky did if they went to 1904. The world's best behavioral economist was Sigmund Freud. Most people would quickly notice what they're capable of, after having nothing to do in their spare time but read Frankenstein and Sherlock Holmes and Shakespeare.

Write an awesome book about German greatness, optimize for maximizing troop morale, but explain things well enough that most German officers can understand it. If you depict the French and British as sufficiently stupid and crude, it will probably never occur to a single British or French person to plagiarize it and use it for their own side's propaganda. It would probably get translated word-for-word and get super popular in Britain and France anyway.

Comment by trevor (TrevorWiesinger) on [Linkpost] George Mack's Razors · 2023-11-27T21:37:20.629Z · LW · GW

Absolutely! I like some of these razors better than others.

Competence assessment is way more accurate in some fields than others e.g. software engineering (does the code run?) vs legal work or policy research (where have they worked before?).

Comment by trevor (TrevorWiesinger) on why did OpenAI employees sign · 2023-11-27T16:03:42.906Z · LW · GW

Citing a relevant part of the Lex Fridman interview (transcript) which people will probably find helpful to watch, so you can at least eyeball Altman's facial expressions:

LEX FRIDMAN: How do you hire? How do you hire great teams? The folks I’ve interacted with, some of the most amazing folks I’ve ever met.

SAM ALTMAN: It takes a lot of time. I mean, I think a lot of people claim to spend a third of their time hiring. I for real truly do. I still approve every single hire at OpenAI. And I think we’re working on a problem that is like very cool and that great people want to work on. We have great people and people want to be around them. But even with that, I think there’s just no shortcut for putting a ton of effort into this.

I think it's also important to do three-body-problem thinking with this situation; it's also possible that Microsoft or some other third party might have gradually but successfully orchestrated distrust/conflict between two good-guy factions or acquired access to the minds/culture of OpenAI employees, in which case it's critical for the surviving good guys to mitigate the damage and maximize robustness against third parties in the future. 

For example, Altman was misled to believe that the board was probably compromised and he had to throw everything at them, and the board was mislead to believe that Altman was hopelessly compromised and they had to throw everything at him (or maybe one of them was actually compromised). I actually wrote about that 5 days before the OpenAI conflict started (I'd call that a fun fact but not a suspicious coincidence, because things are going faster now, 5 days in 2023 is like 30 days in 2019 time).

Comment by trevor (TrevorWiesinger) on What are the results of more parental supervision and less outdoor play? · 2023-11-26T19:11:57.634Z · LW · GW

But you can’t create the social environment that existed when all the kids had less supervision. This isn’t just the “someone will call the police” fear; it’s more prosaic too. At some point other parents will view you as suspect and won’t let their kids play with yours, which defeats some of the purpose.

If we're venturing into historically unprecedented changes in child upbringing either way, then it might be good to keep in mind that children spending time with other children is important for developing social skills in preparation for harsher social environments later on, but introspection and time spent talking with smart adults/tutors might result in substantially improved intelligence by the time they become adults.

As gaining approval from other parents becomes increasingly costly to the children themselves (due to other parents hovering and expecting you to consistently hover), it might be a good idea to just reduce stuff (including playdates and birthday parties) that require costly investment in getting other parent's approval.

Plus, other kids will basically impose their phones and tablets on your kids, notably games and services like TikTok and Youtube (kid version) and intensely attention-optimized repetitive mobile games which are far more harmful than Minecraft (which Kelsey Piper claimed her friend's kids generally preferred over any other activity).

Comment by trevor (TrevorWiesinger) on Announcing New Beginner-friendly Book on AI Safety and Risk · 2023-11-25T19:38:52.358Z · LW · GW

Strong upvoted. I myself still don't know what form public outreach should take; the billionaires we've had so far (Jaan, Dustin, etc) were the cute and cuddly and friendly billionaires, and there are probably some seriously mean mother fuckers in the greater ecosystem

However, I was really impressed by the decisions behind WWOTF, and MIRI's policies before and after MIRI's shift. I still have strong sign uncertainty for the scenario where this book succeeds and gets something like 10 million people thinking in AI safety. We really don't know what that world would look like, e.g. it could end up as the same death game as right now but with more players. 

But one way or another, it is probably highly valuable and optimized reference material for getting an intuitive sense for how to explain AI safety to people, similar to Scott Alexander's Superintelligence FAQ which was endorsed as #1 by Raemon, or the top ~10% of the AI safety arguments competition.

Comment by trevor (TrevorWiesinger) on Raemon's Deliberate (“Purposeful?”) Practice Club · 2023-11-25T03:47:37.959Z · LW · GW

I'm predicting that much of the stuff that causes measurable cognitive improvement will be by the mechanism of making people spend less time on social media or otherwise dithering about on the internet. 

e.g. something like 20% of the measured benefit from things like reading the Sequences, the CFAR handbook, singlemindedly playing a specific indie game, are from being the rare thing sufficient to shake people out of habits formed around the addictive gravity-well of the commercialized internet's click/scroll maximization. 

People should not have "shower thoughts"; in the 90s and 2000s people would zone out and have "shower thoughts" while reading books, the extropy email list, and sometimes even watching TV.

Specifically, somewhere around a 20% chance that >30% of the benefit unexpectedly comes from this dynamic, and a 50% chance that 10-30% of the benefit unexpectedly comes from this dynamic. 

If MIRI or CFAR or EA's extremophile ascetics were already successful at getting their best thinkers to consistently spend time thinking or pacing or writing on whiteboards/notes after work, instead of on the commercialized internet, that's a strong update against my hypothesis.

I already expect feedbackloopfirst rationality to cause substantial cognitive enhancement on its own. This problem is a confounding variable; the goal of the practice club is to find ways to amplify intelligence, but the experiments will show high measured effectiveness from things like singlemindedly playing a specific indie game or driving or taking long showers, even though the causal mechanism actually comes those things increasing the proportion of time a rationalist spends thinking at all, not increasing intelligence or mitigating intelligence-reducing feedback loops. 

People need to already be spending 2+ hours a day distraction-free, in order to see if the results are coming from cognitive enhancement, rather than from distraction-removal like long showers or driving.

Comment by trevor (TrevorWiesinger) on OpenAI: The Battle of the Board · 2023-11-22T20:22:31.768Z · LW · GW

From Planecrash page 10:

On the first random page Keltham opened to, the author was saying what some 'Duke' (high-level Government official) was thinking while ordering the east gates to be sealed, which, like, what, how would the historian know what somebody was thinking, at best you get somebody else's autobiographical account of what they claim they were thinking, and then the writer is supposed to say that what was observed was the claim, and mark separately any inferences from the observation, because one distinguishes observations from inferences.

This post generally does well on this point (obviously written by a bayesian, Zvi's coverage is generally superior), except for the claim "Altman started it". We don't know who started it, and most of the major players might not even know who started it. According to Alex A:

Smart people who have an objective of accumulating and keeping control—who are skilled at persuasion and manipulation —will often leave little trace of wrongdoing. They’re optimizing for alibis and plausible deniability. Being around them and trying to collaborate with them is frustrating. If you’re self-aware enough, you can recognize that your contributions are being twisted, that your voice is going unheard, and that critical information is being withheld from you, but it’s not easy. And when you try to bring up concerns, they are very good at convincing you that those concerns are actually your fault.

We do know that Microsoft is a giant startup-eating machine with incredible influence in the US government and military, and also globally (e.g. possibly better at planting OS backdoors than the NSA and undoubtedly irreplaceable in that process for most computers in the world), and has been planting roots into Altman himself for almost a year now.

Microsoft has >50 Altman-types for every Altman at OpenAI (I couldn't find any publicly available information about the correct number of executives at Microsoft). Investors are also savvy, and the number of consultants consulted is not knowable. A friend characterized this basically as "acquiring a ~$50B dollar company for free".

The obvious outcome where the giant powerful tech company ultimately ends up on top is telling (although the gauche affirmation of allegiance to Microsoft in the wake of the conflict is a weak update since that's a reasonable thing for Microsoft to expect due to its vulnerability to FUD) because Microsoft is a Usual Suspect, even if purely internal conflict is generally a better explanation and somewhat more likely to be what happened here.

Comment by trevor (TrevorWiesinger) on OpenAI: Facts from a Weekend · 2023-11-22T08:55:42.272Z · LW · GW

For those of us who don't know yet, criticizing the accuracy of mainstream Western news outlets is NOT a strong bayesian update against someone's epistemics, especially on a site like Lesswrong (doesn't matter how many idiots you might remember ranting about "mainstream media" on other sites, the numbers are completely different here).

There is a well-known dynamic called Gell-Mann Amnesia, where people strongly lose trust in mainstream Western news outlets on a topic they are an expert on, but routinely forget about this loss of trust when they read coverage on a topic that they can't evaluate accuracy on. Western news outlets Goodhart readers by depicting themselves as reliable instead of prioritizing reliability.

If you read major Western news outlets, or are new to major news outlets due to people linking to them on Lesswrong recently, some basic epistemic prep can be found in Scott Alexander's The Media Very Rarely Lies and if it's important, the follow up posts.

Comment by trevor (TrevorWiesinger) on OpenAI: Facts from a Weekend · 2023-11-22T07:38:55.463Z · LW · GW

The verge article is better, shows tweets by Toner and Nadella confirming that it wasn't just someone getting access to the OpenAI twitter/x account (unless of course someone acquired access to all the accounts, which doesn't seem likely).

Comment by TrevorWiesinger on [deleted post] 2023-11-22T00:12:49.497Z

Also worth calling out explicitly: There aren't that many derivatives traders in the world, and the profession favors secrecy. I think the total influence of derivatives-trading on elite culture is pretty small.

Can you go into more detail about this? This makes derivative traders sound like the kind of thing that would be close to the heart of power players in wall street e.g. due to facilitating a wide variety of liquidity exchanges in low-trust environments.

In a larger derivatives market, AI would allow power and information to accrue to the large players hosting the computers on the market (and possibly hackers) due to AI improving the effectiveness of predictive analytics running on the data. It wouldn't work nearly as well as social media, but that's largely because social media systems are far more data-rich, and the derivatives environment might benefit from algorithmic advances in order to get better results from less data.

Comment by trevor (TrevorWiesinger) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-22T00:02:23.522Z · LW · GW

That largely depends on where AI safety's talent has been going, and could go. 

I'm thinking that most of the smarter quant thinkers have been doing AI alignment 8 hours a day and probably won't succeed, especially without access to AI architectures that haven't been invented yet, and most of the people research policy and cooperation weren't our best. 

If our best quant thinkers are doing alignment research for 8 hours a day with systems that probably aren't good enough to extrapolate to the crunch time systems, and our best thinkers haven't been researching policy and coordination (e.g. historically unprecedented coordination takeoffs), then the expected hope from policy and coordination is much higher, and our best quant thinkers should be doing policy and coordination during this time period; even if we're 4 years away, they can mostly do human research for freshman and sophomore year and go back to alignment research for junior and senior year. Same if we're two years away.

Comment by trevor (TrevorWiesinger) on Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI" · 2023-11-21T20:45:44.596Z · LW · GW

I read the whole thing, glad I did. It really makes me think that many of AI safety's best minds are doing technical work like technical alignment 8 hours a day, when it would be better for them to do 2 hours a day to keep their skills honed, and spend 6 hours a day acting as generalists to think through the most important problems of the moment.

They should have shared their reasons/excuses for the firing. (For some reason, in politics/corporate politics, people try to be secretive all the time and this seems-to-me to be very stupid in like 80+% of cases, including this one.)

Hard disagree in the OpenAI case. I'm putting >50% that they were correctly worried about people correctly deducing all kinds of things from honest statements, because AI safety is unusually smart and bayesian. There's literally prediction markets here. 

I'm putting >50% on that alone; also, if the true reason was anything super weird e.g. Altman accepting bribes or cutting deals with NSA operatives, then it would also be reasonable not to share it, even if AI safety didn't have tons of high-agency people that made it like herding cats.

That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things...

If the things you want people to do differently are costly, e.g. your safer AI is more expensive, but you are seen as untrustworthy, low-integrity, low-tranparency, low political competence, then I think you'll have a hard time getting buy in for it.

I think this gets into the complicated issue of security dilemmas; AI safety has a tradeoff of sovereignty and trustworthiness, since groups that are more powerful and sovereign have a risk of betraying their allies and/or going on the offensive (a discount rate since the risk accumulates over time), but not enough sovereignty means the group can't defend itself against infiltration and absorption.

The situation with slow takeoff means that historically unprecedented things will happen and it's not clear what the correct course of action is for EA and AI safety. I've argued that targeted influence is already a significant risk due to the social media paradigm already being really good at human manipulation by default and due to major governments and militaries already being interested in the use of AI for information warfare. But that's only one potential facet of the sovereignty-tradeoff problem and it's only going to get more multifaceted from here; hence why we need more Rubys and Wentworths spending more hours on the problem.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T02:53:30.212Z · LW · GW

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T02:42:20.549Z · LW · GW

If >80% of Microsoft employees were signed up for Cryonics, as opposed to ~0% now, that would indicate that Microsoft is sufficiently future-conscious to make it probably net-positive for them to absorb OpenAI.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T02:39:30.976Z · LW · GW

There is a >80% chance that US-China affairs (including the AI race between the US and China) is an extremely valuable or crucial lens for understanding the current conflict over OpenAI (the conflict itself, not the downstream implications), as opposed to being a merely somewhat-helpful lens.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T01:31:05.612Z · LW · GW

Microsoft is not actively aggressive or trying to capture OpenAI, and is largely passive in the conflict, e.g. Sam or Greg or investors approached Microsoft with the exodus and letter idea, and Microsoft was either clueless or misled about the connection between the board and EA/AI safety.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T01:22:56.115Z · LW · GW

If most of OpenAI is absorbed into Microsoft, they will ultimately remain sovereign and resist further capture, e.g. using rationality, FDT/UDT, coordination theory, CFAR training, glowfic osmosis, or Microsoft underestimating them sufficiently to fail to capture them further, or overestimating their predictability, or being deterred, or Microsoft not having the appetite to make an attempt at all.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T01:10:45.992Z · LW · GW

If a mass exodus happens, it will mostly be the fodder employees, and more than 30% of OpenAI's talent will remain (e.g. if the mind of Ilya and two other people contain more than 30% of OpenAI's talent, and they all stay).

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T01:00:20.115Z · LW · GW

The conflict was not started by the board, but rather the board reacting to a move made by someone else, or a discovery of a hostile plot previously initiated and advanced by someone else.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T00:56:06.450Z · LW · GW

Richard Ngo, the person, signed the letter (as opposed to a fake signature)

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T00:52:40.800Z · LW · GW

If Microsoft absorbs most of OpenAI, someone should retaliate by publicly revealing/whistleblowing the darkest secrets of Microsoft's core business model, initiate a lawsuit, and/or crash Microsoft's stock as much as possible, so long as they do not violate any contracts or commit any major or minor crimes, and only do socially acceptable behavior like whistleblowing that will not harm AI safety's reputation.

Comment by trevor (TrevorWiesinger) on Vote on worthwhile OpenAI topics to discuss · 2023-11-21T00:46:11.821Z · LW · GW

The letter indicating ~700 employees will leave if Altman and Brockman do not return, contains >50 fake signatures.

Comment by trevor (TrevorWiesinger) on OpenAI: Facts from a Weekend · 2023-11-20T20:00:46.562Z · LW · GW

This conflict has inescapably taken place in the context of US-China competition over AI, as leaders in both countries are well known to pursue AI acceleration for applications like autonomous low-flying nuclear cruise missiles (e.g. in contingencies where military GPS networks fail), economic growth faster than the US/China/rest of the world, and information warfare.

I think I could confidently bet against Chinese involvement, that seems quite reasonable. I can't bet so confidently against US involvement; although I agree that it remains largely unclear, it's also plausible that this situation has a usual suspect and we could have seen it coming. Either way, victory/conquest by ludicrously powerful orgs like Microsoft seem like the obvious default outcome.

Comment by trevor (TrevorWiesinger) on OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns · 2023-11-20T19:32:50.530Z · LW · GW

One explanation that comes to mind is that AI already offers extremely powerful manipulation capabilities and governments are already racing to acquire these capabilities.

I'm very confused about the events that have been taking place, but one factor that I have very little doubt is that the NSA has acquired access to smartphone operating systems and smartphone microphones throughout the OpenAI building (it's just one building, and a really important one, so it's also reasonably likely that it's also been bugged). Whether they were doing anything with that access is much less clear.

Comment by trevor (TrevorWiesinger) on OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns · 2023-11-20T18:46:46.649Z · LW · GW

The US Natsec community (which is probably decentralized and not a "deep state") has a very strong interest in accelerating AI faster than China and Russia e.g. for use in military hardware like cruise missiles, economic growth in an era of technological stagnation, and for defending/counteracting/mitigating SOTA foreign influence operations e.g. Russian botnets that use AI and user data for targeted manipulation. Current-gen AI is pretty well known to be highly valuable for these uses.

This is what makes "the super dangerous people who already badly want AI" one major hypothesis, but not at all the default explanation. Considering who seems to be benefiting the most, Microsoft (which AFAIK probably has the strongest ties to the military out of the big 5 tech companies), this is pretty clearly worth consideration.

Comment by trevor (TrevorWiesinger) on OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns · 2023-11-20T18:26:50.587Z · LW · GW

This is a good point, that much of the data we have comes from leaked operations in South America (e.g. the Church Hearings), and CIA operations are probably much easier there than on American soil.

However, there are also different kinds of systems pointed inward which look more like normal power games e.g. FBI informants, or lobbyists forming complex agreements/arrangements (like how their lawyer counterparts develop clever value-handshake-like agreements/arrangements to settle out-of-court). It shouldn't be surprising that domestic ops are more complicated and look like ordinary domestic power plays (possibly occasionally augmented by advanced technology).

The profit motive alone could motivate Microsoft execs to leverage their access to advanced technology to get a better outcome for Microsoft. I was pretty surprised by the possibility that silicon valley VCs alone could potentially set up sophisticated operations e.g. using pre-established connections to journalists to leak false information or access to large tech companies with manipulation capabilities (e.g. Andreessen Horowitz's access to Facebook's manipulation research).

Comment by trevor (TrevorWiesinger) on OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns · 2023-11-20T18:10:59.185Z · LW · GW

This is the wrong way to think about this, and an even worse way to write about this. 

I think this comment is closing the overton window on extremely important concepts, in a really bad way. There probably isn't a "deep state", we can infer that the US Natsec establishment is probably decentralized a la lc's Don't take the organization chart literally due to the structure of the human brain, and I've argued that competent networks of people emerge stochastically based on technological advances like better lie detection technology, whereas Gwern argued that competence mainly emerges based on proximity to the core leadership and leadership's wavering focus.

I've argued pretty persuasively that there's good odds that excessively powerful people at tech companies and intelligence agencies might follow the national interest, and hijack or damage the AI safety community in order to get better positioned to access AI tech.

I'm really glad to see people take this problem seriously, and I'm frankly pretty disturbed that the conflict I sketched out might have ended up happening so soon after I first went public about it a month ago. I think your comment also gets big parts of the problem right e.g. "events at the AI frontier are not something that the national security elite can just allow to unfold unattended" which is well-put. 

But everything is on fire right now, and people have unfairly high standards for getting everything right the first time (it is even more unfair for the world which is ending). Like learning a language, a system as obscure and complicated as this required me to make tons of mistakes before I had something ready for presentation.

Comment by trevor (TrevorWiesinger) on Altman firing retaliation incoming? · 2023-11-19T04:06:52.976Z · LW · GW

Yeah, the impression I get is that if investors are going to the trouble to leak their plot to journalists (even if anonymously), then they are probably hoping to benefit from it acting as a rallying cry/Schelling point. This is a pretty basic thing for intra-org conflicts.

This indicates, at minimum, that the coalition they're aiming for will become stronger than if they had taken the default action and not leaked their plan to journalists. It seems to me more likely that the coalition they're hoping for doesn't exist at all or is too diffused, and they're trying to set people up to join a pro-Altman faction by claiming one is already strong and is due to receive outside support (from "anonymous sources familiar with the matter").

Comment by trevor (TrevorWiesinger) on Altman firing retaliation incoming? · 2023-11-19T02:13:49.405Z · LW · GW

It is really hard to use social media to measure public opinion, even if Twitter/X doesn't have nearly as much security or influence capabilities as Facebook/Instagram, botnet accounts run by state-adjacent agencies can still game Twitter's algorithms by emulating human behavior and upvoting specific posts in order to game Twitter's newsfeed algorithm for the human users. 

Social media has never been an environment that is friendly to independent researchers; if it was easy, then foreign intelligence agencies would run circles around independent researchers in order to research advanced strategies to manipulate public opinion (e.g. via their own social media botnets, or merely just knowing what to say when their leaders give speeches).

But yes. E/acc seems to be really fired up about this.

Comment by trevor (TrevorWiesinger) on Altman firing retaliation incoming? · 2023-11-19T01:53:00.825Z · LW · GW
Comment by trevor (TrevorWiesinger) on Sam Altman fired from OpenAI · 2023-11-18T16:55:27.933Z · LW · GW

Can you please link to it or say what app or website this is?

Comment by trevor (TrevorWiesinger) on Sam Altman fired from OpenAI · 2023-11-18T16:48:06.733Z · LW · GW

The human brain seems to be structured such that

  1. Factional lines are often drawn splitting up large groups like corporations, government agencies, and nonprofits, with the lines tracing networks of alliances, and also retaliatory commitments that are often used to make factions and individuals hardened against removal by rivals.
  2. People are nonetheless occasionally purged along these lines rather than more efficient decision theory like values handshakes.
  3. These conflicts and purges are followed by harsh rhetoric, since people feel urges to search languagespace and find combinations of words that optimize for retaliatory harm against others.

I would be very grateful for sufficient evidence that the new leadership at OpenAI is popular or unpopular among a large portions of the employees, rather than a small number of anonymous people who might have been allied to the purged people. 

I think it might be better to donate that info e.g. message LW mods via the intercom feature in the lower right corner, than to post it publicly.

Comment by trevor (TrevorWiesinger) on Sam Altman fired from OpenAI · 2023-11-18T16:24:15.569Z · LW · GW

This is the market itself, not a screenshot! Click one of the "bet" buttons. An excellent feature.

Comment by trevor (TrevorWiesinger) on Sam Altman fired from OpenAI · 2023-11-18T02:17:11.267Z · LW · GW

I think this makes sense as an incentive for AI acceleration- even if someone is trying to accelerate AI for altruistic reasons e.g. differential tech development (e.g. maybe they calculate that LLMs have better odds of interpretability succeeding because they think in English), then they should still lose access to their AI lab shortly after accelerating AI. 

They get so much personal profit from accelerating AI, so only people prepared to personally lose it all within 3 years are prepared to sacrifice enough to do something as extreme as burning the remaining timeline.

I'm generally not on board with leadership shakeups in the AI safety community, because the disrupted alliance webs create opportunities for resourceful outsiders to worm their way in. I worry especially about incentives for the US natsec community to do this. But when I look at it from the game theory/moloch perspective, it might be worth the risk, if it means setting things up so that the people who accelerate AI always fail to be the ones who profit off of it, and therefore can only accelerate because they think it will benefit the world.

Comment by trevor (TrevorWiesinger) on Social Dark Matter · 2023-11-17T20:50:43.725Z · LW · GW

In something like 75% of possible futures, this will be the last essay that I publish on LessWrong.  Future content will be available on my substack, where I'm hoping people will be willing to chip in a little commensurate with the value of the writing, and (after a delay) on my personal site (not yet live).

Are you willing to take commissions? If I was a software engineer instead of doing AI governance in DC, I would feel like paying you $5000 every now and then to write on a topic that we're both interested in but I'm more interested in (like what would happen to the world if a critical mass of million people read WWOTF or HPMOR or dath ilan/duncanverse or signed up for cryopreservation) and post it on Lesswrong. I think that's sufficiently weird enough to be one of the 25% possible futures where you sometimes post on LW again.

Maybe lightcone could set up some kind of regranting system where they get a 10% cut for matching wealthy earn-to-givers with less-wealthy brilliant writers ("This post was funded by Sam Samson"), seems like a potential way to get more Scott Alexanders and Zvis. I have no idea how that kind of thing works though.

Comment by trevor (TrevorWiesinger) on Social Dark Matter · 2023-11-16T22:18:07.620Z · LW · GW

Sorry for a minor nitpick, it's worth making. It doesn't detract from Duncan's overall point at all, and if people lose Bayes points from every single nitpick, then Duncan nonetheless loses very few from this.

Because they never heard people openly proselytizing Nazi ideology, they assumed that practically no one sympathized with Nazi ideology.

And then one Donald J. Trump smashed into the Overton window like a wrecking ball, and all of a sudden a bunch of sentiment that had been previously shamed into private settings was on display for the whole world to see, and millions of dismayed onlookers were disoriented at the lack of universal condemnation because surely none of us think that any of this is okay??

But in fact, a rather large fraction of people do think this is okay, and have always felt this was okay, and were only holding their tongues for fear of punishment, and only adding their voices to the punishment of others to avoid being punished as a sympathizer. When the threat of punishment diminished, so too did their hesitation.

In the case of political ideologies, it might be hard to create potential energy for intense political views (or transformative technology could suddenly blindside us and just make it really easy). But generally, it's always been really easy for elites to engineer societal/psychological positive feedback loops sufficient to generate millions of new ideological adherents, or at least to take existing sentiments among millions of people and hone them into something specific and targeted like Nazism. In the particular case of political ideologies, it's wrong to assume that because a bunch of Nazis appeared, they were mostly there all along but hidden. 

The other examples in the post are still very solid and helpful afaik (the Autism label itself might be too intractably broken in our current society for practical use, but staying silent about the problem doesn't seem helpful). 

I like this post better than Raemon's Dark Forest Theories, even though Raemon's post is more wordcount-efficient and uses examples that I find more interesting and relevant. I think this post potentially does a very good job getting at the core of why things like Cryonics and Longtermism did not rapidly become mainstream (there might be substantial near-term EV in doing work to understand the unpopular-cryonics problem and the unpopular-longtermism problem, because those two problems are surprisingly closely linked to why human civilization is currently failing horribly at AI safety, and might even generate rapid solutions that are viable for short timelines e.g. providing momentum for rapidly raising the sanity waterline).

Comment by trevor (TrevorWiesinger) on Helpful examples to get a sense of modern automated manipulation · 2023-11-12T22:00:15.945Z · LW · GW

I haven't really looked into the twitter files, or the right-wing narratives of FBI/Biden suppression of right-wing views (I do know that Musk and the Right are separate and the overlap isn't necessarily his fault, e.g. criticism of the CDC and Ukraine War ended up consigned to the realm of right-wing clowns regardless of the wishes of the critics).

AFAIK the twitter files came nowhere near to confirming the level of manipulation technology that I describe here, mostly focusing on covert informal government operatives de-facto facilitating censorship in plausibly deniable ways. The reason I put a number as extreme as 95% is that weird scenarios during 2020-22 still count, so long as they describe intensely powerful use of AI and statistical analytics for targeted manipulation of humans at around the level of power I described here.

The whole point is that I'm arguing that existing systems are already powerful and dangerous, it's not a far-off future thing or even 4 years away. If it did end up being ONLY the dumb censorship described in the twitter files and the Right, then that would falsify my model.

Comment by trevor (TrevorWiesinger) on Vote on Interesting Disagreements · 2023-11-08T13:58:09.903Z · LW · GW

The most valuable new people joining AI safety will usually take ~1-3 years of effort to begin to be adequately sorted and acknowledged for their worth, unless they are unusually good at self-promotion e.g. gift of gab, networking experience, and stellar resume.

Comment by trevor (TrevorWiesinger) on Vote on Interesting Disagreements · 2023-11-08T13:46:57.581Z · LW · GW

I now think that it should go back to the binary yes and no responses, adding bells and whistles will complicate things too much.

Comment by trevor (TrevorWiesinger) on Vote on Interesting Disagreements · 2023-11-08T01:56:54.240Z · LW · GW

Any activity or action taken after drinking coffee in the morning will strongly reward/reinforce that action/activity