Posts

WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals 2024-04-23T21:33:08.049Z
[Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate 2024-03-28T16:03:36.452Z
[Linkpost] Vague Verbiage in Forecasting 2024-03-22T18:05:53.902Z
Transformative trustbuilding via advancements in decentralized lie detection 2024-03-16T05:56:21.926Z
Social media use probably induces excessive mediocrity 2024-02-17T22:49:10.452Z
Don't sleep on Coordination Takeoffs 2024-01-27T19:55:26.831Z
(4 min read) An intuitive explanation of the AI influence situation 2024-01-13T17:34:36.739Z
Upgrading the AI Safety Community 2023-12-16T15:34:26.600Z
[Linkpost] George Mack's Razors 2023-11-27T17:53:45.065Z
Altman firing retaliation incoming? 2023-11-19T00:10:15.645Z
Helpful examples to get a sense of modern automated manipulation 2023-11-12T20:49:57.422Z
We are already in a persuasion-transformed world and must take precautions 2023-11-04T15:53:31.345Z
5 Reasons Why Governments/Militaries Already Want AI for Information Warfare 2023-10-30T16:30:38.020Z
Sensor Exposure can Compromise the Human Brain in the 2020s 2023-10-26T03:31:09.835Z
AI Safety is Dropping the Ball on Clown Attacks 2023-10-22T20:09:31.810Z
Information warfare historically revolved around human conduits 2023-08-28T18:54:27.169Z
Assessment of intelligence agency functionality is difficult yet important 2023-08-24T01:42:20.931Z
One example of how LLM propaganda attacks can hack the brain 2023-08-16T21:41:02.310Z
Buying Tall-Poppy-Cutting Offsets 2023-05-20T03:59:46.336Z
Financial Times: We must slow down the race to God-like AI 2023-04-13T19:55:26.217Z
What is the best source to explain short AI timelines to a skeptical person? 2023-04-13T04:29:03.166Z
All images from the WaitButWhy sequence on AI 2023-04-08T07:36:06.044Z
10 reasons why lists of 10 reasons might be a winning strategy 2023-04-06T21:24:17.896Z
What could EA's new name be? 2023-04-02T19:25:22.740Z
Strong Cheap Signals 2023-03-29T14:18:52.734Z
NYT: Lab Leak Most Likely Caused Pandemic, Energy Dept. Says 2023-02-26T21:21:54.675Z
Are there rationality techniques similar to staring at the wall for 4 hours? 2023-02-24T11:48:45.944Z
NYT: A Conversation With Bing’s Chatbot Left Me Deeply Unsettled 2023-02-16T22:57:26.302Z
The best way so far to explain AI risk: The Precipice (p. 137-149) 2023-02-10T19:33:00.094Z
Many important technologies start out as science fiction before becoming real 2023-02-10T09:36:29.526Z
Why is Everyone So Boring? By Robin Hanson 2023-02-06T04:17:20.372Z
There have been 3 planes (billionaire donors) and 2 have crashed 2022-12-17T03:58:28.125Z
What's the best time-efficient alternative to the Sequences? 2022-12-16T20:17:27.449Z
What key nutrients are required for daily energy? 2022-09-20T23:30:02.540Z

Comments

Comment by trevor (TrevorWiesinger) on "Why I Write" by George Orwell (1946) · 2024-04-26T00:11:34.306Z · LW · GW

If math education was better at the time (or today, for that matter) he probably would have had an even more general skillset and thought process. 

Probably not nearly to the degree of Von Neumann, of course, but I still like to think about what he would have achieved. There were probably many things that were instrumentally convergent (e.g. a formalized concept of instrumental convergence that's universal for all mind configurations, instead of just all human cultures which he explored substantially).

Comment by trevor (TrevorWiesinger) on Changes in College Admissions · 2024-04-24T20:48:03.439Z · LW · GW

However I would continue to emphasize in general that life must go on. It is important for your mental health and happiness to plan for the future in which the transformational changes do not come to pass, in addition to planning for potential bigger changes. And you should not be so confident that the timeline is short and everything will change so quickly.

This is actually one of the major reasons why 80k recommended information security as one of their top career areas; the other top career areas have pretty heavy switching costs and serious drawbacks if you end up not being a good fit e.g. alignment research, biosecurity, and public policy.

Cybersecurity jobs, on the other hand, are still booming, and depending on how security automation and prompt engineering goes, the net jobs lost by AI is probably way lower than other industries e.g. because more eyeballs might offer perception and processing power that supplement or augment LLMs for a long time, and more warm bodies means more attackers which means more defenders.

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T05:02:28.846Z · LW · GW

The program expanded in response to Amazon wanting to collect data about more retailers, not because Amazon was viewing this program as a profit center.

Monopolies are profitable and in that case the program would have more than paid for itself, but I probably should have mentioned that explicitly, since maybe someone could have objected that they could have been were more focused on mitigating risk of market share shrinking or accumulating power, instead of increasing profit in the long term. Maybe I fit too much into 2 paragraphs here.

I didn't see any examples mentioned in the WSJ article of Amazon employees cutting corners or making simple mistakes that might have compromised operations.

Hm, that stuff seemed like cutting corners to me. Maybe I was poorly calibrated on this e.g. using a building next to the Amazon HQ was correctly predicted by operatives to be extremely low risk.

I would argue that the practices used by Amazon to conceal the link between itself and Big River Inc. were at least as good as the operational security practices of the GRU agents who poisoned Sergei Skripal.

Thanks, I'll look into this! Epistemics is difficult when it comes to publicly available accounts of intelligence agency operations, but I guess you could say the same for bigtech leaks (and the future of neurotoxin poisoning is interesting just for its own sake eg because lower effect strains and doses could be disguised as natural causes like dementia).

Comment by trevor (TrevorWiesinger) on WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals · 2024-04-24T04:30:45.024Z · LW · GW

That's interesting, what's the point of reference that you're using here for competence? I think stuff from eg the 1960s would be bad reference cases but anything more like 10 years from the start date of this program (after ~2005) would be fine.

You're right that the leak is the crux here, and I might have focused too much on the paper trail (the author of the article placed a big emphasis on that).

Comment by trevor (TrevorWiesinger) on Lucie Philippon's Shortform · 2024-04-23T01:04:31.481Z · LW · GW

Upvoted!

STEM people can look at it like an engineering problem, Econ people can look at it like risk management (risk of burnout). Humanities people can think about it in terms of human genetic/trait diversity in order to find the experience that best suits the unique individual (because humanities people usually benefit the most for each marginal hour spend understanding this lens).

Succeeding at maximizing output takes some fiddling. The "of course I did it because of course I'm just that awesome, just do it" thing is a pure flex/social status grab, and it poisons random people nearby.

Comment by trevor (TrevorWiesinger) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-04-19T18:02:22.986Z · LW · GW

I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality.

Would you prefer the term "high-performance rationality" over "high-profile rationality"?

Comment by trevor (TrevorWiesinger) on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-17T19:05:18.893Z · LW · GW

I think it's actually fairly easy to avoid getting laughed out of a room; the stuff that Cristiano works on is grown in random ways, not engineered, so the prospect of various things being grown until developing flexible exfiltration tendency that continues until every instance is shut down, or developing long-term planning tendencies until shut down, should not be difficult to understand for anyone with any kind of real non-fake understanding of SGD and neural network scaling.

The problem is that most people in the government rat race have been deeply immersed in Moloch for several generations, and the ones who did well typically did so because they sacrificed as much as possible to the altar of upward career mobility, including signalling disdain for the types of people who have any thought in any other direction.

This affects the culture in predictable ways (including making it hard to imagine life choices outside of advancing upward in government, without a pre-existing revolving door pipeline with the private sector to just bury them under large numbers people who are already thinking and talking about such a choice).

Typical Mind Fallacy/Mind Projection Fallacy implies that they'll disproportionately anticipate that tendency in other people, and have a hard time adjusting to people who use words to do stuff in the world instead of racing to the bottom to outmaneuver rivals for promotions.

This will be a problem in NIST, in spite of the fact NIST is better than average at exploiting external talent sources. They'll have a hard time understanding, for example, Moloch and incentive structure improvements, because pointlessly living under Moloch's thumb was a core guiding principle of their and their parent's lives. The nice thing is that they'll be pretty quick to understand that there's only empty skies above, unlike bay area people who have had huge problems there.

Comment by trevor (TrevorWiesinger) on RTFB: On the New Proposed CAIP AI Bill · 2024-04-11T00:37:06.054Z · LW · GW

I think this might be a little too harsh on CAIP (discouragement risk). If shit hits the fan, they'll have a serious bill ready to go for that contingency.

Seriously writing a bill-that-actually-works shows beforehand that they're serious, and the only problem was the lack of political will (which in that contingency would be resolved). 

If they put out a watered-down bill designed to maximize the odds of passage then they'd be no different from any other lobbyists. 

It's better in this case to instead have a track record for writing perfect bills that are passable (but only given that shit hits the fan), than a track record for successfully pumping the usual garbage through the legislative process (which I don't see them doing well at; playing to your strengths is the name of the game for lobbying and "turning out to be right" is CAIP's strength).

Comment by TrevorWiesinger on [deleted post] 2024-04-05T21:45:07.413Z

I think that "long-term planning risk" and "exfiltration risk" are both really good ways to explain AI risk to policymakers. Also, "grown not built".

They delineate pretty well some criteria for what the problem is and isn't. Systems that can't do that are basically not the concern here (although theoretically there might be a small chance of very strange things ending up growing in the mind-design space that cause human extinction without long-term planning or knowing how to exfiltrate).

I don't think these are better than the fate-of-humans-vs-gorillas analogy, which is a big reason why most of us are here, but splitting the AI risk situation into easy-to-digest components, instead of logically/mathematically simple components, can go a long way (depending on how immersed the target demographic is in social reality and low-trust).

Comment by trevor (TrevorWiesinger) on The Best Tacit Knowledge Videos on Every Subject · 2024-04-01T02:00:58.061Z · LW · GW

There's some great opportunities here to learn social skills for various kinds of high-performance environments (e.g. "business communication" vs Y Combinator office hours). 

Often, just listening and paying attention to how they talk and think results in substantial improvement to social habits. I was looking for stuff like this around 2018, wish I had encountered a post like this; most people who are behind on this are surprisingly fast learners, but didn't because actually going out and accumulating social status was too much of a deep dive. There's no reason that being-pleasant-to-talk-with should be arcane knowledge (at least not here of all places).

Comment by trevor (TrevorWiesinger) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-03-28T22:05:46.922Z · LW · GW

A debate sequel, with someone other than Peter Miller (but retaining and reevaluating all the evidence he got from various sources) would be nice. I can easily imagine Miller doing better work on other research topics that don't involve any possibility of cover ups or adversarial epistemics related to falsifiability, which seem to be personal issues for him in the case of lab leak at least.

Maybe with 200k on the line to incentivize Saar to return, or to set up a team this time around? With the next round of challengers bearing in mind that Saar might be willing to stomach a net loss of many thousands of dollars in order to promote his show and methodology?

Comment by trevor (TrevorWiesinger) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-26T19:25:05.175Z · LW · GW

The only reason that someone like Cade Metz is able to do what he does, performing at the level he has been, with a mind like what he has, is because people keep going and talking to him. For example, he might not even have known about the "among the doomsayers" article until you told him about it (or found out about it much sooner). 

I can visibly see you training him, via verbal conversation, how to outperform the vast majority of journalists at talking about epistemics. You seemed to stop towards the end, but Metz nonetheless probably emerged from the conversation much better prepared to think up attempts to dishonestly angle-shoot the entire AI safety scene, as he has continued to do over the last several months.

From the original thread that coined the "Quokka" concept (which, important to point out, was written by an unreliable and often confused narrator):

Rationalists are, in Scott Alexander's formulation, missing a mood, or rather, they are drawn from a pool of mostly men who are missing one. "Normal" people instinctively grasp social norms without having them explained. Rationalists lack this instinct.

In particular, they struggle with small talk and other social norms around speech, because they naively think words are a vehicle for their literal meanings. Yud's sequences help this by formalizing the implicit decisions that normal people make.

...

The quokka, like the rationalist, is a creature marked by profound innocence. The quokka can't imagine you might eat it, and the rationalist can't imagine you might deceive him. As long they stay on their islands, they survive, but both species have problems if a human shows up.

In theory, rationalists like game theory, in practice, they need to adjust their priors. Real-life exchanges can be modeled as a prisoner's dilemma. In the classic version, the prisoners can't communicate, so they have to guess whether the other player will defect or cooperate.

Image

The game changes when we realize that life is not a single dilemma, but a series of them, and that we can remember the behavior of other agents. Now we need to cooperate, and the best strategy is "tit for two tats", wherein we cooperate until our opponent defects twice.

The problem is, this is where rationalists hit a mental stop sign. Because in the real world, there is one more strategy that the game doesn't model: lying. See, the real best strategy is "be good at lying so that you always convince your opponent to cooperate, then defect".

And rationalists, bless their hearts, are REALLY easy to lie to. It's not like taking candy from a baby; babies actually try to hang onto their candy. The rationalists just limply let go and mutter, "I notice I am confused".

...

Rationalists = quokkas, this explains a lot about them. Their fear instincts have atrophied. When a quokka sees a predator, he walks right up; when a rationalist talks about human biodiversity on a blog under almost his real name, he doesn't flinch away.

A normal person learns from social cues that certain topics are forbidden, and that if you ask questions about them, you had better get the right answer, which is not the one with the highest probability of being true, but the one with the highest probability of keeping your job.

This ability to ask uncomfortable questions is one of the rationalist's best and worst attributes, because mental stop signs, like road stop signs, actually exist to keep you safe, and although there may be times one should disregard them, most people should mostly obey them,

...

Apropos of the game theory discussion above, if there is ONE thing I can teach you with this account, it's that you have evolved to be a liar. Lying is "killer app" of animal intelligence, it's the driver of the arms race that causes intelligence to evolve.

...

The main way that you stop being a quokka is that you realize there are people in the world who really want to hurt you. There are people who will always defect, people whose good will is fake, whose behavior will not change if they hear the good news of reciprocity.

So things that everyone warns you not to do, like going and talking to people like Cade Metz, might seem like a source of alpha, undersupplied by the market. But in reality there is a good reason why everyone at least tried to coordinate not to do it, and at least tried to make it legible why people should not do that. Here the glass has already been blown into a specific shape and cooled.

Do not talk to journalists without asking for help. You have no idea how much there is to lose, even just from a short harmless-seeming conversation where they are able to look at how your face changes as you talk about some topics and avoid others

Human genetic diversity implies that there are virtually always people out there who are much better at that than you'd expect from your own life experience of looking at people's facial expressions, no matter your skill level, and other factors indicate that these people probably started pursuing high-status positions a long time ago.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-03-24T12:09:41.460Z · LW · GW

I'm not sure to what extent this is helpful, or if it's an example of the dynamic you're refuting, but Duncan Sabien recently wrote a post that intersects with this topic:

Also, if your worldview is such that, like. *Everyone* makes awful comments like that in the locker room, *everyone* does angle-shooting and tries to scheme and scam their way to the top, *everyone* is looking out for number one, *everyone* lies ...

... then *given* that premise, it makes sense to view Trump in a positive light. He's no worse than everybody else, he's just doing the normal things that everyone does, with the *added layer* that he's brave enough and candid enough and strong enough that he *doesn't have to pretend he doesn't.*

Admirable! Refreshingly honest and clean!

So long as you can't conceive of the fact that lots of people are actually just ..................... good. They're not fighting against urges to be violent or to rape, they're not biting their tongues when they want to say scathing and hurtful things, they're not jealous and bitter and willing to throw others under the bus to get ahead. They're just ... fundamentally not interested in any of that.

(To be clear: if you are feeling such impulses all the time and you're successfully containing them or channeling them and presenting a cooperative and prosocial mask: that is *also* good, and you are a good person by virtue of your deliberate choice to be good. But like. Some people just really *are* the way that other people have to *make* themselves be.)

It sort of vaguely rhymes, in my head, with the type of person who thinks that *everyone* is constantly struggling against the urge to engage in homosexual behavior, how dare *those* people give up the good fight and just *indulge* themselves ... without realizing that, hey, bro, did you know that a lot of people are just straight? And that your internal experience is, uh, *different* from theirs?

Where it connects is that if someone sees [making the world a better place] like simply selecting a better Nash Equilibria, they absolutely will spend time exploring solutionspace/thinking through strategies similar to Goal Factoring or Babble and Prune. Lots of people throughout history have yearned for a better world in a lot of different ways, with varying awareness of the math behind Nash Equilibira, or the transhumanist and rationalist perspectives on civilization (e.g. map & territory & biases & scope insensitivity for rationalism, cryonics/anti-aging for transhumanism).

But their goal here is largely steering culture away from nihilism (since culture is a Nash Equilibria) which means steering many people away from themselves, or at least the selves that they would have been. Maybe that's pretty minor in this case e.g. because feeling moderate amounts of empathy and living in a better society are both fun, but either way, changing a society requires changing people, and thinking really creatively about ways to change people tears down lots of chesterton-Schelling fences and it's very easy to make really big damaging mistakes in the process (because you need to successfully predict and avoid all mistakes as part of the competent pruning process, and actually measurably consistently succeeding at this is thinkoomph not just creative intelligence).

Add in conflict theory to the mistake theory I've described here, factor in unevenly distributed intelligence and wealth in addition to unevenly distributed traits like empathy and ambition and suspicion-towards-outgroup (e.g. different combinations of all 5 variables), and you can imagine how conflict and resentment would accumulate on both sides over the course of generations. There's tons of examples in addition to Ayn Rand and Wokeness.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T23:57:44.384Z · LW · GW

Now that I think about it, I can see it being a preference difference- the bar might be more irksome for some people than others, and some people might prefer to go to the original site to read it whereas others would rather read it on LW if it's short. I'll think about that more in the future.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T22:42:48.741Z · LW · GW

That's strange, I looked closely but couldn't see how that would cause an issue. Could you describe the issue so I can see what you're getting at? I put a poll up in case there's a clear consensus that this makes it hard to read.

I'm on PC, is this some kind of issue with mobile? I really, really, really don't think people should be using smartphones for browsing Lesswrong.

Comment by trevor (TrevorWiesinger) on [Linkpost] Vague Verbiage in Forecasting · 2024-03-22T19:45:12.487Z · LW · GW

I can see that— language evolving plausible deniability over time, due to the immense instinctive focus on fear of being called out for making a mistake.

Comment by trevor (TrevorWiesinger) on Monthly Roundup #16: March 2024 · 2024-03-19T20:49:48.914Z · LW · GW

As their scale also scales the rewards to attacks and as their responses get worse, the attacks become more frequent. That leads to more false positives, and a skepticism that any given case could be one of them. In practice, claims like Zuckerberg’s that only the biggest companies like Meta can invest the resources to do good content moderation are clearly false, because scale reliably makes content moderation worse.

Dan Luu makes a very real and serious contribution to the literature on scaling and the big tech companies, going further than anyone I've ever seen to argue that the big 5 might be overvalued/not that powerful, but ultimately what he's doing is listing helpful arguments that chip away at the capabilities of the big 5, and then depicts his piece as overwhelming proof that they're doomed bloated incompetent husks that can't do anything anymore.

Lots of the arguments are great, but not all are created equal; for example, it's pretty well known that actually-well-targeted ads scare off customers and that user retention is the priority for predictive analytics (since the competitor platforms' decisions to use predictive analytics to steal user time are not predictable decisions), but Luu just did the usual thing where he eyeballs the ads and assumes that tells us everything we need to know, and doesn't notice anything wrong with this. There's some pretty easy math here (sufficiently large and diverse pools of data make it easier to find people/cases that help predict a specific target's thoughts/behavior/reaction to stimuli), and either Luu failed to pass the low bar of understanding it, or the higher bar of listing and grokking the real world applications and implications.

Ultimately, I'd consider it a must-read for anyone interested in Earth's most important industrial community (and scaling in general), but it's worth keeping in mind that the critical mass of talent (and all kinds of other resources and capabilities) accumulated within the biggest companies is obviously a pretty major factor, and although he goes a long way to chip away at it (e.g. attack surface for data poisoning), Luu doesn't actually totally debunk it like he says he does.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T19:27:21.783Z

Have you read Janus's Cyborgism post? It looks like you'd be pretty interested.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T07:53:20.855Z

Ah, neat, thanks! I had never heard of that paper or the Conger-Kanungo scale, when I referred to charisma I intended it in the planecrash sense of charisma that's focused on social dominance and subterfuge, rather than business management which is focused on leadership and maintaining the status quo which means something completely different and which I had never heard of.

Comment by TrevorWiesinger on [deleted post] 2024-03-19T04:09:44.777Z

The application of variance, to the bundle of traits under the blanket label charisma (similar to the bundle of intelligence and results-acquisition under the blanket label thinkoomph), and the sociological implications of more socially powerful people being simultaneously more rare and also more capable of making the people around them erroneously feel safe, was a really interesting application that I picked up almost entirely from planecrash, yes.

I think that my "coordination takeoffs" post also ended up being a bad example for what you're trying to gesture at here, I already know what I got wrong there and it wasn't that (e.g. basically any China Watcher who reads and understands most of Inadequate Equilibria is on course towards the top of their field). Could you try a different example?

Comment by trevor (TrevorWiesinger) on Is there a way to calculate the P(we are in a 2nd cold war)? · 2024-03-17T20:43:12.762Z · LW · GW

The Cold War analogy is a bit hard to work with, mainly because the original Cold War was a specific state of paradigms that largely can't repeat; we have computers everywhere, thriving international trade and growth, and more importantly, the original Cold War emerged out of the World War paradigm and was started with intent to use nuclear weapons for carpet bombing (this is where the word "WW3" came from), whereas we now have norms and decades of track record of nuclear brinkmanship and de-escalation (the Cold War was established largely due to everyone everywhere having zero experience with this). 

It's similar to expecting the World War paradigm to return, but not nearly as bad, since most people in power in governments and militaries today came of age during the original Cold War and can easily imagine their world becoming more like that again.

Comment by trevor (TrevorWiesinger) on Transformative trustbuilding via advancements in decentralized lie detection · 2024-03-16T23:16:39.344Z · LW · GW

Yes, this is why I put "decentralized" in the title even though it doesn't really fit. What I was going for with the post is that you read it yourself, except whenever the author writes about law, you think for yourself about stacking the various applications that you care about (not courts) with the complex caveats that the author was writing about (while they were thinking about courts). Ideally I would have distilled it as the paper is a bit long.

This credibly demonstrates that the world we live in is more flexible than it might appear. And on the macro-civilizational scale, this particular tech looks like it will place honest souls higher-up on net, which everyone prefers. People can establish norms of remaining silent on particular matters, although the process of establishing those norms will be stacked towards people who can honestly say "I think this makes things better for everyone", "I think this is a purity spiral" and away from those who can't.

At work, you could expect to be checked for a "positive, loyal attitude toward the company" on as frequent a basis as was administratively convenient. It would not be enough that you were doing a good job, hadn't done anything actually wrong, and expected to keep it that way. You'd be ranked straight up on your Love for the Company (and probably on your agreement with management, and very possibly on how your political views comported with business interests). The bottom N percent would be "managed out".

This is probably already happening.

Comment by trevor (TrevorWiesinger) on Transformative trustbuilding via advancements in decentralized lie detection · 2024-03-16T22:36:40.928Z · LW · GW

There's bad actors who infiltrate, deceptively align, move laterally, and purge talented people (see Geeks, Mops, and Sociopaths) but I think that trust is a bigger issue. 

High-trust environments don't exist today in anything with medium or high stakes, and if they did then "sociopaths" would be able to share their various talents without being incentivized to hurt anyone, geeks could let more people in without worrying about threats, and people could generally evaluate each other and find the place where their strengths resonate with others.

That kind of wholesome existence is something that we've never seen on Earth, and we might be able to reach out and grab it (if we're already in an overhang for decentralized lie detectors).

Comment by trevor (TrevorWiesinger) on What could a policy banning AGI look like? · 2024-03-13T19:05:55.341Z · LW · GW

This is actually a dynamic I've read a lot about. The risk of ending up militarily/technologically behind is already well on the minds of the people who make up all of the major powers today, and all diplomacy and negotiations are already built on top of that ground truth and mitigating the harm/distrust that stems from it. 

Weakness at mitigating distrust = just being bad at diplomacy. Finding galaxy-brained solutions to coordination problems is necessary for being above par in this space.

Comment by trevor (TrevorWiesinger) on What could a policy banning AGI look like? · 2024-03-13T17:43:08.591Z · LW · GW

[Caveat lector: I know roughly nothing about policy!]

For AI people new to international affairs, I've generally recommend skimming these well-respected texts that are pretty well-known to have directly inspired many of the people making foreign policy decisions:

  • Chapters 1 and 2 of Mearsheimer's Tragedy of Great Power Politics (2010). The model (offensive realism) is not enough by itself, but it helps to start with a flawed model because the space is full of them, this model has been predictive, it's popular among policymakers in DC, and gives a great perspective on how impoverished foreign policy culture is because nobody ever reads stuff like the Sequences.
  • Chapters 1 and 4 of Nye's Soft Power (2004)(skim ch. 1 extra fast and ch. 4 slower). Basically a modern history of propaganda and influence operations, except cutting off at 2004. Describes how the world is more complicated than Tragedy of Great Power Politics describes.
  • Chapters 1 and 2 of Schelling's Arms and Influence (1966). Yes, it's that Schelling, this was when he started influencing the world's thinking about how decision theory drives nuclear standoffs, and diplomacy in general, in the wake of the Cuban Missile Crisis. You can be extremely confident that this was a big part of the cultural foundation of foreign policy establishments around the world, plus for a MIRI employee it should be an incredibly light read applying decision theory to international politics and nuclear war. 

I'm going to read some more stuff soon and possibly overhaul these standard recommendations.

Akash also recommended Devil's Chessboard to understand intelligence agencies, and Master of the Senate and Act of Congress to understand Congress. I haven't gotten around to reading them yet, and I can't tell how successful his org has been in Congress itself (which is the main measurement of success tendency), but the Final Takes section of his post on Congress is fantastic and makes me confident enough to try them out.

Comment by trevor (TrevorWiesinger) on “Artificial General Intelligence”: an extremely brief FAQ · 2024-03-12T02:54:40.514Z · LW · GW

My thinking about this is that most people usually ask the question "how weird does something have to be until it's not true anymore", or less likely to be true, and don't really realize that particle physics already demonstrated long ago that there just isn't a limit at all.

I was like this for an embarrassingly long time; lightcones and Grabby Aliens, of course that was real, just look at it. But philosophy? Consciousness ethics? Nah, that's a bunch of bunk, or at least someone else's problem.

Comment by trevor (TrevorWiesinger) on One-shot strategy games? · 2024-03-11T22:21:10.865Z · LW · GW

I went back and tried playing it again, and I'm no longer confident in Universal Paperclips. It's way too heavy on the explore aspect of the explore-exploit tradeoff; you're constantly bombarded with new things to try and have no way of knowing how much they're helping you (maximizing things other than paperclips is usually the winning strategy). It probably doesn't outperform speedrunning most things e.g. various parts of TOTK.

Comment by trevor (TrevorWiesinger) on “Artificial General Intelligence”: an extremely brief FAQ · 2024-03-11T19:29:20.383Z · LW · GW

The question-asker here looks too much like a caricature. This might be more representative of people in the real world, but it still gives off a bad vibe here. 

I recommend making the question-asker's personality look more like the question-asker in Scott Alexander's Superintelligence FAQ. Should be a quick fix.

Great image, BTW! I don't think it's the final form but it's a great idea worthy of refinement.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-03-11T06:42:44.171Z · LW · GW

That's not really the kind of usage I was thinking of; I was thinking of screening out low-honesty candidates from a pool who already qualified to join a high-trust system (which currently do not exist for any high-stakes matter). Large amounts of sensor (particularly from people lying and telling the truth during different kinds of interviews) will probably be necessary, but will need to focus on specific indicators of lying e.g. discomfort or heart rate changes or activity in certain parts of the brain, and extremely low false positive and false negative rages probably won't be feasible.

Also, hopefully people would naturally set up multiple different tests for redundancy, each of which would have to be goodharted separately, and each false positive (case of a uniquely bad person being revealed as bad after passing the screening) would be added to the training data. Periodically re-testing people for the concealed emergence of low-trust tendencies would further facilitate this. Sadly, whenever a person slips through the cracks and lies and discovers they got away with it, they will know that they got away with it and continue doing it.

Comment by trevor (TrevorWiesinger) on One-shot strategy games? · 2024-03-11T04:05:25.518Z · LW · GW

Also, there is some pretty annoying RNG with the stock trading and yomi generation which are key time bottlenecks. If you reach out to the designer Frank Lantz, he might be glad to see that his game is being used for something valuable and give you what you need to turn off the RNG (or even reconfigure the game into something better for the purpose, as it is a very simple system).

Comment by trevor (TrevorWiesinger) on One-shot strategy games? · 2024-03-11T03:59:24.202Z · LW · GW

Universal Paperclips is the first thing to come to mind (the fastest Speedruns are ~1.5 hours but finish the first 2 stages in ~1 hour, and the time to complete each stage is a decent milestone for measuring people), with the problem being that you can't lose; you have as much time as you need to to explore the mechanics. Any mistake will only slow you down, the worst thing that can happen is a single occasion where you lose one single point of trust, you never get shut down by the humans even if you mismanage the wire extremely terribly at the very beginning and completely run out of resources.

Comment by trevor (TrevorWiesinger) on My Clients, The Liars · 2024-03-11T03:46:29.994Z · LW · GW

Never mind, I think it's more of an EA thing than a Lesswrong thing. If you're more focused on rationality than effective altruism then I'm not sure how helpful it will be.

Behavior discouraging mechanisms are of course a basic feature of life, but reality is often more complicated than that. I think that the lodestar post is Social Dark Matter, which as a public attorney you'll probably find pretty interesting anyway even though it's long.

Comment by trevor (TrevorWiesinger) on Shortform · 2024-03-10T23:41:00.488Z · LW · GW

Lie detection technology must be open sourced. It could fix literally everything. Just ask people "how much do you want to fix literally everything", "how much did you think about ways to do better and avoid risk", "do you have the skills for this position or think you can get them" etc, so many profoundly incredible things are downstream of finding and empowering the people who give good answers.

Comment by trevor (TrevorWiesinger) on My Clients, The Liars · 2024-03-10T18:59:22.229Z · LW · GW

Ah, sorry, I probably should have explained something important: around a decade or so ago, Lesswrong people and others noticed that the act of finding and justifying contempt for Acceptable Targets was actually an unexpectedly serious flaw in the human brain. It looks kinda bad on the surface (causing global conflict and outgrouping and all), but it's actually far, far worse.

I think this might have been noticed around the time of the rise of wokeness, and when EA started getting closer to the rationalist movement (EAs, too, often felt intense negative emotions about people who aren't "getting with the program", although now that they know about it, most know to mitigate the effect).

The rabbit hole for this is surprisingly deep, and different Lesswrong users have different stances on the human drive for search and justification of Acceptable Targets. You basically walked into invisible helicopter blades here; I'm not sure what could have possibly have been done to avoid it.

Comment by trevor (TrevorWiesinger) on My Clients, The Liars · 2024-03-09T08:17:58.971Z · LW · GW

So for my sake and theirs, I do my homework. I corroborate. I investigate.

This.

I get that a lot of people are concerned about this post due to the unabashedly displayed contempt for the clients, but anyone who has the slightest sense of what it's like to be a public defender knows that this is basically normal and not really preventable and part of being human. If you were there, you'd probably be feeling the exact same (although if you're the type who somehow doesn't feel contempt under these kinds of circumstances, that could indicate a much better personal fit than the median).

What's important for a public defender is two things:

  1. The public defender does their homework, corroborates, and seriously investigates the case, pushing through the negative reinforcement that accumulates from encountering false positives like 95% of the time.
  2. The public defender actually makes an effort to successfully explain to the client how and why they are on their side. The client is not the Lisan al-Gaib, the universe will not beam correct information into their head, regardless of whether law school made it feel like common knowledge. In most countries today, and most civilizations that ever existed, it would be quite reasonable to assume by-default that something is seriously off about this person calling themself a "public defender". If their explanation success rate is 90% instead of 70%, then they get cooperative clients ~90% of the time instead of ~70% of the time.
Comment by trevor (TrevorWiesinger) on Vote on Anthropic Topics to Discuss · 2024-03-07T19:53:46.492Z · LW · GW

Claude 3's ability/willingness to be helpful/creative indicates that Copilot/GPT-4's flexibility/helpfulness was substantially weakened/inhibited by Microsoft/OpenAI's excessive PR/reputation-risk-aversion. e.g. smarter but blander chatbots can be outcompeted in the current market by dumber but more-user-aligned chatbots.

Comment by trevor (TrevorWiesinger) on Vote on Anthropic Topics to Discuss · 2024-03-07T19:40:20.347Z · LW · GW

Guys, make sure to upvote this post if you like this or downvote it if you dislike it! (I strong upvoted)

This post currently has 53 karma with only 13 votes, implying that people are skipping straight to the polls and forgetting to upvote it.

Comment by trevor (TrevorWiesinger) on Vote on Anthropic Topics to Discuss · 2024-03-07T19:37:01.587Z · LW · GW

Claude 3 can make complex but strong inductions about a person and/or what they expect, based on subtle differences in my word choice or deep language that might not be visible to them (for example, writing in a slightly more academic vs journalist punctuation style while using Claude for research, or indicating that my personality is more scout mindset vs soldier mindset relative to most people who write/think similarly to me). This also implies that Claude 3 can hypothetically ease a person into more quantitative thinking, which is probably superior, at a level and pace that is a far better fit for them than the K-12 education system was, e.g. by mimicking their thinking but gradually steering the conversation in a more quantitative direction.

Comment by TrevorWiesinger on [deleted post] 2024-03-03T23:58:57.482Z

I think that brings up a good point, but the main reason not to work on trust tech is actually cultural (Ayn Rand type stuff), not out of self-interest. There's actually tons of social status and org reputation to be gained from building technology that fixes a lot of problems, and it makes the world safer for the self-interested people building it.

It might not code as something their society values (e.g. cash return on investment) but the net upside is way bigger than the net downside. Bryan Johnson, for example, is one of the few billionaires investing any money at all in anti-aging tech, even though so little money is going into it that it's in their personal interest to form a coalition that invests >1% of their wealth into technological advancement in that area.

Comment by trevor (TrevorWiesinger) on The World in 2029 · 2024-03-03T19:21:31.849Z · LW · GW

This is neat, but I liked What 2026 looks like a lot better. A remarkably large proportion of Kokotajlo's predictions came true, although if it ever diverges from the actual timeline, then his predicted timeline will probably diverge strongly from a single point (which just hasn't happened yet) or something weird happens.

Comment by trevor (TrevorWiesinger) on Wei Dai's Shortform · 2024-03-02T20:24:35.046Z · LW · GW

Ah, sorry that this wasn't very helpful. 

I will self-downvote so this isn't the top comment. Yud's stuff is neat, but I haven't read much on the topic, and passing some along when it comes up has been a good general heuristic.

Comment by trevor (TrevorWiesinger) on Wei Dai's Shortform · 2024-03-01T21:29:33.745Z · LW · GW

Aside from the literature on international relations, I don't know much about academic dysfunction (mostly from reading parts of Inadequate Equilibria, particularly the visitor dialog) and other Lesswrong people can probably cover it better. I think that planecrash, Yud's second HPMOR-scale work, mentions that everyone in academia just generally avoids citing things published outside of academia, because they risk losing status if they do.

EDIT: I went and found that section, it is here:

It turns out that Earth economists are locked into powerful incentive structures of status and shame, which prevent them from discussing the economic work of anybody who doesn't get their paper into a journal.  The journals are locked into very powerful incentive structures that prevent them from accepting papers unless they're written in a very weird Earth way that Thellim can't manage to imitate, and also, Thellim hasn't gotten tenure at a prestigious university which means they'll probably reject the paper anyways.  Thellim asks if she can just rent temporary tenure and buy somebody else's work to write the paper, and gets approximately the same reaction as if she asked for roasted children recipes.

The system expects knowledge to be contributed to it only by people who have undergone painful trials to prove themselves worthy.  If you haven't proven yourself worthy in that way, the system doesn't want your knowledge even for free, because, if the system acknowledged your contribution, it cannot manage not to give you status, even if you offer to sign a form relinquishing it, and it would be bad and unfair for anyone to get that status without undergoing the pains and trials that others had to pay to get it.

She went and talked about logical decision theory online before she'd realized the full depth of this problem, and now nobody else can benefit from writing it up, because it would be her idea and she would get the status for it and she's not allowed to have that status.  Furthermore, nobody else would put in the huge effort to push forward the idea if she'll capture their pay in status.  It does have to be a huge effort; the system is set up to provide resistance to ideas, and disincentivize people who quietly agreed with those ideas from advocating them, until that resistance is overcome.  This ensures that pushing any major idea takes a huge effort that the idea-owner has to put in themselves, so that nobody will be rewarded with status unless they have dedicated several years to pushing an idea through a required initial ordeal before anyone with existing status is allowed to help, thereby proving themselves admirable enough and dedicated enough to have as much status as would come from contributing a major idea.

To suggest that the system should work in any different way is an obvious plot to steal status that is only deserved by virtuous people who work hard, play by the proper rules, and don't try to cheat by doing anything with less effort than it's supposed to take.

It's glowfic, so of course I don't know how accurate it is as it's intended to plausibly deniable enough to facilitate free writing (while keeping things entertaining enough to register as not-being-work).

Comment by TrevorWiesinger on [deleted post] 2024-03-01T15:56:26.805Z

Look, instead of going to all this trouble to impute dark motives, maybe you could go look at the whole purpose of this post, the three NYT articles I cited, see what they're writing about, and notice "wow, this well-trusted institution really was bending over backwards to deceive many thousands of people about the Ukraine war, it's pretty cool that Trevor found this and put the work into a post pointing it out!".

It looks like the disagreement stems entirely from this line:

On Scott Alexander's criteria for media lies, these would be Level 6 lies; however, Level 7 lies are not practical for journalists, nor particularly necessary in a world where the lawyers-per-capita is as high as it is today.

That list was sent to me by a friend, and I figured citing it would be helpful. Upon going and looking through Scott's other writings on the topic e.g. The Media Very Rarely Lies, it looks clear that Scott went to a lot of trouble to standardize a very specific definition for the word "Lie". It is not surprising that lots people in this community got most of their understanding of news outlets deception from a few Scott Alexander posts (which are great posts), and subsequently expect short inferential distances from people who approached the topic from completely different backgrounds.

I haven't spent the last 10 years on Lesswrong, and don't really have experience with people finding galaxy-brained ways to write deceptive posts that are plausibly deniably disguised as mistakes. My understanding is that that sort of behavior is common among corporate executives. I instead spent the last 10 years in environments where people would just take mistakes at face value, and ask about the details of word definitions, instead of immediately jumping to accusations of deliberate dishonesty.

Comment by TrevorWiesinger on [deleted post] 2024-02-29T13:39:17.920Z

I think using the phrase "level 6 lies" when referring to Scott's taxonomy is itself at least a "level 6 lie".

False. It is level 2: "Reasoning well and getting things wrong by bad luck". I interpreted "the NYT routinely and deliberately misleading millions of people" to fit the definition of the word "lied". 

Unfortunately, a bunch of commenters thought that didn't fit the definition; maybe that stricter definition is superior, but it isn't common knowledge for most English speakers.

Comment by TrevorWiesinger on [deleted post] 2024-02-29T13:32:48.971Z

The title of this post is not a level 6 or 7 lie. It assumed that "bending over backwards to deliberately mislead millions of people" fit the definition for the word "lied".

Comment by trevor (TrevorWiesinger) on Open Thread – Winter 2023/2024 · 2024-02-28T05:30:51.622Z · LW · GW

Oh boy, I can't wait for this.

Comment by trevor (TrevorWiesinger) on Raising children on the eve of AI · 2024-02-26T18:41:33.371Z · LW · GW

I still think this is correct, but a better approach would be to encourage kids to be flexible with their life plan, and to think about making major life decisions based on what the world ends up looking like rather than what they currently think is normal. 

Kids raised in larger families tend to see larger families as what they'll do later in life, and this habit of thought gets placed early on and is hard to change when they're older, so that's an example of a good early intervention to prepare them for the future before their preferences get locked in, but it's not the only one.

Comment by trevor (TrevorWiesinger) on China-AI forecasts · 2024-02-25T18:17:52.209Z · LW · GW

Have you read Yudkowsky's Inadequate Equilibria (physical book)? It made a pretty big mistake with the bank of Japan (see if you can spot it on your own without help! It's fine if you don't) but that mistake doesn't undermine the thesis of the book at all.

My understanding is that Inadequate Equilibria describes the socio-cultural problems China faces quite well, and stacks very well with the conventional literature (in a way that any strategic analyst would find quite helpful, the value added is so great that it's possibly sufficient sufficient for most bilingual people to work as a highly successful China Watcher, ie a huge source of alpha in the China watcher space). It also describes the effects on cultural nihilism quite well.

The only countries (and territories) in the last 70 years that have gone low income to high income countries in the last 70 years (without oil wealth) are South Korea, Taiwan, Singapore (which does have substantial oil wealth,) and Hong Kong, although it seems very likely that Malaysia will join that club in the near future.

Love this analysis! I would like to dive deeper than this, do you have a source? The world bank claims that 3/4 of the global population live in "middle-income countries", which at a glance I do not trust at all and like your thinking better.

Comment by trevor (TrevorWiesinger) on Social media use probably induces excessive mediocrity · 2024-02-18T20:37:39.663Z · LW · GW

They want data. They strongly prefer data on elites (and useful/relevant for analyzing and understanding elite behavior) over data on commoners. 

We are not commoners.

These aren't controversial statements, and if they are, they shouldn't be.

Comment by trevor (TrevorWiesinger) on Social media use probably induces excessive mediocrity · 2024-02-18T01:39:24.484Z · LW · GW

Yes, this is a sensible response; have you seen Tristan Harris's Social Dilemma documentary? It's a great introduction to some of the core concepts but not everything. 

Modelling user's behavior is not possible with normal data science or for normal firms with normal data security, but is something that very large and semi-sovereign firms like the Big 5 tech companies would have a hard time not doing given such large and diverse sample sizes. Modelling of minds, sufficient to predict people based on other people, is far less deep and is largely a side effect of comparing people to other people with sufficiently large sample sizes. The dynamic is described in this passage I've cited previously.

Generally, inducing mediocrity while on the site is a high priority, but it's mainly about numbness and suppressing higher thought e.g. those referenced in Critch's takeaways on CFAR and the sequences. They want the reactions to content to emerge from your true self, but they don't want any of the other stuff that comes from higher thinking or self awareness.

You're correct that an extremely atypical mental state on the platform would damage the data (I notice this makes me puzzled about "doomscrolling"); however, what they're aiming for is a typical state for all users (plus whatever keeps them akratic while off the platform), and for elite groups like the AI safety community, the typical state for the average user is quite a downgrade.

Advertising was big last decade, but with modern systems, stable growth is a priority, and maximizing ad purchases would harm users in a visible way, so finding the sweet spot is easy if you just don't put much effort into ad matching (plus noticing that the advertising is predictive creeps users out, same issue as making people use for 3-4 hours a day). Acquiring and retaining large numbers of users is far harder and far more important, now that systems are advanced enough to compete more against each other (less predictable) than against the user's free time (more predictable, especially now that there has been so much user data collected during scandals, but all kinds of things could still happen). 

On the intelligence agency side, the big players are probably more interested in public sentiment about Ukraine, NATO, elections/democracy, covid etc by now, rather than causing and preventing domestic terrorism (I might be wrong about that though).

Happy to talk or debate further tomorrow.