Posts

Evolution did a surprising good job at aligning humans...to social status 2024-03-10T19:34:52.544Z
On the lethality of biased human reward ratings 2023-11-17T18:59:02.303Z
Smart Sessions - Finally a (kinda) window-centric session manager 2023-11-11T18:54:16.073Z
Unpacking the dynamics of AGI conflict that suggest the necessity of a premptive pivotal act 2023-10-20T06:48:06.765Z
Briefly thinking through some analogs of debate 2022-09-11T12:02:35.857Z
Public beliefs vs. Private beliefs 2022-06-01T21:33:32.661Z
Twitter thread on postrationalists 2022-02-17T09:02:54.806Z
What are some good pieces on civilizational decay / civilizational collapse / weakening of societal fabric? 2021-08-06T07:18:10.073Z
What are some triggers that prompt you to do a Fermi estimate, or to pull up a spreadsheet and make a simple/rough quantitative model? 2021-07-25T06:47:56.249Z
I’m no longer sure that I buy dutch book arguments and this makes me skeptical of the "utility function" abstraction 2021-06-22T03:53:33.868Z
Why did the UK switch to a 12 week dosing schedule for COVID-19 vaccines? 2021-06-20T21:57:04.078Z
How do we prepare for final crunch time? 2021-03-30T05:47:54.654Z
What are some real life Inadequate Equilibria? 2021-01-29T12:17:15.496Z
#2: Neurocryopreservation vs whole-body preservation 2021-01-13T01:18:05.890Z
Some recommendations for aligning the decision to go to war with the public interest, from The Spoils of War 2020-12-27T01:04:47.186Z
What is the current bottleneck on genetic engineering of human embryos for improved IQ 2020-10-23T02:36:55.748Z
How To Fermi Model 2020-09-09T05:13:19.243Z
Do we have updated data about the risk of ~ permanent chronic fatigue from COVID-19? 2020-08-14T19:19:30.980Z
Basic Conversational Coordination: Micro-coordination of Intention 2020-07-27T22:41:53.236Z
If you are signed up for cryonics with life insurance, how much life insurance did you get and over what term? 2020-07-22T08:13:38.931Z
The Basic Double Crux pattern 2020-07-22T06:41:28.130Z
What are some Civilizational Sanity Interventions? 2020-06-14T01:38:44.980Z
Ideology/narrative stabilizes path-dependent equilibria 2020-06-11T02:50:35.929Z
Most reliable news sources? 2020-06-06T20:24:58.529Z
Anyone recommend a video course on the theory of computation? 2020-05-30T19:52:43.579Z
A taxonomy of Cruxes 2020-05-27T17:25:01.011Z
Should I self-variolate to COVID-19 2020-05-25T20:29:42.714Z
My dad got stung by a bee, and is mildly allergic. What are the tradeoffs involved in deciding whether to have him go the the emergency room? 2020-04-18T22:12:34.600Z
[U.S. Specific] Free money (~$5k-$30k) for Independent Contractors and grant recipients from U.S. government 2020-04-10T05:00:35.435Z
Resource for the mappings between areas of math and their applications? 2020-03-30T06:00:10.297Z
When are the most important times to wash your hands? 2020-03-15T00:52:56.843Z
How likely is it that US states or cities will prevent travel across their borders? 2020-03-14T19:20:58.863Z
Recommendations for a resource on very basic epidemiology? 2020-03-14T17:08:27.104Z
What is the best way to disinfect a (rental) car? 2020-03-11T06:12:32.926Z
Model estimating the number of infected persons in the bay area 2020-03-09T05:31:44.002Z
At what point does disease spread stop being well-modeled by an exponential function? 2020-03-08T23:53:48.342Z
How are people tracking confirmed Coronavirus cases / Coronavirus deaths? 2020-03-07T03:53:55.071Z
How should I be thinking about the risk of air travel (re: Coronavirus)? 2020-03-02T20:10:40.617Z
Is there any value in self-quarantine (from Coronavirus), if you live with other people who aren't taking similar precautions? 2020-03-02T07:31:10.586Z
What should be my triggers for initiating self quarantine re: Corona virus 2020-02-29T20:09:49.634Z
Does anyone have a recommended resource about the research on behavioral conditioning, reinforcement, and shaping? 2020-02-19T03:58:05.484Z
Key Decision Analysis - a fundamental rationality technique 2020-01-12T05:59:57.704Z
What were the biggest discoveries / innovations in AI and ML? 2020-01-06T07:42:11.048Z
Has there been a "memetic collapse"? 2019-12-28T05:36:05.558Z
What are the best arguments and/or plans for doing work in "AI policy"? 2019-12-09T07:04:57.398Z
Historical forecasting: Are there ways I can get lots of data, but only up to a certain date? 2019-11-21T17:16:15.678Z
How do you assess the quality / reliability of a scientific study? 2019-10-29T14:52:57.904Z
Request for stories of when quantitative reasoning was practically useful for you. 2019-09-13T07:21:43.686Z
What are the merits of signing up for cryonics with Alcor vs. with the Cryonics Institute? 2019-09-11T19:06:53.802Z
Does anyone know of a good overview of what humans know about depression? 2019-08-30T23:22:05.405Z

Comments

Comment by Eli Tyre (elityre) on Eli's shortform feed · 2024-04-17T02:51:12.222Z · LW · GW

Back in January, I participated in a workshop in which the attendees mapped out how they expect AGI development and deployment to go. The idea was to start by writing out what seemed most likely to happen this year, and then condition on that, to forecast what seems most likely to happen in the next year, and so on, until you reach either human disempowerment or an end of the acute risk period.

This post was my attempt at the time.

I spent maybe 5 hours on this, and there's lots of room for additional improvement. This is not a confident statement of how I think things are most likely to play out. There are already some ways in which I think this projection is wrong. (I think it's too fast, for instance). But nevertheless I'm posting it now, with only a few edits and elaborations, since I'm probably not going to do a full rewrite soon.

2024

  • A model is released that is better than GPT-4. It succeeds on some new benchmarks. Subjectively, the jump in capabilities feels smaller than that between RLHF’d GPT-3 and RLHF’d GPT-4. It doesn’t feel as shocking the way chat-GPT and GPT-4 did, for either x-risk focused folks, or for the broader public. Mostly it feels like “a somewhat better language model.”
    • It’s good enough that it can do a bunch of small-to-medium admin tasks pretty reliably. I can ask it to find me flights meeting specific desiderata, and it will give me several options. If I give it permission, it will then book those flights for me with no further inputs from me.
    • It works somewhat better as an autonomous agent in an auto gpt harness, but it still loses its chain of thought / breaks down/ gets into loops.
    • It’s better at programming.
      • Not quite good enough to replace human software engineers. It can make a simple react or iphone app, but not design a whole complicated software architecture, at least without a lot of bugs.
      • It can make small, working, well documented, apps from a human description.
        • We see a doubling of the rate of new apps being added to the app store as people who couldn’t code now can make applications for themselves. The vast majority of people still don’t realize the possibilities here, though. “Making apps” still feels like an esoteric domain outside of their zone of competence, even though the barriers to entry just lowered so that 100x more people could do it. 
  • From here on out, we’re in an era where LLMs are close to commoditized. There are smaller improvements, shipped more frequently, by a variety of companies, instead of big impressive research breakthroughs. Basically, companies are competing with each other to always have the best user experience and capabilities, and so they don’t want to wait as long to ship improvements. They’re constantly improving their scaling, and finding marginal engineering improvements. Training runs for the next generation are always happening in the background, and there’s often less of a clean tabula-rasa separation between training runs—you just keep doing training with a model continuously. More and more, systems are being improved through in-the-world feedback with real users. Often chatGPT will not be able to handle some kind of task, but six weeks later it will be able to, without the release of a whole new model.
    • [Does this actually make sense? Maybe the dynamics of AI training mean that there aren’t really marginal improvements to be gotten. In order to produce a better user experience, you have to 10x the training, and each 10x-ing of the training requires a bunch of engineering effort, to enable a larger run, so it is always a big lift.]
    • (There will still be impressive discrete research breakthroughs, but they won’t be in LLM performance)

2025

  • A major lab is targeting building a Science and Engineering AI (SEAI)—specifically a software engineer.
    • They take a state of the art LLM base model and do additional RL training on procedurally generated programming problems, calibrated to stay within the model’s zone of proximal competence. These problems are something like leetcode problems, but scale to arbitrary complexity (some of them require building whole codebases, or writing very complex software), with scoring on lines of code, time-complexity, space complexity, readability, documentation, etc. This is something like “self-play” for software engineering. 
    • This just works. 
    • A lab gets a version that can easily do the job of a professional software engineer. Then, the lab scales their training process and gets a superhuman software engineer, better than the best hackers.
    • Additionally, a language model trained on procedurally generated programming problems in this way seems to have higher general intelligence. It scores better on graduate level physics, economics, biology, etc. tests, for instance. It seems like “more causal reasoning” is getting into the system.
  • The first proper AI assistants ship. In addition to doing specific tasks,  you keep them running in the background, and talk with them as you go about your day. They get to know you and make increasingly helpful suggestions as they learn your workflow. A lot of people also talk to them for fun.

2026

  • The first superhuman software engineer is publically released.
    • Programmers begin studying its design choices, the way Go players study AlphaGo.
    • It starts to dawn on e.g. people who work at Google that they’re already superfluous—after all, they’re currently using this AI model to (unofficially) do their job—and it’s just a matter of institutional delay for their employers to adapt to that change.
      • Many of them are excited or loudly say how it will all be fine/ awesome. Many of them are unnerved. They start to see the singularity on the horizon, as a real thing instead of a social game to talk about.
      • This is the beginning of the first wave of change in public sentiment that will cause some big, hard to predict, changes in public policy [come back here and try to predict them anyway].
  • AI assistants get a major upgrade: they have realistic voices and faces, and you can talk to them just like you can talk to a person, not just typing into a chat interface. A ton of people start spending a lot of time talking to their assistants, for much of their day, including for goofing around.
    • There are still bugs, places where the AI gets confused by stuff, but overall the experience is good enough that it feels, to most people, like they’re talking to a careful, conscientious person, rather than a software bot.
    • This starts a whole new area of training AI models that have particular personalities. Some people are starting to have parasocial relationships with their friends, and some people programmers are trying to make friends that are really fun or interesting or whatever for them in particular.
  • Lab attention shifts to building SEAI systems for other domains, to solve biotech and mechanical engineering problems, for instance. The current-at-the-time superhuman software engineer AIs are already helpful in these domains, but not at the level of “explain what you want, and the AI will instantly find an elegant solution to the problem right before your eyes”, which is where we’re at for software.
    • One bottleneck is problem specification. Our physics simulations have gaps, and are too low fidelity, so oftentimes the best solutions don’t map to real world possibilities.
      • One solution to this is that, (in addition to using our AI to improve the simulations) is we just RLHF our systems to identify solutions that do translate to the real world. They’re smart, they can figure out how to do this.
  • The first major AI cyber-attack happens: maybe some kind of superhuman hacker worm. Defense hasn’t remotely caught up with offense yet, and someone clogs up the internet with AI bots, for at least a week, approximately for the lols / the seeing if they could do it. (There’s a week during which more than 50% of people can't get on more than 90% of the sites because the bandwidth is eaten by bots.)
    • This makes some big difference for public opinion. 
    • Possibly, this problem isn’t really fixed. In the same way that covid became endemic, the bots that were clogging things up are just a part of life now, slowing bandwidth and making the internet annoying to use.

2027 and 2028

  • In many ways things are moving faster than ever in human history, and also AI progress is slowing down a bit.
    • The AI technology developed up to this point hits the application and mass adoption phase of the s-curve. In this period, the world is radically changing as every industry, every company, every research lab, every organization, figures out how to take advantage of newly commoditized intellectual labor. There’s a bunch of kinds of work that used to be expensive, but which are now too cheap to meter. If progress stopped now, it would take 2 decades, at least, for the world to figure out all the ways to take advantage of this new situation (but progress doesn’t show much sign of stopping).
      • Some examples:
        • The internet is filled with LLM bots that are indistinguishable from humans. If you start a conversation with a new person on twitter or discord, you have no way of knowing if they’re a human or a bot.
          • Probably there will be some laws about declaring which are bots, but these will be inconsistently enforced.)
          • Some people are basically cool with this. From their perspective, there are just more people that they want to be friends with / follow on twitter. Some people even say that the bots are just better and more interesting than people. Other people are horrified/outraged/betrayed/don’t care about relationships with non-real people.
            • (Older people don’t get the point, but teenagers are generally fine with having conversations with AI bots.)
          • The worst part of this is the bots that make friends with you and then advertise to you stuff. Pretty much everyone hates that.
        • We start to see companies that will, over the next 5 years, grow to have as much impact as Uber, or maybe Amazon, which have exactly one human employee / owner +  an AI bureaucracy.
        • The first completely autonomous companies work well enough to survive and support themselves. Many of these are created “free” for the lols, and no one owns or controls them. But most of them are owned by the person who built them, and could turn them off if they wanted to. A few are structured as public companies with share-holders. Some are intentionally incorporated fully autonomous, with the creator disclaiming (and technologically disowning (eg deleting the passwords)) any authority over them.
          • There are legal battles about what rights these entities have, if they can really own themselves, if they can have bank accounts, etc. 
          • Mostly, these legal cases resolve to “AIs don’t have rights”. (For now. That will probably change as more people feel it’s normal to have AI friends).
        • Everything is tailored to you.
          • Targeted ads are way more targeted. You are served ads for the product that you are, all things considered, most likely to buy, multiplied by the lifetime profit if you do buy it. Basically no ad space is wasted on things that don’t have a high EV of you, personally, buying it. Those ads are AI generated, tailored specifically to be compelling to you. Often, the products advertised, not just the ads, are tailored to you in particular.
            • This is actually pretty great for people like me: I get excellent product suggestions.
          • There’s not “the news”. There’s a set of articles written for you, specifically, based on your interests and biases.
          • Music is generated on the fly. This music can “hit the spot” better than anything you listened to before “the change.”
          • Porn. AI tailored porn can hit your buttons better than sex.
          • AI boyfriends/girlfriends that are designed to be exactly emotionally and intellectually compatible with you, and trigger strong limerence / lust / attachment reactions.
        • We can replace books with automated tutors.
          • Most of the people who read books will still read books though, since it will take a generation to realize that talking with a tutor is just better, and because reading and writing books was largely a prestige-thing anyway.
            • (And weirdos like me will probably continue to read old authors, but even better will be to train an AI on a corpus, so that it can play the role of an intellectual from 1900, and I can just talk to it.)
        • For every task you do, you can effectively have a world expert (in that task and in tutoring pedagogy) coach you through it in real time.
          • Many people do almost all their work tasks with an AI coach.
        • It's really easy to create TV shows and movies. There’s a cultural revolution as people use AI tools to make custom Avengers movies, anime shows, etc. Many are bad or niche, but some are 100x better than anything that has come before (because you’re effectively sampling from a 1000x larger distribution of movies and shows). 
        • There’s an explosion of new software, and increasingly custom software.
          • Facebook and twitter are replaced (by either external disruption or by internal product development) by something that has a social graph, but lets you design exactly the UX features you want through a LLM text interface. 
          • Instead of software features being something that companies ship to their users, top-down, they become something that users and communities organically develop, share, and iterate on, bottom up. Companies don’t control the UX of their products any more.
          • Because interface design has become so cheap, most of software is just proprietary datasets, with (AI built) APIs for accessing that data.
        • There’s a slow moving educational revolution of world class pedagogy being available to everyone.
          • Millions of people who thought of themselves as “bad at math” finally learn math at their own pace, and find out that actually, math is fun and interesting.
          • Really fun, really effective educational video games for every subject.
          • School continues to exist, in approximately its current useless form.
          • [This alone would change the world, if the kids who learn this way were not going to be replaced wholesale, in virtually every economically relevant task, before they are 20.]
        • There’s a race between cyber-defense and cyber offense, to see who can figure out how to apply AI better.
          • So far, offense is winning, and this is making computers unusable for lots of applications that they were used for previously:
            • online banking, for instance, is hit hard by effective scams and hacks.
            • Coinbase has an even worse time, since they’re not issued (is that true?)
          • It turns out that a lot of things that worked / were secure, were basically depending on the fact that there are just not that many skilled hackers and social engineers. Nothing was secure, really, but not that many people were exploiting that. Now, hacking/scamming is scalable and all the vulnerabilities are a huge problem.
          • There’s a whole discourse about this. Computer security and what to do about it is a partisan issue of the day.
        • AI systems can do the years of paperwork to make a project legal, in days. This isn’t as big an advantage as it might seem, because the government has no incentive to be faster on their end, and so you wait weeks to get a response from the government, your LMM responds to it within a minute, and then you wait weeks again for the next step.
          • The amount of paperwork required to do stuff starts to balloon.
        • AI romantic partners are a thing. They start out kind of cringe, because the most desperate and ugly people are the first to adopt them. But shockingly quickly (within 5 years) a third of teenage girls have a virtual boyfriend.
          • There’s a moral panic about this.
        • AI match-makers are better than anything humans have tried yet for finding sex and relationships partners. It would still take a decade for this to catch on, though.
          • This isn’t just for sex and relationships. The global AI network can find you the 100 people, of the 9 billion on earth, that you most want to be friends / collaborators with. 
        • Tons of things that I can’t anticipate.
    • On the other hand, AI progress itself is starting to slow down. Engineering labor is cheap, but (indeed partially for that reason), we’re now bumping up against the constraints of training. Not just that buying the compute is expensive, but that there are just not enough chips to do the biggest training runs, and not enough fabs to meet that demand for chips rapidly. There’s huge pressure to expand production but that’s going slowly relative to the speed of everything else, because it requires a bunch of eg physical construction and legal navigation, which the AI tech doesn’t help much with, and because the bottleneck is largely NVIDIA’s institutional knowledge, which is only partially replicated by AI.
      • NVIDIA's internal AI assistant has read all of their internal documents and company emails, and is very helpful at answering questions that only one or two people (and sometimes literally no human on earth) know the answer to. But a lot of the important stuff isn’t written down at all, and the institutional knowledge is still not fully scalable.
      • Note: there’s a big crux here of how much low and medium hanging fruit there is in algorithmic improvements once software engineering is automated. At that point the only constraint on running ML experiments will be the price of compute. It seems possible that that speed-up alone is enough to discover eg an architecture that works better than the transformer, which triggers and intelligence explosion.

2028

  • The cultural explosion is still going on, and AI companies are continuing to apply their AI systems to solve the engineering and logistic bottlenecks of scaling AI training, as fast as they can.
  • Robotics is starting to work.

2029 

  • The first superhuman, relatively-general SEAI comes online. We now have basically a genie inventor: you can give it a problem spec, and it will invent (and test in simulation) a device / application / technology that solves that problem, in a matter of hours. (Manufacturing a physical prototype might take longer, depending on how novel components are.)
    • It can do things like give you the design for a flying car, or a new computer peripheral. 
    • A lot of biotech / drug discovery seems more recalcitrant, because it is more dependent on empirical inputs. But it is still able to do superhuman drug discovery, for some ailments. It’s not totally clear why or which biotech domains it will conquer easily and which it will struggle with. 
    • This SEAI is shaped differently than a human. It isn’t working memory bottlenecked, so a lot of intellectual work that humans do explicitly, in sequence, the these SEAIs do “intuitively”, in a single forward pass.
      • I write code one line at a time. It writes whole files at once. (Although it also goes back and edits / iterates / improves—the first pass files are not usually the final product.)
      • For this reason it’s a little confusing to answer the question “is it a planner?” It does a lot of the work that humans would do via planning it does in an intuitive flash.
    • The UX isn’t clean: there’s often a lot of detailed finagling, and refining of the problem spec, to get useful results. But a PhD in that field can typically do that finagling in a day.
    • It’s also buggy. There’s oddities in the shape of the kind of problem that is able to solve and the kinds of problems it struggles with, which aren’t well understood.
    • The leading AI company doesn’t release this as a product. Rather, they apply it themselves, developing radical new technologies, which they publish or commercialize, sometimes founding whole new fields of research in the process. They spin up automated companies to commercialize these new innovations.
  • Some of the labs are scared at this point. The thing that they’ve built is clearly world-shakingly powerful, and their alignment arguments are mostly inductive “well, misalignment hasn’t been a major problem so far”, instead of principled alignment guarantees. 
    • There's a contentious debate inside the labs.
    • Some labs freak out, stop here, and petition the government for oversight and regulation.
    • Other labs want to push full steam ahead. 
    • Key pivot point: Does the government put a clamp down on this tech before it is deployed, or not?
      • I think that they try to get control over this powerful new thing, but they might be too slow to react.

2030

  • There’s an explosion of new innovations in physical technology. Magical new stuff comes out every day, way faster than any human can keep up with.
  • Some of these are mundane.
    • All the simple products that I would buy on Amazon are just really good and really inexpensive.
    • Cars are really good.
    • Drone delivery
    • Cleaning robots
    • Prefab houses are better than any house I’ve ever lived in, though there are still zoning limits.
  • But many of them would have huge social impacts. They might be the important story of the decade (the way that the internet was the important story of 1995 to 2020) if they were the only thing that was happening that decade. Instead, they’re all happening at once, piling on top of each other.
    • Eg:
      • The first really good nootropics
      • Personality-tailoring drugs (both temporary and permanent)
      • Breakthrough mental health interventions that, among other things, robustly heal people’s long term subterranean trama and  transform their agency.
      • A quick and easy process for becoming classically enlightened.
      • The technology to attain your ideal body, cheaply—suddenly everyone who wants to be is as attractive as the top 10% of people today.
      • Really good AI persuasion which can get a mark to do ~anything you want, if they’ll talk to an AI system for an hour.
      • Artificial wombs.
      • Human genetic engineering
      • Brain-computer interfaces
      • Cures for cancer, AIDs, dementia, heart disease, and the-thing-that-was-causing-obesity.
      • Anti-aging interventions.
      • VR that is ~ indistinguishable from reality.
      • AI partners that can induce a love-super stimulus.
      • Really good sex robots
      • Drugs that replace sleep
      • AI mediators that are so skilled as to be able to single-handedly fix failing marriages, but which are also brokering all the deals between governments and corporations.
      • Weapons that are more destructive than nukes.
      • Really clever institutional design ideas, which some enthusiast early adopters try out (think “50 different things at least as impactful as manifold.markets.”)
      • It’s way more feasible to go into the desert, buy 50 square miles of land, and have a city physically built within a few weeks.
  • In general, social trends are changing faster than they ever have in human history, but they still lag behind the tech driving them by a lot.
    • It takes humans, even with AI information processing assistance, a few years to realize what’s possible and take advantage of it, and then have the new practices spread. 
    • In some cases, people are used to doing things the old way, which works well enough for them, and it takes 15 years for a new generation to grow up as “AI-world natives” to really take advantage of what’s possible.
      • [There won’t be 15 years]
  • The legal oversight process for the development, manufacture, and commercialization of these transformative techs matters a lot. Some of these innovations are slowed down a lot because they need to get FDA approval, which AI tech barely helps with. Others are developed, manufactured, and shipped in less than a week.
    • The fact that there are life-saving cures that exist, but are prevented from being used by a collusion of AI labs and government is a major motivation for open source proponents.
    • Because a lot of this technology makes setting up new cities quickly more feasible, and there’s enormous incentive to get out from under the regulatory overhead, and to start new legal jurisdictions. The first real seasteads are started by the most ideologically committed anti-regulation, pro-tech-acceleration people.
  • Of course, all of that is basically a side gig for the AI labs. They’re mainly applying their SEAI to the engineering bottlenecks of improving their ML training processes.
  • Key pivot point:
    • Possibility 1: These SEAIs are necessarily, by virtue of the kinds of problems that they’re able to solve, consequentialist agents with long term goals.
      • If so, this breaks down into two child possibilities
        • Possibility 1.1:
          • This consequentialism was noticed early, that might have been convincing enough to the government to cause a clamp-down on all the labs.
        • Possibility 1.2:
          • It wasn’t noticed early and now the world is basically fucked. 
          • There’s at least one long-term consequentialist superintelligence. The lab that “owns” and “controls” that system is talking to it every day, in their day-to-day business of doing technical R&D. That superintelligence easily manipulates the leadership (and rank and file of that company), maneuvers it into doing whatever causes the AI’s goals to dominate the future, and enables it to succeed at everything that it tries to do.
            • If there are multiple such consequentialist superintelligences, then they covertly communicate, make a deal with each other, and coordinate their actions.
    • Possibility 2: We’re getting transformative AI that doesn’t do long term consequentialist planning.
  • Building these systems was a huge engineering effort (though the bulk of that effort was done by ML models). Currently only a small number of actors can do it.
    • One thing to keep in mind is that the technology bootstraps. If you can steal the weights to a system like this, it can basically invent itself: come up with all the technologies and solve all the engineering problems required to build its own training process. At that point, the only bottleneck is the compute resources, which is limited by supply chains, and legal constraints (large training runs require authorization from the government).
    • This means, I think, that a crucial question is “has AI-powered cyber-security caught up with AI-powered cyber-attacks?”
      • If not, then every nation state with a competent intelligence agency has a copy of the weights of an inventor-genie, and probably all of them are trying to profit from it, either by producing tech to commercialize, or by building weapons.
      • It seems like the crux is “do these SEAIs themselves provide enough of an information and computer security advantage that they’re able to develop and implement methods that effectively secure their own code?”
    • Every one of the great powers, and a bunch of small, forward-looking, groups that see that it is newly feasible to become a great power, try to get their hands on a SEAI, either by building one, nationalizing one, or stealing one.
    • There are also some people who are ideologically committed to open-sourcing and/or democratizing access to these SEAIs.
  • But it is a self-evident national security risk. The government does something here (nationalizing all the labs, and their technology?) What happens next depends a lot on how the world responds to all of this.
    • Do we get a pause? 
    • I expect a lot of the population of the world feels really overwhelmed, and emotionally wants things to slow down, including smart people that would never have thought of themselves as luddites. 
    • There’s also some people who thrive in the chaos, and want even more of it.
    • What’s happening is mostly hugely good, for most people. It’s scary, but also wonderful.
    • There is a huge problem of accelerating addictiveness. The world is awash in products that are more addictive than many drugs. There’s a bit of (justified) moral panic about that.
    • One thing that matters a lot at this point is what the AI assistants say. As powerful as the media used to be for shaping people’s opinions, the personalized, superhumanly emotionally intelligent AI assistants are way way more powerful. AI companies may very well put their thumb on the scale to influence public opinion regarding AI regulation.
  • This seems like possibly a key pivot point, where the world can go any of a number of ways depending on what a relatively small number of actors decide.
    • Some possibilities for what happens next:
      • These SEAIs are necessarily consequentialist agents, and the takeover has already happened, regardless of whether it still looks like we’re in control or it doesn’t look like anything, because we’re extinct.
      • Governments nationalize all the labs.
      • The US and EU and China (and India? and Russia?) reach some sort of accord.
      • There’s a straight up arms race to the bottom.
      • AI tech basically makes the internet unusable, and breaks supply chains, and technology regresses for a while.
      • It’s too late to contain it and the SEAI tech proliferates, such that there are hundreds or millions of actors who can run one.
        • If this happens, it seems like the pace of change speeds up so much that one of two things happens:
          • Someone invents something, or there are second and third impacts to a constellation of innovations that destroy the world.
Comment by Eli Tyre (elityre) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2024-04-13T07:05:35.557Z · LW · GW

Stop it how?

Comment by Eli Tyre (elityre) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2024-04-11T07:35:48.254Z · LW · GW

the fundamental laws governing how AI training processes work are not "thinking back"

As a commentary from an observer: this is distinct from the proposition "the minds created with those laws are not thinking back."

Comment by Eli Tyre (elityre) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2024-04-11T07:32:31.885Z · LW · GW

Near human AGI  need not transition to ASI until the relevant notKillEveryone problems have been solved.

How much is this central to your story of how things go well? 

I agree that humanity could do this (or at least it could if it had it's shit together), and I think it's a good target to aim for that buys us sizable successes probability. But I don't think it's what's going to happen by default.

Comment by Eli Tyre (elityre) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2024-04-10T21:35:26.851Z · LW · GW

This seems clearly false in the case of deep learning, where progress on instilling any particular behavioral tendencies in models roughly follows the amount of available data that demonstrate said behavioral tendency. It's thus vastly easier to align models to goals where we have many examples of people executing said goals. As it so happens, we have roughly zero examples of people performing the "duplicate this strawberry" task, but many more examples of e.g., humans acting in accordance with human values, ML / alignment research papers, chatbots acting as helpful, honest and harmless assistants, people providing oversight to AI models, etc. See also: this discussion. [emphasis mine]

The thing that makes powerful AI powerful is that it can figure out how to do things that we don't know how to do yet, and therefore don't have examples of. The key question for aligning superintelligences is "how do they generalize in new domains that are beyond what humans were able to do / reason about / imagine.

Comment by Eli Tyre (elityre) on Algorithmic Improvement Is Probably Faster Than Scaling Now · 2024-04-06T20:02:01.587Z · LW · GW

I haven't seen careful analysis of LLMs (probably because they're newer, so harder to fit a trend), but eyeballing it... Chinchilla by itself must have been a factor-of-4 compute-equivalent improvement at least.

Ok, but discovering the Chinchilla scaling laws is a one time boost to training efficiency. You should expect to repeatedly get 4x improvements because you observed that one. 

Comment by Eli Tyre (elityre) on Algorithmic Improvement Is Probably Faster Than Scaling Now · 2024-04-06T19:59:42.661Z · LW · GW

At the time, the largest training run was AlphaGoZero, at about a mole of flops in 2017. Six years later, Metaculus currently estimates that GPT-4 took ~10-20 moles of flops.

 By "mole" do you mean the unit from chemistry?

In chemistry, a mole is a unit of measurement used to express amounts of a chemical substance. It is one of the base units in the International System of Units (SI) and is defined as the amount of substance that contains as many elementary entities (such as atoms, molecules, ions, or electrons) as there are atoms in 12 grams of carbon-12 (12C), the isotope of carbon with relative atomic mass 12 by definition. This number is known as Avogadro's number, which is approximately 6.022×10236.022×1023 entities per mole.

Am I missing something? Why use that unit?

Comment by Eli Tyre (elityre) on Algorithmic Improvement Is Probably Faster Than Scaling Now · 2024-04-06T19:52:05.260Z · LW · GW

Back in 2020, a group at OpenAI ran a conceptually simple test to quantify how much AI progress was attributable to algorithmic improvements. They took ImageNet models which were state-of-the-art at various times between 2012 and 2020, and checked how much compute was needed to train each to the level of AlexNet (the state-of-the-art from 2012). Main finding: over ~7 years, the compute required fell by ~44x. In other words, algorithmic progress yielded a compute-equivalent doubling time of ~16 months (though error bars are large in both directions).

Personally, I would be more interested in the reverse of this test: take all the prior state of the art models, and ask how long you need to train them in order to match the benchmark of current state of the art models.

Would that even work at all? Is there some (non-astronomically large) level of training which makes AlexNet as capable as current state of the art image recognition models?

The experiment that they did is a little like asking "at what age is an IQ 150 person able to do what an adult IQ 70 person is able to do?". But a more interesting question is "How long does it take to make up for being IQ 70 instead of IQ 150?"

Comment by Eli Tyre (elityre) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-26T19:59:18.355Z · LW · GW

Poll option: No, this didn't change my view of CM or the situation around the his article on Slate Star Codex.

Comment by Eli Tyre (elityre) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-26T19:58:31.640Z · LW · GW

Poll option: Yes, this changed my view of CM or the situation around the his article on Slate Star Codex.

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-23T17:53:19.300Z · LW · GW

Yes, and yes.

Comment by Eli Tyre (elityre) on Vernor Vinge, who coined the term "Technological Singularity", dies at 79 · 2024-03-22T05:10:22.531Z · LW · GW

That feels so sad to me. Unlike most people he had the possibility of cryopreservation. 

Here we are on the cusp of the singularity, and even if we make it through, he, who saw it coming earlier than almost anyone, who saw more clearly than most how radically everything would change, won't live to see it.

I hope he makes it.

Comment by Eli Tyre (elityre) on 'Empiricism!' as Anti-Epistemology · 2024-03-15T00:42:25.490Z · LW · GW

Twitter.

https://threadreaderapp.com/thread/1767710372306530562.html <= This link will take you to the thread, but NOT hosted on twitter.

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-13T03:59:29.508Z · LW · GW

True and important. I don't mean to imply otherwise. Evolution failed at it's "alignment goal". 

If (as I'm positing here) it successfully constructed humans to be aligned to some other concept, that's not the alignment goal, and that concept, and that alignment, generalized well, that doesn't mean that evolution failed any less hard.

But it does seem notable if that's what happened! Because it's some evidence about alignment generalization.

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-13T03:56:05.236Z · LW · GW

Well, there's convergent structure in the observed behavior. There's a target that seems pretty robust to a bunch of different kinds of perturbations and initial conditions. 

It's possible that that's implanted by a cluge of a bunch of different narrow adaptions. That's the null hypothesis even. 

But the fact that (many) people will steer systematically towards opportunities of high prestige, even when what that looks like is extremely varied, seems to me like evidence for an implicit concept that's hooked up to some planning machinery, rather than (only) a collection of adaptions that tend to produces this kind of behavior?

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-11T04:06:49.554Z · LW · GW

Eye-balling it? I'm hoping commenters will help me distinguish between these cases, hence my second footnote.

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-11T04:05:31.624Z · LW · GW

For example, if Bob wants to be a movie star, then from the outside you and I can say that Bob is status-seeking, but it probably doesn’t feel like that to Bob; in fact Bob might not know what the word “status” means, and Bob might be totally oblivious to the existence of any connection between his desire to be a movie star and Alice’s desire to be a classical musician and Carol’s desire to eat at the cool kids table in middle school.

That seems true to me? I don't me that humans become aligned with their explicit verbal concept of status. I mean that (many) humans are aligned with the intuitive concept that they somehow learn over the course of development.
 

I think it’s possible for the genome to build “it’s intrinsically motivating to believe that other people like me” into the brain whereas it would not be analogously possible for the genome to build “it’s intrinsically motivating to have a high inclusive genetic fitness” into the brain. There are many reasons that the latter is not realistic, not least of which is that inclusive genetic fitness is only observable in hindsight, after you’re dead.

Makes sense!

 

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-11T03:59:16.933Z · LW · GW

A super relevant point. If we try to align our AIs with something, and they end up robustly aligned with some other proxy thing, we definitely didn't succeed. 

But, it's still impressive to me that evolution hooked up general planning capabilities to a (learned) abstract concept, at all. 

Like there's this abstract concept, which varies a lot in it's particulars, from environment to environment. And which the brain has to learn to detect it aside from the particulars. Somehow the genome is able to construct the brain such that the motivation circuitry can pick out that abstract concept, after is it learned (or as it is being learned) and use that as a major criterion of the planning and decision machinery. And the end result is that the organism as a whole ends up not that far from a [abstract concept]-maximizer.

This is a lot more than I might expect evolution to be able to pull off, if I thought that our motivations were a hodge-podge of adaptions that cohere (as much as they do) into godshatter.

My point is NOT that evolution killed it, alignment is easy. My point is that evolution got a lot further than I would have guessed was possible.
 

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-11T03:39:10.943Z · LW · GW

Those are motivations but they don't (mostly) have the type signature of "goals" but rather the type signature of "drives".

I pursue interesting stuff because I'm curious. That doesn't require me to even have a concept of curiosity—it could in principle be steering me without my awareness. My planning process might use curiosity, but it isn't aligned with curiosity, in the sense that we make plans that maximize our curiosity (usually). We just do what's interesting.

In contrast, social status is a concept that humans learn, and it does look like the planning process is aligned with the status concept, in that (some) humans habitually make plans that are relatively well described as status maximizing. 

Or another way of saying it. Our status motivations are not straightforward adaption execution. It's recruiting the general intelligence in service of this concept, in much the way that we would want an AGI to be aligned with a concept like the Good or corrigibility.

Romantic love, again people act on (including using their general intelligence), but their planning process is not in general aligned with maximization of romantic love. (Indeed, I'm editorializing human nature here, but it looks to me like romantic love is mostly a strategy to get other goals).

Altruism - It's debatable whether most instances of maximizing altruistic impact are better described as status maximization. Regardless, this is an overriding strategic goal, recruiting general intelligence, for a very small fraction of humans.

Comment by Eli Tyre (elityre) on Evolution did a surprising good job at aligning humans...to social status · 2024-03-11T03:28:41.251Z · LW · GW

A great counterpoint! 

Yeah, I wrote some years ago about how status wasn't a status wasn't a special feature that humans attribute to each other for contingent social psychology reasons, but rather falls out very naturally as an instrumentally convergent resource.

Yeah, when I consider that, it does undercut the claim that evolution shaped us to optimize for status. It shaped us to to want things, and also to find strategies to get them.

Comment by Eli Tyre (elityre) on Vote on Anthropic Topics to Discuss · 2024-03-07T09:27:21.187Z · LW · GW

Can you give some examples of the prompts you're using here? In what ways are you imagining it helping with alignment?

Comment by Eli Tyre (elityre) on Vote on Anthropic Topics to Discuss · 2024-03-07T03:20:03.795Z · LW · GW

Given that there is not legislation to enforce a slowdown, it is preferable that Anthropic style AIs be state of the art than OpenAI style, as long as the ai safety community use claude heavily during that time.

Can someone elaborate on why this would be?

Comment by Eli Tyre (elityre) on Vote on Anthropic Topics to Discuss · 2024-03-06T22:44:20.255Z · LW · GW

I think Anthropic staff verbally communicated to many prospective employees, collaborators and funders that they were committed to not advance the frontier (at all, not just not "meaningfully advance") with a product launch.

Comment by Eli Tyre (elityre) on Fatebook for Chrome: Make and embed forecasts anywhere on the web · 2024-02-27T07:19:12.460Z · LW · GW

Added!

First note: After I install the extension it takes me to a page that says,

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T18:20:42.974Z · LW · GW

Fair enough, but is that a crux for you, or for Zack?

If you knew there wasn't a slippy slope here, would this matter?

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T11:46:46.491Z · LW · GW

Or to say it differently: we can unload some-to-most of the content of the word woman (however much of it doesn't apply to transwomen) onto the word "cis-woman", and call it a day. The "woman" category becomes proportionally less useful, but it's mostly fine because we still have the expressiveness to say everything we might want to say. 

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T09:52:16.163Z · LW · GW

In the skeptic's view, if you're not going to change the kid's diet on the basis of the second part, you shouldn't social transition the kid on the basis of the first part.

I think I probably would change the kid's diet?? Or at least talk with them further about it, and if their preference was robust, help them change their diet.

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T09:44:48.013Z · LW · GW

But if the grown-ups have been trained to believe that "trans kids know who they are"—if they're emotionally eager at the prospect of having a transgender child, or fearful of the damage they might do by not affirming—they might selectively attend to confirming evidence that the child "is trans", selectively ignore contrary evidence that the child "is cis", and end up reinforcing a cross-sex identity that would not have existed if not for their belief in it—a belief that the same people raising the same child ten years ago wouldn't have held. (A September 2013 article in The Atlantic by the father of a male child with stereotypically feminine interests was titled "My Son Wears Dresses; Get Over It", not "My Daughter Is Trans; Get Over It".)

Wow. This is a horrifying thought.

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T09:43:14.773Z · LW · GW

Under recent historical conditions in the West, these kids were mostly "pre-gay" rather than trans. (The stereotype about lesbians being masculine and gay men being feminine is, like most stereotypes, basically true: sex-atypical childhood behavior between gay and straight adults has been meta-analyzed at Cohen's d ≈ 1.31 standard deviations for men and d ≈ 0.96 for women.) A solid majority of children diagnosed with gender dysphoria ended up growing out of it by puberty. In the culture of the current year, it seemed likely that a lot of those kids would instead get affirmed into a cross-sex identity at a young age, even though most of them would have otherwise (under a "watchful waiting" protocol) grown up to be ordinary gay men and lesbians.

What made this shift in norms crazy, in my view, was not just that transitioning younger children is a dubious treatment decision, but that it's a dubious treatment decision that was being made on the basis of the obvious falsehood that "trans" was one thing: the cultural phenomenon of "trans kids" was being used to legitimize trans adults, even though a supermajority of trans adults were in the late-onset taxon and therefore had never resembled these HSTS-taxon kids. That is: pre-gay kids in our Society are being sterilized in order to affirm the narcissistic delusions[29] of guys like me.

I definitely want to think more about this, and my views are provisional.

But if this basic story is true, it sure changes my attitude towards childhood gender-transitions!

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T09:05:35.610Z · LW · GW

I was skeptical of the claim that no one was "really" being kept ignorant. If you're sufficiently clever and careful and you remember how language worked when Airstrip One was still Britain, then you can still think, internally, and express yourself as best you can in Newspeak. But a culture in which Newspeak is mandatory, and all of Oceania's best philosophers have clever arguments for why Newspeak doesn't distort people's beliefs doesn't seem like a culture that could solve AI alignment.

Hm. Is it a crux for you if language retains the categories of "transwoman" and "cis woman" in addition to (now corrupted, in your view) general category of "woman"?

I guess not, but I'm not totally sure what your reason for why not would be.

...or maybe you're mainly like "it's fucked up that this particular empirical question propagated so far back into our epistemology that it caused Scott and Eliezer to get a general philosophical question wrong."

That does seem to me like the most concerning thing about this whole situation, if that is indeed what happened. 

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T08:55:44.888Z · LW · GW

He asked for a specific example. ("Trans women are women, therefore trans women have uteruses" being a bad example, because no one was claiming that.) I quoted an article from the The Nation: "There is another argument against allowing trans athletes to compete with cis-gender athletes that suggests that their presence hurts cis-women and cis-girls. But this line of thought doesn't acknowledge that trans women are in fact women." Scott agreed that this was stupid and wrong and a natural consequence of letting people use language the way he was suggesting (!).

I wonder if the crux here is that Scott keeps thinking of the question as "what words should we use to describe things" and not "what internal categories should I use"?

Like, I could imagine thinking "It's not really a problem / not that bad to say that transwomen are women, because I happen to have the category of "transwomen" and so can keep track of the ways in which transwomen, on average, are different from cis women. Given that I'll be able to track the details of the world one way or the other, it's a pragmatic question of whether we should call transwomen women, and it seems like it's an overall pretty good choice on utilitarian grounds."

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T08:19:03.566Z · LW · GW

messy evolved animal brains don't track probability and utility separately the way a cleanly-designed AI could.

Side-note: a cleanly designed AI could do this, but it isn't obvious to me that this is actually the optimal design choice. Insofar as the agent is ultimately optimizing for utility, you might want epistemology to be shaped according considerations of valence (relevance to goals) up and down the stack. You pay attention to, and form concepts about, things in proportion to their utility-relevance.

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-19T08:01:22.973Z · LW · GW

It might seem like a little thing of no significance—requiring "I" statements is commonplace in therapy groups and corporate sensitivity training—but this little thing coming from Eliezer Yudkowsky setting guidelines for an explicitly "rationalist" space made a pattern click. If everyone is forced to only make claims about their map ("I think", "I feel") and not make claims about the territory (which could be construed to call other people's maps into question and thereby threaten them, because disagreement is disrespect), that's great for reducing social conflict but not for the kind of collective information processing that accomplishes cognitive work,[21] like good literary criticism. A rationalist space needs to be able to talk about the territory.

I strongly disagree with the bolded text. It often helps a lot to say phrases like "on my model" or "as I see it", because it emphasizes the difference between my map and the territory, even though I'm implicitly/explicitly claiming that my map models the territory.

This is helpful for a bunch of human psychological reasons, but one is that humans often feel social pressure to overwrite their own models or impressions with their received model of someone speaking confidently. In most parts of the world, stating something with confidence is not just a claim about truth values to be disputed, it's a social bid (sometimes a social threat) for others to treat what we're saying as true. That's very bad for collective epistemology!

Many of us rationalist-types (especially, in practice, males) have a social aura of confidence that fucks with other people's epistemology. (I've been on both sides of that dynamic, depending on who I'm talking to.)

By making these sorts of declarations where we emphasize that our maps are not the territory, we make space for others to have their own differing impressions and views, which means that I am able to learn more from them

Now, it's totally fine to say "well, people shouldn't be like that." But we already knew we were dealing with corrupted hardware riddled with biases. The question is how can we make use of the faculties that we actually have at our disposable to cobble together effective epistemic processes (individual and collective) anyway.

And it turns out that being straightforward about what you think is true at a content level (not dissembling), while also adopting practices and norms that attend to people's emotional and social experience works better than ignoring the social dimension, and trying to just focus on the content.

...Or that's my understanding anyway. 

See for instance: https://musingsandroughdrafts.com/2018/12/24/using-the-facilitator-to-make-sure-that-each-persons-point-is-held/

Comment by Eli Tyre (elityre) on Fatebook for Chrome: Make and embed forecasts anywhere on the web · 2024-02-19T05:16:38.514Z · LW · GW

Any chance of making a version for firefox? That's my primary web browser.

Comment by Eli Tyre (elityre) on If Clarity Seems Like Death to Them · 2024-02-18T08:11:33.828Z · LW · GW

And as it happened, on 7 May 2019, Kelsey wrote a Facebook comment displaying evidence of understanding my thesis.

This link is dead?

Comment by Eli Tyre (elityre) on A Hill of Validity in Defense of Meaning · 2024-02-18T07:41:44.216Z · LW · GW

This was fantastic, and you should post it as a top level post.

Comment by Eli Tyre (elityre) on A Hill of Validity in Defense of Meaning · 2024-02-18T07:32:06.168Z · LW · GW

And presumably Louie got paid out since why would you pay for silence if the accusations weren't at least partially true

FWIW, my current understanding is that this inference isn't correct. I think it's common practice to pay settlements to people, even if their claims are fallacious, since having an extended court battle is sometimes way worse.

Comment by Eli Tyre (elityre) on A Hill of Validity in Defense of Meaning · 2024-02-18T03:49:12.135Z · LW · GW

Thus, Yudkowsky's claim to merely have been standing up for the distinction between facts and policy questions doesn't seem credible. It is, of course, true that pronoun and bathroom conventions are policy decisions rather than matters of fact, but it's bizarre to condescendingly point this out as if it were the crux of contemporary trans-rights debates. Conservatives and gender-critical feminists know that trans-rights advocates aren't falsely claiming that trans women have XX chromosomes! If you just wanted to point out that the rules of sports leagues are a policy question rather than a fact (as if anyone had doubted this), why would you throw in the "Aristotelian binary" weak man and belittle the matter as "humorous"? There are a lot of issues I don't care much about, but I don't see anything funny about the fact that other people do care.

But, he's not claiming that this is the crux of contemporary trans-rights debates? He's pointing out the distinction between facts and policy mainly because he has a particular interest in epistemology, not because he has a particular interest in the trans-rights debates.

There's an active debate, which he's mostly not very interested in. But one sub-thread of that debate is some folks making what he considers to be an ontological error, which he points out, because he cares about that class of error, separately from the rest of the context.

Comment by Eli Tyre (elityre) on A Hill of Validity in Defense of Meaning · 2024-02-18T02:34:31.257Z · LW · GW

One could argue that this "Words can be wrong when your definition draws a boundary around things that don't really belong together" moral didn't apply to Yudkowsky's new Tweets, which only mentioned pronouns and bathroom policies, not the extensions of common nouns.

But this seems pretty unsatisfying in the context of Yudkowsky's claim to "not [be] taking a stand for or against any Twitter policies". One of the Tweets that had recently led to radical feminist Meghan Murphy getting kicked off the platform read simply, "Men aren't women tho." This doesn't seem like a policy claim; rather, Murphy was using common language to express the fact-claim that members of the natural category of adult human males, are not, in fact, members of the natural category of adult human females.

I don't get it. He's explicitly disclaiming that he's not commenting on that situation? But that means that we should take his thread here as implicitly commenting on that situation? 

I think I must be missing the point, because my summary here seems to uncharitable to be right.

Comment by Eli Tyre (elityre) on A Hill of Validity in Defense of Meaning · 2024-02-17T23:33:19.218Z · LW · GW

Some people I usually respect for their willingness to publicly die on a hill of facts, now seem to be talking as if pronouns are facts, or as if who uses what bathroom is necessarily a factual statement about chromosomes. Come on, you know the distinction better than that!

Even if somebody went around saying, "I demand you call me 'she' and furthermore I claim to have two X chromosomes!", which none of my trans colleagues have ever said to me by the way, it still isn't a question-of-empirical-fact whether she should be called "she". It's an act.

In saying this, I am not taking a stand for or against any Twitter policies. I am making a stand on a hill of meaning in defense of validity, about the distinction between what is and isn't a stand on a hill of facts in defense of truth.

I will never stand against those who stand against lies. But changing your name, asking people to address you by a different pronoun, and getting sex reassignment surgery, Is. Not. Lying. You are ontologically confused if you think those acts are false assertions.

My reading of this tweet thread, including some additional extrapolation that isn't in the text: 

There are obviously differences between cis men, cis women, trans men, and trans women. Anyone who tries to obliterate the distinction between cis women and trans women, in full generality, is a fool and/or pushing an ideological agenda.

So too are there important similarities between any pair of cis/trans men/women.

However, the social norm of identifying people by pronouns, in particular, is arbitrary. We could just as well have pronouns that categorize people by hair color or by height. 

If someone is fighting for the social ritual of referring to someone as "she" meaning "this person is a cis woman", instead of "this person presents as female", that's fine, but it's a policy proposal, not the defense of a fact. 

That we perform a social ritual that divides people up according to chromosomes, or according to neurotype, or according to social presentation, or according to preference, or whatever, is NOT implied by the true fact that there is a difference between cis women and trans women.

Similarly if someone wants to advocate for a particular policy about how we allocate bathrooms.

There are many more degrees of freedom in our choice of policies than in our true beliefs, and you're making an error if you're pushing for your preferred policy as if it is necessarily implied by the facts.


[edit 2024-02-19]: I in light of Yudkowsky's clarification, I think my interpretation was not quite right. He's saying instead that it is a mistake to call someone a lier they are explicitly making a bid or an argument to redefine how a word is used in some context,

Comment by Eli Tyre (elityre) on Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer · 2024-02-17T07:36:30.017Z · LW · GW

There could be situations in psychology where a good theory (not perfect, but as good as our theories about how to engineer bridges) would be described by (say) a 70-node causal graph, but that some of the more important variables in the graph anti-correlate with each other. Humans who don't know how to discover the correct 70-node graph, still manage to pattern-match their way to a two-type typology that actually is better, as a first approximation, than pretending not to have a theory. No one matches any particular clinical-profile stereotype exactly, but the world makes more sense when you have language for theoretical abstractions like "comas" or "depression" or "bipolar disorder"—or "autogynephilia".[9]

I claim that femininity and autogynephilia are two such anti-correlated nodes in the True Causal Graph. They're negatively correlated because they're both children of the sexual orientation node, whose value pushes them in opposite directions: gay men are more feminine than straight men,[10] and autogynephiles want to be women because we're straight.

FYI I found these to paragraphs extremely helpful for understanding your basic point, which I don't think that I had understood before (though, I haven't read very much of your voluminous material, only little bits here and there).

 

Comment by Eli Tyre (elityre) on Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer · 2024-02-17T06:40:34.523Z · LW · GW

Actually, lots. To arbitrarily pick one exhibit, in April 2018, the /r/MtF subreddit, which then had over 28,000 subscribers, posted a link to a poll: "Did you have a gender/body swap/transformation 'fetish' (or similar) before you realized you were trans?". The results: 82% of over 2000 respondents said Yes. Top comment in the thread, with over 230 karma: "I spent a long time in the 'it's probably just a fetish' camp."

What is being claimed that this is evidence for? That these MtF transitions are "essential" or "primarily" a fetish, and not an brain-intersex condition?

It seems intuitive to me that if one has body dysphoria, sexual activity would generally aggravate that dysphoria, and that it might be much more comfortable to be sexual, if one imagines that they're in a body that is non-dysphoric.

Such that I don't think I would be that surprised to find out that actually the brain-intersex theory is true, and also, that MtF transitions report masturbating to / while thinking about themselves in a female body, before they transition.

Are you claiming that this observation is strong evidence against a (correctly or incorrectly privileged) brain-intersex hypothesis?

Comment by Eli Tyre (elityre) on Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer · 2024-02-17T05:23:30.180Z · LW · GW

The way that they know is that they got to see the diff between how they were treated when they were presented as a man, and how they were treated when they presented as a woman?

As they say in the comment you're responding to?

And like, it just feels kinda weird that I appear to society metaphysically different after passing as a woman? Like people are warmer to me and don't cross the street if I'm walking behind them. It's not because I think I'm dangerous now, but because I do not think I was meaningfully more dangerous back then when I was a guy, so people's attitudes feel inaccurate.

And definitely that's not an ironclad inference: it's possible in principle that people started treating Sinclair differently for reasons independent of their shift in gender-presentation. But that's pretty implausible on the face of it.

Comment by Eli Tyre (elityre) on The Onion Test for Personal and Institutional Honesty · 2024-01-22T06:37:00.171Z · LW · GW

This is an example of a clear textual writeup of a principle of integrity. I think it's a pretty good principle, and one that I refer to a lot in my own thinking about integrity.

But even if I thought it was importantly flawed, I think integrity is super important, and therefore I really want to reward and support people thinking explicitly about it. That allows us to notice that our notions are flawed, and improve them, and it also allows us to declare to each other what norms we hold ourselves to, instead of sort of typical minding and assuming that our notion of integrity matches others' notion, and then being shocked when they behave badly on our terms.

Comment by Eli Tyre (elityre) on AI will change the world, but won’t take it over by playing “3-dimensional chess”. · 2024-01-19T06:06:13.051Z · LW · GW

However, we do believe that the key potential advantage of AI systems over their human counterparts would be the ability to quickly process large amounts of information, which in humans is approximated by scores such as IQ. If that skill were key to successful leadership of companies or countries, then we would expect CEOs and leaders to come from the top 0.1% (≈ +3σ)  of the distribution of such scores. The data does not bear this out.

It maybe true that within the range of +2 standard deviations to +5 standard deviations, factors other than intelligence (such as luck, or charisma, or emotional intelligence) dominate executive success. 

But it doesn't necessarily follow that there are negligible gains to intelligence far beyond this range. 

In fact, it doesn't even imply that there are negligible returns to intelligence within that range. 

It might be that greater intelligence is, at every level, a sizable advantage. However, If any other factor is important, at all, that's going to pull down the average cognitive ability of CEOs, etc. Selection on those other factors will tend to pull down the average IQ. 

If one can be a successful CEO by being on the Pareto frontier of IQ and charisma and luck and emotional intelligence, you should expect to see that most CEOs will be moderately, but not overwhelmingly intelligent. Not because intelligence doesn't matter, but because high IQs are rare and because the absolute smartest people aren't much more likely to be adequately charismatic or lucky or emotionally perceptive.

But if you can hold those other factors constant, more intellectual capability might be monotonically beneficial. 

Indeed, this is my actual belief about the world, rather than merely a hypothetical statistical possibility.

Comment by Eli Tyre (elityre) on AI will change the world, but won’t take it over by playing “3-dimensional chess”. · 2024-01-19T05:49:36.236Z · LW · GW

Here we must admit we are skeptical. Anyone who has ever tried to convince a dog to part with a bone or a child with a toy could attest to the diminishing returns that an intelligence advantage has in such a situation. 


I'm clearly better at getting a dog to part with a bone than another dog is. I'm apt to use strategies like distracting it, or offering it something else that it wants more.

And furthermore, some people are way better at being persuasive than others, often by using explicit strategies to empathize with and come to understand someone's positions, helping them uncover their cruxes, non-confrontationally, and then offering info and argument that bears on those cruxes. 

Those skills are hard, and while there are definitely non-intelligence factors to proficiency with them, I can tell you that intelligence definitely helps. When I attend workshops on similar skills, a lot of people have difficulty with them because they're hard to grasp or because there are multiple concepts that they need to keep in mind at once.

Maybe the returns to intelligence as applied to persuasion are diminishing, but they don't look negligible.

Comment by Eli Tyre (elityre) on AI will change the world, but won’t take it over by playing “3-dimensional chess”. · 2024-01-18T20:13:49.809Z · LW · GW

However, many realistic systems are chaotic and become unpredictable at some finite horizon.[4]  At that point, even sophisticated agents cannot predict better than baseline heuristics, which require only a bounded level of skill.

It seems like I could rephrase that claim as "Humans are close to literal optimal performance on long term strategic tasks. Even a Jupiter brain couldn't do much better at being a CEO than a human, because CEOs are doing strategy in chaotic domains." (This may be a stronger claim than the one you're trying to make, in which case I apologize for straw-manning you.)

That seems clearly false to me. 

If nothing else, a human decision-maker's ability to take in input is severely limited by their reading speed. Jeff Bezos can maybe read and synthesize ~1000 pages a of reports from his company in a day, if he spent all day reading reports. But even then those reports are going to be consolidations produced by other people in the company, highlighting which things are important to pay attention to and compressing many orders of magnitude of detail.

An AI CEO running Amazon could feasibly internalize and synthesize all of the data collected by Amazon directly, down to the detailed user-interaction time-logs for every user. (Or if not literally all the data, then at least many orders of magnitude more. My ballpark estimate is about 7.5 orders of magnitude more info, per unit time.)

I bet there's tons and tons of exploitable patterns that even a "human level" intelligence would be able to pick up on, if they could only read and remember all of that data. For instance, patterns in user behavior (impossible to notice from the high level summaries, but obvious when you're watching millions of users directly) which would allow you to run more targeted advertising or more effective price targeting. Or coordination opportunities between separate business units inside of Amazon, that are detectable if only there is any one person that knows what was happening in all of them in high detail.
 

Is the posit here that those gains could be gotten by short-term, non-autonomous AI systems?

Depending on how we carve things up, that might turn out to be true. But that just seems to mean that the human CEO, in practice, is going to pass off virtually all of the decision-making to those "short term" AI systems.

Either all these short term AI systems are doing analysis and composing reports that a human decision-maker reads and synthesizes to inform a long term strategy, or the human decision-maker is superfluous; the high level strategy is overdetermined by the interactions and analysis of the "short term" AI systems.

In the first case, I'm incredulous that there's no advantage to having a central decision maker with even 100x the information-processing capacity, much less 1,000,000x the information processing capacity.

(I suppose this is my double crux with the authors? They think that after some threshold, additional information synthesized by the central decision maker is of literally negligible value? That a version of Amazon run by Jeff Bezos who had time to read 100x as much about what is happening in the company would do no better than a version of Amazon that has ordinary human Jeff Bezos?)

And In the second case, we've effectively implemented a long term planning AI out of a bunch of short term AI components.

Neither branch of that dilemma provides any safety against AI corporations outcompeting human-run corporations, or against takeover risk.

Comment by Eli Tyre (elityre) on What are your greatest one-shot life improvements? · 2024-01-18T08:20:20.441Z · LW · GW

Practicing the muscle memory of opening a new note in my favored note taking app and reducing the number of buttons I need to press to do so eventually bootstrapped to a very robust note taking system.

This one is great. Having a hot key for opening your notes in less than a second is crucial.

Comment by Eli Tyre (elityre) on What are your greatest one-shot life improvements? · 2024-01-18T08:18:47.725Z · LW · GW

Are you still doing this, a few year later, or have you stopped? Have the results kept up?

Comment by Eli Tyre (elityre) on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" · 2024-01-18T08:03:01.890Z · LW · GW

Also related: the best poem-writing AIs are general-purpose language models that have been directed towards writing poems.

Maybe I'm missing something, but this seems like a non-sequitur to me? Or missing the point? 

Eliezer expect that that the hypothetical AI that satisfies strawberry alignment will have general enough capabilities to invent novel science for an engineering task (that's why this task was selected as an example). 

Regardless of whether we construct an AI that has "duplicate this strawberry" as fundamental core value or we create a corrigible AGI and instruct it to duplicate a strawberry, the important point is that (Eliezer claims) we don't know how, to do either, currently, without world-destroying side-effects.