Posts
Comments
That implies the ability to mix and match human chromosomes commercially is really far off
I agree that the issues of avoiding damage and having the correct epigenetics seem like huge open questions, and successfully switching a fruit fly chromosome isn't sufficient to settle them
Would this sequence be sufficient?
1. Switch a chromosome in a fruit fly
Success = normal fruit fly development
2a. Switch a chromosome in a rat
Success = normal rat development
2b. (in parallel, doesn't depend on 2a) Combine several chromosomes in a fruit fly to optimize aggressively for a particular trait
Success = fruit fly develops with a lot of the desired trait, but without serious negative consequences
3. Repeat 2b on a rat
4. Repeat 2a and 2b on a primate
Can you think of a faster way? It seems like a very long time to get something commercially viable
Maybe the test case is to delete one chromosome and insert another a chromosome in a fruit fly. Only 4 pairs of chromosomes, already used for genetic modifications with CRISPR
Goal = complete the insertion and still develop a normal fruit fly. I bet this is a fairly inexpensive experiment, within reach of many people on LessWrong
Chromosome selection seems like the most consequential idea here if it's possible
Is it possible now, even in animals? Can you isolate chromosomes without damaging them and assemble them into a viable nucleus?
Edit: also -- strong upvoted because I want to see more of this on LW. Not directly AI but massively affects the gameboard
My model of "steering" the military is a little different from that It's over a thousand partially autonomous headquarters, which each have their own interests. The right hand usually doesn't know what the left is doing
Of the thousand+ headquarters, there's probably 10 that have the necessary legitimacy and can get the necessary resources. Winning over any one of the 10 is a sufficient condition to getting the results I described above
In other words, you don't have to steer the whole ship. Just a small part of it. I bet that can be done in 6 months
I don't agree, because a world of misaligned AI is known to be really bad. Whereas a world of AI successfully aligned by some opposing faction probably has a lot in common with your own values
Extreme case: ISIS successfully builds the first aligned AI and locks in its values. This is bad, but it's way better than misaligned AI. ISIS want to turn the world into an idealized 7th Century Middle East, which is a pretty nice place compared to much of human history. There's still a lot in common with your own values
I bet that's true
But it doesn't seem sufficient to settle the issue. A world where aligning/slowing AI is a major US priority, which China sometimes supports in exchange for policy concessions sounds like a massive improvement over today's world
The theory of impact here is that there's a lot of policy actions to slow down AI, but they're bottlenecked on legitimacy. The US military could provide legitimacy
They might also help alignment, if the right person is in charge and has a lot of resources. But even if 100% their alignment research is noise that doesn't advance the field, military involvement could be a huge net positive
So the real question is:
- Is the theory of impact plausible
- Are their big risks that mean this does more harm than good
Because maximizing the geometric rate of return, irrespective of the risk of ruin, doesn't reflect most peoples' true preferences
In the scenario above with the red and blue lines, the full Kelly has a 9.3% chance of losing at least half your money, but the .4 Kelly only has a 0.58% chance of getting an outcome at least that bad
I agree. I think this basically resolves the issue. Once you've added a bunch of caveats:
- The bet is mind-bogglingly favorable. More like the million-to-one, and less like the 51% doubling
- The bet reflects the preferences of most of the world. It's not a unilateral action
- You're very confident that the results will actually happen (we have good reason to believe that the new Earths will definitely be created)
Then it's actually fine to take the bet. At that point, our natural aversion is based on our inability to comprehend the vast scale of a million Earths. I still want to say no, but I'd probably be a yes at reflective equilibrium
Therefore... there's not much of a dilemma anymore
It doesn't matter who said an idea. I'd rather just consider each idea on its own merits
I don't think that solves it. A bounded utility function would stop you from doing infinite doublings, but it still doesn't prevent some finite number of doublings in the million-Earths case
That is, if the first round multiplies Earth a millionfold, then you just have to agree that a million Earths is at least twice as good as one Earth
There's definitely bet patterns superior to risking everything -- but only if you're allowed to play a very large number of rounds
If the pharmacist is only willing to play 10 rounds, then there's no way to beat betting everything every round
As the number of rounds you play approaches infinity (assuming you can bet fractions of a coin), your chance of saving everyone approaches 1. But this takes a huge number of rounds. Even at 10,000 rounds, the best possible strategy only gives each person around a 1 in 20 chance of survival
Updated! Excuse the delay
I buy that… so many of the folks funded by Emergent Ventures are EAs, so directly arguing against AI risk might alienate his audience
Still, this Straussian approach is a terrible way to have a productive argument
My mistake! Fixed
Many thanks for the update… and if it’s true that you could write the very best primer, that sounds like a high value activity
I don’t understand the astroid analogy though. Does this assume the impact is inevitable? If so, I agree with taking no action. But in any other case, doing everything you can to prevent it seems like the single most important way to spend your days
Many thanks! It looks like EA was the right angle... found some very active English-speaking EA groups right next to where I'll be
I bet you're right that a perceived lack of policy options is a key reason people don't write about this to mainstream audiences
Still, I think policy options exist
The easiest one is adding right right types AI capabilities research to the US Munitions List, so they're covered under ITAR laws. These are mind-bogglingly burdensome to comply with (so it's effectively a tax on capabilities research). They also make it illegal to share certain parts of your research publicly
It's not quite the secrecy regime that Eliezer is looking for, but it's a big step in that direction
I think 2, 3, and 8 are true but pretty easy to overcome. Just get someone knowledgeable to help you
4 (low demand for these essays) seems like a calibration question. Most writers probably would lose their audience if they wrote about it as often as Holden. But more than zero is probably ok. Scott Alexander seems to be following that rule, when he said that we was summarizing the 2021 MIRI conversations at a steady drip so as not to alienate the part of his audience that doesn’t want to see that
I think 6 (look weird) used to be true, but it’s not any more. It’s hard to know for sure without talking to Kelsey Piper or Ezra Klein, but I suspect they didn’t lose any status for their Vox/NYT statements
I agree that it's hard, but there are all sorts of possible moves (like LessWrong folks choosing to work at this future regulatory agency, or putting massive amounts of lobbying funds into making sure the rules are strict)
If the alternative (solving alignment) seems impossible given 30 years and massive amounts of money, then even a really hard policy seems easy by comparison
Eliezer gives alignment a 0% chance of succeeding. I think policy, if tried seriously, has >50%. So it's a giant opportunity that's gotten way too little attention
I'm optimistic about policy for big companies in particular. They have a lot to lose from breaking the law, they're easy to inspect (because there's so few), and there's lots of precedent (ITAR already covers some software). Right now, serious AI capabilities research just isn't profitable outside of the big tech companies
Voluntary compliance is also a very real thing. Lots of AI researchers are wealthy and high-status, and they'd have a lot to lose from breaking the law. At the very least, a law would stop them from publishing their research. A field like this also lends itself to undercover enforcement
I think an agreement with China is impossible now, because prominent folks don't even believe the threat exists. Two factors could change the art of the possible. First, if there were a widely known argument about the dangers of AI, on which most public intellectual agreed. Second, since the US has a technological lead, it could actually be to their advantage.
It sounds like Eliezer is confident that alignment will fail. If so, the way out is to make sure AGI isn’t built. I think that’s more realistic than it sounds
1. LessWrong is influential enough to achieve policy goals
Right now, the Yann LeCun view of AI is probably more mainstream, but that can change fast.
LessWrong is upstream of influential thinkers. For example:
- Zvi and Scott Alexander read LessWrong. Let’s call folks like them Filter #1
- Tyler Cowen reads Zvi and Scott Alexander. (Filter #2)
- Malcolm Gladwell, a mainstream influencer, reads Tyler Cowen every morning (Filter #3)
I could’ve made a similar chain with Ezra Klein or Holden Karnofsky. All these chains put together is a lot of influence
Right now, I think Eliezer’s argument (AI capabilities research will destroy the world) is blocked at Filter #1. None of the Filter #1 authors have endorsed it. Why should they? The argument relies on intuition. There’s no way for Filter #1 to evaluate it. I think that’s why Scott Alexander and Holden Karnofsky hedged, neither explicitly endorsing nor rejecting the doom theory.
Even if they believed Eliezer, Filter #1 authors need to communicate more than an intuition to Filter #2. Imagine the article: “Eliezer et al have a strong intuition that the sky is falling. We’re working on finding some evidence. In the meantime, you need to pass some policies real fast.”
In short, ideas from LessWrong can exert a strong influence on policymakers. This particular idea hasn’t because it isn’t legible and Filter #1 isn’t persuaded.
2. If implemented early, government policy can prevent AGI development
AGI development is expensive. If Google/Facebook/Huawei didn’t expect to make a lot of money from capabilities development, they’d stop investing in it. This means that the pace of AI is very responsive to government policy.
If the US, China, and EU want to prevent AGI development, I bet they’d get their way. This seems a job for a regulatory agency. Pick a (hopefully narrow) set of technologies and make it illegal to research them without approval.
This isn’t as awful as it sounds. The FAA basically worked, and accidents in the air are very rare. If Eliezer’s argument is true, the costs are tiny compared to the benefits. A burdensome bureaucracy vs destruction of the universe.
Imagine a hypothetical world, where mainstream opinion (like you’d find in the New York Times) says that AGI would destroy the world, and a powerful regulatory agency has the law on its side. I bet AGI is delayed by decades.
3. Don’t underestimate how effectively the US government can do this job
Don’t over-index on covid or climate change. AI safety is different. Covid and climate change both demand sacrifices from the entire population. This is hugely unpopular. AI safety, on the other hand, only demands sacrifices from a small number of companies
For now, I think the top priority is to clearly and persuasively demonstrate why alignment won’t be solved in the next 30 years. This is crazy hard, but it might be way easier than actually solving alignment
Is there a good write up of the case against rapid tests? I see Tom Frieden’s statement that rapid tests don’t correlate with infectivity, but I can’t imagine what that’s based on
In other words, there’s got to be a good reason why so many smart people oppose using rapid tests to make isolation decisions
Could you spell out your objection? It’s a big ask, having read a book just to find out what you mean!
Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.
Longer summary:
Lots of folks made bad AGI predictions by asking:
- How much compute is needed for AGI?
- When that compute will be available?
To find (1), they use a “biological anchor,” like the computing power of the human brain, or the total compute used to evolve human brains.
Hans Moravec, 1988: the human brain uses 10^13 ops/s, and computers with this power will be available in 2010.
Eliezer objects that:
- “We’ll have computers as fast as human brains in 2010” doesn’t imply “we’ll have strong AI in 2010.”
- The compute needed depends on how well we understand cognition and computer science. It might be done with a hypercomputer but very little knowledge, or a modest computer but lots of knowledge.
- An AGI wouldn’t actually need 10^13 ops/s, because human brains are inefficient. One example, they do lots of operations in parallel, which could be replaced with fewer operations in series.
Eliezer, 1999: Eliezer mentions that he too made bad AGI predictions as a teenager
Ray Kurzweil, 2001: Same idea as Moravec, but 10^16 ops/s. Not worth repeating
Someone, 2006: it took ~10^43 ops for evolution to create human brains. It’ll be a very long time before a computer can reach 10^43 ops, so AGI is very far away
Eliezer objects that the use of a biological anchor is sufficient to make this estimate useless. It’s a case of a more general “argument from comparable resource consumption.”
Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI!
OpenPhil, 2020: A much more sophisticated estimate, but still based on a biological anchor. They predict AGI in 2050.
How the new model works:
Demand side: Estimate how many neural-network parameters would emulate a brain. Use this to find the computational cost of training such a model. (I think this part mischaracterizes OpenPhil's work, my comments at the bottom)
Supply side: Moore’s law, assuming
- Willingness to spend on AGI training is a fixed percent of GDP
- “Computation required to accomplish a fixed task decreases by half every 2-3 years due to better algorithms.”
Eliezer’s objections:
- (Surprise!) It’s still founded on a biological anchor, which is sufficient to make it invalid
- OpenPhil models theoretical AI progress as algorithms getting twice as efficient every 2-3 years. This is a bad model, because folks keep finding entirely new approaches. Specifically, it implies “we should be able to replicate any modern feat of deep learning performed in 2021, using techniques from before deep learning and around fifty times as much computing power."
- Some of OpenPhil’s parameters make it easy for the modelers to cheat, and make sure it comes up with an answer they like:
“I was wondering what sort of tunable underdetermined parameters enabled your model to nail the psychologically overdetermined final figure of '30 years' so exactly.”
Can’t we use this as an upper bound? Maybe AGI will come sooner, but surely it won’t take longer than this estimate.
Eliezer thinks this is the same non-sequitur as Moravec’s. If you train a model big enough to emulate a brain, that doesn’t mean AGI will pop out at the end.
Other commentary: Eliezer mentions several times that he’s feeling old, tired, and unhealthy. He feels frustrated that researchers today repeat decades-old bad arguments. It takes him a lot of energy to rebut these claims
My thoughts:
I found this persuasive, but I also think it mischaracterized the OpenPhil model
My understanding is that OpenPhil didn’t just estimate the number of neural network parameters required to train a human brain. They used six different biological anchors, including the “evolution anchor’, which I find very useful for an upper bound.
Holden Karnofsky, who seems to put much more stock in the Bio Anchors model than Eliezer, explains the model really well here. But I was frustrated to see that the write-up on Holden’s blog gives 50% by 2090 (first graph) using the evolution anchor, while the same graph in the old calcs gives only 11%. Was this model tuned after seeing the results?
My conclusion: Bio Anchors is a terrible way to model when AGI will actually arrive. But I don’t agree with Eliezer’s dismissal of using Bio Anchors to get an upper bound, because I think the evolution anchor achieves this.
What particular counterproductive actions by the public are we hoping to avoid?
Zvi just posted EY's model
I should’ve been more clear…export controls don’t just apply to physical items. Depending on the specific controls, it can be illegal to publicly share technical data, including source code, drawings, and sometimes even technical concepts
This makes it really hard to publish papers, and it stops you from putting source code or instructions online
Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?
EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this
But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other
I agree with everything in your comment except the value of showing EY’s claim to be wrong:
- Believing a problem is harder than it is can stop you from finding creative solutions
- False believe in your impending doom leads to all sorts of bad decisions (like misallocating resources, or making innocent researchers’ lives worse)
- Belief in your impending doom is terrible for your mental heath (tbh I sensed a bit of this in the EY discussion)
- Insulting groups like OpenAI destroys a lot of value, especially if EY is actually wrong
- If alignment were solved, then developing AGI would be the best event in human history. It’d be a shame to prevent that
In other words, if EY is right, we really need to know that. And know it in way that’s easy to persuade others. If EY is wrong, we need to know that too, and stop this gloom and doom
I agree. This wasn’t meant as an object level discussion of whether the “alignment is doomed” claim is true. What I’d hopes to convey is that, even if the research is on the wrong track, we can still massively increase the chances of a good outcome, using some of the options I described
That said, I don’t think Starship is a good analogy. We already knew that such a rocket can work in theory, so it was a matter of engineering, experimentation, and making a big organization work. What if a closer analogy to seeing alignment solved was seeing a proof of P=NP this year?
In fact, what I’d really like to see from this is Leverage and CFAR’s actual research, including negative results
What experiments did they try? Is there anything true and surprising that came out of this? What dead ends did they discover (plus the evidence that these are truly dead ends)?
It’d be especially interesting if someone annotated Geoff’s giant agenda flowchart with what they were thinking at the time and what, if anything, they actually tried
Also interested in the root causes of the harms that came to Zoe et al. Is this an inevitable consequence of Leverage’s beliefs? Or do the particular beliefs not really matter, and it’s really about the social dynamics in their group house?
I don’t agree with the characterization of this topic as self-obsessed community gossip. For context, I’m quite new and don’t have a dog in the fight. But I drew memorable conclusions from this that I couldn’t have gotten from more traditional posts
First, experimenting with our own psychology is tempting and really dangerous. Next time, I’d turn up the caution dial way higher than Leverage did
Second, a lot of us (probably including me) have an exploitable weakness brought on high scrupulously combined with openness to crazy-sounding ideas. Next time, I’d be more cautious (but not too cautious!) about proposals like joining Leverage
Third, if we ever need to maintain the public’s goodwill, I’ll try not to use words like “demonic seance”… even if I don’t mean it literally
In short, this is the sort of mistake worth learning about, including for those not personally affected, because it’s the kind of mistake we could plausibly make again. I think it’s useful to have here, and the right attitude for the investigation is “what do these events teach us about how rationalist groups can go wrong?” I also don’t think posting a summary would’ve been sufficient. It was necessary to hear Geoff and Anna’s exact words
So is this an accurate summary of your thinking?
- You agree with FDT on some issues. The goal of decision theory is to determine what kind of agent you should be. The kind of agent you are (your "source code") affects other agents' decisions
- FDT requires you to construct counterfactual worlds. For example, if I'm faced with Newcomb's problem, I have to imagine a counterfactual world in which I'm a two-boxer
- We don't know how to construct counterfactual worlds. Imagining a consistent world in which I'm a two-boxer is just as hard as imagining a one where objects fall up, or 2+2 is 5
- You get around this by constructing counterfactual models, instead of counterfactual worlds. Instead of trying to imagine a consistent world that caused me to become a two-boxer, I just make a simple mental model. For example, my model might indicate that my "source code" causes both the predictor's decision and mine. From here, I can model myself as a two-boxer, even though that model doesn't represent any possible world
Really enjoyed this. I’m skeptical, because (1) a huge number of things have to go right, and (2) some of them depend on the goodwill of people who are disincentivized to help
Most likely: the Vacated Territory flounders, much like Birobidzhan (Which is a really fun story, by the way. In the 1930’s, the Soviet Union created a mostly-autonomous colony for its Jews in Siberia. Macha Gessen tells the story here)
Best case:
In September 2021, the first 10,000 Siuslaw Syrians touched down in Siuslaw National Forest, land that was previously part of Oregon.
It was an auspicious start. The US and state governments had happily passed all necessary laws, amendments, and regulations. Likewise, the locals were friendly. Those already living on the Suislaw River happily gave up their homes in exchange for government compensation.
Their new President, Abdul Ali, dreamt of a shining city-state, which would attracted the best of the Syrian diaspora. The US government and various NGOs offered all sorts of goods and hands-on assistance, but Ali declined everything except cash. Ali received $20,000 per person for the first year, giving him an initial budget of $200M. Donations per person decreased by about 10% per year.
The land wasn’t great for farming, but the fishing was excellent. Ali’s new Exclusive Economic Zone had previously been a marine mammal sanctuary. The Siuslaw Syrians bought used commercial fishing boats (and, in the early days, hired some of their crew). Ali managed the fishery with an iron fist, and the fish population remained high for many years.
In the first year, Ali invested a full 25% of his budget in an Islamic University. He believed the key to success would be attracting the best and brightest of the diaspora – especially those of working age. Ali marketed his vision widely, and it worked. His population grew like wildfire.
One day, someone was caught smuggling Fentanyl into the United States through Siuslaw. Ali saw how this could harm relations with the US, so he caused the drug traffickers to be publicly beaten and put to death. Siuslaw continued this harsh regime, and the US never saw Siuslaw as a security threat.
Ali had hoped to thrive as a city-state, but he just didn’t have the advantages of Singapore – a massive port at the crossroads of the world’s shipping. He’s also not receiving as much external cash that Israel received at a similar point in its development. So he makes two moves:
First, to keep his population growth high, Ali invested in making Siuslaw the “authentic Syrian culture.” Most of the diaspora believes they would feel more at home in Siuslaw than in Syria. It was a geniunely good place to live, striking the balance between devout and modern. Siuslaw reached 1M people by 2030.
Second, he built semiconductors. Initially, poor quality ones. But Siuslaw’s skilled manufacturing sector eventually developed.
In the early days, Ali tried to accept a Chinese team’s investment and on-the-ground assistance with a semiconductor plant. The US threatened to cut off all aid and close its border with Siuslaw. Ali backed down. Later, he met similar pressure when trying to purchase military equipment from Russia for his police force. Ali learned that the US would have a quiet veto power on his foreign policy. Outside of foreign policy, Ali jealously – and successfully – guarded his independence.
By 2040, Siuslaw has reached a population of 2 million and matches the United States’ GDP per capita.
The Syrian refugee camps in Jordan and Turkey have now been operating for 28 years, and their population has actually increased. Ali makes generous public gestures, but only those who are young, healthy, productive, or smart are allowed to immigrate. Siuslaw has no plans to solve the refugee problem. Nevertheless, everyone agrees: the Vacated Territories Project was a success.