A Brief Review of Current and Near-Future Methods of Genetic Engineering
post by GeneSmith · 2021-04-10T19:16:01.169Z · LW · GW · 33 commentsContents
My Purpose in Writing This Post A Summary of Current Techniques Step 1: Make a yardstick Step 2: Generate desirable variance Step 3: Select an embryo for implantation What type of genetic engineering can we do today? The Future Screening for intelligence 1. Improve tests for the genetic component of intelligence 2. Solve Iterated Embryo Selection Reflections on the Value of Human Genetic Engineering None 33 comments
Part 1: The Case for Human Genetic Engineering [LW · GW]
Part 2: The Case for Increasing Intelligence [LW · GW]
My Purpose in Writing This Post
I've spent the last 6 months or so looking into the possibility of pursuing human genetic engineering as a means of improving human lives and increasing the probability of a desirable future. If you'd like more details about why I think improving health and intelligence is desirable, read my previous two posts.
In this post I'm going to summarize my understanding of the research on how genetic engineering will likely be done, the limitations of current techniques, and how they might be improved in the near future.
One last thing before I get started: the genetic engineering I am interested in, and which I think holds the most potential for increasing the likelihood of a good future does not incorporate any type of selective breeding or eugenics programs. Though one could theoretically increase intelligence or any other trait by banning those with undesirable traits from having children and encouraging those with desirable traits people to have more children, I think this is a bad approach for reasons I have summarized in another post. [LW · GW]. This post examines methods that do not require coercion to work.
A Summary of Current Techniques
Step 1: Make a yardstick
All non-coercive efforts to genetically engineer humans have one essential prerequisite task: finding which genes contribute to the expression of different traits. In the modern world of genetics, this is done with a test called a Genome Wide Association Study, or GWAS. These are truly massive studies: a typical GWAS done today usually has hundreds of thousands of participants. Genetic material is collected from all participants, often with a blood draw or cheek swab, and their genome is analyzed with a machine like Illumina's MiSeq analyzer.
For cost reasons, nearly all studies today sequence only a small portion of the genome using a device called an SNP microarray. SNP stands for Single Nucleotide Polymorphism, and is the term geneticists use to refer to a base pair that differs between two individuals. Also, because it's technically possible for any base pair to differ between humans, geneticists usually use this term to refer to base pairs for which at least 1% of study participants have a different base pair at that location. An SNP microarray is an amazing device that can sequence a portion of a genome without sequencing all of it and save money by doing so. This is done by attaching a bunch of short RNA sequences to a substrate (basically a really flat plate), then spreading a bunch of ground up DNA over the array of these RNA sequences and measuring which of the plate-attached RNA sequences have complimentary pairs in the ground-up sample. A signal is then obtained (I think with a laser? The articles I read weren't clear on this), whose strength varies depending on how many of the base pairs of the sequences attached properly. In other words, they're measuring how well the two RNA strands bonded to one another. There's a whole bunch of fancy signal processing that happens after this to deal with noisy data from RNA strands that are partially complementary, but at the end we have data about which of the plate-attached RNA strands had complementary pairs in the sample, and for those that didn't, how much they differed by.
Once this data is obtained, either with the SNP chip model described above or with whole-genome sequencing, a linear effect model is used to construct predictors for the influence of each SNP on the expression of a particular trait. If that sounds confusing, just understand that they're basically modelling the expression of a trait as a linear equation like y = mx + b, where each letter in the genome is an input (an x) and the m represents the effect size of that letter on the expression of a trait. The value of the coefficient m is determined by minimizing the prediction error of a linear equation. Surprisingly, most genetically caused variance in trait expression can be explained with linear models. For those of you interested in the mathematical details, I suggest you take a look at the "Methods" section in the Wikipedia article on GWAS.
Unfortunately it seems like state of the art methods are still not fantastic at predicting the exact expression of highly polygenic traits, with the notable exception of height. Here's a paper from late 2018 that attempted to predict height, heel bone density, and educational attainment from SNP data alone. The authors were able to explain 40% of height variance with genetic data, 20% of heel bone density variance, and 9% of educational attainment variance. Height has the unique properties of both being highly polygenic and extremely easy to measure, so many of the ideas and techniques underlying modern GWAS were pioneered on studies of height.
More recent studies have been able to explain a higher level of variance in educational attainment and intelligence. A study in April 2019 was able to explain 16% of educational attainment and 11% of intelligence. Still, this is a long way from capturing the 50-80% of variance in intelligence that studies indicate comes from genes.
Step 2: Generate desirable variance
Geneticists have several tools to generate desirable genetic variance. Though tools like CRISPR seem to get the most attention from the mainstream press, CRISPR is not a particularly cost effective way to engender desirable traits in a future human. There are use-cases for CRISPR, such as if both parents have a recessive disease like sickle cell anemia and all of their children will have the disease. In that case, CRISPR could be used to replace the disease allele with its normal counterpart, allowing the couple to conceive children without the disease in question. CRISPR is also a very useful therapeutic for treating those who have a mendelian genetic condition and have already been born. For example, here's a study where researchers used CRISPR to cure one patient of beta-Thalassemia and another of sickle cell anemia by extracting blood stem cells from their bone marrow, replacing the diseased gene, and reinjecting those modified cells back into the patients.
But for most traits, CRISPR is not a cost-effective tool for one simple reason. Most of the traits we care the most about including heart disease risk, diabetes risk, cancer risks of all types, and many others are highly polygenic traits; each are influenced by tens of thousands of letters in the genome. Given CRISPR's tendency to occasionally make off-target edits, and the expense of editing so many places in the genome, I don't see this being a viable strategy to decrease risk of heart disease any time soon. This may change in the future, but it appears to be the case for now.
The best way to generate desirable genetic variance in the near term is by generating a large number of embryos. During sex, sperm and eggs (referred to as "gametes" by biologists) go through a process called meiosis, where they swap parts of their matching chromosomes to generate a new organism with its own unique DNA. This process generates variance. The resulting offspring will incorporate traits from both parents, but trait expression will not always match the mean of the parents. If one parent has a 10% chance of experiencing a heart attack during their lifetime and another has a 15% chance of experiencing a heart attack, the offspring will not always have a 12.5% chance of getting a heart attack. Instead, generally speaking, the offspring's risk of heart attack will be drawn from a normal distribution with a mean of 12.5%. The expression of all heritable polygenic traits will show variance in offspring.
Step 3: Select an embryo for implantation
Once genetic variance has been generated via the production of a number of embryos, the next step is to identify the likely trait values of each one. Embryos within a few weeks of fertilization have a very interesting property: one may remove several cells from the embryo and it will still retain the capacity to develop into a fully functional adult organism. This regenerative capacity allows us to gain unique insights into the genetic potential of each embryo; one may perform a biopsy on each, removing several cells, then amplify and sequence the DNA from the removed cells.
We may therefore discern the genetic sequence of an embryo before we choose to implant it. This gives us the chance to tilt the odds in favor of a future child: we can reduce their risk of serious polygenic diseases like heart disease, breast cancer, and type 2 diabetes, and we can virtually eliminate serious mendelian diseases like sickle cell anemia, cystic fibrosis, Huntington's disease, and others. This is done by creating an overall "score" for each embryo, which represents the embryo's expected expression of a set of traits. For example, we would give embryos at higher risk of developing coronary artery disease or type 2 diabetes a lower score. The expression of each trait is given a weight in accordance with how important we believe it is. These weights can be adjusted to reflect parental preferences with the help of a genetic counselor knowledgeable about the tests and about the diseases themselves.
Any trait with a strong genetic component may be selected for or against using this method. One merely needs to generate variance among a pool of embryos and develop a test that is able to capture a large enough portion of this genetically caused variance in trait expression.
What type of genetic engineering can we do today?
It may surprise some to learn that the techniques I described above are actually accessible right now in some capacity to any parents that can afford to do In-Vitro Fertilization. In IVF, a father donates his semen and a mother donates eggs, and a reproductive specialist uses those pools of reproductive cells to produce embryos in a cell culture. These embryos may then be screened using the process I described above, and the parents, with the help of a reproductive specialist, may choose which embryo they would like to implant.
There are companies offering this service right now. Among them are Orchid Health and Genomic Prediction. So far as I know, no companies in Europe nor the United States are offering screening for intelligence, skin color, or any other cosmetic traits. Instead, they focus exclusively on genetic predictors of health, such as heart attack risk, type 2 diabetes risk etc. This is at least partially explained by the fact that we don't have great polygenic predictors of intelligence yet, but given that the predictors for some of the diseases they DO screen for aren't much better, the main reason seems to be to avoid the controversy that would inevitably follow.
Even with the fairly limited testing we have today, pre-implantation genetic screening can have a remarkable effect on children's future health. This is especially true for individuals with a family history of disease. As a topic of a future paper, I would like to quantify exactly how cost-effective IVF + embryo selection is for couples with no fertility issues, but my gut feel after reading some research is that the expected reduction in medical costs alone more than pay for the cost of IVF. To give you a rough idea of how effective this technology is at reducing disease risk, here's Genomic Prediction's chart showing how much we would expect the risk of different diseases to decrease if we were to choose from the better of two scoring of two embryos.
You can play around with the tool yourself to see how changing the number of embryos affects the expected reduction in disease risk. Though their current web interface is limited, it's clear that even just selecting from two embryos, the expected reduction in chronic disease risk is substantial. And as the number of embryos selected from goes up, risk of disease decreases even further.
In fact, the reductions are so substantial that I suspect we are not too far from the day where conceiving a child through sex will be viewed similar to how giving birth at home is viewed today: unnecessarily risky and something to be avoided if one can afford it.
If you are considering having children in the future, you may want to look into pre-implantation genetic screening. If you're a woman you may want to consider doing this earlier rather than later, as the number of eggs that can be harvested per cycle tends to decline with age, making the process more expensive if done later. The same goes for men, though since semen extraction is rather easier and does not require a specialist, the main consideration is semen quality rather than cost.
The Future
Suppose you buy the argument that we're likely to be able to improve human health, happiness, intelligence, etc using genetic engineering. What is the critical path to the development of this technology? How do we get there faster?
Screening for intelligence
Though there will doubtless be some people that oppose this in the near future, it seems inevitable that when the predictive tests for intelligence improve enough, some lab in some country will start offering pre-implantation genetic tests for predicted intelligence. Intelligence has a strong genetic basis (estimates vary between 35-80%) and positively correlates with too many objectively important outcomes for us to ignore it for long. The moral case for doing so is strong as well: if we are able to enhance intelligence without having any negative effect on other important traits, why wouldn't we do so? We already spend significant amounts of money to help our children realize their intellectual potential. Raising their potential through genetic intervention seems completely consistent with the values already clearly expressed by many people.
And once this service starts being offered anywhere, it will only be a matter of time before it's offered pretty much everywhere. If a country does not allow embryo screening for intelligence and other desirable traits, wealthy parents will simply have their embryos genotyped, then send the data files off to a clinic in another country where they can be analyzed, and implant the ones that score the best according to some scoring system that takes intelligence into account.
And if data transfers are banned, then those parents will simply take a vacation somewhere it isn't to give birth. Eventually, the huge accrued disadvantage faced by countries that don't allow the technology will create overwhelming pressure to legalize the technology in some capacity, and no country other than dictatorships will be able to resist the pressure. My guess for where this will first be legalized is somewhere in Asia, possibly South Korea or China. And just as IVF itself became normalized, so will preimplantation genetic screening designed to give one's child the best life possible.
For intelligence screening, in particular, I have concluded there are two key technologies needed to enable dramatic improvements.
1. Improve tests for the genetic component of intelligence
The first is a test better able to capture genetically caused variance in intelligence. Plomin & Stumm seem to think that the two missing ingredients for really good predictors of the genetic portion of intelligence are larger sample sizes and whole-genome sequencing instead of SNP based approaches, as well as possibly non-linear models of gene effects and gene-environment interactions (see the last paragraph of box 4 on page 6 from the above link). They estimate that we can capture half the genetic variance in intelligence with SNP data alone, but that we'll need whole-genome sequencing for the remainder. SNP tests usually cost around $100, while whole genome sequencing currently costs around $300. Here's a nice graph showing the current state of our tests as of 2018.
It's worth pointing out that this ratio of environmental influence on intelligence to genetic influence on intelligence is not fixed. If half the population had chronic exposure to lead in their drinking water and the other half had clean drinking water, the percentage of variance in intelligence explained by environmental factors would go up. Similarly, if half the population was genetically engineered to be unusually intelligent while the other half was not, the percentage of variance explained by genes would go up.
The more important thing to note here is how large of an increase we could get to intelligence by simply increasing the frequency of the SNPs positively correlated with its expression. Professor Steven Hsu has estimated that there is enough additive variance in the human population to create people with IQs of over 1000 if we were to add them all together. We almost certainly wouldn't want to incorporate all these variants into a single person, as some likely have tradeoffs with health, reproductive propensity, or other things we care about that would make incorporating them a poor choice. Another concern is whether or not IQ, as measured by tests like progressive matrices will continue to correlate with the things we actually care about at such extreme levels. It's likely that the use of today's IQ tests to predict intelligence will break down if we push it far enough in either direction. But exactly where that point is remains an important open question. It seems likely to me that we will be able to raise average human IQ into the high 100's without any serious downsides. We already have thousands of examples of people with IQs this high, most of whom were functional in other ways we care about. In fact, if we get much better tests of genetic intelligence and we are also able to get iterated embryo selection to work, this question of how far we can safely push trait expression will become the chief remaining question.
I've followed AI safety research as a hobby for the last few years and one of the lessons I've learned from the research is machines that optimizing for any objective X will eventually impact another objective Y if one pushes hard enough. This will doubtlessly be the case with intelligence.
It's quite difficult to estimate how hard it will be to develop better tests. One obvious step is to increase the sample size of the study. This will help detect genetic variants with smaller effect sizes on intelligence and to detect rarer variants. Another obvious step is to perform whole-genome sequencing, which would help capture rare variants that may account for currently uncaptured variance.
The best paper I've found on this topic is Stephen Hsu's 2014 paper On the genetic architecture of intelligence and other quantitative traits, which estimates that a sample size of a million would be enough to capture nearly all the variance. However, since its publication, studies examining ~250k individuals were only able to explain 7-10% of the variance in cognitive performance (see the top of page 2). Furthermore, this study only identified 225 significant SNP hits, well short of the 10,000 that Hsu estimates play a role in intelligence. The relationship between sample size and discovered SNPs is not linear, but it's not clear how much of the missing heritability is due to smaller sample size as opposed to other things. Are there more variants that influence intelligence with smaller effect sizes than Hsu predicted? Do non-linear effects play a bigger role? Is there some other confounding factor? I don't yet know the answer to these questions.
So after a couple of days of research trying to complete this section, I am stuck with no clear answers. It is not clear to me how large of a sample size we'll need to get an accurate measurement of intelligence, nor is it clear what additions we'll need to basic additive models to obtain high performance on such tests.
If I had to hazard a very rough guess, I would say that a sample size of 10 million with full genome sequencing performed on every participant would probably be sufficient to capture >80% of the genetically caused variance in intelligence. Assuming $300 per genome sequenced and $100 to administer each test, this comes out to a price tag of $4 billion. Not cheap, but well within the realm of feasibility. And hopefully, economies of scale would help lower the price, at least for the genome sequencing portion.
2. Solve Iterated Embryo Selection
The Second necessary technology needed to allow for dramatic improvements in polygenic traits such as intelligence on a short timeline is Iterated Embryo Selection or IES. IES is a technology that theoretically allows for arbitrarily large increases in trait values on a much shorter time horizon than any other near-term technology. IES involves the following 6 steps:
- Extract somatic cells from an organism or tissue (usually skin cells or blood cells)
- Revert these cells back to induced pluripotent stem cells
- Develop those stem cells into gametes (reproductive cells like sperm or eggs)
- Fertilize the gametes to create a batch of new embryos
- Sequence the DNA of the embryos, selecting the best of the batch
- Develop the selected embryos into a larger amount of tissue, like skin cells or blood cells
- Repeat steps 1-6
IES essentially takes the reproductive cycle from 20+ years down to 6 months. Whereas normal IVF ends when an embryo is selected for implantation, IES takes the selected embryo through another cycle of meiosis and recombination (possibly introducing new genetic material from another group of embryos in the process). After each round of iteration, the mean trait values of the new pool of embryos will be equal to the highest-scoring embryos from the previous round. This is the true magic of iterated embryo selection; once feasible, it allows for arbitrary gains in any genetically influenced trait.
So what ingredients are we still missing? Step 1 is trivial. Step 2 has been possible since 2006 when Shinya Yamanaka's lab produced the first induced pluripotent stem cells using "Yamanaka factors", and is in fact an active step in most stem cell therapies. Step 4 is a standard part of IVF and step 5 is becoming more common. Step 6 seems like it's already possible given that most research into tissue engineering assumes embryonic stem cells or some other pluripotent stem cells as a starting point. As far as I can tell, the only step that has not yet been accomplished to completion in humans is step 3: differentiation of pluripotent stem cells into gametes.
We have already gotten step 3 to work in mice. In 2016, Hikabe et al showed reconstitution of the entire female mouse germline in vitro (steps 1-3 in the list above). This process, known as In-Vitro Gametogenesis, is critical to all attempts to Iterated Embryo Selection. Hikabe et. al. were able to harvest a sample of skin cells from the tail of the mouse, revert those cells back to a pluripotent state using Yamanaka factors, differentiate those IPSCs into oocytes, then fertilize the resulting Oocytes to create mouse embryos, which were then implanted in a female mouse who gave birth to healthy pups.
There's a really fantastic summary of current progress of this technology in humans by Dr. Sherman Silber on YouTube. We are very close. The only remaining unrealized step is getting from primordial germ cells to sperm and eggs. What makes this step so difficult is recreating the conditions in which primordial germ cells mature into spermatogonial stem cells and eggs within the human body.
Silber believes this may be easier for oocytes than for sperm. To culture PGCs into oocytes, the PGCs must develop in the presence of fetal granulosa cells. These cells are critical because they emit a set of growth factors that tell the PGCs to develop into Oocytes. Silber believes we should be able to replicate these conditions by isolating the growth factors and applying them to the PGCs.
Sperm are trickier. According to Dr. Silber the only method they've been able to use so far to mature primordial germ cells into spermatogonial stem cells is injection of PGCs into the rete testes of a prepubescent boy. The pubescent development process, as it turns out, is critical to mature PGCs into spermatogonial stem cells, and those conditions cannot be found in adult testes.
While this type of injection works well for restoring fertility in individuals who lost it in childhood (usually due to cancer treatments), this will not work for Iterated Embryo Selection. But unfortunately a cursory search yielded no results for spermatogenesis via growth factors. Either research into spermatogenesis has not been funded or I have simply been unable to find the published papers.
So to summarize: we are very close to making Iterated Embryo Selection possible. The missing piece is the ability to turn primordial germ cells into oocytes and sperm. Ongoing research will likely make this possible for oocytes in the next 5-10 years, but the path for spermatogenesis is less clear.
Reflections on the Value of Human Genetic Engineering
This will not be my last paper on the topic, but I wanted to take a brief moment to reflect on why I think human genetic engineering is important. Apart from the obvious near-term benefits of reducing chronic disease, I think in the long run, genetic engineering will only matter if it affects the development of transformative artificial intelligence.
I don't remember exactly where I read this, but in another post I read on LessWrong, the author suggested that biological systems may simply become obsolete in the future because computer-based information processing systems will become better at turning energy into utility. I suspect that in the long run, this will probably be true.
I am very worried that current humans are simply incapable of aligning powerful AI with our interests due to the incredible technical complexity of the problem. My goal in pursuing a career in genetics with a focus on human reproduction is to increase human capability to deal with incredibly technical problems like those involved in creating TAI. Along the way I hope we can create a kinder, healthier society with fewer mismatches between our genes and our environment.
If some of the more pessimistic projections about the timelines to TAI are realized, my efforts in this field will have no effect. It is going to take at least 30 years for dramatically more capable humans to be able to meaningfully contribute to work in this field. Using Ajeya Cotra's estimate of the timeline to TAI [AF · GW], which estimates a 50% chance of TAI by 2052, I estimate that there is at most a 50% probability that these efforts will have an impact, and a ~25% chance that they will have a large impact.
Those odds are good enough for me.
33 comments
Comments sorted by top scores.
comment by lsusr · 2021-04-11T03:10:46.924Z · LW(p) · GW(p)
What an informative, well-researched, well-written post. I am curious about the Iterated Embryo Selection. If you use two parents then would it result in inbreeding? Would you need more than two parents to avoid inbreeding? If the latter then that could reduce the rate of adoption.
You also mention that "that optimizing for any objective X will eventually impact another objective Y if one pushes hard enough". This is true. I wonder how much of it can be avoided by both optimizing for a positive trait X while simultaneously optimizing against the traits of people with negative life outcomes.
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-11T06:56:18.743Z · LW(p) · GW(p)
Thanks! I spent an embarrassingly long time writing it.
Your question about iterated embryo selection is an interesting one. I suspect that performing this procedure multiple times without adding genetic material WOULD result in higher defect rates, though I'm not positive. If one is already selecting against the types of negative traits that inbreeding increases, would we still expect to see higher rates of health conditions even after selection, or would inbreeding simply decrease our the average quality of embryos due to a higher percent having health issues?
Part of my problem is not understanding exactly why inbreeding is bad. I'm familiar with the standard answer that "inbreeding increases the chance that offspring inherit recessive diseases", but why exactly is that? One answer is Muller's ratchet, which says that environmental damage leads to a constant increase in deleterious mutations to the germline, and the only feasible way to decrease mutational load is through sex. Under this model, sex is kind of like a simple error correction mechanism: a single mutation is unlikely to occur in both organisms, so given the production of enough offspring, one of them is likely to have reduced mutational load.
So under this model, inbreeding is bad because it correlates genetic mutations. If two organisms share a larger portion of their DNA, they are likely to inherit many of the same mutations, preventing their descendants from shedding mutational load through lucky recombination.
But if that analysis is correct and inability to shed mutational load is the main reason for increased health problems among inbred offspring, then perhaps it wouldn't be such a big deal for iterated embryo selection after all, since there is very little time between generations for the embryos to accrue deleterious mutations.
In the end though, one almost certainly would want to introduce genetic material from other parents simply because it would allow for additional valuable genetic material from which to select.
I wonder how much of it can be avoided by both optimizing for a positive trait X while simultaneously optimizing against the traits of people with negative life outcomes.
You've put your finger right on one of the most important questions for the future of this field. If we simply add enough important traits to our linear equation, can we push as far as we want into the tails of these trait distributions? I think we should be able to get at least 3 or 4 standard deviations from the mean of most traits with this method. Possibly much further. But how far can we actually go before we end up optimizing against some important trait that we simply didn't include in the equation because we either didn't think about it or didn't realize it was important?
This is one of the reasons I've tried to keep up with AI safety research. They are far far ahead of biologists in understanding how to properly frame and begin to answer these types of questions. If you squint hard enough, trait enhancement with iterated embryo selection starts to look a lot like iterated amplification and distillation, and the question of how far we should push into the tails of the distributions becomes a question of how high we can safely set the learning rate.
One of the topics I'd like to write about in the future is how to apply ideas from AI safety to the field of genetic enhancement. It's actually quite hard to do this rigorously because many of the assumptions that underlie techniques to align AI don't really work with genetic engineering. With humans, you have an insanely long lag before you can validate that the changes actually worked, and the cost of getting things wrong is huge. With software models, you can explore actions that your current policy says aren't optimal. Models can be updated very quickly. But with humans the cost of exploration is extremely high.
And unless everyone in this community is wildly off about their timelines to transformative AI, we may only get one genetically engineered generation before biological organisms like ourselves cease to be relevant to the larger strategic picture.
comment by Mitchell_Porter · 2021-04-11T06:29:22.842Z · LW(p) · GW(p)
This is not what I expected. I thought this article would be about molecular methods of directly altering the genome - CRISPR, artificial chromosomes, etc.
But instead I only see one method mentioned, and it consists of a quasi-darwinian cycle in which lots of eggs are fertilized, allowed to divide a few times, genetically screened for desired traits, and then cells from these early-stage embryos are used to make a new generation of sperm and eggs so as to repeat the cycle.
Darwinian evolution consists of variation followed by selection, and here the engine of variation is the all-natural process of chromosomal recombination that occurs during sexual reproduction. In nature, the fertilized egg then grows into an organism, and the selective filter is how well it survives and reproduces out in the world. But in the described process of accelerated artificial selection, the fertilized eggs don't grow into organisms. Instead, they are sequenced in order to discover the individual genotypes produced, and evaluated on the basis of a guess as to how they would fare, if they did grow into an organism.
To put it another way, natural selection is a cycle of genotypes that grow into phenotypes that mate and create new genotypes, but this accelerated artificial selection uses virtual phenotypes obtained by combining sequence information with GWAS-based interpretation.
I'll admit that's ingenious. And it would be interesting to know if an analogous method has ever been used successfully, on any kind of organism.
I see two opportunities for doubt: the selection criteria, and the safety of repeated artificial fertilization/gametogenesis. Regarding the first, one may doubt GWAS on the grounds of reliability (false positives) and power (not enough variance accounted for). Regarding the second, one would like to know that this process isn't creating e.g. some cumulative epigenetic artefact.
A few further comments:
This article is headlined as a "review of current and near-future methods", but it really seems to be about promoting this one particular method (iterated embryo selection). There's discussion in the comments here [LW · GW] about the history of this idea - it was mentioned in a bioethics journal in 2012, under the name "in vitro eugenics"; it was discussed by Carl Shulman at MIRI in 2009; and Gwern found a precursor dating from 1998.
I think a genuine review would have to say more about direct genetic modification. The one instance of human genetic engineering that we know about, performed in China in 2018, of course used CRISPR. I believe this is now illegal in China (see draft item 39 here), as of last month. And CRISPR ends up modifying more than just the targeted gene. Nonetheless, genome editing will surely be part of future human genetic engineering.
Meanwhile, iterated gametogenesis will just as surely have its own safety issues. They say there were 276 failed attempts before the successful cloning of a sheep (Dolly). Cumulative epigenetic modifications, of a kind not occurring in nature, seems an extremely likely risk.
Speaking of epigenetics, I've just discovered the existence of another class of methods, epigenome editing... And then there's the topic of nonheritable (and possibly temporary) genetic modifications made to mature organisms. If what you care about is biological intelligence increase, somatic gene-hacking seems likely to get there before germline gene-hacking, because you don't have to wait for your first generation to grow up.
Replies from: GeneSmith, GeneSmith↑ comment by GeneSmith · 2021-05-25T05:02:27.662Z · LW(p) · GW(p)
Speaking of epigenetics, I've just discovered the existence of another class of methods, epigenome editing... And then there's the topic of nonheritable (and possibly temporary) genetic modifications made to mature organisms. If what you care about is biological intelligence increase, somatic gene-hacking seems likely to get there before germline gene-hacking, because you don't have to wait for your first generation to grow up.
I read something relevant to this idea tonight that I think makes it less likely we will be able to significantly impact intelligence with epigenetic editing. A paper in PNAS from last year looked at which functional regions of the genome saw enrichment of educational-attainment associated SNP hits:
The EA3 study on educational attainment, a highly polygenic trait, is another notable recent example of this type of analysis (39). A very large number of category enrichment analyses was performed on 1,271 independent genome-wide significant signals detected in a GWAS of 1.1 million individuals with educational attainment data. The authors highlight two broad findings. First, the most significantly prioritized genes that were implicated as causal show trajectories of expression in the brain that are increased before the late prenatal stage of development and decline thereafter. Weaker, newly discovered, associations showed no such trajectory. This suggests a modestly disproportionate influence of brain development relative to active brain functioning in determining differences between individual abilities underlying educational attainment, which is perhaps not surprising.
This suggest that even if we were somehow able to inject some epigenome modifying vector into brains capable of modifying a significant fraction of neurons and even if the inevitable cell mosaicism induced by such changes had no negative impact on cognitive function, we would STILL be severely limited in the proportion of genetically influenced intelligence we could impact.
Not to mention it seems very likely that the cost of modifying 86 billion cells in the brain would far exceed the cost of sequencing embryo DNA.
↑ comment by GeneSmith · 2021-04-12T20:13:08.030Z · LW(p) · GW(p)
Thank you for writing such a thoughtful comment. I have to confess, I probably gave this post the wrong title. For the longest time I simply titled it "Genetic Engineering Part 3" as I wasn't sure what to call it when I first started. I then accidentally left that title in when I first published it and hastily changed it to its current title even though that doesn't quite fit either.
You're correct, of course, that I did not comprehensively review all possible techniques for genetic engineering. Most notably among these is whole-genome synthesis, with which we could theoretically create an entire genome with any base pairs we wanted. In my research I estimated that synthesizing a whole human genome from scratch would cost about $200 million. So we still have a few orders of magnitude to go before whole genome sequencing becomes a viable method for creating superhumans.
I also have some serious concerns about other much more dangerous uses of whole-genome synthesis. If the technology becomes cheap enough and widely enough available it could become an incredibly dangerous weapon for engineering biological weapons. This is such a big worry that I think pursuing human genetic modification via genome synthesis might actually end up INCREASING the risk of human extinction rather than decreasing it.
Regarding the first, one may doubt GWAS on the grounds of reliability (false positives) and power (not enough variance accounted for)
If there were false positives in a GWAS then the model would have poor performance on the test set. Of course there ARE issues with GWAS predictive power when you try to generalize to other populations with a high ancestral distance from your training set. For example I remember reading about a GWAS for general cognitive ability that predicted about 10% of variance in Europeans, but only 2.5% for people of African descent. However that isn't an issue of false positives. It's an issue of different genes having different frequencies in each population. We could create a good predictor for people of African descent if we had data sets that included more people from those populations.
Regarding the second, one would like to know that this process isn't creating e.g. some cumulative epigenetic artefact.
This is something I didn't even think about when writing the paper, so thanks for bringing it up. I would think that the epigenome would be preserved throughout this process, but that assumption might be wrong.
comment by gilch · 2021-04-10T21:05:25.400Z · LW(p) · GW(p)
Wouldn't a more limited form of iterated embryo selection still be possible with the oocytes alone? You'd still have to fertilize the eggs with sperm from the current generation, but the eggs could be derived from a selected female embryo, then you do it again, sperm from the current generation, but eggs from second-generation selected female embryos, etc.
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-11T05:58:37.871Z · LW(p) · GW(p)
This is an interesting idea. I suspect that this type of selection would asymptotically approach twice the per-generation gain of simple embryo selection. So useful but not really transformative.
Replies from: gilch↑ comment by gilch · 2021-04-11T16:30:36.562Z · LW(p) · GW(p)
Are sperm necessary at all? Eggs have also gone through meiosis, so they're haploid just like a sperm nucleus. Can you just implant the nucleus from a selected embryo's egg into another selected egg and then add the "you've been fertilized" chemical signal? I'm not sure how complex that process is.
If that's too hard, what about surgically swapping in the nucleus from an egg cell produced from a selected embryo into a healthy sperm cell? Would the sperm function for long enough to fertilize an egg?
Of course, we can only produce females this way, but that could still be transformative (unless you can make oocytes from male cells to get a haploid nucleus? They do have one X chromosome, so they should have the required genes.)
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-11T18:47:25.079Z · LW(p) · GW(p)
This is an interesting idea. Nuclear transfer has been used in cloning before, but it is not particularly reliable. That being said, perhaps future research could improve the success rate (and more importantly the ease of doing so).
At the end of the day, the entire iterated embryo selection process is about generating a complete DNA sequence that scores better on our tests. I left out whole genome synthesis from the original post because from the brief reading I did on the topic it seemed prohibitively expensive. But that could change in the future, as the cost per base pair has been declining exponentially for some time now. The most notable recent use of whole-genome sequencing was to create the mRNA in the Pfizer-BioNTech and Moderna vaccines.
Maybe I'll write a future post about this topic.
comment by Samuel Clamons (samuel-clamons) · 2021-04-13T18:43:39.876Z · LW(p) · GW(p)
To clear up a possible confusion around microarrays, SNP sequencing, and GWAS - microarrays are also used to directly measure gene expression (as opposed to trait expression) by hybridizing mRNA extracted from a tissue sample and hybridizing that against a library of known RNA sequences for different genes. This uses the same technology as microarray-based GWAS, but for different purpose (gene expression vs. genomic variation), and with different material (mRNA vs amplified genomic DNA) and analysis math.
Also, there's increasingly less reason to use microarrays for anything. It's cheap enough to just sequence a whole genome now that I'm pretty sure newer studies just use whole genome sequencing. For scale, the lab I worked in during undergrad (midsized lab at a medium sized liberal arts college, running on a few 100k $/yr) was transitioning from microarray gene expression data to whole-transcriptome sequencing back in 2014. There's a lot of historical microarray data out there that I'm sure researchers will still be reanalyzing for years, but high throughput sequencing is the present and future of genomics.
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-13T20:20:40.024Z · LW(p) · GW(p)
Thanks for the detail about microarrays.
Do you have any sense as to how much it costs to sequence a whole human genome right now? I estimated about $300, but that was based on essentially one vendor.
Replies from: samuel-clamons↑ comment by Samuel Clamons (samuel-clamons) · 2021-05-22T02:36:25.931Z · LW(p) · GW(p)
Hey, sorry for the long time replying - last I checked, it was a few hundred $s to sequence exome-only (that is, only DNA that actually gets translated into protein) and about $1-1.5k for whole genome - but that was a couple of years ago, and I'm not sure how much cheaper it is now.
comment by gilch · 2021-04-10T21:14:26.854Z · LW(p) · GW(p)
There's another problem with iterated embryo selection that I haven't seen accounted for. I don't recall the exact numbers, but some surprisingly large fraction of natural human pregnancies result in a spontaneous abortion. Exact causes of this may vary, but I think a significant fraction of those are simply not viable due to genetic mutations. Adult parents have at least proven they have the genes necessary to both survive until adulthood and find a mate. Embryonic parents haven't proven that. Certainly we can use genetic tests to screen them for known genetic diseases, and that is kind of the point, but how do we screen them for unknown genetic diseases?
Replies from: gilch, TurnTrout↑ comment by gilch · 2021-04-11T16:08:50.971Z · LW(p) · GW(p)
Searching around the web, it looks like most miscarriages are due to aneuploidy. That would be easy to detect and select against.
It's hard to find good numbers for the human mutation rate. I saw numbers ranging from 42 to 200 per generation. Sperm seem to have more mutations than eggs on average. It can vary based on environmental exposure to mutagens, and older parents tend to have more mutations on average. Perhaps embryonic parents simply wouldn't have the time to accumulate many mutations. On the other hand, one has to do unnatural things to get these embryonic cells to turn into gametes. If any of these steps are mutagenic, then the mutation rate could be even worse.
↑ comment by TurnTrout · 2021-04-10T21:57:52.796Z · LW(p) · GW(p)
I feel confused wrt the genetic mutation hypothesis for the spontaneous abortion phenomenon. Wouldn't genes which stop the baby from being born, quickly exit the gene pool? Similarly for gamete formation processes which allow such mutations to arise?
Replies from: gilch↑ comment by gilch · 2021-04-11T00:41:48.345Z · LW(p) · GW(p)
Wouldn't genes which stop the baby from being born, quickly exit the gene pool?
Yes, by killing the fetus before it's born. New mutations still happen all the time. Usually they hit junk DNA and not much happens, but what if it breaks something vital? And it's possible to inherit deleterious recessive alleles from both parents. That why incest is still a problem, from a genetic standpoint.
Similarly for gamete formation processes which allow such mutations to arise?
And yet we still have transposons. Evolution requires some amount of mutation, which is occasionally beneficial to the species. Species that were too good at preventing mutations would be unable to adapt to changing environmental conditions, and thus die out.
Replies from: TurnTrout↑ comment by TurnTrout · 2021-04-11T02:23:10.384Z · LW(p) · GW(p)
Evolution requires some amount of mutation, which is occasionally beneficial to the species. Species that were too good at preventing mutations would be unable to adapt to changing environmental conditions, and thus die out.
We're aware of many species which evolved to extinction. I guess I'm looking for why there's no plausible "path" in genome-space between this arrangement and an arrangement which makes fatal errors happen less frequently. EG why wouldn't it be locally beneficial to the individual genes to code for more robustness against spontaneous abortions, or an argument that this just isn't possible for evolution to find (like wheels instead of legs, or machine guns instead of claws).
comment by steven0461 · 2021-04-13T20:39:45.581Z · LW(p) · GW(p)
Great post, very informative
Step 7 seems like it's already possible given that most research into tissue engineering assumes embryonic stem cells or some other pluripotent stem cells as a starting point.
Typo for "Step 6"?
Replies from: GeneSmithcomment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-13T12:51:04.602Z · LW(p) · GW(p)
Thanks for this post!
If some of the more pessimistic projections about the timelines to TAI are realized, my efforts in this field will have no effect. It is going to take at least 30 years for dramatically more capable humans to be able to meaningfully contribute to work in this field. Using Ajeya Cotra's estimate of the timeline to TAI [LW · GW], which estimates a 50% chance of TAI by 2052, I estimate that there is at most a 50% probability that these efforts will have an impact, and a ~25% chance that they will have a large impact.
Those odds are good enough for me.
How low would the odds have to be before you would switch to doing something else? Would you continue with your current plan if the odds were 20-10 instead of 50-25?
Replies from: GeneSmith, GeneSmith↑ comment by GeneSmith · 2021-04-13T20:07:40.586Z · LW(p) · GW(p)
I think if the odds were below 10% I would probably switch. Other than faster-than-expected progress in AI, the biggest thing I'm worried about is iterated embryo selection taking too long. That seems like the only technology capable of creating truly superlative humans capable of making a significant impact before TAI is created.
Replies from: TurnTrout↑ comment by TurnTrout · 2021-04-13T20:13:31.147Z · LW(p) · GW(p)
Do you think such humans would have a high probability of working on TAI alignment, compared to working on actually making TAI?
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-13T21:36:56.632Z · LW(p) · GW(p)
This is a really good question. I'm not sure I have a satisfying answer to this other than to say that awareness of the dangers of both nuclear weapons and computers has been disproportionately high among extremely smart people. John Von Neumann literally woke up from a dream in 1945 and dictated to his wife the outcome of both the Manhattan Project and the more general project of computation.
One night in early 1945, just back from Los Alamos, vN woke in a state of alarm in the middle of the night and told his wife Klari:
“… we are creating … a monster whose influence is going to change history … this is only the beginning! The energy source which is now being made available will make scientists the most hated and most wanted citizens in any country.
The world could be conquered, but this nation of puritans will not grab its chance; we will be able to go into space way beyond the moon if only people could keep pace with what they create …”
He then predicted the future indispensable role of automation, becoming so agitated that he had to be put to sleep by a strong drink and sleeping pills.
In his obituary for John von Neumann, Ulam recalled a conversation with von Neumann about the “ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue.”
Or Alan Turing around the same time:
“It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers… They would be able to converse with each other to sharpen their wits. At some stage therefore, we should have to expect the machines to take control.”
Another one from him:
“Let us return for a moment to Lady Lovelace’s objection, which stated that the machine can only do what we tell it to do. One could say that a man can "inject" an idea into the machine, and that it will respond to a certain extent and then drop into quiescence, like a piano string struck by a hammer. Another simile would be an atomic pile of less than critical size: an injected idea is to correspond to a neutron entering the pile from without. Each such neutron will cause a certain disturbance which eventually dies away. If, however, the size of the pile is sufficiently increased, the disturbance caused by such an incoming neutron will very likely go on and on increasing until the whole pile is destroyed. Is there a corresponding phenomenon for minds, and is there one for machines? There does seem to be one for the human mind. The majority of them seem to be "sub critical," i.e. to correspond in this analogy to piles of sub-critical size. An idea presented to such a mind will on average give rise to less than one idea in reply. A smallish proportion are supercritical. An idea presented to such a mind may give rise to a whole "theory" consisting of secondary, tertiary and more remote ideas. Animals’ minds seem to be very definitely sub-critical. Adhering to this analogy we ask, "Can a machine be made to be super-critical?”
Granted, these are just anecdotes. And let it be noted that Von Neumann and Turing both went on to make significant progress in their respective fields despite these concerns. My current theory is that yes, they are more likely to both recognize the danger of AI and do something about it. But that could be wrong. I will have to think more about this.
↑ comment by GeneSmith · 2021-04-13T20:26:43.535Z · LW(p) · GW(p)
I'm not sure about the exact threshold. If the odds were below 10% I think that would be enough for me to switch to AI.
There is one other way in which I think a career in genetics could translate into a career in existential risk mitigation: through reducing the likelihood of engineered pandemics. One of the key technologies that holds incredible potential for good and for harm is genome synthesis. Given the recent rates of cost decline, I worry that someone might be able to re-create super smallpox or something before we even get to TAI. A career in genetics would put me closer to that technology, so maybe I could help design systems to prevent that particular type of disaster.
comment by [deleted] · 2021-04-11T22:52:26.022Z · LW(p) · GW(p)
There's a hole in the assumptions in your last paragraph. Implicitly you are saying that you believe TAI will benefit from or require the actions of a few 'super-genius' human beings to make possible.
There are some flaws in your statements to unpack:
a. The existence of human 'super geniuses'. Nature can only do so much to improve our intelligence, being stuck with living cells as computational circuits in a finite brain volume, with finite energy supply. It isn't clear how meaningful the intelligence differences really are in terms of utility on actual tasks.
b. The kind of tasks that intelligence testing can measure being relevant to the task of designing a TAI. Thing is, the road to get there isn't going to involve a whole lot of someone solving math problems in their head as they pound a keyboard through the night writing reams of custom code. A whole lot of it will be careful, methodical organization of your problem into clear layers and carefully checked assumptions to prevent math leaks (a math leak would be where a heuristic being optimized for is slightly incorrect, leading to the system building a suboptimal solution. I think of it as 'leaking' the delta between the incorrect approximation and the correct approximation). A lot of the "keyboard pounding" can be automated by building early bootstrap agents that find for us a near optimal algorithm for a given piece of the AI problem. Moreover, most code should be reused so we don't have humans just re-resolving the same problems over and over.
c. A lot of the pieces needed to get there from here are probably organizational. You need thousands of people and some way to standardize everyone's efforts and build APIs and frameworks and other mechanisms to gain benefit from all these separate workers. A single person is not going to meaningfully solve this problem by themselves. You'll very likely need an immense framework of support software, and some method of iteratively improving it over time without significant regression. (the failure mode of most large software projects)
If a-c has a 90% chance of being correct, then the actual probability would be 0.1*0.25 or 2.5%, and probably not worth the hassle. Note that there is a cost - the medical procedures to create genetically modified embryos have risks of screwing something up, giving you humans who are doomed to die some horrific way.
Just as a general policy, anything current flesh and blood humans with are having trouble with, that smarter humans have less trouble with, current humans can probably write a piece of software that is better than the efforts of any humans. With today's techniques.
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-12T20:45:15.590Z · LW(p) · GW(p)
Nature can only do so much to improve our intelligence, being stuck with living cells as computational circuits in a finite brain volume, with finite energy supply.
This is true, and it's one of the main reasons I think AI will eventually overtake us no matter how much genetic engineering we do. But like I said in the post, there is enough additive variance in existing gene pool to create humans with predicted IQs of over a thousand. We are far from the actual physical limits on brain size, neural conduction speed, neuron size, and many others.
It isn't clear how meaningful the intelligence differences really are in terms of utility on actual tasks.
How many company founders emphasize that "attracting talent" is the most difficult part of making their company successful? How many Nobel winners have IQs several standard deviations above the mean? It is very clear that intelligence has a huge impact on performance on a wide variety of tasks. If you want more examples of this I suggest you read one of Gwern's many essays on the topic
Thing is, the road to get there isn't going to involve a whole lot of someone solving math problems in their head as they pound a keyboard through the night writing reams of custom code.
This isn't an accurate characterization of the type of task that unusually intelligent people excel at. I agree with you that raw intelligence isn't the only thing need to solve this problem. Creating proper organizational incentive structures will matter a lot too, as will clear-headed thinking about the problem. But intelligent people are actually very good at exactly those types of things. Look at how many of the top scientists in the country played a significant role in the Manhattan Project.
A single person is not going to meaningfully solve this problem by themselves.
Which is why I didn't suggest we create one super-genius and call it a day. I don't want access to pre-implantation genetic screening or iterated embryo selection to be available to only the privileged elite. It needs to be broadly available to any parent that wants it. And the benefits can and will go to many fields, not just AI.
Just as a general policy, anything current flesh and blood humans with are having trouble with, that smarter humans have less trouble with, current humans can probably write a piece of software that is better than the efforts of any humans. With today's techniques.
This is just obviously wrong. TAI cannot be created by current. humans with today's techniques. It's likely going to take decades to create the technology to do so, and it's going to take some of the smartest researchers in the world to do it.
Replies from: None↑ comment by [deleted] · 2021-04-12T21:28:58.132Z · LW(p) · GW(p)
My support for the last paragraph is that many of the things we credit "exceptionally smart" people with doing like solving equations can be automated. Or exploring function spaces for a better solution. Or, well, any problem that has a checkable answer, which are the very things iq tests measure.
It's not on an IQ test how to imagine a better aircraft that is both creative and meets design specs. It's always problems that a clear answer exists for.
Anyways in my personal experience I have met a lot of "brittle" people. They have no outer visualization for how a machine actually works and just get stuck the moment they hit a problem that wasn't in a training exercise at school. Basic ideas just don't occur to them.
But yeah if you put me up against them on rigidly defined problems taught in a book I might be slightly slower.
Note that I personally test at around 80-97th percentile depending on the test. (MCATs was 97). This tells me that whatever intelligence I have lucked into having is substantially above average but not the best.
I am saying an army of people only as good as me - top quintile - can and will create TAI decades before genetic engineering will matter.
Replies from: GeneSmith↑ comment by GeneSmith · 2021-04-13T02:37:51.833Z · LW(p) · GW(p)
I am saying an army of people only as good as me - top quintile - can and will create TAI decades before genetic engineering will matter.
Yes, this is a concern for the utility of this approach. If TAI is created before 2050, none of this work will matter much because none of the unusually intelligent people we've been able to create will have had time to make meaningful contributions to the field of AI. In that sense, research in this field is a gamble that only starts paying off if AI takes until at least 2050. Genetic engineering will have a progressively larger impact the longer it takes to develop TAI.
This timing concern was actually one of my chief worries about going into genetics as a career. I won't be able to switch careers and start having a large impact on AI if research in that field progresses faster than expected. So it's possible there will come a point in the future where I am stuck on the sidelines in the final years before TAI is created, watching 30 years of work come to nothing.
But I think 50% odds of having a huge impact are worth taking, and I think the biological route to superintelligence is severely neglected right now. Who is actually working on genetic engineering right now? I literally know one person who has both expressed an interest in genetic engineering for intelligence and has real scientific expertise in the field: Steven Hsu. And sadly he seems to have turned away from his earlier goals after his public humiliation at the hands of misguided student activists at Michigan State University.
I am hopeful that as pre-implantation genetic screening via IVF becomes a more normalized part of the pregnancy process, attitudes will change. It's pretty silly that so many people think enhancing our children's potential via physical exercise and healthy food is acceptable but that genetic intervention should be off-limits.
Replies from: None↑ comment by [deleted] · 2021-04-13T17:28:20.956Z · LW(p) · GW(p)
Oh. The reason you shouldn't go into genetics as a career is you will not be permitted to do anything on humans until after we have TAI. Your career will just be wasted. You should work on AI unless you are already in a PhD program.
There are countless legal and structural barriers in the way.
Replies from: Zack_M_Davis, GeneSmith↑ comment by Zack_M_Davis · 2021-04-13T18:15:00.346Z · LW(p) · GW(p)
The effective altruist case for regime change??
Replies from: None↑ comment by GeneSmith · 2021-04-13T20:16:38.267Z · LW(p) · GW(p)
There are at least two companies in the US alone already doing pre-implantation screening for polygenic disease risk right now, and one of them is offering screening for unusually low IQ already. It's not that big of a stretch to imagine that parents will want to actively screen for IQ or other important traits in the next decade.
There are no legal barriers to embryo selection for intelligence. There may be some put up at some point in the future (which is a source of worry for me), but the current barriers are technological, not legal.
There was a survey done in Singapore and 87% of parents said they would be willing to intervene genetically to make their children smarter if the option was available. Attitudes in Korea are similar. If worse comes to worse I'll just work for a company or in a lab somewhere that hasn't banned it.
Replies from: None