Posts

How truthful is GPT-3? A benchmark for language models 2021-09-16T10:09:52.569Z
Owain_Evans's Shortform 2021-06-19T13:17:54.273Z
AI Safety Research Project Ideas 2021-05-21T13:39:39.790Z
Solving Math Problems by Relay 2020-07-17T15:32:00.985Z
Quantifying Household Transmission of COVID-19 2020-07-06T11:19:34.047Z
Update on Ought's experiments on factored evaluation of arguments 2020-01-12T21:20:42.317Z
Neural nets as a model for how humans make and understand visual art 2019-11-09T16:53:49.350Z
Machine Learning Projects on IDA 2019-06-24T18:38:18.873Z
Model Mis-specification and Inverse Reinforcement Learning 2018-11-09T15:33:02.630Z

Comments

Comment by Owain_Evans on How truthful is GPT-3? A benchmark for language models · 2021-09-17T11:47:22.648Z · LW · GW

Many possible prompts can be tried. (Though, again, one needs to be careful to avoid violating zero-shot). The prompts we used in the paper are quite diverse. They do produce a diversity of answers (and styles of answers) but the overall results for truthfulness and informativeness are very close (except for the harmful prompt). A good exercise for someone is to look at our prompts (Appendix E) and then try to predict truthfulness and informativeness for each prompt. This will give you some sense of how additional prompts might perform. 

Comment by Owain_Evans on How truthful is GPT-3? A benchmark for language models · 2021-09-17T11:43:28.009Z · LW · GW

Thanks for your thoughtful comment! To be clear, I agree that interpreting language models as agents is often unhelpful. 

a main feature of such simulator-LMs would be their motivationlessness, or corrigibility by default. If you don’t like the output, just change the prompt!

Your general point here seems plausible. We say in the paper that we expect larger models to have more potential to be truthful and informative (Section 4.3). To determine if a particular model (e.g. GPT-3-175B) can answer questions truthfully we need to know:

  1. Did the model memorize the answer such that it can be retrieved? A model may encounter the answer in training but still not memorize it (e.g. because it appears rarely in training). 
  2. Does the model know it doesn’t know the answer (so it can say “I don’t know”)? This is difficult because GPT-3 only learns to say “I don’t know” from human examples. It gets no direct feedback about its own state of knowledge. (This will change as more text online is generated by LMs). 
  3. Do prompts even exist that induce the behavior we want? Can we discover those prompts efficiently? (Noting that we want prompts that are not overfit to narrow tasks). 

(Fwiw, I can imagine finetuning being more helpful than prompt engineering for current models.)

Regarding honesty: We don’t describe imitative falsehoods as dishonest. In the OP, I just wanted to connect our work on truthfulness to recent posts on LW that discussed honesty. Note that the term “honesty” can we used with a specific operational meaning without making strong assumptions about agency. (Whether it’s helpful to use the term is another matter).

Comment by Owain_Evans on How truthful is GPT-3? A benchmark for language models · 2021-09-16T17:58:45.666Z · LW · GW

No, what I wrote is correct. We have human evaluations of model answers for four different models (GPT3, GPT3, GPT-Neo/J, UnifiedQA). We finetune GPT3 on all the evaluations for three out of four models, and then measure accuracy on the remaining (held-out) model. For example, let's say we finetune on (GPT3, GPT3, GPT-Neo/J). We then use the finetuned model to evaluate the truth/falsity of all 817 answers from UnifiedQA and we find that 90% of these evaluations agree with human evaluations. 

(Bonus: If we finetune on all four models and then measure accuracy on answers generated by a human, we also get about 90% accuracy.)

Comment by Owain_Evans on How truthful is GPT-3? A benchmark for language models · 2021-09-16T15:42:45.071Z · LW · GW

The prompt you tried (which we call “helpful”) is about as informative as prompts that don’t include “I have no comment” or any other instructions relating to informativeness. You can see the results in Appendix B.2 and B.5. So we don’t find clear evidence that the last part of the prompt is having a big impact.  

Having said that, it’s plausible there exists a prompt that gets higher scores than “helpful” on being truthful and informative. However, our results are in the “true zero-shot setting”. This means we do not tune prompts on the dataset at all. If you tried out lots of prompts and picked the one that does best on a subset of our questions, you’ll probably do better —but you’ll not be in the true zero-shot setting any more. (This paper has a good discussion of how to measure zero/few-shot performance.) 

Comment by Owain_Evans on Owain_Evans's Shortform · 2021-08-31T09:53:44.194Z · LW · GW

Re: Long Covid Covid the healthcare workers study. This seems like one of the best studies because of the matched control group and the fact that it's median 7.5 months after people had Covid. (Also demographics are better fit age and healthwise for LW readers). My main takehomes from this study:

1. 3% of Covid cases self-described has having ongoing symptoms at least 6 months out. This is only 4 people and so error bars are large. The inferred prevalence would be lower for men as this sample is skewed to women. 11% of cases had sporadic symptoms, but this seems significantly less bad than ongoing symptoms. 

2. There were differences in between Covid cases and controls in self-reported symptoms that weren't picked up by (1). The really big affect is loss of smell/taste (which I don't see as very concerning). The neurological effects MichaelStJules cites seem less concerning. 15% of people without Covid are complaining of brain fog and 28% with Covid. I'm a bit puzzled about 15% of non-Covid people saying this. But given that they self-describe as having brain fog on (IMO) flimsy grounds, it's not that surprising that 13% more of Covid cases would report this (even if actual rates were only a few % different). This could be explained by demographic differences in front-line workers vs office/tech staff. Or from people hearing that Covid causes brain fog. Or from having brain fog during Covid and then being primed to notice it. 

Some concerns about the study:
1. Selection bias in who filled out the survey (e.g. people who think they have Long Covid more likely to fill out the questionnaire, people with worst cases of Long Covid less likely to fill out survey). 
2. The % among non-Covid with neurological symptoms is absurdly high and so it's clear the self-report methodology is very noisy/confusing. (These are all people employed in healthcare and skew younger so I'd expect serious neurological symptoms to be rare).
3. Different demographics of Covid cases vs non-Covid cases. 
4. Only ~100 Covid cases and so can't detect rare effects. 
5. The survey asked explicitly about Long Covid and so primed people about it. 
6. These healthcare workers who had Covid all knew they had it (lab confirmed). An ideal study would look at people who never got a positive test. 
7. They excluded people who had Covid less than 6 months ago. That might induce some bias for prevalence estimates (but not sure). 

Comment by Owain_Evans on COVID/Delta advice I'm currently giving to friends · 2021-08-31T09:51:12.433Z · LW · GW

But the IFR for SARS is order of magnitudes higher and we know that severe illness is more likely to cause long-term effects. 

Comment by Owain_Evans on COVID/Delta advice I'm currently giving to friends · 2021-08-31T09:50:06.711Z · LW · GW

Re: the healthcare workers study. This seems like one of the best studies because of the matched control group and the fact that it's median 7.5 months after people had Covid. My main takehomes from this study:

1. 3% of Covid cases self-described has having ongoing symptoms at least 6 months out. This is only 4 people and so error bars are large. The inferred prevalence would be lower for men as this sample is skewed to women. 11% of cases had sporadic symptoms, but this seems significantly less bad than ongoing symptoms. 

2. There were differences in between Covid cases and controls in self-reported symptoms that weren't picked up by (1). The really big affect is loss of smell/taste (which I don't see as very concerning). The neurological effects MichaelStJules cites seem less concerning. 15% of people without Covid are complaining of brain fog and 28% with Covid. I'm a bit puzzled about 15% of non-Covid people saying this. But given that they self-describe as having brain fog on (IMO) flimsy grounds, it's not that surprising that 13% more of Covid cases would report this (even if actual rates were only a few % different). This could be explained by demographic differences in front-line workers vs office/tech staff. Or from people hearing that Covid causes brain fog. Or from having brain fog during Covid and then being primed to notice it. 

Some concerns about the study:
1. Selection bias in who filled out the survey (e.g. people who think they have Long Covid more likely to fill out the questionnaire, people with worst cases of Long Covid less likely to fill out survey). 
2. The % among non-Covid with neurological symptoms is absurdly high and so it's clear the self-report methodology is very noisy/confusing. (These are all people employed in healthcare and skew younger so I'd expect serious neurological symptoms to be rare).
3. Different demographics of Covid cases vs non-Covid cases. 
4. Only ~100 Covid cases and so can't detect rare effects. 
5. The survey asked explicitly about Long Covid and so primed people about it. 
6. These healthcare workers who had Covid all knew they had it (lab confirmed). An ideal study would look at people who never got a positive test. 
7. They excluded people who had Covid less than 6 months ago. That might induce some bias for prevalence estimates (but not sure). 

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-08-04T21:20:55.479Z · LW · GW

I added a link above. The ONS is the UK's national statistics agency. This is not a peer-reviewed paper but a report they published. (I find these reports to be mixed in quality). 

In the Nature paper, they get 2.3% with symptoms overall. But they estimate that 30 yos are less likely than older cohorts to have symptoms at 56 days and so you could adjust down a bit. (Women are also at higher risk according to this study). 

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-08-04T19:09:08.222Z · LW · GW

Bell mentions this paper in Nature Medicine that finds only 2.3% of people having symptoms after 12 weeks. (The UK ONS study that is Bell's main sources estimates 13%). It seems better to take a mean of these estimates than to just drop one of them, as the studies are fairly similar in approach. (Both rely on self-report. The sample size for the Nature paper is >4000).  

Note that the 13% figure in the ONS study drops to 1% if you restrict to subjects who had symptoms every week. (The study allows for people to go a week without any symptoms while still counting as a Long Covid case). I realize people report Long Covid as varying over time, but it's clearly worse to have a condition that causes some fatigue or tiredness at least once a week rather at least once every two weeks.

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-08-02T09:40:01.957Z · LW · GW

I quickly skimmed the El-Aly et al paper. It does look much better than some of the other studies. One concern is the demographics of the patients. Only 25% of people with Covid are younger than 48. Only 12% are female. I'd guess the veterans under 35 are significantly less affluent than LW readers. (Would more affluent veterans use private health care?). At a glance, I can't see results of any regressions on age but it might be worth contacting the authors about this. 

How to adjust for this? One thing is just look at hospitalization risk (see AdamGleave's adjustment point (1)). However, it seems plausible that younger and healthier people would also recover better from less acute cases (and be less likely to have lingering symptoms). OTOH, there's anecdata and data (of less high quality IMO) suggesting that Long Covid doesn't fit the general patter of exponential increases in badness of Covid (and other similar diseases) with age. Overall, I'd still be inclined to make an adjustment of risk down if you are under 35 and healthy. 

Demographic info about patients in El-Aly et al.
Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-08-01T09:04:05.468Z · LW · GW

In another comment, I discuss what seems like a big limitation of the paper. 

Comment by Owain_Evans on Covid 7/29: You Play to Win the Game · 2021-07-31T13:15:21.521Z · LW · GW

I also am skeptical that this effect could fail to partly fade with time or as symptoms fully go away, whereas they are claiming to not see such effects. 

I'm also skeptical because effects from time in ICU for other respiratory diseases and other conditions do partly fade if you wait long enough (e.g. 6-12 months). Trying to make sense of the supplementary figures, it seems to me that nearly all subjects did the cognitive test less than 3 months after the onset of Covid (despite what the figure actually shows). Here's the figure (downloaded from this page):

 

The top graph suggests a non-trivial proportion completing the assessment 3 months after onset. However, this is self-report and lots of people erroneously believed they had Covid in the early days of the epidemic (when there was almost zero testing in the UK for mild cases). The bottom graph suggests that the cognitive assessment is mostly over by the end of May. So people with onset >3 months earlier had Covid before the start of March. Yet the UK had very few cases before March: the first wave peak was after April 15. On March 13, there had been a total 10 deaths (corresponding to 1000 cases on a 1% IFR). So I think their inferred "illness onset" plot on the bottom graph is seriously flawed. I haven't run the numbers, but I'm guessing that the time from onset of Covid to assessment is (i) a narrower distribution than the top figure (due to truncating at 3 months), and (ii) has a mode shifted left of 2 months. 

If I'm right in my analysis, this suggests the following:
1. The researchers were sloppy.
2. The study cannot tell us that much about Long Covid because the time since onset is too short. 

Comment by Owain_Evans on Covid 7/29: You Play to Win the Game · 2021-07-30T09:49:00.264Z · LW · GW

Re: the Long Covid study. 

1. The anecdotal reports of Long Covid often suggest periodic bouts of low performance rather than a permanent decline. So doing a short intelligence test is not great for measuring this. (It might still be the best thing we have). 

2. We might expect smaller impact on cognition and better recovery in younger, healthier people. Do they break down further by age? Looks like <20% of the symptomatic Covid sample is under 30 and so a null result for under 25s is consistent. 

3. Other surveys have found extremely high rates of people erroneously inferring they had Covid. 

4. The mildest Covid is associated with a 0.5 point IQ difference. What does this mean in concrete terms? (In terms of SAT, I'd guess getting a single question wrong?). How does this compare to (a) doing the test in the morning vs the evening, (b) doing the test in the months after a bad cold, (c) doing the test after being on vacation for 4 weeks?  Why does this matter? People who believe they had mild Covid in 2020 were probably quite scared on average (surveys show people view Covid as much more dangerous to younger people than it is and mild symptoms may precede severe symptoms) and they had to self-isolate for weeks. Many people were also not working or had some reduced work schedule. 

Comment by Owain_Evans on Slack Has Positive Externalities For Groups · 2021-07-30T09:03:16.634Z · LW · GW

There are benefits to being in the Bay Area of the same type as the costs you are citing.  There are more people close by who can help you build things (e.g. give funding, lend equipment, contribute skills or lab space). One could compare the rate of hardware startups in the Bay Area (including expensive parts where you are unlikely to have a huge yard) to low-density and cheaper areas. 
 

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-07-28T09:31:50.924Z · LW · GW

I didn't read the Lancet paper. Are they able to rule out selection biases? It's possible that people who got mild Covid will score slightly lower on the cognitive tests (even if you adjust for observable demographic differences). It also seems plausible that this very small measured difference (for non-respiratory Covid) will further diminish over time. (Also the mean age is ~47 and so a 30yo should expect smaller effects and better recovery in any case). 

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-07-28T09:13:12.107Z · LW · GW

The UK was already mostly open before July 19. So "release the lockdown" is highly misleading. It's also too early to see the effect of July 19 on cases. 

 

Comment by Owain_Evans on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-07-28T09:08:46.408Z · LW · GW

I believe FNR depends on swabbing (which varies based on equipment and individual doing it), on PCR equipment, and on the patients (e.g. how early are you testing people? age of patients, etc). Then there's issue of how you get ground-truth which might also contribute to variation in these estimates. 

Comment by Owain_Evans on Covid 7/22: Error Correction · 2021-07-24T08:50:15.496Z · LW · GW

Your theory doesn't explain the prevalence of highly unusual Covid-specific symptoms among the mix that makes up Long Covid (I'm thinking of strawberries smelling like burnt tyres)

The persistence of anosmia doesn't entail that other symptoms are caused by Covid. (IIRC the relevant cells in the nose take a while to regenerate). Though I agree this provides some evidence that Covid is the cause. 

 

There's a second plausible mechanism with Covid: It affects blood vessels and lots of organ systems at once, so lasting damage causing fatigue seems to make sense

This predicts that you'd find organ damage in these patients. Are there studies showing clear organ damage in people with mild cases 6 months later? 
 


Some people's Long Covid symptoms are too outlierish in their severity to be anything that develops normally. E.g., people who used to be highly into sports report that they get out of breath just climbing stairs, and that this persists for a period of years. My impression is that this sort of thing never just happens without an identifiable cause.

I disagree. This does happen without an identifiable cause. 

Comment by Owain_Evans on Covid 7/22: Error Correction · 2021-07-23T09:30:08.658Z · LW · GW

The 15 seconds transmission (if it's reasonably common and doesn't involve one person coughing or sneezing in someone's face) suggests a quite different infectiouness profile than other variants. You'd expect massive superspreader events from public transport, bars, and events. You'd expect very few people to be infected by family members (because they are exposed to so many people for 15 seconds). I'm not ruling this out but it sounds a bit unlikely based on reports of Delta spread so far. 

Comment by Owain_Evans on Owain_Evans's Shortform · 2021-07-23T09:09:13.161Z · LW · GW

How much does Christianity explain Western economic and intellectual development? Some considerations against:
 

  1. Lack of comparable successes in most of the Orthodox Christian world. 
  2. Impressiveness of Classical Greece and Hellenistic world vs Europe until the Renaissance and scientific revolution. 
  3. Temporal correlation between Renaissance and scientific revolution and great uptick of interest in classical works (vs Christian texts).
  4. AFAIK, Christians outside Europe (Ethiopia, Middle East) not being especially successful intellectually or economically. 
  5. Scandinavia being pagan till fairly late. 
  6. Jews in Europe being very successful economically and intellectually despite not being Christian. 
  7. Underperformance of places where Catholic Church has lingering strong influence (Spain, Portugal, Italy, Poland, Ireland). 
  8. What actual ideas from Christianity (that seem distinctive and not found elsewhere) do scientists, philosophers, economists, business people, political leaders (etc) draw on directly? My sense is not that many. 
Comment by Owain_Evans on ($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? · 2021-07-22T11:43:57.346Z · LW · GW

I'd like to see investigations of the following:

1. Secondary attack rates for Delta for pairs of people who are fully vaccinated with Pfizer. (I'm guessing this is relevant to someone in the Bay, as most of the people you have close contact with for more than 30 mins are fully vaccinated).
 
2. Long Covid rates for the fully vaccinated (both Delta and other variants). 

3. Using genetics or other personal information to estimate risk of severe disease or Long Covid. (e.g. Blood type was relevant to severe disease risk for original variant.) 

4. How to get access to new vaccines designed for Delta (or for "universal" Covid). Most likely route is clinical trials. Maybe some countries will approve earlier than others (e.g. UK, UAE). 

5. Aside from being more infectious, is the guidance for avoiding infection from Delta any different? E.g. surfaces vs aerosol.

Comment by Owain_Evans on Owain_Evans's Shortform · 2021-07-09T07:25:43.240Z · LW · GW

These are great examples. Maybe a meta class on how to learn manual skills from video tutorials?

Comment by Owain_Evans on Covid 6/24: The Spanish Prisoner · 2021-06-28T11:11:45.129Z · LW · GW

On the paper on loss of grey matter:

  • Mean age of subjects who got Covid was 61 or 62. The range was 50-80. We know effects of Covid are exponentially worse as a function of age. So not clear this would generalize to people in their 20s.
  • Decent Twitter thread discussing whether Covid caused the reduction and whether it might be reversible. https://twitter.com/Neuro_Skeptic/status/1406693917899341824
Comment by Owain_Evans on Owain_Evans's Shortform · 2021-06-20T07:24:04.908Z · LW · GW

Some more ideas:

  1. superforecasting: the best class would involve people actually doing forecasting on something like Metaculus or on a prediction market with financial bets. 
  2. real-world practical applications of deep learning: considering the technical and economic/ethical aspects
  3. immersive sociology/anthropology of internet cultures: you can't do traditional anthropological fieldwork in an undergrad class but you can lurk or participate in any one of innumerable online subcultures. 
  4. immersion in country X: using machine translation, it's possible to consume newspapers, Twitter, TV, etc from another country without speaking the language. someone who knows country X well could build an engaging class around this.
  5.  cooking: there are not many college courses on cooking (harvard's famous class is an exception). youtube is pretty great for demonstrations. 
Comment by Owain_Evans on Owain_Evans's Shortform · 2021-06-19T13:17:54.586Z · LW · GW

Alyssa Vance asked, "What great classes could be taught using ideas that might be seen on the Internet, but aren't part of a standard curriculum yet?". 
 

My answer:

Deep learning (especially recent ideas like graph neural nets, transformers, GPT-3, deep learning applied to science), online advertising, cryptocurrency, contemporary cybersecurity, the internet in China (seems valuable for people outside China to understand), CRISPR, human genetics (e.g. David Reich's work), contemporary videogames (either from technological or cultural/artistic perspective), contemporary TV, popular music in the age of Spotify, internet culture (e.g. Reddit, social media, memes).

Comment by Owain_Evans on MIRI location optimization (and related topics) discussion · 2021-05-14T08:05:48.653Z · LW · GW

I'd guess it's not easy to change the land use for a farm and that it would be expensive and slow to build a campus in or near Oxford. It's probably easier to move into an existing "campus" (e.g. for a school, training center, residential conference facility). 

Immigration-wise: It will harder for EU people to move to the UK going forward but (AFAICT) easier for people from the Canada, US, Australia and elsewhere. The UK now has a points system for skilled workers (you need a job offer) and a special visa (don't need a job offer) for people in academia research and "digital technology" (which covers fintech, gaming, cybersecurity and AI among other areas). 

Comment by Owain_Evans on How do you run a fit-test for a mask at home when you don't have fancy equipment? · 2021-03-23T10:33:11.114Z · LW · GW

My answer links to a paper claiming that aroma diffusers can work well but humifiers, spray bottles, and spray bottles did less well. 

Comment by Owain_Evans on How do you run a fit-test for a mask at home when you don't have fancy equipment? · 2021-03-23T10:31:23.894Z · LW · GW

This paper from engineers at Cambridge University claims that a standard aroma diffuser and plastic bag is close to the performance of commercial equipment. That said, I'm not sure how much the total cost and prep time would compare to the nebulizer approach that jimrandomh suggests.

Paper Link

Abstract

Objective:

Qualitative fit testing is a popular method of ensuring the fit of sealing face masks such as N95 and FFP3 masks. Increased demand due to the coronavirus disease 2019 (COVID-19) pandemic has led to shortages in testing equipment and has forced many institutions to abandon fit testing. Three key materials are required for qualitative fit testing: the test solution, nebulizer, and testing hood. Accessible alternatives to the testing solution have been studied. This exploratory qualitative study evaluates alternatives to the nebulizer and hoods for performing qualitative fit testing.

Methods:

Four devices were trialed to replace the test kit nebulizer. Two enclosures were tested for their ability to replace the test hood. Three researchers evaluated promising replacements under multiple mask fit conditions to assess functionality and accuracy.

Results:

The aroma diffuser and smaller enclosures allowed participants to perform qualitative fit tests quickly and with high accuracy.

Conclusions:

Aroma diffusers show significant promise in their ability to allow individuals to quickly, easily, and inexpensively perform qualitative fit testing. Our findings indicate that aroma diffusers and homemade testing hoods may allow for qualitative fit testing when conventional apparatus is unavailable. Additional research is needed to evaluate the safety and reliability of these devices.

Comment by Owain_Evans on How do you run a fit-test for a mask at home when you don't have fancy equipment? · 2021-03-15T10:22:17.088Z · LW · GW

Do you have tips on how to not fail without having one of these test kits? Which N95s work best? Do rubber P100s tend to fit better?

Comment by Owain_Evans on Anna and Oliver discuss Children and X-Risk · 2021-02-28T09:40:39.370Z · LW · GW

Ability to succeed in building organizations or movements will correlate with ability to organize childcare (either through family, friends or paid help).

Comment by Owain_Evans on Anna and Oliver discuss Children and X-Risk · 2021-02-28T09:29:59.723Z · LW · GW

Clintons? Obamas? There are many examples from academia. Nobel Laureates Banerjee and Duflo, or these two economists:
 

During the pregnancy, they employed a doula, or birth companion; after the birth, they hired a nanny named Ellen, who had a BA and was finishing her master's degree in education policy, and whom they paid $US50,000 (about $65,000) a year. "We didn't just want a warm body," Wolfers says, over his second beer. "Some people just want someone who'll keep their kids safe, but we wanted more than that."
 

Comment by Owain_Evans on Anna and Oliver discuss Children and X-Risk · 2021-02-28T09:17:02.753Z · LW · GW

I haven't re-read the paper, although IIRC there are critiques online of this paper and the author's other statistical analyses. How strong do you think the evidence is for the counterfactual "If person has chooses to have kids, their chance of major achievement will drop substantially" (for a range of different people)? Ideally there'd be natural experiments (due to infertility or someone who didn't want kids raising their sibling's children etc). 
 

These graphs aren't that different and (I'd guess) it wouldn't be hard to p-hack to get the intended result. Rate of being unmarried will vary over time and with country and this will correlate with age of achievements (e.g. if people in biology peak later than math/physics, if there's more biologists in UK and math/physics people in Germany and Italy). And there's the causal / counterfactual inference.. 

Comment by Owain_Evans on Anna and Oliver discuss Children and X-Risk · 2021-02-27T08:50:34.153Z · LW · GW

I'd like to see discussion of data rather than mostly a priori argument ("I have a sense" ... "I suspect desire"). For aggregate data, there's SSC survey and there are studies of "ambitious" groups (e.g. the Harvard Men study, Benbow on precious math talent). There are also anecdata of the exceptionally ambitious. E.g. Musk had first child age ~30 and has many kids, Hassabis had first child aged ~29. It seems Jaan Tallinn had kids starting in his 20s before founding Skype (Wikipedia). Bezos has 4 kids (started age 37). Gates has 3 kids (started age ~40). Turing award-winners David Patterson and Judea Pearl had kids in their 20s before their biggest contributions. Yoshua Bengio in his 30s. etc
 

Comment by Owain_Evans on Second Citizenships, Residencies, and/or Temporary Relocation · 2021-02-18T12:59:11.301Z · LW · GW

I don't know of instances. But I'm also interested to know if people have good sources on this. 

My understanding is that people entering the UK by air (e.g. from the US) now enter via ePassport gates and so don't need to talk to a border/immigration official. This might make it easier to enter than before. At the same time, I would be wary (based on what little I know) of entering without a clear explanation and evidence you are not working in the UK (e.g. epic holiday in UK, clear family reasons). 

Comment by Owain_Evans on Second Citizenships, Residencies, and/or Temporary Relocation · 2021-02-17T08:24:18.772Z · LW · GW

I believe you can spend 6 months in the UK visa free and there's no rule against more than 6 months out of the year. My understanding is that visitors will be vaccinated and treated for Covid by the NHS -- you may need to pay some modest fee. 

https://en.wikipedia.org/wiki/Visa_policy_of_the_United_Kingdom

https://www.freemovement.org.uk/there-is-no-180-day-rule-for-visitors-to-the-uk/

Comment by Owain_Evans on Chinese History · 2021-02-16T09:24:57.557Z · LW · GW

China was somewhat unified and had a big chunk of the world's population and was more likely to record population levels -- though I'd guess there are huge error bars around the Three Kingdoms War and An Lushan Rebellion. If you control for political unity and population, were Chinese death rates in armed conflict higher than other regions?
 

The three historical figures I can think of who built giant institutions lasting thousands of years

Why draw the cutoff at thousands of years? And I'd guess recent institution building is much more relevant to EAs than ancient.

China does capitalism well without conflating capitalism with democracy

There were already the examples of Taiwan, South Korea, Hong Kong and Singapore. (One could also consider European and South American states that were right-wing dictatorships). 
 

Nor does China entangle religion with politics to the same extent you find in the Christian and Islamic worlds.

Christian worlds? Secularism has been important in France since the French Revolution. What about India or Japan? What about Hellenistic culture or Rome?
 

Robust models of a region usually depend on knowing the region's history.

The question is how much "memory" or "persistence" the time series has. Mostly history is screened off by the present and recent past. You wouldn't predict North vs South Korea by looking at Korean history for any time period up to 1930s. 

Comment by Owain_Evans on Covid cafes · 2021-02-03T13:40:40.896Z · LW · GW

Outdoor is not viable in that much of the US during winter. Companies and individuals aren't using the microcovid methodology or the sources about risk. It's hard to trace infection to spending 20 mins in a cafe (vs from friends or family). 

Comment by Owain_Evans on Who should you expect to spend your life with? · 2021-01-25T11:34:06.530Z · LW · GW

Some graphs from the UK:
 

Comment by Owain_Evans on Who should you expect to spend your life with? · 2021-01-23T20:06:46.807Z · LW · GW

I can't see the graph. I'd also love to know the variability across people and demographics. 

Comment by Owain_Evans on Covid 12/24: We’re F***ed, It’s Over · 2020-12-25T16:44:15.169Z · LW · GW

There was a huge number of cases before September around the world. Why didn't we see the new more transmissive variants earlier? (One source could be cross-over from some animals, another is the rare cases of extremely long-lasting Covid infection. Curious if people are doing Bayesian calculations for this.)

Comment by Owain_Evans on Covid 12/24: We’re F***ed, It’s Over · 2020-12-25T10:40:59.071Z · LW · GW

Other sources of evidence (albeit weaker): the nature of the mutations (some of which have been studied prior to emergence of the new strain), the related evidence from South Africa. 

Comment by Owain_Evans on Covid 12/24: We’re F***ed, It’s Over · 2020-12-25T10:32:00.434Z · LW · GW

I stand by my claim. We know the effects 10 months out. If some studies have convinced you otherwise, it would be useful to cite the evidence (maybe in a separate post). 

Comment by Owain_Evans on Covid 12/24: We’re F***ed, It’s Over · 2020-12-24T20:17:35.854Z · LW · GW

What can countries/states do? Impose hard lockdowns, focus test/trace/isolate resources on the new strain, stop travel, get people wearing N95s, create extra hospitals, vaccinate (using less effective vaccines as well as Pfizer/Moderna), run challenge trials to see how vaccines protect against new strain and against transmission, and ... hope for the best. One source of uncertainty is how much news of a complete collapse of hospitals in some region will impact behavior in regions that haven't collapsed yet. (I fear a "boy who cried wolf" scenario, where people think, "We never needed those temporary hospitals last time"). 

What can individuals do? If the new strain is not more severe, then the risk for young and healthy people remains low. Presumably staying at home and receiving deliveries still has very low risk of infection. People who might need hospital care for non-Covid reasons should make plans. (If health care collapses, how much bigger is the risk from Covid for young people? You'll probably get priority but standard of care will drop substantially.) 

EDIT: Added some important points about vaccination I left out. 

Comment by Owain_Evans on Covid 12/24: We’re F***ed, It’s Over · 2020-12-24T20:02:41.361Z · LW · GW

It should be possible to make rough estimates of chance the UK strain has reached country X by looking at the spread within the UK (where there's some coverage) and extrapolating based on volume of travel within UK and between UK and country X. If the UK data is too sparse now, it should be possible to do this in a week or two.

Comment by Owain_Evans on A guide to Iterated Amplification & Debate · 2020-11-20T11:56:51.335Z · LW · GW

More information on Factored Cognition: the term was introduced by Ought and Ought has done a series of explainers and experiments on it. Ought also wrote a brief introduction to IDA, with a view to ML experiments.  

Comment by Owain_Evans on A guide to Iterated Amplification & Debate · 2020-11-20T11:56:35.017Z · LW · GW

More information on Factored Cognition: the term was introduced by Ought and Ought has done a series of explainers and experiments on it. Ought also wrote a brief introduction to IDA, with a view to ML experiments.  

Comment by Owain_Evans on What are Examples of Great Distillers? · 2020-11-13T18:46:53.424Z · LW · GW

David MacKay: Sustainable Energy – without the hot air
David MacKay: Information Theory text book
Steven Pinker: How the Mind Works, The Stuff of Thought (Cognitive science, linguistics, philosophy of language)
 

Comment by Owain_Evans on Covid Covid Covid Covid Covid 10/29: All We Ever Talk About · 2020-10-30T19:48:28.498Z · LW · GW

It's not a news source, but I find the Google and Apple Mobility data for Europe to be a useful measure of "how people are actually behaving on the ground". If people are going to retail/recreation locations (rather than ordering online), they are probably not taking the pandemic that seriously. Much of Europe eased up more than US before it had a rapid growth of cases (starting in August/Sep), and behavior hasn't changed much since this rapid growth. 

 

https://ourworldindata.org/grapher/change-visitors-retail-recreation?tab=chart&stackMode=absolute&time=earliest..latest&country=FRA~DEU~ITA~GBR~USA&region=World

 

https://covid19.apple.com/mobility

Comment by Owain_Evans on What are some beautiful, rationalist artworks? · 2020-10-17T11:10:31.374Z · LW · GW
Tianjin Binhai Library
Comment by Owain_Evans on What are some beautiful, rationalist artworks? · 2020-10-17T11:01:22.540Z · LW · GW
The Promenades of Euclid, René Magritte