Posts

How important are model sizes to your timeline predictions? 2019-09-05T17:34:14.742Z · score: 11 (6 votes)
What are some good examples of gaming that is hard to detect? 2019-05-16T16:10:38.333Z · score: 5 (2 votes)
Any rebuttals of Christiano and AI Impacts on takeoff speeds? 2019-04-21T20:39:51.076Z · score: 47 (17 votes)
Some intuition on why consciousness seems subjective 2018-07-27T22:37:44.587Z · score: 19 (10 votes)
Updating towards the simulation hypothesis because you think about AI 2016-03-05T22:23:49.424Z · score: 9 (13 votes)
Working at MIRI: An interview with Malo Bourgon 2015-11-01T12:54:58.841Z · score: 8 (9 votes)
Meetup : 'The Most Good Good You Can Do' (Effective Altruism meetup) 2015-05-14T18:32:18.446Z · score: 1 (2 votes)
Meetup : Utrecht- Brainstorm and ethics discussion at the Film Café 2014-05-19T20:49:07.529Z · score: 1 (2 votes)
Meetup : Utrecht - Social discussion at the Film Café 2014-05-12T13:10:07.746Z · score: 1 (2 votes)
Meetup : Utrecht 2014-04-20T10:14:21.859Z · score: 1 (2 votes)
Meetup : Utrecht: Behavioural economics, game theory... 2014-04-07T13:54:49.079Z · score: 2 (3 votes)
Meetup : Utrecht: More on effective altruism 2014-03-27T00:40:37.720Z · score: 1 (2 votes)
Meetup : Utrecht: Famine, Affluence and Morality 2014-03-16T19:56:44.267Z · score: 0 (1 votes)
Meetup : Utrecht: Effective Altruism 2014-03-03T19:55:11.665Z · score: 3 (4 votes)

Comments

Comment by soerenmind on AGI in a vulnerable world · 2020-03-27T13:40:32.672Z · score: 1 (1 votes) · LW · GW

Hmm, in my model most of the x-risk is gone if there is no incentive to deploy. But I expect actors will deploy systems because their system is aligned with a proxy. At least this leads to short-term gains. Maybe the crux is that you expect these actors to suffer a large private harm (death) and I expect a small private harm (for each system, a marginal distributed harm to all of society)?

Comment by soerenmind on AGI in a vulnerable world · 2020-03-27T13:08:31.714Z · score: 1 (0 votes) · LW · GW

I agree that coordination between mutually aligned AIs is plausible.

I think such coordination is less likely in our example because we can probably anticipate and avoid it for human-level AGI.

I also think there are strong commercial incentives to avoid building mutually aligned AGIs. You can't sell (access to) a system if there is no reason to believe the system will help your customer. Rather, I expect systems to be fine-tuned for each task, as in the current paradigm. (The systems may successfully resist fine-tuning once they become sufficiently advanced.)

I'll also add that two copies of the same system are not necessarily mutually aligned. See for example debate and other self-play algorithms.

Comment by soerenmind on AGI in a vulnerable world · 2020-03-26T18:10:58.245Z · score: 1 (1 votes) · LW · GW

This reasoning can break if deployment turns out to be very cheap (i.e. low marginal cost compared to fixed cost); then there will be lots of copies of the most impressive system. Then it matters a lot who uses the copies. Are they kept secret and only deployed for internal use? Or are they sold in some form? (E.g. the supplier sells access to its system so customers can fine-tune e.g. to do financial trading.)

Comment by soerenmind on AGI in a vulnerable world · 2020-03-26T18:05:18.591Z · score: 2 (2 votes) · LW · GW
And once there is at least one AGI running around, things will either get a lot worse or a lot better very quickly.

I don't expect the first AGI to have that much influence (assuming gradual progress). Here's an example of what fits my model: there is one giant-research-project AGI that costs $10b to deploy (and maybe $100b to R&D), 100 slightly worse pre-AGIs that cost perhaps $100m each to deploy, and 1m again slightly worse pre-AGIs that cost $10k to each copy. So at any point in time we have a lot of AI systems that, together, are more powerful than the small number of most impressive systems.

Comment by soerenmind on AGI in a vulnerable world · 2020-03-26T14:16:15.388Z · score: 5 (3 votes) · LW · GW

Small teams can also get cheap access to impressive results by buying it from large teams. The large team should set a low price if it has competitors who also sell to many customers.

Comment by soerenmind on What would be the consequences of commoditizing AI? · 2020-03-21T17:28:15.129Z · score: 1 (1 votes) · LW · GW

Would be pretty interested in your ideas about how to commoditize AI.

Comment by soerenmind on Coronavirus Open Thread · 2020-03-18T02:42:52.688Z · score: 1 (1 votes) · LW · GW

Right now I expect they just used hospital admission forms. If I was self-reporting 5 pages of medical history while I'm critically ill I'd probably skip some fields. Interesting that they did find high rates of diabetes etc though.

Comment by soerenmind on Coronavirus Open Thread · 2020-03-17T17:28:24.174Z · score: 4 (3 votes) · LW · GW

Data point: There were no asthma patients among a group of 140 hospitalized COVID-19 cases in Wuhan.

But nobody had other allergic diseases either. No hay fever? Seems curious.

Comment by soerenmind on Coronavirus Open Thread · 2020-03-15T15:18:06.181Z · score: 9 (4 votes) · LW · GW

1/13 people have Asthma. How much worse off are we?

Comment by soerenmind on Credibility of the CDC on SARS-CoV-2 · 2020-03-09T14:28:39.654Z · score: 3 (2 votes) · LW · GW

Is this also wrong?

It may be possible that a person can get COVID-19 by touching a surface or object that has the virus on it and then touching their own mouth, nose, or possibly their eyes, but this is not thought to be the main way the virus spreads.

It's certainly contrary to most sources I've seen. Instead CDC claim it spreads "between people who are in close contact with one another (within about 6 feet)" (i. e. through droplets in the air).

https://www.cdc.gov/coronavirus/2019-ncov/about/transmission.html

Comment by soerenmind on What "Saving throws" does the world have against coronavirus? (And how plausible are they?) · 2020-03-05T10:19:06.851Z · score: 5 (3 votes) · LW · GW

1. We can slow down the spread through hand-washing, social distancing etc for long enough to develop a vaccine (or other measures) on time.

2. A vaccine is brought to market without the usual safety testing. Apparently we already have one that works in mice (from personal communication).

3. >10% get infected but the death rate has been greatly overestimated due to sampling bias. That one seems probable to me.

4. Antivirals

Comment by soerenmind on Draft: Models of Risks of Delivery Under Coronavirus · 2020-02-28T13:23:30.909Z · score: 3 (2 votes) · LW · GW

Coronaviruses may survive a lot longer, depending on specifics.

https://www.google.com/amp/s/www.medicalnewstoday.com/amp/articles/coronaviruses-how-long-can-they-survive-on-surfaces

Comment by soerenmind on [AN #78] Formalizing power and instrumental convergence, and the end-of-year AI safety charity comparison · 2019-12-26T18:26:32.644Z · score: 3 (2 votes) · LW · GW

Sounds like we agree :)

Comment by soerenmind on [AN #78] Formalizing power and instrumental convergence, and the end-of-year AI safety charity comparison · 2019-12-26T14:09:24.866Z · score: 4 (3 votes) · LW · GW

Thanks!

On 2): Being overparameterized doesn't mean you fit all your training data. It just means that you could fit it with enough optimization. Perhaps the existence of some Savant people shows that the brain could memorize way more than it does.

On 3): The number of our synaptic weights is stupendous too - about 30000 for every second in our life.

On 4): You can underfit at the evolution level and still overparameterize at the individual level.

Overall you convinced me that underparameterization is less likely though. Especially on your definition of overparameterization, which is relevant for double descent.

Comment by soerenmind on [AN #78] Formalizing power and instrumental convergence, and the end-of-year AI safety charity comparison · 2019-12-26T01:31:27.108Z · score: 3 (2 votes) · LW · GW

Why do you think that humans are, and powerful AI systems will be, severely underparameterized?

Comment by soerenmind on [AN #76]: How dataset size affects robustness, and benchmarking safe exploration by measuring constraint violations · 2019-12-07T21:45:11.706Z · score: 2 (2 votes) · LW · GW

Potential paper from DM/Stanford for a future newsletter: https://arxiv.org/pdf/1911.00459.pdf

It addresses the problem that an RL agent will delude itself by finding loopholes in a learned reward function.

Comment by soerenmind on Strategic implications of AIs' ability to coordinate at low cost, for example by merging · 2019-12-07T15:47:56.420Z · score: 2 (2 votes) · LW · GW

Also interesting to see that all of these groups were able to coordinate to the disadvantage of less coordinates groups, but not able to reach peace among themselves.

One explanation might be that the more coordinated groups also have harder coordination problems to solve because their world is bigger and more complicated. Might be the same with AI?

Comment by soerenmind on Seeking Power is Instrumentally Convergent in MDPs · 2019-12-07T15:35:22.290Z · score: 4 (3 votes) · LW · GW

If X is "number of paperclips" and Y is something arbitrary that nobody optimizes, such as the ratio of number of bicycles on the moon to flying horses, optimizing X should be equally likely to increase or decrease Y in expectation. Otherwise "1-Y" would go in the opposite direction which can't be true by symmetry. But if Y is something like "number of happy people", Y will probably decrease because the world is already set up to keep Y up and a misaligned agent could disturb that state.

Comment by soerenmind on Seeking Power is Instrumentally Convergent in MDPs · 2019-12-06T12:13:23.133Z · score: 1 (1 votes) · LW · GW

I should've specified that the strong version is "Y decreases relative to a world where neither of X nor Y are being optimized". Am I right that this version is not true?

Comment by soerenmind on Seeking Power is Instrumentally Convergent in MDPs · 2019-12-05T21:40:02.180Z · score: 3 (2 votes) · LW · GW

Thanks for writing this! It always felt like a blind spot to me that we only have Goodhart's law that says "if X is a proxy for Y and you optimize X, the correlation breaks" but we really mean a stronger version: "if you optimize X, Y will actively decrease". Your paper clarifies that what we actually mean is an intermediate version: "if you optimize X, it becomes a harder to optimize Y". My conclusion would be that the intermediate version is true but the strong version false then. Would you say that's an accurate summary?

Comment by soerenmind on [1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv · 2019-11-28T19:15:09.468Z · score: 2 (2 votes) · LW · GW

Posted a little reaction to this paper here.

Comment by soerenmind on [AN #75]: Solving Atari and Go with learned game models, and thoughts from a MIRI employee · 2019-11-28T18:56:18.011Z · score: 7 (3 votes) · LW · GW

My tentative view on MuZero:

  • Cool for board games and related tasks.
  • The Atari demo seems sketchy.
  • Not a big step towards making model-based RL work - instead, a step making it more like model-free RL.

Why?

  • A textbook benefit for model-based RL is that world models (i.e. prediction of observations) generalize to new reward functions and environments. They've removed this benefit.
  • The other textbook benefit of model-based RL is data efficiency. But on Atari, MuZero is just as inefficient as model-free RL. In fact, MuZero moves a lot closer to model-free methods by removing the need to predict observations. And it's roughly equally inefficient. Plus it trains with 40 TPUs per game where other algorithms use a single GPU and similar training time. What if they spent that extra compute to get more data?
  • In the low-data setting they outperform model-free methods. But they suspiciously didn't compare to any model-based method. They'd probably lose there because they'd need a world model for data efficiency.
  • MuZero only plans for K=5 steps ahead - far less than AlphaZero. Two takeaways: 1) This again looks more similar to model-free RL which has essentially K=1. 2) This makes me more optimistic that model-free RL can learn Go with just a moderate efficiency (and stability?) loss (Paul has speculated this. Also, the trained AlphaZero policy net is apparently still better than Lee Sedol without MCTS).
Comment by soerenmind on AlphaStar: Impressive for RL progress, not for AGI progress · 2019-11-25T20:54:44.808Z · score: 3 (2 votes) · LW · GW
the big questions are just how large a policy you would need to train using existing methods in order to be competitive with a human (my best guess would be a ~trillion to a ~quadrillion)

Curious where this estimate comes from?

Comment by soerenmind on AlphaStar: Impressive for RL progress, not for AGI progress · 2019-11-10T08:28:51.629Z · score: 7 (3 votes) · LW · GW

Why just a 10x speedup over model free RL? I would've expected much more.

Comment by soerenmind on [AN #72]: Alignment, robustness, methodology, and system building as research priorities for AI safety · 2019-11-07T16:01:19.229Z · score: 1 (1 votes) · LW · GW

Should I share the Alignment Research Overview in its current Google Doc form or is it about to be published somewhere more official?

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-13T01:55:36.060Z · score: 1 (1 votes) · LW · GW

Yes. To the extent that the system in question is an agent, I'd roughly think of many copies of it as a single distributed agent.

Comment by soerenmind on Towards an empirical investigation of inner alignment · 2019-10-12T10:33:20.282Z · score: 1 (1 votes) · LW · GW

Hmmm my worry isn't so nuch that we have an unusual definition of inner alignment. It's more the opposite: that outsiders associate this line of research with quackery (which only gets worse if our definition is close to the standard one).

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:25:46.635Z · score: 1 (1 votes) · LW · GW

Re whether ML is easy to deploy: most compute these days goes into deployment. And there are a lot of other deployment challenges that you don't have during training where you train a single model under lab conditions.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:24:52.181Z · score: 1 (1 votes) · LW · GW

Fair - I'd probably count "making lots of copies of a trained system" as a single system here.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:17:31.278Z · score: 1 (1 votes) · LW · GW

Yes - the part that I was doubting is that it provides evidence for relatively quick takeoff.

Comment by soerenmind on How important are model sizes to your timeline predictions? · 2019-10-11T01:25:43.349Z · score: 3 (2 votes) · LW · GW

For the record, two people who I consider authorities on this told me some version of "model sizes matter a lot".

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:14:33.463Z · score: 5 (2 votes) · LW · GW

"Continuous" vs "gradual":

I’ve also seen people internally use the word gradual and I prefer it to continuous because 1) in maths, a discontinuity can be an insignificant jump and 2) a fast takeoff is about fast changes in the growth rate, whereas continuity is about jumps in the function value (you can have either without the other). I don’t see a natural way to say non-gradual or a non-graduality though, which is why I do often say discontinuous instead.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:12:20.102Z · score: 9 (5 votes) · LW · GW
Still, the general strategy of "dealing with things as they come up" is much more viable under continuous takeoff.

Agreed. This is why I'd like to see MIRI folks argue more for their views on takeoff speeds. If they’re right, more researchers should want to work for them.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:11:40.319Z · score: 1 (1 votes) · LW · GW

I agree that large power differentials between are possible between countries, like in the industrial revolution. I think it’s worth distinguishing concentration of power among countries from concentration among AI systems. I.e.

1) each country has at most a couple of AI systems and one country has significantly better ones or

2) each country’s economy uses many systems with a range of abilities and one country has significantly better ones on average.

In 2), the countries likely want to trade and negotiate (in addition to potentially conquering each other). Systems within conquered countries still have influence. That seems more like what happened in the industrial revolution. I feel like people sometimes argue for concentration of power among AI systems by saying that we’ve seen concentration among countries or companies. But those seem pretty different. (I’m not saying that’s what you’re arguing).

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:09:03.634Z · score: 1 (1 votes) · LW · GW

I don’t see the GAN example as evidence for continuous-but-quick takeoff.

When a metric suddenly becomes a target, fast progress can follow. But we already target the most important metrics (e.g. general intelligence). Face generation became a target in 2014 - and the number of papers on GANs quickly grew from a few to thousands per year. Compute budgets also jumped. There were low-hanging fruits for face-generation that people previously did not care about. I.e. we could have generated way better faces in 2014 than the one in the example if we had cared about it for some time.

Comment by soerenmind on Towards an empirical investigation of inner alignment · 2019-09-26T17:24:36.386Z · score: 1 (1 votes) · LW · GW

Just registering that I'd like to see a less loaded term than "inner alignment" being adopted.

Don't want to be confused with these people: "Inner alignment means becoming more of your true self. Your inner being is your true self, and by developing your inner potential, you express more and more of your true self."

Comment by soerenmind on [AN #63] How architecture search, meta learning, and environment design could lead to general intelligence · 2019-09-18T17:28:57.872Z · score: 1 (1 votes) · LW · GW

Would be cool to hear at some point :)

Comment by soerenmind on [AN #63] How architecture search, meta learning, and environment design could lead to general intelligence · 2019-09-11T00:09:15.814Z · score: 5 (3 votes) · LW · GW
I think that the complexity of the real world was quite crucial, and that simulating environments that reach the appropriate level of complexity will be a very difficult task.

Paul made some arguments that contradict this on the 80k podcast:


Almost all the actual complexity comes from other organisms, so that’s sort of something you get for free if you’re spending all this compute running evolution cause you get to have the agent you’re actually producing interact with itself.
I guess, other than that, you have this physical environment, which is very rich. Quantum field theory is very computationally complicated if you want to actually simulate the behavior of materials, but, it’s not an environment that’s optimized in ways that really pull out … human intelligence is not sensitive to the details of the way that materials break. If you just substitute in, if you take like, “Well, materials break when you apply stress,” and you just throw in some random complicated dynamics concerning how materials break, that’s about as good, it seems, as the dynamics from actual chemistry until you get to a point where humans are starting to build technology that depends on those properties. And, by that point, the game is already over.
Comment by soerenmind on Buck's Shortform · 2019-08-20T20:25:23.451Z · score: 10 (4 votes) · LW · GW

Hired an econ tutor based on this.

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T20:55:39.350Z · score: 3 (2 votes) · LW · GW

Yep my comment was about the linear scale up rather than it's implications for social learning.

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T17:41:06.670Z · score: 9 (3 votes) · LW · GW

Costs don't really grow linearly with model size because utilization goes down as you spread a model across many GPUs. I. e. aggregate memory requirements grow superlinearly. Relatedly, model sizes increased <100x while compute increased 300000x on OpenAI's data set. That's been updating my views a bit recently.

People are trying to solve this with things like GPipe, but I don't know yet if there can be an approach that scales to many more TPUs than what they tried (8). Communication would be the next bottleneck.

https://ai.googleblog.com/2019/03/introducing-gpipe-open-source-library.html?m=1

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T17:37:35.653Z · score: 1 (1 votes) · LW · GW

Edit: double commented

Comment by soerenmind on Conditions for Mesa-Optimization · 2019-07-28T10:33:53.923Z · score: 2 (2 votes) · LW · GW

The concept of pre-training and fine-tuning in ML seems closely related to mesa-optimization. You pre-train a model on a general distribution so that it can quickly learn from little data on a specific one.

However, as the number of tasks you want to do (N) increases, there seems to be the opposite effect as what your (very neat) model in section 2.1 describes: you get higher returns for meta-optimization so you'll want to spend relatively more on it. I think model's assumptions are defied here because the tasks don't require completely distinct policies. E.g. GPT-2 does very well across tasks with the exact same prediction-policy. I'm not completely sure about this point but it seems fruitful to explore the analogy to pre-training which is widely used.


Comment by soerenmind on What are principled ways for penalising complexity in practice? · 2019-06-28T04:38:36.232Z · score: 15 (5 votes) · LW · GW

The exact Bayesian solution penalizes complex models as a side effect. Each model should have a prior over its parameters. The more complex model can fit the data better, so P(data | best-fit parameters, model) is higher. But the model gets penalized because P(best-fit parameters | model) is lower on the prior. Why? The prior is thinly spread over a higher dimensional parameter space so it is lower for any particular set of parameters. This is called "Bayesian Occam's razor".

Comment by soerenmind on Risks from Learned Optimization: Introduction · 2019-06-24T19:22:51.332Z · score: 25 (7 votes) · LW · GW

This recent Deepmind paper seems to claim that they found a mesa optimizer. E. g. suppose their LSTM observes an initial state. You can let the LSTM 'think' about what to do by feeding it that state multiple times in a row. The more time it had to think, the better it acts. It has more properties like that. It's a pretty standard LSTM so part of their point is that this is common.

https://arxiv.org/abs/1901.03559v1

Comment by soerenmind on Risks from Learned Optimization: Introduction · 2019-06-20T19:31:34.017Z · score: 3 (1 votes) · LW · GW

Terminology: the phrase 'inner alignment' is loaded with connotations to spiritual thought (https://www.amazon.com/Inner-Alignment-Dinesh-Senan-ebook/dp/B01CRI5UIY)

Comment by soerenmind on What is the evidence for productivity benefits of weightlifting? · 2019-06-20T18:26:59.175Z · score: 5 (3 votes) · LW · GW

"high intensity aerobic exercise provides the benefit, and resistance training, if it includes high intensity aerobic exercise, can capture that benefit."

Which part made you conclude that high intensity aerobic exercise is needed? Asking because most resistance training doesn't include it.

Great answer, thanks!

Comment by soerenmind on What would you need to be motivated to answer "hard" LW questions? · 2019-05-17T13:06:31.567Z · score: 5 (3 votes) · LW · GW

It would help if the poster directly approaches or tags me as a relevant expert.

Comment by soerenmind on What are some good examples of gaming that is hard to detect? · 2019-05-17T12:58:51.664Z · score: 1 (1 votes) · LW · GW

Thanks, updated.

Comment by soerenmind on What are some good examples of gaming that is hard to detect? · 2019-05-17T12:58:05.362Z · score: 1 (1 votes) · LW · GW

For example, an RL agent that learns a policy that looks good to humans but isn't. Adversarial examples that only fool a neural nets wouldn't count.