Posts

How important are model sizes to your timeline predictions? 2019-09-05T17:34:14.742Z · score: 11 (6 votes)
What are some good examples of gaming that is hard to detect? 2019-05-16T16:10:38.333Z · score: 5 (2 votes)
Any rebuttals of Christiano and AI Impacts on takeoff speeds? 2019-04-21T20:39:51.076Z · score: 47 (17 votes)
Some intuition on why consciousness seems subjective 2018-07-27T22:37:44.587Z · score: 19 (10 votes)
Updating towards the simulation hypothesis because you think about AI 2016-03-05T22:23:49.424Z · score: 9 (13 votes)
Working at MIRI: An interview with Malo Bourgon 2015-11-01T12:54:58.841Z · score: 8 (9 votes)
Meetup : 'The Most Good Good You Can Do' (Effective Altruism meetup) 2015-05-14T18:32:18.446Z · score: 1 (2 votes)
Meetup : Utrecht- Brainstorm and ethics discussion at the Film Café 2014-05-19T20:49:07.529Z · score: 1 (2 votes)
Meetup : Utrecht - Social discussion at the Film Café 2014-05-12T13:10:07.746Z · score: 1 (2 votes)
Meetup : Utrecht 2014-04-20T10:14:21.859Z · score: 1 (2 votes)
Meetup : Utrecht: Behavioural economics, game theory... 2014-04-07T13:54:49.079Z · score: 2 (3 votes)
Meetup : Utrecht: More on effective altruism 2014-03-27T00:40:37.720Z · score: 1 (2 votes)
Meetup : Utrecht: Famine, Affluence and Morality 2014-03-16T19:56:44.267Z · score: 0 (1 votes)
Meetup : Utrecht: Effective Altruism 2014-03-03T19:55:11.665Z · score: 3 (4 votes)

Comments

Comment by soerenmind on AlphaStar: Impressive for RL progress, not for AGI progress · 2019-11-10T08:28:51.629Z · score: 4 (2 votes) · LW · GW

Why just a 10x speedup over model free RL? I would've expected much more.

Comment by soerenmind on [AN #72]: Alignment, robustness, methodology, and system building as research priorities for AI safety · 2019-11-07T16:01:19.229Z · score: 1 (1 votes) · LW · GW

Should I share the Alignment Research Overview in its current Google Doc form or is it about to be published somewhere more official?

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-13T01:55:36.060Z · score: 1 (1 votes) · LW · GW

Yes. To the extent that the system in question is an agent, I'd roughly think of many copies of it as a single distributed agent.

Comment by soerenmind on Towards an empirical investigation of inner alignment · 2019-10-12T10:33:20.282Z · score: 1 (1 votes) · LW · GW

Hmmm my worry isn't so nuch that we have an unusual definition of inner alignment. It's more the opposite: that outsiders associate this line of research with quackery (which only gets worse if our definition is close to the standard one).

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:25:46.635Z · score: 1 (1 votes) · LW · GW

Re whether ML is easy to deploy: most compute these days goes into deployment. And there are a lot of other deployment challenges that you don't have during training where you train a single model under lab conditions.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:24:52.181Z · score: 1 (1 votes) · LW · GW

Fair - I'd probably count "making lots of copies of a trained system" as a single system here.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-12T10:17:31.278Z · score: 1 (1 votes) · LW · GW

Yes - the part that I was doubting is that it provides evidence for relatively quick takeoff.

Comment by soerenmind on How important are model sizes to your timeline predictions? · 2019-10-11T01:25:43.349Z · score: 3 (2 votes) · LW · GW

For the record, two people who I consider authorities on this told me some version of "model sizes matter a lot".

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:14:33.463Z · score: 3 (1 votes) · LW · GW

"Continuous" vs "gradual":

I’ve also seen people internally use the word gradual and I prefer it to continuous because 1) in maths, a discontinuity can be an insignificant jump and 2) a fast takeoff is about fast changes in the growth rate, whereas continuity is about jumps in the function value (you can have either without the other). I don’t see a natural way to say non-gradual or a non-graduality though, which is why I do often say discontinuous instead.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:12:20.102Z · score: 8 (4 votes) · LW · GW
Still, the general strategy of "dealing with things as they come up" is much more viable under continuous takeoff.

Agreed. This is why I'd like to see MIRI folks argue more for their views on takeoff speeds. If they’re right, more researchers should want to work for them.

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:11:40.319Z · score: 1 (1 votes) · LW · GW

I agree that large power differentials between are possible between countries, like in the industrial revolution. I think it’s worth distinguishing concentration of power among countries from concentration among AI systems. I.e.

1) each country has at most a couple of AI systems and one country has significantly better ones or

2) each country’s economy uses many systems with a range of abilities and one country has significantly better ones on average.

In 2), the countries likely want to trade and negotiate (in addition to potentially conquering each other). Systems within conquered countries still have influence. That seems more like what happened in the industrial revolution. I feel like people sometimes argue for concentration of power among AI systems by saying that we’ve seen concentration among countries or companies. But those seem pretty different. (I’m not saying that’s what you’re arguing).

Comment by soerenmind on Misconceptions about continuous takeoff · 2019-10-11T01:09:03.634Z · score: 1 (1 votes) · LW · GW

I don’t see the GAN example as evidence for continuous-but-quick takeoff.

When a metric suddenly becomes a target, fast progress can follow. But we already target the most important metrics (e.g. general intelligence). Face generation became a target in 2014 - and the number of papers on GANs quickly grew from a few to thousands per year. Compute budgets also jumped. There were low-hanging fruits for face-generation that people previously did not care about. I.e. we could have generated way better faces in 2014 than the one in the example if we had cared about it for some time.

Comment by soerenmind on Towards an empirical investigation of inner alignment · 2019-09-26T17:24:36.386Z · score: 1 (1 votes) · LW · GW

Just registering that I'd like to see a less loaded term than "inner alignment" being adopted.

Don't want to be confused with these people: "Inner alignment means becoming more of your true self. Your inner being is your true self, and by developing your inner potential, you express more and more of your true self."

Comment by soerenmind on [AN #63] How architecture search, meta learning, and environment design could lead to general intelligence · 2019-09-18T17:28:57.872Z · score: 1 (1 votes) · LW · GW

Would be cool to hear at some point :)

Comment by soerenmind on [AN #63] How architecture search, meta learning, and environment design could lead to general intelligence · 2019-09-11T00:09:15.814Z · score: 5 (3 votes) · LW · GW
I think that the complexity of the real world was quite crucial, and that simulating environments that reach the appropriate level of complexity will be a very difficult task.

Paul made some arguments that contradict this on the 80k podcast:


Almost all the actual complexity comes from other organisms, so that’s sort of something you get for free if you’re spending all this compute running evolution cause you get to have the agent you’re actually producing interact with itself.
I guess, other than that, you have this physical environment, which is very rich. Quantum field theory is very computationally complicated if you want to actually simulate the behavior of materials, but, it’s not an environment that’s optimized in ways that really pull out … human intelligence is not sensitive to the details of the way that materials break. If you just substitute in, if you take like, “Well, materials break when you apply stress,” and you just throw in some random complicated dynamics concerning how materials break, that’s about as good, it seems, as the dynamics from actual chemistry until you get to a point where humans are starting to build technology that depends on those properties. And, by that point, the game is already over.
Comment by soerenmind on Buck's Shortform · 2019-08-20T20:25:23.451Z · score: 10 (4 votes) · LW · GW

Hired an econ tutor based on this.

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T20:55:39.350Z · score: 3 (2 votes) · LW · GW

Yep my comment was about the linear scale up rather than it's implications for social learning.

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T17:41:06.670Z · score: 9 (3 votes) · LW · GW

Costs don't really grow linearly with model size because utilization goes down as you spread a model across many GPUs. I. e. aggregate memory requirements grow superlinearly. Relatedly, model sizes increased <100x while compute increased 300000x on OpenAI's data set. That's been updating my views a bit recently.

People are trying to solve this with things like GPipe, but I don't know yet if there can be an approach that scales to many more TPUs than what they tried (8). Communication would be the next bottleneck.

https://ai.googleblog.com/2019/03/introducing-gpipe-open-source-library.html?m=1

Comment by soerenmind on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode · 2019-07-28T17:37:35.653Z · score: 1 (1 votes) · LW · GW

Edit: double commented

Comment by soerenmind on Conditions for Mesa-Optimization · 2019-07-28T10:33:53.923Z · score: 2 (2 votes) · LW · GW

The concept of pre-training and fine-tuning in ML seems closely related to mesa-optimization. You pre-train a model on a general distribution so that it can quickly learn from little data on a specific one.

However, as the number of tasks you want to do (N) increases, there seems to be the opposite effect as what your (very neat) model in section 2.1 describes: you get higher returns for meta-optimization so you'll want to spend relatively more on it. I think model's assumptions are defied here because the tasks don't require completely distinct policies. E.g. GPT-2 does very well across tasks with the exact same prediction-policy. I'm not completely sure about this point but it seems fruitful to explore the analogy to pre-training which is widely used.


Comment by soerenmind on What are principled ways for penalising complexity in practice? · 2019-06-28T04:38:36.232Z · score: 15 (5 votes) · LW · GW

The exact Bayesian solution penalizes complex models as a side effect. Each model should have a prior over its parameters. The more complex model can fit the data better, so P(data | best-fit parameters, model) is higher. But the model gets penalized because P(best-fit parameters | model) is lower on the prior. Why? The prior is thinly spread over a higher dimensional parameter space so it is lower for any particular set of parameters. This is called "Bayesian Occam's razor".

Comment by soerenmind on Risks from Learned Optimization: Introduction · 2019-06-24T19:22:51.332Z · score: 25 (7 votes) · LW · GW

This recent Deepmind paper seems to claim that they found a mesa optimizer. E. g. suppose their LSTM observes an initial state. You can let the LSTM 'think' about what to do by feeding it that state multiple times in a row. The more time it had to think, the better it acts. It has more properties like that. It's a pretty standard LSTM so part of their point is that this is common.

https://arxiv.org/abs/1901.03559v1

Comment by soerenmind on Risks from Learned Optimization: Introduction · 2019-06-20T19:31:34.017Z · score: 3 (1 votes) · LW · GW

Terminology: the phrase 'inner alignment' is loaded with connotations to spiritual thought (https://www.amazon.com/Inner-Alignment-Dinesh-Senan-ebook/dp/B01CRI5UIY)

Comment by soerenmind on What is the evidence for productivity benefits of weightlifting? · 2019-06-20T18:26:59.175Z · score: 5 (3 votes) · LW · GW

"high intensity aerobic exercise provides the benefit, and resistance training, if it includes high intensity aerobic exercise, can capture that benefit."

Which part made you conclude that high intensity aerobic exercise is needed? Asking because most resistance training doesn't include it.

Great answer, thanks!

Comment by soerenmind on What would you need to be motivated to answer "hard" LW questions? · 2019-05-17T13:06:31.567Z · score: 5 (3 votes) · LW · GW

It would help if the poster directly approaches or tags me as a relevant expert.

Comment by soerenmind on What are some good examples of gaming that is hard to detect? · 2019-05-17T12:58:51.664Z · score: 1 (1 votes) · LW · GW

Thanks, updated.

Comment by soerenmind on What are some good examples of gaming that is hard to detect? · 2019-05-17T12:58:05.362Z · score: 1 (1 votes) · LW · GW

For example, an RL agent that learns a policy that looks good to humans but isn't. Adversarial examples that only fool a neural nets wouldn't count.

Comment by soerenmind on What failure looks like · 2019-04-29T12:00:46.534Z · score: 3 (1 votes) · LW · GW

It'd be nice to hear a response from Paul to paragraph 1. My 2 cents:

I tend to agree that we end up with extremes eventually. You seem to say that we would immediately go to alignment given somewhat aligned systems so Paul's 1st story barely plays out.

Of course, the somewhat aligned systems may aim at the wrong thing if we try to make them solve alignment. So the most plausible way it could work is if they produce solutions that we can check. But if this were the case, human supervision would be relatively easy. That's plausible but it's a scenario I care less about.

Additionally, if we could use somewhat aligned systems to make more aligned ones, iterated amplification probably works for alignment (narrowly defined by "trying to do what we want"). The only remaining challenge would be to create one system that's somewhat smarter than us and somewhat aligned (in our case that's true by assumption). The rest follows, informally speaking, by induction as long as the AI+humans system can keep improving intelligence as alignment is improved. Which seems likely. That's also plausible but it's a big assumption and may not be the most important scenario / isn't a 'tale of doom'.

Comment by soerenmind on Any rebuttals of Christiano and AI Impacts on takeoff speeds? · 2019-04-22T11:15:55.802Z · score: 3 (2 votes) · LW · GW

AFAICT Paul's definition of slow (I prefer gradual) takeoff basically implies that local takeoff and immediate unipolar outcomes are pretty unlikely. Many people still seem to put stock in local takeoff. E.g. Scott Garrabrant. Zvi and Eliezer have said they would like to write rebuttals. So I'm surprised by the scarcity of disagreement that's written up.

Comment by soerenmind on Any rebuttals of Christiano and AI Impacts on takeoff speeds? · 2019-04-22T11:00:36.676Z · score: 5 (3 votes) · LW · GW

Thanks. IIRC the comments didn't feature that much disagreement and little engagement from established researchers. I didn't find too much of these in other threads either. I'm not sure if I should infer that little disagreement exists.

Re Paul's definition, he expects there will be years between 50% and 100% GDP growth rates. I think a lot of people here would disagree but I'm not sure.

Comment by soerenmind on How much funding and researchers were in AI, and AI Safety, in 2018? · 2019-04-21T20:19:10.758Z · score: 14 (4 votes) · LW · GW

I counted 37 researchers with safety focus plus MIRI researchers in September 2018. These are mostly aimed at AGI and at least PhD level. I also counted 38 who do safety at various levels of part-time. I can email the spreadsheet. You can also find it in 80k's safety Google group.

Comment by soerenmind on Some intuition on why consciousness seems subjective · 2018-11-21T11:06:43.601Z · score: 1 (1 votes) · LW · GW

Gotcha, I'm referring to a representation encoded in neuron activity, which is the physical process.

Comment by soerenmind on Some intuition on why consciousness seems subjective · 2018-11-06T00:40:25.909Z · score: 1 (1 votes) · LW · GW

Where else would the model be if not inside the head? Or are you saying one can 'understand' physical objects without any hint of a model?

Comment by soerenmind on Some intuition on why consciousness seems subjective · 2018-10-17T19:03:04.918Z · score: 9 (2 votes) · LW · GW

Late response:

Instantiation/representation/having a model, in my view, is not binary and is needed for any understanding. You seem to say that I don't think 'understanding' requires instantiation. My example with the bowl is meant to say that you do require a non-zero degree of instantiation - although I would call it modelling instead because instantiation makes me think of a temporal model, but bowls can be represented as static without losing their defining features. In short, no model=no understanding is my claim. This is an attempt to make the word knowledge more precise because it can mean many things.

I then go on to describe why you need a more high fidelity model to represent the defining features of someone's brain state evolving over some time period. The human brain is obviously incapable of that. Although an exact subatomic model of a bowl contains a lot of information too, you can abstract much more of it away without losing anything essential.

I'd also like to correct that I make no claims that science, or anything, is subjective. Conversely, I'm claiming that subjectivity is not a fundamental concept and we can taboo it in discussions like these.

Comment by soerenmind on Agents That Learn From Human Behavior Can't Learn Human Values That Humans Haven't Learned Yet · 2018-08-28T14:24:01.722Z · score: 1 (1 votes) · LW · GW

Here's how I'd summarize my disagreement with the main claim: Alice is not acting rationally in your thought experiment if she acts like Bob (under some reasonable assumptions). In particular, she is doing pure exploitation and zero (value-)exploration by just maximizing her current weighted sum. For example, she should be reading philosophy papers.

Comment by soerenmind on Shaping economic incentives for collaborative AGI · 2018-06-30T16:19:29.604Z · score: 4 (3 votes) · LW · GW

(Btw I think you may have switched your notation from theta to x in section 5.)

Comment by soerenmind on Shaping economic incentives for collaborative AGI · 2018-06-30T16:10:14.719Z · score: 4 (3 votes) · LW · GW

Neat paper, congrats!

Comment by soerenmind on Big Advance in Infinite Ethics · 2017-12-11T03:30:00.376Z · score: 2 (1 votes) · LW · GW

Warning: I haven't read the paper so take this with a grain of salt

Here's how it would go wrong if I understand it right: For exponentially discounted MDPs there's something called an effective horizon. That means everything after that time is essentially ignored.

You pick a tiny . Say (without loss of generality) that all utilities . Then there is a time with . So the discounted cumulative utility from anything after is bounded by (which follows from the limit of the geometric series). That's an arbitrarily small constant.

We can now easily construct pairs of sequences for which LDU gives counterintuitive conclusions. E.g. a sequence which is maximally better than for any until the end of time but ever so slightly worse (by ) for .

So anything that happens after is essentially ignored - we've essentially made the problem finite.

Exponential discounting in MDPs is standard practice. I'm surprised that this is presented as a big advance in infinite ethics as people have certainly thought about this in economics, machine learning and ethics before.

Btw, your meta-MDP probably falls into the category of Bayes-Adaptive MDP (BAMDP) or Bayes-Adaptive partially observable MDP (BAPOMDP) with learned rewards.

Comment by soerenmind on The Three Levels of Goodhart's Curse · 2017-10-28T15:14:19.000Z · score: 0 (0 votes) · LW · GW

(also x-posted from https://arbital.com/p/goodharts_curse/#subpage-8s5)

Another, speculative point: If and were my utility function and my friend's, my intuition is that an agent that optimizes the wrong function would act more robustly. If true, this may support the theory that Goodhart's curse for AI alignment would be to a large extent a problem of defending against adversarial examples by learning robust features similar to human ones. Namely, the robust response may be because me and my friend have learned similar robust, high-level features; we just give them different importance.

Comment by soerenmind on The Three Levels of Goodhart's Curse · 2017-10-28T15:05:23.000Z · score: 0 (0 votes) · LW · GW

(x-posted from Arbital ==> Goodhart's curse)

On "Conditions for Goodhart's curse":

It seems like with AI alignment the curse happens mostly when V is defined in terms of some high-level features of the state, which are normally not easily maximized. I.e., V is something like a neural network where is the state.

Now suppose U' is a neural network which outputs the AI's estimate of these features. The AI can then manipulate the state/input to maximize these features. That's just the standard problem of adversarial examples.

So it seems like the conditions we're looking for are generally met in the common setting were adversarial examples do work to maximize some loss function. One requirement there is that the input space is high-dimensional.

So why doesn't the 2D Gaussian example go wrong? [This is about the example from Arbital ==> Goodhart's Curse where there is no bound on and ]. There's no high-level features to optimize by using the flexibility of the input space.

On the other hand, you don't need a flexible input space to fall prey to the winner's curse. Instead of using the high flexibility of the input space you use the 'high flexibility' of the noise if you have many data points. The noise will take any possible value with enough data, causing the winner's curse. If you care about a feature that is bounded under the real-world distribution but noise is unbounded, you will find that the most promising-looking data points are actually maximizing the noise.

There's a noise-free (i.e. no measurement errors) variant of the winner's curse which suggests another connection to adversarial examples. If you simply have data points and pick the one that maximizes some outcome measure, you can conceptualize this as evolutionary optimization in the input space. Usually, adversarial examples are generated by following the gradient in the input space. Instead, the winner's curse uses evolutionary optimization.

Comment by soerenmind on There's No Fire Alarm for Artificial General Intelligence · 2017-10-16T17:42:27.324Z · score: 20 (8 votes) · LW · GW

People had previously given Go as an example of What You See Before The End.

Who said this? I only heard of the prediction that it'll take 10+ years, made only a few years before 2015.

Comment by soerenmind on Updating towards the simulation hypothesis because you think about AI · 2016-03-22T10:12:04.303Z · score: 0 (0 votes) · LW · GW

I guess an answer to "Given that my name is Alex, what is the probability that my name is Alex?" could be that the hypothesis is highly selected. When you're still the soul that'll be assigned to a body, looking at the world from above, this guy named Alex won't stick out because of his name. But the people who will influence the most consequential event in the history of that world will.

Comment by soerenmind on Updating towards the simulation hypothesis because you think about AI · 2016-03-06T17:31:02.366Z · score: 0 (0 votes) · LW · GW

"The core of this objection is that not only you are special, but that everybody is special"

Is your point sort of the same thing I'm saying with this? "Everyone has some things in their life that are very exceptional by pure chance. I’m sure there’s some way to deal with this in statistics but I don’t know it."

Comment by soerenmind on Rational diet: Eating insects? · 2016-03-03T21:02:01.222Z · score: 1 (1 votes) · LW · GW

Seems like a bad meme to spread for precautionary reasons.

http://reducing-suffering.org/the-importance-of-insect-suffering/

http://reducing-suffering.org/why-i-dont-support-eating-insects/

Warning: Bringing this argument at a dinner party with trendsetting, ecologically conscious consumers might cost you major idiosyncrasy credits.

Comment by soerenmind on Results of a One-Year Longitudinal Study of CFAR Alumni · 2015-12-12T21:50:47.554Z · score: 3 (3 votes) · LW · GW

"We have been conducting a peer survey of CFAR workshop participants, which involves sending surveys about the participant to 2 of their friends, both before the workshop and again approximately one year later. We are in the final stages of data collection on those surveys, and expect to begin the data analysis later this month."

Comment by soerenmind on Take the EA survey, help the EA movement grow and potentially win $250 to your favorite charity · 2015-11-30T11:39:22.402Z · score: 1 (1 votes) · LW · GW

I'd also like to see the results on LW this year!

Comment by soerenmind on Take the EA survey, help the EA movement grow and potentially win $250 to your favorite charity · 2015-11-30T11:38:48.579Z · score: 2 (2 votes) · LW · GW

How about promoting in Main? Was promoted last year IIRC. I think the overlap of the communities can justify this. Disclosure: I'm biased as an aspiring effective altruist.

Comment by soerenmind on Open thread, Oct. 26 - Nov. 01, 2015 · 2015-11-05T17:40:58.445Z · score: 1 (1 votes) · LW · GW

The EA Global videos will be officially released soon. You can already watch them here, but I couldn't find the xrisk video among them. I'd suggest just asking the speakers for their slides. I remember two of them were Nate Soares and Owen Cotton-Barrat.

Comment by soerenmind on Working at MIRI: An interview with Malo Bourgon · 2015-11-02T12:17:42.770Z · score: 0 (0 votes) · LW · GW

Thanks for mentioning that. For some reason the link changed from effective-altruism.com to lesswrong.com when I copy-pasted the article. Fixed!

Comment by soerenmind on Effective Altruism from XYZ perspective · 2015-07-12T23:29:14.701Z · score: 0 (0 votes) · LW · GW

I don't get your argument there. After all, you might e.g. value other EAs instrumentally because they help members of other species. That is, you intrinsically value an EA like anyone else, but you're inclined to help them more because that will translate into others being helped.