Comment by rk on How can guesstimates work? · 2019-07-11T11:00:28.669Z · score: 5 (3 votes) · LW · GW

Most of the rituals were created by individuals that did actually understand the real reasons for why certain things had to happen

This is not part of my interpretation, so I was surprised to read this. Could you say more about why you think this? (Either why you think this being argued for in Vaniver's / Scott's posts or why you believe it is fine; I'm mostly interested in the arguments for this claim).

For example, Scott writes:

How did [culture] form? Not through some smart Inuit or Fuegian person reasoning it out; if that had been it, smart European explorers should have been able to reason it out too.

And quotes (either from Scholar's Stage or The Secret of Our Success):

It’s possible that, with the introduction of rice, a few farmers began to use bird sightings as an indication of favorable garden sites. On-average, over a lifetime, these farmers would do better – be more successful – than farmers who relied on the Gambler’s Fallacy or on copying others’ immediate behavior.

Which, I don't read as the few farmers knowing why they should use bird sightings.

Or this quote from Xunzi in Vaniver's post:

One performs divination and only then decides on important affairs. But this is not to be regarded as bringing one what one seeks, but rather is done to give things proper form.

Which doesn't sound like Xunzi understanding the specific importance of a given divination (I realise Xunzi is not the originator of the divinatory practices)

Comment by rk on Eli's shortform feed · 2019-06-04T14:49:07.731Z · score: 1 (1 votes) · LW · GW

This link (and the one for "Why do we fear the twinge of starting?") is broken (I think it's an admin view?).

Comment by rk on AI development incentive gradients are not uniformly terrible · 2019-03-14T13:04:21.358Z · score: 4 (3 votes) · LW · GW

Yes, you're quite right!

The intuition becomes a little clearer when I take the following alternative derivation:

Let us look at the change in expected value when I increase my capabilities. From the expected value stemming from worlds where I win, we have . For the other actor, their probability of winning decreases at a rate that matches my increase in probability of winning. Also, their probability of deploying a safe AI doesn't change. So the change in expected value stemming fro m worlds where they win is .

We should be indifferent to increasing capabilities when these sum to 0, so .

Let's choose our units so . Then, using the expressions for from your comment, we have .

Dividing through by we get . Collecting like terms we have and thus . Substituting for we have and thus

Comment by rk on When does introspection avoid the pitfalls of rumination? · 2019-02-20T21:28:29.446Z · score: 3 (2 votes) · LW · GW

It seems like keeping a part 'outside' the experience/feeling is a big part for you. Does that sound right? (Similar to the unblending Kaj talks about in his IFS post or clearing a space in Focusing)

Now of course today's structure/process is tomorrow's content

Do you mean here that as you progress, you will introspect on the nature of your previous introspections, rather than more 'object-level' thoughts and feelings?

Comment by rk on When does introspection avoid the pitfalls of rumination? · 2019-02-20T21:26:16.581Z · score: 1 (1 votes) · LW · GW

I think that though one may use the techniques looking for a solution (which I agree makes them solution-oriented in a sense), it's not right to so that in, say, Focusing, you introspect on solutions rather than causes. So maybe the difference is more the optimism than the area of focus?

Comment by rk on When does introspection avoid the pitfalls of rumination? · 2019-02-20T21:22:44.639Z · score: 2 (2 votes) · LW · GW

This points to a lot of what the difference feels like to me! It jibes with my intuition for the situation that prompted this question.

I was mildly anxious about something (I forget what), and stopped myself as I was about to move on to some work (in which I would have lost the anxiety). I thought it might be useful to be with the anxiety a bit and see what was so anxious about the situation. This felt like it would be useful, but then I wondered if I would get bad ruminative effects. It seemed like I wouldn't, but I wasn't sure why.

I'm not sure if I should be given pause by the fact you say that rumination is concerned with action; my reading of the wikipedia page is that being concerned with action is a big missing feature of rumination

Comment by rk on Building up to an Internal Family Systems model · 2019-02-20T14:55:19.261Z · score: 5 (3 votes) · LW · GW

I came back to this post because I was thinking about Scott's criticism of subminds where he complains about "little people who make you drink beer because they like beer".

I'd already been considering how your robot model is nice for seeing why something submind-y would be going on. However, I was still confused about thinking about these various systems as basically people who have feelings and should be negotiated with, using basically the same techniques I'd use to negotiate with people.

Revisiting, the "Personalized characters" section was pretty useful. It's nice to see it more as a claim that '[sometimes for some people] internal processes may be represented using social machinery' than 'internal agents are like fighting people'.

## When does introspection avoid the pitfalls of rumination?

2019-02-20T14:14:46.798Z · score: 23 (10 votes)
Comment by rk on The Case for a Bigger Audience · 2019-02-15T18:39:57.838Z · score: 3 (2 votes) · LW · GW

Not Ben, but I have used X Goodhart more than 20 times (summing over all the Xs)

Comment by rk on Short story: An AGI's Repugnant Physics Experiment · 2019-02-14T19:36:14.570Z · score: 10 (5 votes) · LW · GW

Section of an interesting talk relating to this by Anna Salamon. Makes the point that if ability to improve its model of fundamental physics is not linear in the amount of Universe it controls, such an AI would be at least somewhat risk-averse (with respect to gambles that give it different proportions of our Universe)

Comment by rk on Building up to an Internal Family Systems model · 2019-01-26T14:53:00.411Z · score: 5 (3 votes) · LW · GW

I really enjoyed this post and starting with the plausible robot design was really helpful for me accessing the IFS model. I also enjoyed reflecting on your previous objections as a structure for the second part.

The part with repeated unblending sounds reminiscent of the "Clearing a space" stage of Focusing, in which one acknowledges and sets slightly to the side the problems in one's life. Importantly, you don't "go inside" the problems (I take 'going inside' to be more-or-less experiencing the affect associated with the problems). This seems pretty similar to stopping various protectors from placing negative affect into consciousness.

I noticed something at the end that it might be useful to reflect on: I pattern matched the importance of childhood traumas to woo and it definitely decreased my subjective credence in the IFS model. I'm not sure to what extent I endorse that reaction.

One thing I'd be interested in expansion on: you mention you think that IFS would benefit most people. What do you mean by 'benefit' in this case? That it would increase their wellbeing? Their personal efficacy? Or perhaps that it will increase at least one of their wellbeing and personal efficacy but not necessarily both for any given person?

Comment by rk on AI development incentive gradients are not uniformly terrible · 2019-01-23T20:54:37.835Z · score: 3 (2 votes) · LW · GW

I think this is a great summary (EDIT: this should read "I think the summary in the newsletter was great").

That said, these models are still very simplistic, and I mainly try to derive qualitative conclusions from them that my intuition agrees with in hindsight.

Yes, I agree. The best indicator I had of making a mathematical mistake was whether my intuition agreed in hindsight

Comment by rk on Does anti-malaria charity destroy the local anti-malaria industry? · 2019-01-07T23:19:03.789Z · score: 1 (1 votes) · LW · GW

Thanks! The info on parasite specificity/history of malaria is really useful.

I wonder if you know of anything specifically about the relative cost-effectiveness of nets for infected people vs uninfected people? No worries if not

Comment by rk on Does anti-malaria charity destroy the local anti-malaria industry? · 2019-01-07T14:51:30.963Z · score: 3 (2 votes) · LW · GW

Probably the most valuable nets are those deployed on people who already have malaria, to prevent it from spreading to mosquitoes, and thus to more people

I hadn't thought about this! I'd be interested in learning more about this. Do you have a suggested place to start reading or more search term suggestions (on top of Ewald)?

Also, can animals harbour malaria pathogens that harm humans? This section of the wiki page on malaria makes me think not, but it's not explicitly stated

Comment by rk on Anthropic paradoxes transposed into Anthropic Decision Theory · 2018-12-21T13:04:56.476Z · score: 1 (1 votes) · LW · GW

your decision theory maps from decisions to situations

Could you say a little more about what a situation is? One thing I thought is maybe that a situation is a result of a choice? But then it sounds like your decision theory decides whether you should, for example, take an offered piece of chocolate, regardless of whether you like chocolate or not. So I guess that's not it

But the point is that each theory should be capable of standing on its own

Can you say a little more about how ADT doesn't stand on its own? After all, ADT is just defined as:

An ADT agent is an agent that would implement a self-confirming linking with any agent that would do the same. It would then maximises its expected utility, conditional on that linking, and using the standard non-anthropic probabilities of the various worlds.

Is the problem that it mentions expected utility, but it should be agnostic over values not expressible as utilities?

Comment by rk on Anthropic paradoxes transposed into Anthropic Decision Theory · 2018-12-21T11:10:36.741Z · score: 1 (1 votes) · LW · GW

So I think an account of anthropics that says "give me your values/morality and I'll tell you what to do" is not an account of morality + anthropics, but has actually pulled out morality from an account of anthropics that shouldn't have had it. (Schematically, rather than define adt(decisionProblem) = chooseBest(someValues, decisionProblem), you now have define adt(values, decisionProblem) = chooseBest(values, decisionProblem))

Perhaps you think that an account that makes mention of morality ends up being (partly) a theory of morality? And that also we should be able to understand anthropic situations apart from values?

To try and give some intuition for my way of thinking about things, suppose I flip a fair coin and ask agent A if it came up heads. If it guesses heads and is correct, it gets $100. If it guesses tails and is correct, both agents B and C get$100. Agents B and C are not derived from A in any special way and will not be offered similar problems -- there is not supposed to be anything anthropic here.

What should agent A do? Well that depends on A's values! This is going to be true for a non-anthropic decision theory so I don't see why we should expect an anthropic decision theory to be free of this dependency.

Here's another guess at something you might think: "anthropics is about probabilities. It's cute that you can parcel up value-laden decisions and anthropics, but it's not about decisions."

Maybe that's the right take. But even if so, ADT is useful! It says that in several anthropic situations, even if you've not sorted your anthropic probabilities out, you can still know what to do.

Comment by rk on Anthropic paradoxes transposed into Anthropic Decision Theory · 2018-12-21T00:07:26.926Z · score: 1 (1 votes) · LW · GW

It seems to me that ADT separates anthropics and morality. For example, Bayesianism doesn't tell you what you should do, just how to update your beliefs. Given your beliefs, what you value decides what you should do. Similarly, ADT gives you an anthropic decision procedure. What exactly does it tell you to do? Well, that depends on your morality!

Comment by rk on Overconfident talking down, humble or hostile talking up · 2018-12-01T13:35:05.801Z · score: 6 (2 votes) · LW · GW

As I read through, the core model fit well with my intuition. But then I was surprised when I got to the section on religious schisms! I wondered why we should model the adherents of a religion as trying to join the school with the most 'accurate' claims about the religion.

On reflection, it appears to me that the model probably holds roughly as well in the religion case as the local radio intellectual case. Both of those are examples of "hostile" talking up. I wonder if the ways in which those cases diverge from pure information sharing explains the difference between humble and hostile.

In particular, perhaps some audiences are looking to reduce cognitive dissonance between their self-image as unbiased on the one hand and their particular beliefs and preferences on the other. That leaves an opening for someone to sell reasonableness/unbiasedness self-image to people holding a given set of beliefs and preferences.

Someone making reasonable counterarguments is a threat to what you've offered, and in that case your job is to provide refutation, counterargument and discredit so it is easy for that person's arguments to be dismissed (through a mixture of claimed flaws in their arguments and claimed flaws in the person promoting them). This would be a 'hostile' talking up.

Also, we should probably expect to find it hard to distinguish between some hostile talking ups and overconfident talking downs. If we could always distinguish, hostile talking up is a clear signal of defeat.

Comment by rk on On MIRI's new research directions · 2018-11-23T17:40:34.173Z · score: 3 (2 votes) · LW · GW

When it comes to disclosure policies, if I'm uncertain between the "MIRI view" and the "Paul Christiano" view, should I bite the bullet and back one approach over the other? Or can I aim to support both views, without worrying that they're defeating each other?

My current understanding is that it's coherent to support both at once. That is, I can think that possibly intelligence needs lots of fundamental insights, and that safety needs lots of similar insights (this is supposed to be a characterisation of a MIRI-ish view). I can think that work done on figuring out more about intelligence and how to control it should only be shared cautiously, because it may accelerate the creation of AGI.

I can also think that prosaic AGI is possible, and fundamental insights aren't needed. Then I might think that I could do research that would help align prosaic AGIs but couldn't possibly align (or contribute to) an agent-based AGI.

Is the above consistent? Also do people (with better emulators of people) who worry about disclosure think that this makes sense from their point of view?

## Believing others' priors

2018-11-22T20:44:15.303Z · score: 9 (4 votes)
Comment by rk on Robust Delegation · 2018-11-20T17:37:31.275Z · score: 3 (2 votes) · LW · GW

I think you've got a lot of the core idea. But it's not important that we know that the data point has some ranking within a distribution. Let me try and explain the ideas as I understand them.

The unbiased estimator is unbiased in the sense that for any actual value of the thing being estimated, the expected value of the estimation across the possible data is the true value.

To be concrete, suppose I tell you that I will generate a true value, and then add either +1 or -1 to it with equal probability. An unbiased estimator is just to report back the value you get:

E[estimate(x)] = estimate(x + 1)/2 + estimate(x - 1)/2

If the estimate function is identity, we have (x + x +1 -1)/2 = x. So its unbiased.

Now suppose I tell you that I will generate the true value by drawing from a normal distribution with mean 0 and variance 1, and then I tell you 23,000 as the reported value. Via Bayes, you can see that it is more likely that the true value is 22,999 than 23,001. But the unbiased estimator blithely reports 23,000.

So, though the asymmetry is doing some work here (the further we move above 0, the more likely that +1 rather than -1 is doing some of the work), it could still be that 23,000 is the smallest of the values I sampled.

Comment by rk on Act of Charity · 2018-11-18T16:12:30.748Z · score: 4 (2 votes) · LW · GW

I am also pretty interested in 2 (ex-post giving). In 2015, there was impactpurchase.org. I got in contact with them about it, and the major updates Paul reported were a) being willing to buy partial contributions (not just for people who were claiming full responsibility for things) and b) more focused on what's being funded (like for example, only asking for people to submit claims on blog posts and articles).

I realise that things like impactpurchase is possibly framed in terms of a slightly divergent reason for 2 (it seems more focused on changing the incentive landscape, whereas the posts above include thinking about whether giving slack to people with track records will lead those people to be counterfactually more effective in future).

Comment by rk on Prediction-Augmented Evaluation Systems · 2018-11-17T14:53:15.181Z · score: 2 (2 votes) · LW · GW

I'm interested in the predictors' incentives.

One problem with decision markets is that you only get paid for your information about an option if the decision is taken, which can incentivise you to overstate the case for an option (if you see its predicted benefit X, its true benefit is X+k and it would have to be at X+k+l to be chosen, if l < k, you will want to move the predicted benefit to X+k+l and make a k-l profit).

Maybe you avoid this if you pay for participation in PAES, but then you might risk people piling on to obvious judgments to get paid. Maybe you evaluate the counterfactual shift in confidence from someone making a judgment, and reward accordingly? But then it seems possible that the problems in the previous paragraph would appear again.

Comment by rk on Prediction-Augmented Evaluation Systems · 2018-11-17T14:43:55.633Z · score: 2 (2 votes) · LW · GW

On estimating expected value, I'm reminded by some of Hanson's work where he suggests predicting later evaluation (recent example: http://www.overcomingbias.com/2018/11/how-to-fund-prestige-science.html). I think this is an interesting subcase of the evaluating subprocess. It also fits nicely with this post by PC

Comment by rk on Prediction-Augmented Evaluation Systems · 2018-11-16T12:55:03.949Z · score: 3 (3 votes) · LW · GW

Thanks for the video! I had already skimmed this post when I noticed it, and then I watched it and reread the post. Perhaps my favourite thing about it was that it was slightly non-linear (skipping ahead to the diagram, non-linearity when covering sections).

Could you say a bit more about your worries with (scaling) prediction markets?

Do you have any thoughts about which experiments have the best expected information value per \$?

Comment by rk on Annihilating aliens & Rare Earth suggest early filter · 2018-11-13T10:19:36.592Z · score: 1 (1 votes) · LW · GW

This was really interesting. I've thought of this comment on-and-off for the last month.

You raised an interesting reason for thinking that transhumans would have high anthropic measure. But if you have a reference-class based anthropic theory, couldn't transhumans have a lot of anthropic measure, but not be in our reference class (that is, for SSA, we shouldn't reason as if we were selected from a class containing all humans and transhumans)?

Even if we think that the reference class should contain transhumans, do we have positive reasons for thinking that it should contain organisations?

One thought is that you might reject reference classes in anthropic reasoning (even under SSA). Is that the case?

Comment by rk on AI development incentive gradients are not uniformly terrible · 2018-11-13T10:15:17.576Z · score: 1 (1 votes) · LW · GW

Yes, that seems an important case to consider.

You might still think the analysis in the post is relevant if there are actors that can shape the incentive gradients you talk about: Google might be able to focus its sub-entities in a particular way while maintaining profit or a government might choose to implement more or less oversight over tech companies.

Even with the above paragraph, it seems like the relative change-over-time in resources and power of the strategic entities would be important to consider, as you point out. In this case, it seems like (known) fast takeoffs might be safer!

Comment by rk on AI development incentive gradients are not uniformly terrible · 2018-11-12T16:32:02.043Z · score: 2 (2 votes) · LW · GW

I talked to a couple of people in relevant organisations about possible info hazards for talking about races (not because this model is sophisticated or non-obvious, but because it contributes to general self-fulfilling chattering). Amongst those I talked to, they were not worried about (a) simple pieces with at least some nuance in general and (b) this post in particular

Comment by rk on AI development incentive gradients are not uniformly terrible · 2018-11-12T16:29:44.914Z · score: 2 (2 votes) · LW · GW

Comment here if you have structure/writing complaints for the post

Comment by rk on AI development incentive gradients are not uniformly terrible · 2018-11-12T16:29:20.502Z · score: 2 (2 votes) · LW · GW

Comment here if you are worried about info-hazard-y-ness of talking about AI races

Comment by rk on AI development incentive gradients are not uniformly terrible · 2018-11-12T16:28:26.949Z · score: 2 (2 votes) · LW · GW

Comment here if there are maths problems

## AI development incentive gradients are not uniformly terrible

2018-11-12T16:27:31.886Z · score: 23 (10 votes)
Comment by rk on Update the best textbooks on every subject list · 2018-11-08T21:46:24.089Z · score: 4 (3 votes) · LW · GW

Having the rules in the post made me think you wanted new suggestions in this thread. The rest of the post and habryka's comment point towards new comments in the old thread.

If you want people to update the old thread, I would either remove the rules from this post, or add a caveat like "Remember, when you go to post in that thread, you should follow the rules below"

Comment by rk on Track-Back Meditation · 2018-11-03T16:33:02.506Z · score: 2 (2 votes) · LW · GW

I've been trying this for a couple of weeks now. It's hard! I often will have a missing link in the distraction chain: I know something that came at point X in the distraction chain and X-n, for n > 1. When I try and probe the missing part it's pretty uncomfortable. Like using or poking a numb limb. It can be pretty aversive, so I can't bring myself to do this meditation every time I meditate.

Comment by rk on "Now here's why I'm punching you..." · 2018-10-18T12:03:10.653Z · score: 8 (3 votes) · LW · GW

This changed my mind about the parent comment (I think the first paragraph would have done so, but the example certainly helped).

In general, I don't mind added concreteness even at the cost of some valence-loading. But seeing how well "sanction" works and some other comments that seem to disagree on the exact meaning of "punch", I guess not using "punch" would have been better

Comment by rk on The Kelly Criterion · 2018-10-17T01:08:32.401Z · score: 2 (2 votes) · LW · GW

I did indeed! So I guess this game fails (5) out of Zvi's criteria.

Comment by rk on The Kelly Criterion · 2018-10-17T00:03:27.955Z · score: 2 (2 votes) · LW · GW

Does your program assume that the Kelly bet stays a fixed size, rather than changing?

Here's a program you can paste in your browser that finds the expected value from following Kelly in Gurkenglas' game (it finds EV to be 20)

https://pastebin.com/iTDK7jX6

(You can also fiddle with the first argument to experiment to see some of the effects when 4 doesn't hold)

Comment by rk on Annihilating aliens & Rare Earth suggest early filter · 2018-09-24T15:12:04.690Z · score: 4 (2 votes) · LW · GW

It sounds like in the first part of your post you're disagreeing with my choice of reference class when using SSA? That's reasonable. My intuition is that if one ends up using a reference class-dependent anthropic principle (like SSA) that transhumans would not be part of our reference class, but I suppose I don't have much reason to trust this intuition.

On anthropic measure being tied to independently-intelligent minds, what is the difference between an independently- and dependently-intelligent mind? What makes you think the mind needs to be specifically independently-intelligent?

Comment by rk on Annihilating aliens & Rare Earth suggest early filter · 2018-09-24T15:07:22.192Z · score: 2 (2 votes) · LW · GW

Yes, I suppose the only way that this would not be an issue is if the aliens are travelling at a very high fraction of the speed of light and inflation means that they will never reach spatially distant parts of the Universe in time for this to be an issue.

In SETI-attack, is the idea that the information signals are disruptive and cause the civilisations they may annihilate to be too disrupted (perhaps by war or devastating technological failures) to defend themselves?

Comment by rk on Annihilating aliens & Rare Earth suggest early filter · 2018-09-21T17:26:23.159Z · score: 2 (2 votes) · LW · GW

Yeah, that's a good point. I will amend that part at some point.

Also, the analysis might have some predictions if civilisations don't pass through a (long) observable stage before they start to expand. It increases the probability that a shockwave of intergalactic expansion will arrive at Earth soon. Still, if the region of our past light cone where young civilisations might exist is small enough, we probably just lose information on where the filter is likely to be

## Annihilating aliens & Rare Earth suggest early filter

2018-09-21T15:33:11.603Z · score: 8 (5 votes)
Comment by rk on (A -> B) -> A · 2018-09-12T12:59:02.866Z · score: 4 (3 votes) · LW · GW

I wonder if there are any plausible examples of this type where the constraints don't look like ordering on B and search on A.

To be clear about what I mean about those constraints, here's an example. One way you might be able to implement this function is if you can enumerate all the values of A and then pick the maximum B according to some ordering. If you can't enumerate A, you might have some strategy for searching through it.

But that's not the only feasible strategy. For example, if you can order B, take two elements of B to C and order C, you might do something like taking the element of B that, together with the value less than it, takes you to the greatest C.

My question is whether these weirder functions have any interest

Comment by rk on Is there a practitioner's guide for rationality? · 2018-08-14T21:51:08.152Z · score: 6 (3 votes) · LW · GW

I wasn't aware that CFAR had workshops in Europe before this comment. I applied for a workshop off the back of this. Thanks!

Comment by rk on Tactical vs. Strategic Cooperation · 2018-08-14T19:11:17.315Z · score: 12 (7 votes) · LW · GW

I feel a pull towards downvoting this. I am not going to, because I think this was posted in good faith, and as you say, it's clear a lot of time and effort has gone into these comments. That said, I'd like to unpack my reaction a bit. It may be you disagree with my take, but it may also be there's something useful in it.

[EDIT: I should disclaim that my reaction may be biased from having recently received an aggressive comment.]

First, I should note that I don't know why you did these edits. Did sarahconstantin ask you to? Did you think a good post was being lost behind poor presentation? Is it to spell out your other comment in more detail? Knowing the answer to this might have changed my reaction.

My most important concern is why this feedback was public. The only (charitable) reason I can think of is to give space for pushback of the kind that I am giving.

My other major concern is presentation. The sentence 'I trust that you can see past the basic "I'm being attacked" feeling and can recognise the effort and time that has gone into the comments' felt to me like a status move and potentially upsetting someone then asking them to say thank you.

Comment by rk on Rationality Retreat in Europe: Gauging Interest · 2018-08-14T10:27:39.616Z · score: 2 (2 votes) · LW · GW

It is probably true that those are the places with most engagement. However, as someone without Facebook, I'm always grateful for things (also) being posted in non-FB places (mailing lists work too, but there is a longer lag on finding out about things that way).

Comment by rk on Gears in understanding · 2018-08-13T09:29:31.215Z · score: 3 (1 votes) · LW · GW

It seems like the images of the gears have disappeared. Are they still available anywhere? EDIT: They're back!

Comment by rk on "Taking AI Risk Seriously" (thoughts by Critch) · 2018-08-13T08:49:36.629Z · score: 3 (2 votes) · LW · GW

If you can’t viscerally feel the difference between .1% and 1%, or a thousand and a million, you will probably need more of a statistics background to really understand things like “how much money is flowing into AI, and what is being accomplished, and what does it mean?”

I'm surprised at the suggestion that studying statistics strengthens gut sense of the significance of probabilities. I've updated somewhat towards that based on the above, but I would still expect something more akin to playing with and visualising data to be useful for this

Comment by rk on Sandboxing by Physical Simulation? · 2018-08-01T19:58:17.329Z · score: 3 (2 votes) · LW · GW

It seems hard to me to get information out of the AI without also giving it information. That is, presumably we will configure parts of its environment to correspond to problems in our own world, which necessarily gives some information on our world.

I suppose another option would be that this is a proposal for running AGIs that just run without us ever getting information from. I don't think that's what you meant, but thought I'd check.

Comment by rk on Sandboxing by Physical Simulation? · 2018-08-01T14:27:19.297Z · score: 7 (2 votes) · LW · GW

I suppose the problem comes when the AI starts to communicate with us. There would be a lot of information that they could exploit. Even if they don't get any sense of our physics, if they are able to model us we might be in trouble. And even if we didn't give them any direct communication (for example manifesting puzzles in their world, the solution of which would allow us to solve our own questions), they might promote simulation to a reasonable hypothesis.

EY wrote a story that serves as an intuition pump here.

## Impulsivity in The Procrastination Equation

2018-07-30T20:46:20.303Z · score: 9 (5 votes)
Comment by rk on Prediction Markets: When Do They Work? · 2018-07-29T23:30:11.314Z · score: 11 (3 votes) · LW · GW

Any disagreement with Robin Hanson (https://twitter.com/robinhanson/status/1022535475410731009)?

It seems to me that subsidy addresses V, II and maybe III, but not so much IV and I. Does that seem right?

(Also, I think he's suggested non-anonymous trading as a cure for insider trading before)

Comment by rk on Complete Class: Consequentialist Foundations · 2018-07-20T09:58:05.980Z · score: 1 (1 votes) · LW · GW

On the other hand, there may well be a serious game-theoretic reason why it is "too harsh": someone who is getting to cooperation from the system has no reason to cooperate in turn.

Is there a typo here? ("getting to cooperation" -> "getting no cooperation"). And the idea is that there are other ways of making an impact on the world than the decisions of the "futarchy", so people who have no stake in the futarchy could mess things up other ways, right?

Comment by rk on [1607.08289] "Mammalian Value Systems" (as a starting point for human value system model created by IRL agent) · 2018-07-18T13:05:20.517Z · score: 1 (1 votes) · LW · GW

It wasn't clear to me from the paper if they thought values that came from our contingent history could be worth preserving or promoting. For example, they might think they engage our same moral intuitions as mammalian values without being worth defending

Comment by rk on [1607.08289] "Mammalian Value Systems" (as a starting point for human value system model created by IRL agent) · 2018-07-16T09:33:19.417Z · score: 3 (2 votes) · LW · GW

I felt similarly. I'd also like to see them dig more into their mammalian values + human cognition + evolution of human society/culture. Specifically, (1) defending the breakdown as a good account of human values and (2) separating out their claims about values in human cultures (and being a bit clearer about whether they claim that cultural values are less likely 'true' values, however that might be cashed out) and about values arising from historical incident

Comment by rk on [1607.08289] "Mammalian Value Systems" (as a starting point for human value system model created by IRL agent) · 2018-07-15T21:52:26.861Z · score: 1 (1 votes) · LW · GW

Anyone know how this relates to Armstrong & Mindermann showing that you can't infer the values of an irrational agent? I've not read their paper, but I am not sure if the mammalian values can count as the "normative assumptions" they mention in the abstract

## What I got out of 'Algorithms to Live By'

2018-04-11T16:35:05.155Z · score: 28 (10 votes)

## Trust the (local) expert

2018-04-09T10:22:13.133Z · score: 31 (7 votes)