Halifax Meetup -- Board Games 2019-04-15T04:00:02.799Z · score: 4 (2 votes)
Predictors as Agents 2019-01-08T20:50:49.599Z · score: 12 (8 votes)
Formal Models of Complexity and Evolution 2017-12-31T20:17:46.513Z · score: 12 (4 votes)
A Candidate Complexity Measure 2017-12-31T20:15:39.629Z · score: 29 (9 votes)
Please Help: How to make a big improvement in the alignment of political parties’ incentives with the public interest? 2017-01-18T00:51:56.355Z · score: 5 (4 votes)


Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T17:47:33.460Z · score: 1 (1 votes) · LW · GW

No, I mean Beauty's subjective credence that the coin came up heads. That should be 1/2 by the nature of a coin flip. Then, if the coin comes up tails, you need 1 bit to select between the subjectively identical states of waking up on Monday or Tuesdsay. So in total:

P(heads, Monday) = 1/2,

P(tails, Monday) = 1/4

P(tails, Tuesday) = 1/4

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T03:35:06.331Z · score: 1 (1 votes) · LW · GW

I was aware that many people had since moved on to UDT. I meant that UDASSA seems to be satisfactory at resolving the typical questions about anthropic probabilities, setting aside decision theory/noncomputability issues.

I agree it would be nice to have all this information in an readily-accessible place. Maybe the posts setting out the ideas and later counter-arguments could be put in a curated sequence.

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T02:28:18.342Z · score: 1 (1 votes) · LW · GW

Yeah, I also had similar ideas for solving anthropics a few years ago, and was surprised when I learned that UDASSA had been around for so long. At least you can take pride in having found the right answer independently.

I think that UDASSA gives P(heads) = 1/2 on the Sleeping Beauty problem due to the way it weights different observer-moments, proportional to 2^(-description length). This might seem a bit odd, but I think it's necessary to avoid problems with Boltzmann brains and the like.

Comment by interstice on [Link] John Carmack working on AGI · 2019-11-14T00:18:11.553Z · score: 5 (3 votes) · LW · GW

Here it is:

Starting this week, I’m moving to a "Consulting CTO” position with Oculus.
I will still have a voice in the development work, but it will only be consuming a modest slice of my time.
As for what I am going to be doing with the rest of my time: When I think back over everything I have done across games, aerospace, and VR, I have always felt that I had at least a vague “line of sight” to the solutions, even if they were unconventional or unproven. I have sometimes wondered how I would fare with a problem where the solution really isn’t in sight. I decided that I should give it a try before I get too old.
I’m going to work on artificial general intelligence (AGI).
I think it is possible, enormously valuable, and that I have a non-negligible chance of making a difference there, so by a Pascal’s Mugging sort of logic, I should be working on it.
For the time being at least, I am going to be going about it “Victorian Gentleman Scientist” style, pursuing my inquiries from home, and drafting my son into the work.
Runner up for next project was cost effective nuclear fission reactors, which wouldn’t have been as suitable for that style of work. 😊
Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T00:13:11.480Z · score: 1 (1 votes) · LW · GW

It's true that UDASSA is tragically underrated, given that(it seems to me) it provides a satisfactory resolution to all anthropic problems. I think this might be a situation where people tend to leave the debate and move on to something else when they seem to have found a satisfactory position, like how most LW people don't bother arguing about whether god exists anymore.

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-13T22:57:51.918Z · score: 6 (2 votes) · LW · GW

This seems like a step backwards from UDASSA, another potential solution to many anthropic problems. UDASSA has a completely formal specification, while this model relies on a somewhat unclear verbal definition. So you need to know the 'relative frequency' with which H happens. But what are we averaging over here? Our universe? All possible universes? If uncertain about which universe we are in, how should we average over the different universes? What if we are reasoning about an event which, as far as we know, will only happen once?

Comment by interstice on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-28T00:50:54.455Z · score: 16 (6 votes) · LW · GW

Can't tell if joking, but they probably mean that they were "actually in the mafia" in the game, so not in the real-world mafia.

Comment by interstice on Accelerating the capitalistic tendencies of social systems to do good · 2019-09-18T14:47:18.133Z · score: 3 (2 votes) · LW · GW

A better system won't just magically form itself after the existing system has been destroyed. In all likelihood what will form will be either a far more corrupt and oligarchical system, or no system at all. I think a better target for intervention would be attempting to build superior alternatives so that something is available when the existing systems start to fail. In education for example, Lambda School is providing a better way for many people to learn programming than college.

Note also that existing systems of power are very big, so efforts to damage them probably have low marginal impact. Building initially small new things can have much higher marginal impact. If the systems are as a corrupt as you think they are, they should destroy themselves on their own in any case.

Comment by interstice on Halifax Meetup -- Board Games · 2019-09-02T16:29:51.927Z · score: 5 (3 votes) · LW · GW

Update -- We're going to have a meetup on September 21st at Uncommon Grounds (1030 South Park Street). This is going to be posted on SlateStarCodex as part of the worldwide meetup event.

Comment by interstice on Self-supervised learning & manipulative predictions · 2019-08-22T19:47:46.296Z · score: 1 (1 votes) · LW · GW

No worries, I also missed the earlier posts when I wrote mine. There's lots of stuff on this website.

I endorse your rephrasing of example 1. I think my position is that it's just not that hard to create a "self-consistent probability distribution". For example, say you trained an RNN to predict sequences, like in this post. Despite being very simple, it already implicitly represents a probability distribution over sequences. If you train it with back-propagation on a confusing article involving pyrite, then its weights will be updated to try to model the article better. However, if "pyrite" itself was easy to predict, then the weights that lead to it outputting "pyrite" will *not* be updated. The same thing holds for modern Transformer networks, which predict the next token based only on what it has seen so far. (Here is a paper with a recent example using GPT-2. Note the degeneracy of maximum likelihood sampling, but how this becomes less of a problem when just sampling from the implied distribution)

I agree that this sort of manipulative prediction could be a problem in principle, but it does not seem to occur in recent ML systems. (Although, there are some things which are somewhat like this; the earlier paper I linked and mode collapse do involve neglecting high-entropy components of the distribution. However, the most straightforward generation and training schemes do not incentivize this)

For example 2, the point about gradient descent is this: while it might be the case that outputting "Help I'm stuck in a GPU Factory000" would ultimately result in a higher accuracy, the way the gradient is propagated would not encourage the agent to behave manipulatively. This is because, *locally*, "Help I'm stuck in a GPU Factory" decreases accuracy, so that behavior(or policies leading to it) will be dis-incentivized by gradient descent. It may be the case that this will result in easier predictions later, but the structure of the reward function does not lead to any optimization pressure towards such manipulative strategies. Learning taking place over high-level abstractions doesn't change anything, because any high-level abstractions leading to locally bad behavior will likewise be dis-incentivized by gradient descent

Comment by interstice on Self-supervised learning & manipulative predictions · 2019-08-20T14:44:33.299Z · score: 5 (3 votes) · LW · GW

Example 1 basically seems to be the problem of output diversity in generative models. This can be a problem in generative models, but there are ways around it. e.g. instead of outputting the highest-probability individual sequence, which will certainly look "manipulative" as you say, sample from the implied distribution over sequences. Then the sentence involving "pyrite" will be output with probability proportional to how likely the model thinks "pyrite" is on its own, disregarding subsequent tokens.

For example 2, I wrote a similar post a few months ago (and in fact, this idea seems to have been proposed and forgotten a few times on LW). But for gradient descent-based learning systems, I don't think the effect described will take place.

The reason is that gradient-descent-based systems are only updated towards what they actually observe. Let's say we're training a system to predict EU laws. If it predicts "The EU will pass potato laws..." but sees "The EU will pass corn laws..." the parameters will be updated to make "corn" more likely to have been output than "potato". There is no explicit global optimization for prediction accuracy.

As you train to convergence, the predictions of the model will attempt to approach a fixed point, a set of predictions that imply themselves. However, due to the local nature of the update, this fixed-point will not be selected to be globally minimal, it will just be the first minima the model falls into. (This is different from the problems with "local minima" you may have heard about in ordinary neural network training -- those go away in the infinite-capacity limit, whereas local minima among fixed-points do not) The fixed-point should look something like "what I would predict if I output [what I would predict if I output [what I would predict .. ]]]" where the initial prediction is some random gibberish. This might look pretty weird, but it's not optimizing for global prediction accuracy.

Comment by interstice on Halifax Meetup -- Board Games · 2019-08-01T02:40:50.248Z · score: 3 (2 votes) · LW · GW

Hey! It did happen. So far there are 3 of us, we've been meeting up pretty regularly. If you're interested I can let you know the next time we're planning to meet up.

Comment by interstice on No Safe AI and Creating Optionality · 2019-06-24T02:33:21.564Z · score: 5 (3 votes) · LW · GW

What this would mean is that we would have to recalibrate our notion of "safe", as whatever definition has been proved impossible does not match our intuitive perception. We consider lots of stuff we have around now to be reasonably safe, although we don't have a formal proof of safety for almost anything.

Comment by interstice on "UDT2" and "against UD+ASSA" · 2019-05-12T16:06:40.403Z · score: 2 (2 votes) · LW · GW

In the mad scientist example, why would your measure for the die landing 0 be 0.91? I think Solomonoff Induction would assign probability 0.1 to that outcome, because you need an extra bits to specify which clone you are. Or is this just meant to illustrate a problem with ASSA, UD not included?

Comment by interstice on Predictors as Agents · 2019-03-25T20:57:47.428Z · score: 1 (1 votes) · LW · GW

Yeah, if you train the algorithm by random sampling, the effect I described will take place. The same thing will happen if you use an RL algorithm to update the parameters instead of an unsupervised learning algorithm(though it seems willfully perverse to do so -- you're throwing away a lot of the structure of the problem by doing this, so training will be much slower)

I also just found an old comment which makes the exact same argument I made here. (Though it now seems to me that argument is not necessarily correct!)

Comment by interstice on Two Small Experiments on GPT-2 · 2019-03-04T23:33:55.601Z · score: 4 (2 votes) · LW · GW

If you literally ran (a powered-up version of) GPT-2 on "A brilliant solution to the AI alignment problem is..." you would get the sort of thing an average internet user would think of as a brilliant solution to the AI alignment problem. Trying to do this more usefully basically leads to Paul's agenda (which is about trying to do imitation learning of an implicit organization of humans)

Comment by interstice on Predictors as Agents · 2019-01-03T21:22:03.357Z · score: 3 (2 votes) · LW · GW

Reflective Oracles are a bit of a weird case case because their 'loss' is more like a 0/1 loss than a log loss, so all of the minima are exactly the same(If we take a sample of 100000 universes to score them, the difference is merely incredibly small instead of 0). I was being a bit glib referencing them in the article; I had in mind something more like a model parameterizing a distribution over outputs, whose only influence on the world is via a random sample from this distribution. I think that such models should in general have fixed points for similar reasons, but am not sure. Regardless, these models will, I believe, favour fixed points whose distributions are easy to compute(But not fixed points with low entropy, that is they will punish logical uncertainty but not intrinsic uncertainy). I'm planning to run some experiments with VAEs and post the results later.

Comment by interstice on Generalising CNNs · 2019-01-03T03:49:10.172Z · score: 3 (2 votes) · LW · GW

You might be interested in Transformer Networks, which use a learned pattern of attention to route data between layers. They're pretty popular and have been used in some impressive applications like this very convincing image-synthesis GAN.

re: whether this is a good research direction. The fact that neural networks are highly compressible is very interesting and I too suspect that exploiting this fact could lead to more powerful models. However, if your goal is to increase the chance that AI has a positive impact, then it seems like the relevant thing is how quickly our understanding of how to align AI systems progresses, relative to our understanding of how to build powerful AI systems. As described, this idea sounds like it would be more useful for the latter.

Comment by interstice on Predictors as Agents · 2019-01-01T21:19:50.390Z · score: 1 (1 votes) · LW · GW
Is there a reason you think a reflective oracle (or equivalent) can't just be selected "arbitrarily", and will likely be selected to maximize some score?

The gradient descent is not being done over the reflective oracles, it's being done over some general computational model like a neural net. Any highly-performing solution will necessarily look like a fixed-point-finding computation of some kind, due to the self-referential nature of the predictions. Then, since this fixed-point-finder is *internal* to the model, it will be optimized for log loss just like everything else in the model.

That is, the global optimization of the model is distinct from whatever internal optimization the fixed-point-finder uses to choose the reflective oracle. The global optimization will favor internal optimizers that produce fixed-points with good score. So while fixed-point-finders in general won't optimize for anything in particular, the one this model uses will.

Comment by interstice on Announcement: AI alignment prize round 3 winners and next round · 2018-12-31T22:53:30.147Z · score: 3 (2 votes) · LW · GW

I submit Predictors as Agents.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-30T15:18:25.622Z · score: 1 (1 votes) · LW · GW
If we assume Sleeping Beauty has lots of information, we might expect that the shortest matching program will look like a simulation of physical law plus a "bridging law" that, given this simulation, tells you what symbols get written to the tape

I agree. I still think that the probabilities would be closer to 1/2, 1/4, 1/4. The bridging law could look like this: search over the universe for compact encodings of my memories so far, then see what is written next onto this encoding. In this case, it would take no more bits to specify waking up on Tuesday, because the memories are identical, in the same format, and just slightly later temporally.

In a naturalized setting, it seems like the tricky part would be getting the AIXI on Monday to care what happens after it goes to sleep. It 'knows' that it's going to lose consciousness(it can see that its current memory encoding is going to be overwritten) so its next prediction is undetermined by its world-model. There is one program that will give it the reward of its successor then terminates, as I described above, but it's not clear why the AIXI would favour that hypothesis. Maybe if it has been in situations involving memory-wiping before, or has observed other RO-AIXI's in such situations.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-29T19:19:46.254Z · score: 1 (1 votes) · LW · GW

"I can't make bets on my beliefs about the Eschaton, because they are about the Eschaton." -- Well, it makes sense. Besides, I did offer you a bet taking into account a) that the money may be worth less in my branch b) I don't think DL + RL AGI is more likely than not, just plausible. If you're more than 96% certain there will be no such AI, 20:1 odds are a good deal.

But anyways, I would be fine with betting on a nearer-term challenge. How about -- in 5 years, a bipedal robot that can run on rough terrain, as in this video, using a policy learned from scratch by DL + RL(possibly including a simulated environment during training) 1:1 odds.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-28T17:07:21.607Z · score: 1 (1 votes) · LW · GW

Hmmm...but if I win the bet then the world may be destroyed, or our environment could change so much the money will become worthless. Would you take 20:1 odds that there won't be DL+RL-based HLAI in 25 years?

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-28T16:59:33.354Z · score: 1 (1 votes) · LW · GW

I still don't see how you're getting those probabilities. Say it takes 1 bit to describe the outcome of the coin toss, and assume it's easy to find all the copies of yourself(ie your memories) in different worlds. Then you need:

1 bit to specify if the coin landed heads or tails

If the coin landed tails, you need 1 more bit to specify if it's Monday or Tuesday.

So AIXI would give these scenarios P(HM)=0.50, P(TM)=0.25, P(TT)=0.25.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T22:08:06.447Z · score: 1 (1 votes) · LW · GW

Have something in mind?

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T20:34:40.935Z · score: 1 (1 votes) · LW · GW

Well, it COULD be the case that the K-complexity of the memory-erased AIXI environment is lower, even when it learns that this happened. The reason for this is that there could be many possible past AIXI's who have their memory erased/altered and end up in the same subjective situation. Then the memory-erasure hypothesis can use the lowest K-complexity AIXI who ends up with these memories. As the AIXI learns more it can gradually piece together which of the potential past AIXI's it actually was and the K-complexity will go back up again.

EDIT: Oh, I see you were talking about actually having a RANDOM memory in the sense of a random sequence of 1s and 0s. Yeah, but this is no different than AIXI thinking that any random process is high K-complexity. In general, and discounting merging, the memory-altering subroutine will increase the complexity of the environment by a constant plus the complexity of whatever transformation you want to apply to the memories.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T19:49:47.219Z · score: 2 (3 votes) · LW · GW

Well, the DotA bot pretty much just used PPO,. AlphaZero used MCTS + RL, OpenAI recently got a robot hand to do object manipulation with PPO and a simulator(the simulator was hand-built, but in principle it could be produced by unsupervised learning like in this). Clearly it's possible to get sophisticated behaviors out of pretty simple RL algorithms. It could be the case that these approaches will "run out of steam" before getting to HLAI, but it's hard to tell at the moment, because our algorithms aren't running with the same amount of compute + data as humans (for humans, I am thinking of our entire lifetime experiences as data, which is used to build a cross-domain optimizer).

re: Uber, I agree that at least in the short term most applications in the real world will feature a fair amount of engineering by hand. But the need for this could decrease as more power becomes available, as has been the case in supervised learning.

Comment by interstice on Boltzmann Brains and Within-model vs. Between-models Probability · 2018-09-27T16:29:38.356Z · score: 1 (1 votes) · LW · GW

How do the initial simple conditions relate to the branching? Our universe seems to have had simple initial conditions but there's still been a lot of random branching, right? That is, the universe from our perspective is just one branch of a quantum state evolving simply from simple conditions, so you need O(#branching events) bits to describe it. Incidentally this undermines Eliezer's argument for MWI based on Solomonoff induction, though MWI is probably still true

[EDITED: Oh, from one of your other comments I see that you aren't saying the shortest program involves beginning at the start of the universe. That makes sense]

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T16:11:31.441Z · score: 1 (1 votes) · LW · GW

I agree that you do need some sort of causal structure around the function-fitting deep net. The question is how complex this structure needs to be before we can get to HLAI. It seems plausible to me(at least a 10% chance, say) that it could be quite simple, maybe just consisting of modestly more sophisticated versions of the RL algorithms we have so far, combined with really big deep networks.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T15:51:05.644Z · score: 1 (1 votes) · LW · GW

Incidentally, you can use the same idea to have RO-AIXI do anthropic reasoning/bargaining about observers that are in a broader reference class than 'exact same sense data', by making the mapping O -> O' some sort of coarse-graining.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T15:44:14.392Z · score: 1 (1 votes) · LW · GW

" P(HM)=0.49, P(TM)=0.49, P(TT)=0.2 " -- Are these supposed to be mutually exclusive probabilities?

" There is a turing machine that writes the memory-wiped contents to tape all in one pass. " - Yes, this is basically what I said. ('environment' above could include 'the world' + bridging laws). But you also need to alter the reward structure a bit to make it match our usual intuition of what 'memory-wiping' means, and this has significance for decision theory.

Consider, if your own memory was erased, you would probably still be concerned about what was going to happen to you later. But a regular AIXI won't care about what happens to its memory-wiped clone(i.e. another AIXI inducting on the 'memory-wiped' stream), because they don't share an input channel. So to fix this you give the original AIXI all of the rewards that its clone ends up getting.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-26T22:36:05.115Z · score: 3 (1 votes) · LW · GW

Okay, but (e.g.) deep RL methods can solve problems that apparently require quite complex causal thinking such as playing DotA. I think what is happening here is that while there is no explicit causal modelling happening at the lowest level of the algorithm, the learned model ends up building something that serves the functions of one because that is the simplest way to solve a general class of problems. See the above meta-RL paper for good examples of this. There seems to be no obvious obstruction to scaling this sort of thing up to human-level causal modelling. Can you point to a particular task needing causal inference that you think these methods cannot solve?

Comment by interstice on Boltzmann Brains and Within-model vs. Between-models Probability · 2018-09-26T15:26:28.038Z · score: 1 (1 votes) · LW · GW

The penalty for specifying where you are in space and time is dwarfed by the penalty for specifying which Everett branch you're in.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-25T19:26:08.580Z · score: 5 (4 votes) · LW · GW

In the past, people have said that neural networks could not possibly scale up to solve problems of a certain type, due to inherent limitations of the method. Neural net solutions have then been found using minor tweaks to the algorithms and (most importantly) scaling up data and compute. Ilya Sutskever gives many examples of this in his talk here. Some people consider this scaling-up to be "cheating" and evidence against neural nets really working, but it's worth noting that the human brain uses compute on the scale of today's supercomputers or greater, so perhaps we should not be surprised if a working AI design requires a similar amount of power.

On a cursory reading, it seems like most the problems given in the papers could plausibly be solved by meta-reinforcement learning on a general-enough set of environments, of course with massively scaled-up compute and data. It may be that we will need a few more non-trivial insights to get human-level AI, but it's also plausible that scaling up neural nets even further will just work.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-25T17:57:02.891Z · score: 4 (2 votes) · LW · GW

I think the framework of RO-AIXI can be modified pretty simply to include memory-tampering.

Here's how you do it. Say you have an environment E and an RO-AIXI A running in it. You have run the AIXI for a number of steps, and it has a history of observations O. Now we want to alter its memory to have a history of observations O'. This can be implemented in the environment as follows:

1. Create a new AIXI A', with the same reward function as the original and no memories. Feed it the sequence of observations O'.

2. Run A' in place of A for the remainder of E. In the course of this execution, A' will accumulate total reward R. Terminate A'.

3. Give the original AIXI reward R, then terminate it.

This basically captures what it means for AIXI's memory to be erased. Two AIXI's are only differentiated from each other by their observations and reward function, so creating a new AIXI which shares a reward function with the original is equivalent to changing the first AIXI's observations. The new AIXI, A', will also be able to reason about the possibility that it was produced by such a 'memory-tampering program', as this is just another possible RO-Turing machine. In other words it will be able to reason about the possibility that its memory has been altered.

[EDITED: My original comment falsely stated that AIXI-RO avoids dutch-booking, but I no longer think it does. I've edited my reasoning below]

As applied to the Sleeping Beauty problem from the paper, I think this WILL be dutch-booked. If we assume it takes one bit to specify heads/tails, and one to specify which day one wakes on, then the agent will have probabilities

1/2 Heads,

1/4 Tails, wake on Monday

1/4 Tails, wake on Tuesday

Since memory-erasure has the effect of creating a new AIXI with no memories, the betting scenario(in section 3.2) of the paper has the structure of either a single AIXI choosing to take a bet, or two copies of the same AIXI playing a two-person game. RO-AIXI plays Nash equilibria in such scenarios. Say the AIXI has taken bet 9. From the perspective of the current AIXI, let p be the probability that it takes bet 10, and let q be the probability that its clone takes bet 10.

E[u] = 1/2 * ( (-15 + 2eps) + p (10 + eps)) + 1/2 * ((15 + eps) + p*q *(-20 + 2eps) + p(1 - q)(-10 + eps) + q(1 - p) * (-10 + eps))

= 3/2 eps + 1/2 * (p * 2 * eps + q(-10 + eps))

This has the structure of a prisoner's dilemma. In particular, the expected utility of the current AIXI is maximized at p = 1. So both AIXI's will take the bet and incur a sure loss. On the other hand, for this reason the original AIXI A would not take the bet 9 on Sunday, if given the choice.

Comment by interstice on Exorcizing the Speed Prior? · 2018-07-24T23:37:54.165Z · score: 1 (1 votes) · LW · GW

You could think of the 'advice' given by evolution being in the form of a short program, e.g. for a neural-net-like learning algorithm. In this case, a relatively short string of advice could result in a lot of apparent optimization.

(For the book example: imagine a species that outputs books of 20Gb containing only the letter 'a'. This is very unlikely to be produced by random choice, yet it can be specified with only a few bits of 'advice')

Comment by interstice on Physics has laws, the Universe might not · 2018-06-16T19:02:36.404Z · score: 3 (1 votes) · LW · GW

I largely agree with your conception. That's sort of why I put scare quotes around exist -- I was talking about universes for which there is NO finite computational description, which (I think) is what the OP was talking about. I think it would basically be impossible for us to reason about such universes, so to say that they 'exist' is kind of strange.

Comment by interstice on Physics has laws, the Universe might not · 2018-06-14T18:31:42.618Z · score: 5 (2 votes) · LW · GW

The idea of a universe "without preset laws" seems strange to me. Say for example that you take your universe to be a uniform distribution over strings of length n. This "universe" might be highly chaotic, but it still has an orderly short description -- namely, as the uniform distribution. More generally, for us to even SPEAK about "a toy universe" coherently, we need to give some sort of description of that universe, which basically functions as the laws of that universe(probabilistic laws are still laws). So even if such universes "exist"(whatever that means), we couldn't speak or reason about them in any way, let alone run computer simulations of them.

Comment by interstice on Beyond Astronomical Waste · 2018-06-08T20:02:50.167Z · score: 8 (4 votes) · LW · GW

The weight could be something like the algorithmic probability over strings(, in which case universes like ours with a concise description would get a fairly large chunk of the weight.

Comment by interstice on The simple picture on AI safety · 2018-05-28T16:50:54.670Z · score: 21 (7 votes) · LW · GW

Couldn't you say the same thing about basically any problem? "Problem X is really quite simple. It can be distilled down to these steps: 1. Solve problem X. There, wasn't that simple?"

Comment by interstice on Open question: are minimal circuits daemon-free? · 2018-05-08T23:31:32.676Z · score: 3 (1 votes) · LW · GW

By "predict sufficiently well" do you mean "predict such that we can't distinguish their output"?

Unless the noise is of a special form, can't we distinguish $f$ and $tilde{f}$ by how well they do on $f$'s goals? It seems like for this not to be the case, the noise would have to be of the form "occasionally do something weak which looks strong to weaker agents". But then we could get this distribution by using a weak (or intermediate) agent directly, which would probably need less compute.

Comment by interstice on Open question: are minimal circuits daemon-free? · 2018-05-07T18:24:49.696Z · score: 13 (3 votes) · LW · GW

Don't know if this counts as a 'daemon', but here's one scenario where a minimal circuit could plausibly exhibit optimization we don't want.

Say we are trying to build a model of some complex environment containing agents, e.g. a bunch of humans in a room. The fastest circuit that predicts this environment will almost certainly devote more computational resources to certain parts of the environment, in particular the agents, and will try to skimp as much as possible on less relevant parts such as chairs, desks etc. This could lead to 'glitches in the matrix' where there are small discrepancies from what the agents expect.

Finding itself in such a scenario, a smart agent could reason: "I just saw something that gives me reason to believe that I'm in a small-circuit simulation. If it looks like the simulation is going to be used for an important decision, I'll act to advance my interests in the real world; otherwise, I'll act as though I didn't notice anything".

In this way, the overall simulation behavior could be very accurate on most inputs, only deviating in the cases where it is likely to be used for an important decision. In effect, the circuit is 'colluding' with the agents inside it to minimize its computational costs. Indeed, you could imagine extreme scenarios where the smallest circuit instantiates the agents in a blank environment with the message "you are inside a simulation; please provide outputs as you would in environment [X]". If the agents are good at pretending, this could be quite an accurate predictor.

Comment by interstice on On exact mathematical formulae · 2018-04-24T21:26:11.147Z · score: 9 (3 votes) · LW · GW

re: differential equation solutions, you can compute if they are within epsilon of each other for any epsilon, which I feel is "morally the same" as knowing if they are equal.

It's true that the concepts are not identical. I feel computability is like the "limit" of the "explicit" concept, as a community of mathematicians comes to accept more and more ways of formally specifying a number. The correspondence is still not perfect, as different families of explicit formulae will have structure(e.g. algebraic structure) that general Turing machines will not.

Comment by interstice on On exact mathematical formulae · 2018-04-23T00:36:44.785Z · score: 8 (5 votes) · LW · GW

While the concept of explicit solution can be interpreted messily, as in the quote above, there is a version of this idea that more closely cuts reality at the joints, computability. A real number is computable iff there is a Turing machine that outputs the number to any desired accuracy. This covers fractions, roots, implicit solutions, integrals, and, if you believe the Church-Turing thesis, anything else we will be able to come up with.

Comment by interstice on Announcing the AI Alignment Prize · 2018-01-03T22:32:25.086Z · score: 3 (1 votes) · LW · GW

Hope it's not too late, but I also meant for this post(linked in original) to be part of my entry:

Comment by interstice on Announcing the AI Alignment Prize · 2017-12-31T20:18:32.437Z · score: 3 (2 votes) · LW · GW


Comment by interstice on Please Help: How to make a big improvement in the alignment of political parties’ incentives with the public interest? · 2017-01-18T00:53:44.250Z · score: 2 (2 votes) · LW · GW

Dominic Cummings asks for help in aligning incentives of political parties. Thought this might be of interest, as aligning incentives is a common topic of discussion here, and Dominic is someone with political power(he ran the Leave campaign for Brexit), so giving him suggestions might be a good opportunity to see some of the ideas here actually implemented.

Comment by interstice on Deliberate Grad School · 2015-10-05T17:56:44.397Z · score: 1 (1 votes) · LW · GW

I think the idea is that you're supposed to deduce the last name and domain name from identifying details in the post.

Comment by interstice on Beyond Statistics 101 · 2015-06-26T16:41:12.366Z · score: 6 (6 votes) · LW · GW

What resources would you recommend for learning advanced statistics?

Comment by interstice on Help needed: nice AIs and presidential deaths · 2015-06-08T21:42:02.145Z · score: 1 (1 votes) · LW · GW

How about you ask the AI "if you were to ask a counterfactual version of you who lives in a world where the president died, what would it advise you to do?". This counterfactual AI is motivated to take nice actions, so it would advise the real AI to take nice actions as well, right?