Halifax SSC Meetup -- FEB 8 2020-02-08T00:45:37.738Z · score: 4 (1 votes)
HALIFAX SSC MEETUP -- FEB. 1 2020-01-31T03:59:05.110Z · score: 4 (1 votes)
SSC Halifax Meetup -- January 25 2020-01-25T01:15:13.090Z · score: 4 (1 votes)
Clarifying The Malignity of the Universal Prior: The Lexical Update 2020-01-15T00:00:36.682Z · score: 21 (9 votes)
Halifax SSC Meetup -- Saturday 11/1/20 2020-01-10T03:35:48.772Z · score: 4 (1 votes)
Recent Progress in the Theory of Neural Networks 2019-12-04T23:11:32.178Z · score: 64 (22 votes)
Halifax Meetup -- Board Games 2019-04-15T04:00:02.799Z · score: 4 (2 votes)
Predictors as Agents 2019-01-08T20:50:49.599Z · score: 12 (8 votes)
Formal Models of Complexity and Evolution 2017-12-31T20:17:46.513Z · score: 12 (4 votes)
A Candidate Complexity Measure 2017-12-31T20:15:39.629Z · score: 29 (9 votes)
Please Help: How to make a big improvement in the alignment of political parties’ incentives with the public interest? 2017-01-18T00:51:56.355Z · score: 5 (4 votes)


Comment by interstice on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T04:55:04.423Z · score: 14 (8 votes) · LW · GW

The post mentions problems that encourage people to hide reality from themselves. I think that constructing a 'meaningful life narrative' is a pretty ubiquitous such problem. For the majority of people, constructing a narrative where their life has intrinsic importance is going to involve a certain amount of self-deception.

Some of the problems that come from the interaction between these sorts of narratives and learning about x-risks have already been mentioned. To me, however, it looks like some of the AI x-risk memes themselves are partially the result of reality-masking optimization with the goal of increasing the perceived meaningfulness of the lives of people working on AI x-risk. As an example, consider the ongoing debate about whether we should expect the field of AI to mostly solve x-risk on its own. Clearly, if the field can't be counted upon to avoid the destruction of humanity, this greatly increases the importance of outside researchers trying to help them. So to satisfy their emotional need to feel that their actions have meaning, outside researchers have a bias towards thinking that the field is more incompetent than it is, and to come up with and propagate memes justifying that conclusion. People who are already in insider institutions have the opposite bias, so it makes sense that this debate divides to some extent along these lines.

From this perspective, it's no coincidence that internalizing some x-risk memes leads people to feel that their actions are meaningless. Since the memes are partially optimized to increase the perceived meaningfulness of the actions of a small group of people, by necessity they will decrease the perceived meaningfulness of everyone else's actions.

(Just to be clear, I'm not saying that these ideas have no value, that this is being done consciously, or that the originators of said memes are 'bad'; this is a pretty universal human behavior. Nor would I endorse bringing up these motives in an object-level conversation about the issues. However, since this post is about reality-masking problems it seems remiss not to mention.)

Comment by interstice on Clarifying The Malignity of the Universal Prior: The Lexical Update · 2020-01-16T20:44:36.073Z · score: 1 (1 votes) · LW · GW

Thanks, that makes sense. Here is my rephrasing of the argument:

Let the 'importance function' take as inputs machines and , and output all places where is being used as a universal prior, weighted by their effect on -short programs. Suppose for the sake of argument that there is some short program computing ; this is probably the most 'natural' program of this form that we could hope for.

Even given such a program, we'll still lose to the aliens: in , directly specifying our important decisions on Earth using will require both and to be fed into , costing bits, then bits to specify us. For the aliens, getting them to be motivated to control -short programs costs bits, but then they can skip directly to specifying us given , so they save bits over the direct explanation. So the lexical update works.

(I went wrong in thinking that the aliens would need to both update their notion of importance to match ours *and* locate our world; but if we assume the 'importance function' exists then the aliens can just pick out our world using our notion of importance)

Comment by interstice on Clarifying The Malignity of the Universal Prior: The Lexical Update · 2020-01-16T01:04:31.901Z · score: 1 (1 votes) · LW · GW

What you say seems right, given that we are specifying a world, then important predictions, then U'. I was assuming a different method for specifying where we are, are as follows:

For the sake of establishing whether or not the lexical update was a thing, I assume that there exists some short program Q which, given as input a description of a language X, outputs a distribution over U-important places where X is being used to make predictions. Given that Q exists, and (importantly) has a short description in U', I think the shortest way of picking out "our world" in U' would be feeding a description of U' into Q then sampling worlds from the distribution Q(U'). This would then imply that U'(our world) is equal to U’(our world, someone making predictions, U'), because the shortest way of describing our world involves specifying Q and U' anyway. The aliens can't make the 'lexical update' here because this is about their preferences, not beliefs(that is, since they know U', they could find our world knowing only Q; but this wouldn't affect their motives to do so because we're assuming their motives are tied to simplicity in U', where our world still requires finding an up-front specification of U' within U')

That said, it seems like maybe I am cheating by assuming Q has a short description in U'; a more natural assumption might be a program Q(X, Y) which outputs X-important places where Y is being used to make predictions. I will think about this more.

Comment by interstice on Machine Learning Can't Handle Long-Term Time-Series Data · 2020-01-05T06:12:34.177Z · score: 9 (6 votes) · LW · GW

Today's neural networks definitely have problems solving more 'structured' problems, but I don't think that 'neural nets can't learn long time-series data' is a good way of framing this. To go through your examples:

This shouldn’t have been a major issue, except that with each switch it discarded past observations. Had the car maintained this history it would have seen that some sort of large object was progressing across the street on a collision course, and had plenty of time to stop.

From a brief reading of the report, this sounds like this control logic is part of the system surrounding the neural network, not the network itself.

One network predicts the odds of winning and another network figures out which move to perform. This turns a time-series problem (what strategy to perform) into a two separate stateless[1] problems.

I don't see how you think this is 'stateless'. AlphaStar's architecture contains an LSTM('Core') which is then fed into the value and move networks, similar to most time series applications of neural networks.

Most conspicuously, human beings know how to build walls with buildings. This requires a sequence of steps that don’t generate a useful result until the last of them are completed. A wall is useless until the last building is put into place. AlphaStar (the red player in the image below) does not know how to build walls.

But the network does learn how to build its economy, which also doesn't pay off for a very long time. I think the issue here is more about a lack of 'reasoning' skills than time-scales: the network can't think conceptually, and so doesn't know that a wall needs to completely block off an area to be useful. It just learns a set of associations.

ML can generate classical music just fine but can’t figure out the chorus/verse system used in rock & roll.

MustNet was trained from scratch on MIDI data, but it's still able to generate music with lots of structure on both short and long time scales. GPT2 does the same for text. I'm not sure if MuseNet is able to generate chorus/verse structures in particular, but again this seems more like an issue of lack of logic/concepts than time scales(that is, MuseNet can make pieces that 'sound right' but has no conceptual understanding of their structure)

I'll note that AlphaStar, GPT2, and MuseNet all use the Transformer architecture, which seems quite effective for structured time-series data. I think this is because its attentional mechanism lets it zoom in on the relevant parts of past experiences.

I also don't see how connectome-specific-waves are supposed to help. I think(?) your suggestion is to store slow-changing data in the largest eigenvectors of the Laplacian -- but why would this be an improvement? It's already the case(by the nature of the matrix) that the largest eigenvectors of e.g. an RNN's transition matrix will tend to store data for longer time periods.

Comment by interstice on romeostevensit's Shortform · 2020-01-01T23:35:02.105Z · score: 12 (4 votes) · LW · GW

Steroids do fuck a bunch of things up, like fertility, so they make evolutionary sense. This suggests we should look to potentially dangerous or harmful alterations to get real IQ boosts. Greg cochran has a post suggesting gout might be like this.

Comment by interstice on Understanding Machine Learning (I) · 2019-12-22T00:16:39.115Z · score: 3 (2 votes) · LW · GW

This seems much too strong, lots of interesting unsolved problems can be cast as i.i.d. Video classification, for example, can be cast as i.i.d., where the distribution is over different videos, not individual frames.

Comment by interstice on Free Speech and Triskaidekaphobic Calculators: A Reply to Hubinger on the Relevance of Public Online Discussion to Existential Risk · 2019-12-21T23:47:56.411Z · score: 22 (6 votes) · LW · GW

In the analogy, it's only possible to build a calculator that outputs the right answer on non-13 numbers because you already understand the true nature of addition. It might be more difficult if you were confused about addition, and were trying to come up with a general theory by extrapolating from known cases -- then, thinking 6 + 7 = 15 could easily send you down the wrong path. In the real world, we're similarly confused about human preferences, mind architecture, the nature of politics, etc., but some of the information we might want to use to build a general theory is taboo. I think that some of these questions are directly relevant to AI -- e.g. the nature of human preferences is relevant to building an AI to satisfy those preferences, the nature of politics could be relevant to reasoning about what the lead-up to AGI will look like, etc.

Comment by interstice on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-16T03:54:40.161Z · score: 1 (1 votes) · LW · GW

Fair point, but I note that the cooperative ability only increases fitness here because it boosts the individuals' status, i.e. they are in a situation where status-jockeying and cooperative behavior are aligned. Of course it's true that they _are_ often so aligned.

Comment by interstice on Many Turing Machines · 2019-12-15T22:39:37.941Z · score: 1 (1 votes) · LW · GW

I agree it's hard to get the exact details of the MUH right, but pretty much any version seems better to me than 'only observable things exist' for the reasons I explained in my comment. And pretty much any version endorses many-worlds(of course you can believe many-worlds without believing MUH). Really this is just a debate about the meaning of the word 'exist'.

Comment by interstice on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-12T07:11:29.702Z · score: 4 (2 votes) · LW · GW

Ability to cooperate is important, but I think that status-jockeying is a more 'fundamental' advantage because it gives an advantage to individuals, not just groups. Any adaptation that aids groups must first be useful enough to individuals to reach fixation(or near-fixation) in some groups.

Comment by interstice on Many Turing Machines · 2019-12-10T20:53:45.197Z · score: 6 (4 votes) · LW · GW

You've essentially re-invented the Mathematical Universe Hypothesis, which many people around here do in fact believe.

For some intuition as to why people would think that things that can't ever affect our future experiences are 'real', imagine living in the distant past and watching your relatives travel to a distant land, and assume that long-distance communication such as writing is impossible. You would probably still care about them and think they are 'real' , even though by your definition they no longer exist to you. Or if you want to quibble about the slight chance of seeing them again, imagine being in the future and watching them get on a spaceship which will travel beyond your observable horizon. Again, it still seems like you would still care about them and consider them 'real'.

Comment by interstice on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-09T06:39:56.107Z · score: 4 (2 votes) · LW · GW

Do you agree that signalling intelligence is the main explanation for the evolution of language? To me, it seems like coalition-building is a more fundamental driving force(after all, being attracted to intelligence only makes sense if intelligence is already valuable in some contexts, and coalition politics seems like an especially important domain) Miller has also argued that sexual signalling is a main explanation of art and music, which Will Buckingham has a good critique of here.

Comment by interstice on Q&A with Shane Legg on risks from AI · 2019-12-09T06:11:06.489Z · score: 2 (2 votes) · LW · GW

As far as I know no one's tried to build a unified system with all of those capacities, but we do seem to have rudimentary learned versions of each of the capacities on their own.

Comment by interstice on Recent Progress in the Theory of Neural Networks · 2019-12-06T21:18:53.425Z · score: 1 (1 votes) · LW · GW

That's an interesting link. It sound like the results can only be applied to strictly Bayesian methods though, so they couldn't be applied to neural networks as they exist now.

Comment by interstice on Recent Progress in the Theory of Neural Networks · 2019-12-06T04:16:54.046Z · score: 3 (2 votes) · LW · GW

Not sure if I agree regarding the real-world usefulness. For the non-IID case, PAC-Bayes bounds fail, and to re-instate them you'd need assumptions about how quickly the distribution changes, but then it's plausible that you could get high probability bounds based on the most recent performance. For small datasets, the PAC-Bayes bounds suffer because they scale as . (I may edit the post to be clearer about this)

Agreed that analyzing how the bounds change under different conditions could be insightful though. Ultimately I suspect that effective bounds will require powerful ways to extract 'the signal from the noise', and examining the signal will likely be useful for understanding if a model has truly learned what it is supposed to.

Comment by interstice on Understanding “Deep Double Descent” · 2019-12-06T02:15:32.628Z · score: 13 (5 votes) · LW · GW

The neural tangent kernel guys have a paper where they give a heuristic argument explaining the double descent curve(in number of parameters) using the NTK.

Comment by interstice on Understanding “Deep Double Descent” · 2019-12-06T02:11:39.094Z · score: 13 (6 votes) · LW · GW

Nice survey. The result about double descent even occurring in dataset size is especially surprising.

Regarding the 'sharp minima can generalize' paper, they show that there exist sharp minima with good generalization, not flat minima with poor generalization, so they don't rule out flatness as an explanation for the success of SGD. The sharp minima they construct with this property are also rather unnatural: essentially they multiply the weights of layer 1 by a constant and divide the weights of layer 2 by the same constant. The piecewise linearity of ReLU means the output function is unchanged. For large , the network is now highly sensitive to perturbations in layer 2. These solutions don't seem like they would be found by SGD, so it could still be that, for solutions found by SGD, flatness and generalization are correlated.

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T17:47:33.460Z · score: 1 (1 votes) · LW · GW

No, I mean Beauty's subjective credence that the coin came up heads. That should be 1/2 by the nature of a coin flip. Then, if the coin comes up tails, you need 1 bit to select between the subjectively identical states of waking up on Monday or Tuesdsay. So in total:

P(heads, Monday) = 1/2,

P(tails, Monday) = 1/4

P(tails, Tuesday) = 1/4

(EDIT: actually this depends on how difficult it is to locate memories on Monday vs. Tuesday, which might be harder given that your memory has been erased. I think that for 'natural' ways of locating your consciousness it should be close to / / though)

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T03:35:06.331Z · score: 1 (1 votes) · LW · GW

Right, I knew that many people had since moved on to UDT due to limitations of UDASSA for decision-making. What I meant was that UDASSA seems to be satisfactory at resolving the typical questions about anthropic probabilities, setting aside decision theory/noncomputability issues.

I agree it would be nice to have all this information in an readily-accessible place. Maybe the posts setting out the ideas and later counter-arguments could be put in a curated sequence.

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T02:28:18.342Z · score: 1 (1 votes) · LW · GW

Yeah, I also had similar ideas for solving anthropics a few years ago, and was surprised when I learned that UDASSA had been around for so long. At least you can take pride in having found the right answer independently.

I think that UDASSA gives P(heads) = 1/2 on the Sleeping Beauty problem due to the way it weights different observer-moments, proportional to 2^(-description length). This might seem a bit odd, but I think it's necessary to avoid problems with Boltzmann brains and the like.

Comment by interstice on [Link] John Carmack working on AGI · 2019-11-14T00:18:11.553Z · score: 8 (4 votes) · LW · GW

Here it is:

Starting this week, I’m moving to a "Consulting CTO” position with Oculus.
I will still have a voice in the development work, but it will only be consuming a modest slice of my time.
As for what I am going to be doing with the rest of my time: When I think back over everything I have done across games, aerospace, and VR, I have always felt that I had at least a vague “line of sight” to the solutions, even if they were unconventional or unproven. I have sometimes wondered how I would fare with a problem where the solution really isn’t in sight. I decided that I should give it a try before I get too old.
I’m going to work on artificial general intelligence (AGI).
I think it is possible, enormously valuable, and that I have a non-negligible chance of making a difference there, so by a Pascal’s Mugging sort of logic, I should be working on it.
For the time being at least, I am going to be going about it “Victorian Gentleman Scientist” style, pursuing my inquiries from home, and drafting my son into the work.
Runner up for next project was cost effective nuclear fission reactors, which wouldn’t have been as suitable for that style of work. 😊
Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-14T00:13:11.480Z · score: 1 (1 votes) · LW · GW

It's true that UDASSA is tragically underrated, given that(it seems to me) it provides a satisfactory resolution to all anthropic problems. I think this might be a situation where people tend to leave the debate and move on to something else when they seem to have found a satisfactory position, like how most LW people don't bother arguing about whether god exists anymore.

Comment by interstice on Insights from the randomness/ignorance model are genuine · 2019-11-13T22:57:51.918Z · score: 6 (2 votes) · LW · GW

This seems like a step backwards from UDASSA, another potential solution to many anthropic problems. UDASSA has a completely formal specification, while this model relies on a somewhat unclear verbal definition. So you need to know the 'relative frequency' with which H happens. But what are we averaging over here? Our universe? All possible universes? If uncertain about which universe we are in, how should we average over the different universes? What if we are reasoning about an event which, as far as we know, will only happen once?

Comment by interstice on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-28T00:50:54.455Z · score: 16 (6 votes) · LW · GW

Can't tell if joking, but they probably mean that they were "actually in the mafia" in the game, so not in the real-world mafia.

Comment by interstice on [deleted post] 2019-09-18T14:47:18.133Z

A better system won't just magically form itself after the existing system has been destroyed. In all likelihood what will form will be either a far more corrupt and oligarchical system, or no system at all. I think a better target for intervention would be attempting to build superior alternatives so that something is available when the existing systems start to fail. In education for example, Lambda School is providing a better way for many people to learn programming than college.

Note also that existing systems of power are very big, so efforts to damage them probably have low marginal impact. Building initially small new things can have much higher marginal impact. If the systems are as a corrupt as you think they are, they should destroy themselves on their own in any case.

Comment by interstice on Halifax Meetup -- Board Games · 2019-09-02T16:29:51.927Z · score: 5 (3 votes) · LW · GW

Update -- We're going to have a meetup on September 21st at Uncommon Grounds (1030 South Park Street). This is going to be posted on SlateStarCodex as part of the worldwide meetup event.

Comment by interstice on Self-supervised learning & manipulative predictions · 2019-08-22T19:47:46.296Z · score: 1 (1 votes) · LW · GW

No worries, I also missed the earlier posts when I wrote mine. There's lots of stuff on this website.

I endorse your rephrasing of example 1. I think my position is that it's just not that hard to create a "self-consistent probability distribution". For example, say you trained an RNN to predict sequences, like in this post. Despite being very simple, it already implicitly represents a probability distribution over sequences. If you train it with back-propagation on a confusing article involving pyrite, then its weights will be updated to try to model the article better. However, if "pyrite" itself was easy to predict, then the weights that lead to it outputting "pyrite" will *not* be updated. The same thing holds for modern Transformer networks, which predict the next token based only on what it has seen so far. (Here is a paper with a recent example using GPT-2. Note the degeneracy of maximum likelihood sampling, but how this becomes less of a problem when just sampling from the implied distribution)

I agree that this sort of manipulative prediction could be a problem in principle, but it does not seem to occur in recent ML systems. (Although, there are some things which are somewhat like this; the earlier paper I linked and mode collapse do involve neglecting high-entropy components of the distribution. However, the most straightforward generation and training schemes do not incentivize this)

For example 2, the point about gradient descent is this: while it might be the case that outputting "Help I'm stuck in a GPU Factory000" would ultimately result in a higher accuracy, the way the gradient is propagated would not encourage the agent to behave manipulatively. This is because, *locally*, "Help I'm stuck in a GPU Factory" decreases accuracy, so that behavior(or policies leading to it) will be dis-incentivized by gradient descent. It may be the case that this will result in easier predictions later, but the structure of the reward function does not lead to any optimization pressure towards such manipulative strategies. Learning taking place over high-level abstractions doesn't change anything, because any high-level abstractions leading to locally bad behavior will likewise be dis-incentivized by gradient descent

Comment by interstice on Self-supervised learning & manipulative predictions · 2019-08-20T14:44:33.299Z · score: 5 (3 votes) · LW · GW

Example 1 basically seems to be the problem of output diversity in generative models. This can be a problem in generative models, but there are ways around it. e.g. instead of outputting the highest-probability individual sequence, which will certainly look "manipulative" as you say, sample from the implied distribution over sequences. Then the sentence involving "pyrite" will be output with probability proportional to how likely the model thinks "pyrite" is on its own, disregarding subsequent tokens.

For example 2, I wrote a similar post a few months ago (and in fact, this idea seems to have been proposed and forgotten a few times on LW). But for gradient descent-based learning systems, I don't think the effect described will take place.

The reason is that gradient-descent-based systems are only updated towards what they actually observe. Let's say we're training a system to predict EU laws. If it predicts "The EU will pass potato laws..." but sees "The EU will pass corn laws..." the parameters will be updated to make "corn" more likely to have been output than "potato". There is no explicit global optimization for prediction accuracy.

As you train to convergence, the predictions of the model will attempt to approach a fixed point, a set of predictions that imply themselves. However, due to the local nature of the update, this fixed-point will not be selected to be globally minimal, it will just be the first minima the model falls into. (This is different from the problems with "local minima" you may have heard about in ordinary neural network training -- those go away in the infinite-capacity limit, whereas local minima among fixed-points do not) The fixed-point should look something like "what I would predict if I output [what I would predict if I output [what I would predict .. ]]]" where the initial prediction is some random gibberish. This might look pretty weird, but it's not optimizing for global prediction accuracy.

Comment by interstice on Halifax Meetup -- Board Games · 2019-08-01T02:40:50.248Z · score: 3 (2 votes) · LW · GW

Hey! It did happen. So far there are 3 of us, we've been meeting up pretty regularly. If you're interested I can let you know the next time we're planning to meet up.

Comment by interstice on No Safe AI and Creating Optionality · 2019-06-24T02:33:21.564Z · score: 5 (3 votes) · LW · GW

What this would mean is that we would have to recalibrate our notion of "safe", as whatever definition has been proved impossible does not match our intuitive perception. We consider lots of stuff we have around now to be reasonably safe, although we don't have a formal proof of safety for almost anything.

Comment by interstice on "UDT2" and "against UD+ASSA" · 2019-05-12T16:06:40.403Z · score: 2 (2 votes) · LW · GW

In the mad scientist example, why would your measure for the die landing 0 be 0.91? I think Solomonoff Induction would assign probability 0.1 to that outcome, because you need an extra bits to specify which clone you are. Or is this just meant to illustrate a problem with ASSA, UD not included?

Comment by interstice on Predictors as Agents · 2019-03-25T20:57:47.428Z · score: 1 (1 votes) · LW · GW

Yeah, if you train the algorithm by random sampling, the effect I described will take place. The same thing will happen if you use an RL algorithm to update the parameters instead of an unsupervised learning algorithm(though it seems willfully perverse to do so -- you're throwing away a lot of the structure of the problem by doing this, so training will be much slower)

I also just found an old comment which makes the exact same argument I made here. (Though it now seems to me that argument is not necessarily correct!)

Comment by interstice on Two Small Experiments on GPT-2 · 2019-03-04T23:33:55.601Z · score: 4 (2 votes) · LW · GW

If you literally ran (a powered-up version of) GPT-2 on "A brilliant solution to the AI alignment problem is..." you would get the sort of thing an average internet user would think of as a brilliant solution to the AI alignment problem. Trying to do this more usefully basically leads to Paul's agenda (which is about trying to do imitation learning of an implicit organization of humans)

Comment by interstice on Predictors as Agents · 2019-01-03T21:22:03.357Z · score: 3 (2 votes) · LW · GW

Reflective Oracles are a bit of a weird case case because their 'loss' is more like a 0/1 loss than a log loss, so all of the minima are exactly the same(If we take a sample of 100000 universes to score them, the difference is merely incredibly small instead of 0). I was being a bit glib referencing them in the article; I had in mind something more like a model parameterizing a distribution over outputs, whose only influence on the world is via a random sample from this distribution. I think that such models should in general have fixed points for similar reasons, but am not sure. Regardless, these models will, I believe, favour fixed points whose distributions are easy to compute(But not fixed points with low entropy, that is they will punish logical uncertainty but not intrinsic uncertainy). I'm planning to run some experiments with VAEs and post the results later.

Comment by interstice on Generalising CNNs · 2019-01-03T03:49:10.172Z · score: 3 (2 votes) · LW · GW

You might be interested in Transformer Networks, which use a learned pattern of attention to route data between layers. They're pretty popular and have been used in some impressive applications like this very convincing image-synthesis GAN.

re: whether this is a good research direction. The fact that neural networks are highly compressible is very interesting and I too suspect that exploiting this fact could lead to more powerful models. However, if your goal is to increase the chance that AI has a positive impact, then it seems like the relevant thing is how quickly our understanding of how to align AI systems progresses, relative to our understanding of how to build powerful AI systems. As described, this idea sounds like it would be more useful for the latter.

Comment by interstice on Predictors as Agents · 2019-01-01T21:19:50.390Z · score: 1 (1 votes) · LW · GW
Is there a reason you think a reflective oracle (or equivalent) can't just be selected "arbitrarily", and will likely be selected to maximize some score?

The gradient descent is not being done over the reflective oracles, it's being done over some general computational model like a neural net. Any highly-performing solution will necessarily look like a fixed-point-finding computation of some kind, due to the self-referential nature of the predictions. Then, since this fixed-point-finder is *internal* to the model, it will be optimized for log loss just like everything else in the model.

That is, the global optimization of the model is distinct from whatever internal optimization the fixed-point-finder uses to choose the reflective oracle. The global optimization will favor internal optimizers that produce fixed-points with good score. So while fixed-point-finders in general won't optimize for anything in particular, the one this model uses will.

Comment by interstice on Announcement: AI alignment prize round 3 winners and next round · 2018-12-31T22:53:30.147Z · score: 3 (2 votes) · LW · GW

I submit Predictors as Agents.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-30T15:18:25.622Z · score: 1 (1 votes) · LW · GW
If we assume Sleeping Beauty has lots of information, we might expect that the shortest matching program will look like a simulation of physical law plus a "bridging law" that, given this simulation, tells you what symbols get written to the tape

I agree. I still think that the probabilities would be closer to 1/2, 1/4, 1/4. The bridging law could look like this: search over the universe for compact encodings of my memories so far, then see what is written next onto this encoding. In this case, it would take no more bits to specify waking up on Tuesday, because the memories are identical, in the same format, and just slightly later temporally.

In a naturalized setting, it seems like the tricky part would be getting the AIXI on Monday to care what happens after it goes to sleep. It 'knows' that it's going to lose consciousness(it can see that its current memory encoding is going to be overwritten) so its next prediction is undetermined by its world-model. There is one program that will give it the reward of its successor then terminates, as I described above, but it's not clear why the AIXI would favour that hypothesis. Maybe if it has been in situations involving memory-wiping before, or has observed other RO-AIXI's in such situations.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-29T19:19:46.254Z · score: 1 (1 votes) · LW · GW

"I can't make bets on my beliefs about the Eschaton, because they are about the Eschaton." -- Well, it makes sense. Besides, I did offer you a bet taking into account a) that the money may be worth less in my branch b) I don't think DL + RL AGI is more likely than not, just plausible. If you're more than 96% certain there will be no such AI, 20:1 odds are a good deal.

But anyways, I would be fine with betting on a nearer-term challenge. How about -- in 5 years, a bipedal robot that can run on rough terrain, as in this video, using a policy learned from scratch by DL + RL(possibly including a simulated environment during training) 1:1 odds.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-28T17:07:21.607Z · score: 1 (1 votes) · LW · GW

Hmmm...but if I win the bet then the world may be destroyed, or our environment could change so much the money will become worthless. Would you take 20:1 odds that there won't be DL+RL-based HLAI in 25 years?

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-28T16:59:33.354Z · score: 1 (1 votes) · LW · GW

I still don't see how you're getting those probabilities. Say it takes 1 bit to describe the outcome of the coin toss, and assume it's easy to find all the copies of yourself(ie your memories) in different worlds. Then you need:

1 bit to specify if the coin landed heads or tails

If the coin landed tails, you need 1 more bit to specify if it's Monday or Tuesday.

So AIXI would give these scenarios P(HM)=0.50, P(TM)=0.25, P(TT)=0.25.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T22:08:06.447Z · score: 1 (1 votes) · LW · GW

Have something in mind?

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T20:34:40.935Z · score: 1 (1 votes) · LW · GW

Well, it COULD be the case that the K-complexity of the memory-erased AIXI environment is lower, even when it learns that this happened. The reason for this is that there could be many possible past AIXI's who have their memory erased/altered and end up in the same subjective situation. Then the memory-erasure hypothesis can use the lowest K-complexity AIXI who ends up with these memories. As the AIXI learns more it can gradually piece together which of the potential past AIXI's it actually was and the K-complexity will go back up again.

EDIT: Oh, I see you were talking about actually having a RANDOM memory in the sense of a random sequence of 1s and 0s. Yeah, but this is no different than AIXI thinking that any random process is high K-complexity. In general, and discounting merging, the memory-altering subroutine will increase the complexity of the environment by a constant plus the complexity of whatever transformation you want to apply to the memories.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T19:49:47.219Z · score: 2 (3 votes) · LW · GW

Well, the DotA bot pretty much just used PPO,. AlphaZero used MCTS + RL, OpenAI recently got a robot hand to do object manipulation with PPO and a simulator(the simulator was hand-built, but in principle it could be produced by unsupervised learning like in this). Clearly it's possible to get sophisticated behaviors out of pretty simple RL algorithms. It could be the case that these approaches will "run out of steam" before getting to HLAI, but it's hard to tell at the moment, because our algorithms aren't running with the same amount of compute + data as humans (for humans, I am thinking of our entire lifetime experiences as data, which is used to build a cross-domain optimizer).

re: Uber, I agree that at least in the short term most applications in the real world will feature a fair amount of engineering by hand. But the need for this could decrease as more power becomes available, as has been the case in supervised learning.

Comment by interstice on Boltzmann Brains and Within-model vs. Between-models Probability · 2018-09-27T16:29:38.356Z · score: 1 (1 votes) · LW · GW

How do the initial simple conditions relate to the branching? Our universe seems to have had simple initial conditions but there's still been a lot of random branching, right? That is, the universe from our perspective is just one branch of a quantum state evolving simply from simple conditions, so you need O(#branching events) bits to describe it. Incidentally this undermines Eliezer's argument for MWI based on Solomonoff induction, though MWI is probably still true

[EDITED: Oh, from one of your other comments I see that you aren't saying the shortest program involves beginning at the start of the universe. That makes sense]

Comment by interstice on Deep learning - deeper flaws? · 2018-09-27T16:11:31.441Z · score: 1 (1 votes) · LW · GW

I agree that you do need some sort of causal structure around the function-fitting deep net. The question is how complex this structure needs to be before we can get to HLAI. It seems plausible to me(at least a 10% chance, say) that it could be quite simple, maybe just consisting of modestly more sophisticated versions of the RL algorithms we have so far, combined with really big deep networks.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T15:51:05.644Z · score: 1 (1 votes) · LW · GW

Incidentally, you can use the same idea to have RO-AIXI do anthropic reasoning/bargaining about observers that are in a broader reference class than 'exact same sense data', by making the mapping O -> O' some sort of coarse-graining.

Comment by interstice on Reflective AIXI and Anthropics · 2018-09-27T15:44:14.392Z · score: 1 (1 votes) · LW · GW

" P(HM)=0.49, P(TM)=0.49, P(TT)=0.2 " -- Are these supposed to be mutually exclusive probabilities?

" There is a turing machine that writes the memory-wiped contents to tape all in one pass. " - Yes, this is basically what I said. ('environment' above could include 'the world' + bridging laws). But you also need to alter the reward structure a bit to make it match our usual intuition of what 'memory-wiping' means, and this has significance for decision theory.

Consider, if your own memory was erased, you would probably still be concerned about what was going to happen to you later. But a regular AIXI won't care about what happens to its memory-wiped clone(i.e. another AIXI inducting on the 'memory-wiped' stream), because they don't share an input channel. So to fix this you give the original AIXI all of the rewards that its clone ends up getting.

Comment by interstice on Deep learning - deeper flaws? · 2018-09-26T22:36:05.115Z · score: 3 (1 votes) · LW · GW

Okay, but (e.g.) deep RL methods can solve problems that apparently require quite complex causal thinking such as playing DotA. I think what is happening here is that while there is no explicit causal modelling happening at the lowest level of the algorithm, the learned model ends up building something that serves the functions of one because that is the simplest way to solve a general class of problems. See the above meta-RL paper for good examples of this. There seems to be no obvious obstruction to scaling this sort of thing up to human-level causal modelling. Can you point to a particular task needing causal inference that you think these methods cannot solve?

Comment by interstice on Boltzmann Brains and Within-model vs. Between-models Probability · 2018-09-26T15:26:28.038Z · score: 1 (1 votes) · LW · GW

The penalty for specifying where you are in space and time is dwarfed by the penalty for specifying which Everett branch you're in.