Abe Dillon's Shortform 2019-07-31T23:22:06.419Z


Comment by Abe Dillon (abe-dillon) on Human instincts, symbol grounding, and the blank-slate neocortex · 2020-04-12T01:21:11.709Z · LW · GW

Hey, G Gordon Worley III!

I just finished reading this post because Steve2152 was one of the two people (you being the other) to comment on my (accidentally published) post on formalizing and justifying the concept of emotions.

It's interesting to hear that you're looking for a foundational grounding of human values because I'm planning a post on that subject as well. I think you're close with the concept of error minimization. My theory reaches back to the origins of life and what sets living systems apart from non-living systems. Living systems are locally anti-entropic which means: 1) According to the second law of thermodynamics, a living system can never be a truly closed system. 2) Life is characterized by a medium that can gather information such as genetic material.

The second law of thermodynamics means that all things decay, so it's not enough to simply gather information, the system must also preserve the information it gathers. This creates an interesting dynamic because gathering information inherently means encountering entropy (the unknown) which is inherently dangerous (what does this red button do?). It's somewhat at odds with the goal of preserving information. You can even see this fundamental dichotomy manifest in the collective intelligence of the human race playing tug-of-war between conservatism (which is fundamentally about stability and preservation of norms) and liberalism (which is fundamentally about seeking progress or new ways to better society).

Another interesting consequence of the 'telos' of life being to gather and preserve information is: it inherently provides a means of assigning value to information. That is: information is more valuable the more it pertains to the goal of gathering and preserving information. If an asteroid were about to hit earth and you were chosen to live on a space colony until Earth's atmosphere allowed humans to return and start society anew, you would probably favor taking a 16 GB thumb drive with the entire English Wikipedia article text than a server-rack full several petabytes of high-definition recordings of all the reality television ever filmed, because that won't be super helpful toward the goal of preserving knowledge *relevant* to man kind's survival.

The theory also opens interesting discussions like, if all living things have a common goal; why do things like paracites, conflict, and war exist? Also, how has evolution led to a set of instincts that imperfectly approximate this goal? How do we implement this goal in an intelligent system? How do we guarantee such an implementation will not result in conflict? Etc.

Anyway, I hope you'll read it when I publish it and let me know what you think!

Comment by abe-dillon on [deleted post] 2020-04-11T22:51:04.942Z

Thanks for the insight!

This is actually an incomplete draft that I didn't mean to publish, so I do intend to cover some of your points. It's probably not going to go into the depth you're hoping for since it's pretty much just a synthesis of the bit of information from a segment from a Radiolab episode and three theorems about neural networks.

My goal was to simply use those facts to provide an informal proof that a trade-off exists between latency and optimality* in neural networks and that said trade-off explains why some agents (including biological creatures) might use multiple models at different points in that trade-off instead of devoting all their computational resources to one very deep model or one low-latency model. I don't think it's a particularly earth-shattering revelation, but sometimes; even pretty straight forward ideas can have an impact**.

I also don't think that subconscious processing is exactly the same as emotions.

The position I present here is a little more subtle than that. It doesn't directly equate subconscious processing to emotions. I state that emotions are: a conscious recognition of physiological processes triggered by faster stimulus-response paths in your nervous system.

The examples given in the podcast focus mostly on fight-or-flight until they get later into the discussion about research on paraplegic subjects. I think that might hint at a hierarchy of emotional complexity. It's easy to explain the most basic 'emotion' that even the most primitive brains should express. As you point out; emotions like guilt are more difficult to explain. I don't know if I can give a satisfactory response to that point because it's beyond my lay understanding, but my best guess is: this feed-back loop from stimulus to response back to stimulus and so on can be initiated from something other than direct sensory input and the information fed back might include more than physiological state.

Each path has some input which propagates through it and results in some output. The output might include more than signals that directly physiological control signals such as various muscles. It include more abstract information such as a compact representation of the internal state of the path. The input might include more than sensory input. The feedback might be more direct.

For instance, I believe I've read that some parts of the brain receive a copy of recent motor commands which may or may not correspond to physiological change. Along with the in-direct feedback from sensors that measure your sweaty palms, the output of a path may directly feed back the command signals to release hormones or to blink eyes or whatever as input to other paths. A path might output signals that don't correspond to any physiological control, they may be specifically meant to be feedback signals that communicate more abstract information.

Another example is: you don't cry at the end of Schindler's List because of any direct sensory input. The emotion arises from a more complex, higher-order cognition of the situation. Perhaps there are abstract outputs from the slower paths that feed back into the faster paths which makes the whole feed-back system more complex and allows for a higher-order cognition paths to indirectly result in physiological responses that they don't directly control.

Another piece of the puzzle may be that the slowest path which I, perhaps erroneously; refer to consciousness, is supposedly where the physiological state triggered by faster paths gets labeled. That slower path almost definitely uses other context to arrive at such a label. A physiological state can have multiple causes. If you've just run a marathon on a cold day, it's unlikely you'll feel you're frightened if you register as an elevated heart rate, sweaty palms, goosebumps, etc.

I lump all those 'faster stimulus-response paths' including reflexes under the umbrella term 'subconscious' which might not be correct. I'm not sure if any of the related fields (neurology, psychology, etc.) have a more precise definition for subconscious. The word used in the podcast is the 'autonomic nervous system' which, according to Google means: the part of the nervous system responsible for control of the bodily functions not consciously directed, such as breathing, the heartbeat, and digestive processes.

There's a bit of a blurred line there, since reflexes are often included as part of the autonomic nervous system even though they govern responses that can also be consciously directed, such as blinking. Also, I believe the debate of what, exactly, 'consciously directed' means, is still out since, AFAIK; there's no generally agreed upon formal definition of the word 'consciousness'.

In fact, the term "subconscious" lumps together "some of the things happening in the neocortex" with "everything happening elsewhere in the brain" (amygdala, tectum, etc.) which I think are profoundly different and well worth distinguishing. ... I think a neocortex by itself cannot do anything biologically useful.

I think there are a lot of words related to the phenomenon of intelligence and consciousness that have nebulous, informal meanings which vaguely reference concrete implementations (like the human mind and brain), but could and should be formalized mathematically. In that pursuit, I'd like to extract the essence of those words from the implementation details like the neocortex.

There are many other creatures, such as octopuses and crows; which are on a similar evolutionary path of increasing intelligence but have completely different anatomy to humans and each other. I agree that focusing research on the neocortex itself is a terrible way to understand intelligence. It's like trying to understand how a computer works by looking only at media files on the hard drive. Ignoring the BIOS, operating system, file system, CPU, and other underlying systems that render that data useful.

I believe, for instance; Artificial Intelligence is a misnomer. We should be studying the phenomenon of intelligence as an abstract property that a system can exhibit regardless of whether it's man-made. There is no scientific field of artificial aerodynamics or artificial chemistry. There's no fundamental difference between the way air behaves when it interacts with a wing that depends upon whether the wing is natural or man-made.

Without a formal definition of 'intelligence' we have no way of making basic claims like, "system X is more intelligent than system Y". It's similar to how fields like physics were stuck until previously vague words like force and energy were given formal mathematical definitions. The engineering of heat engines benefited greatly when thermodynamics was developed and formalized ideas like 'heat' and 'entropy'. Computer science wasn't really possible until Church and Turing formalized the vague ideas of computation and computability. Later Shannon formalized the concept of information and allowed even greater progress.

We can look to specific implementations of a phenomenon to draw inspiration and help us understand the more universal truths about the phenomenon in question (as I do in this post), but if an alien robot came from outer-space and behaved in every way like a human, I see no reason to treat its intelligence as a fundamentally distinct phenomenon. When it exhibits emotion, I see no reason to call it anything else.

Anyway, I haven't read your post yet, but I look forward to it! Thanks, again!

*here, optimality refers to producing the absolute best outputs for a given input. It's independent of the amount of resources required to arrive at those outputs.

**I mean: Special Relativity (SR) came from the fact that the velocity of light (measured in space/time) appeared constant across all reference frames according to Maxwell's equations (and backed up by observation). Einstein made the genius but obvious (in hind-sight) conclusion that the only way it's possible for a value of space/time to remain constant between reference frames is if the measure space and time themselves are variable. The Lorentz transform is the only transform consistent with such dimensional variability between reference frames. There are only three terms in c = time/space, If c is constant and different reference frames demand variability, time and space must not be constant.

Not that I think I'm presenting anything as amazing as Special Relativity or that I think I'm anywhere near Einstein. It's just a convenient example.

Comment by abe-dillon on [deleted post] 2020-04-10T15:33:40.916Z

In short, your second paragraph is what I'm after.

Philosophically, I don't think the distinction you make between a design choice and an evolved feature carries much relevance. It's true that some things evolve that have no purpose and it's easy to imagine that emotions are one of things especially since people often conceptualize emotion as the "opposite" of rationality, however; some things evolve that clearly do serve a purpose (in other words there is a justification for their existence), like the eye. Of course nobody sat down with the intent to design an eye. It evolved, was useful, and stuck around because of that utility. The utility of the eye (its justification for sticking around) exists independent of whether the eye exists. A designer recognizes the utility before hand and purposefully implements it. Evolution "recognizes" the utility after stumbling into it.

Comment by Abe Dillon (abe-dillon) on There's No Fire Alarm for Artificial General Intelligence · 2019-12-30T03:48:59.412Z · LW · GW

How? The person I'm responding to gets the math of probability wrong and uses it to make a confusing claim that "there's nothing wrong" as though we have no more agency over the development of AI than we do over the chaotic motion of a dice.

It's foolish to liken the development of AI to a roll of the dice. Given the stakes, we must try to study, prepare for, and guide the development of AI as best we can.

This isn't hypothetical. We've already built a machine that's more intelligent than any man alive and which brutally optimizes toward a goal that's incompatible with the good of man kind. We call it, "Global Capitalism". There isn't a man alive who knows how to stock the shelves of stores all over the world with #2 pencils that cost only 2 cents each, yet it happens every day because *the system* knows how. The problem is: that system operates with a sociopathic disregard for life (human or otherwise) and has exceeded all limits of sustainability without so much as slowing down. It's a short-sighted, cruel leviathan and there's no human at the reigns.

At this point, it's not about waiting for the dice to settle, it's about figuring out how to wrangle such a beast and prevent the creation of more.

Comment by Abe Dillon (abe-dillon) on An Intuitive Explanation of Solomonoff Induction · 2019-08-09T01:39:01.549Z · LW · GW

This is a pretty lame attitude towards mathematics. If William Rowan Hamilton showed you his discovery of quaternions, you'd probably scoff and say "yeah, but what can that do for ME?".

Occam's razor has been a guiding principal for science for centuries without having any proof for why it's a good policy, Now Solomonoff comes along and provides a proof and you're unimpressed. Great.

Comment by Abe Dillon (abe-dillon) on An Intuitive Explanation of Solomonoff Induction · 2019-08-09T01:04:28.782Z · LW · GW
After all, a formalization of Occam's razor is supposed to be useful in order to be considered rational.

Declaring a mathematical abstraction useless just because it is not practically applicable to whatever your purpose may be is pretty short-sighted. The concept of infinity isn't useful to engineers, but it's very useful to mathematicians. Does that make it irrational?

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-09T00:58:15.998Z · LW · GW

Thinking this through some more, I think the real problem is that S.I. is defined in the perspective of an agent modeling an environment, so the assumption that Many Worlds has to put any un-observable on the output tape is incorrect. It's like stating that Copenhagen has to output all the probability amplitudes onto the output tape and maybe whatever dice god rolled to produce the final answer as well. Neither of those are true.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-09T00:44:36.068Z · LW · GW

That's a link to somebody complaining about how someone else presented an argument. I have no idea what point you think it makes that's relevant to this discussion.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-09T00:37:52.221Z · LW · GW
output of a TM that just runs the SWE doesn't predict your and only your observations. You have to manually perform an extra operation to extract them, and that's extra complexity that isn't part of the "complexity of the programme".

First, can you define "SWE"? I'm not familiar with the acronym.

Second, why is that a problem? You should want a theory that requires as few assumptions as possible to explain as much as possible. The fact that it explains more than just your point of view (POV) is a good thing. It lets you make predictions. The only requirement is that it explains at least your POV.

The point is to explain the patterns you observe.

>The size of the universe is not a postulate of the QFT or General Relativity.
That's not relevant to my argument.

It most certainly is. If you try to run the Copenhagen interpretation in a Turing machine to get output that matches your POV, then it has to output the whole universe and you have to find your POV on the tape somewhere.

The problem is: That's not how theories are tested. It's not like people are looking for a theory that explains electromagnetism and why they're afraid of clowns and why their uncle "Bob" visited so much when they were a teenager and why their's a white streak in their prom photo as though a cosmic ray hit the camera when the picture was taken, etc. etc.

The observations we're talking about are experiments where a particular phenomenon is invoked with minimal disturbance from the outside world (if you're lucky enough to work in a field like Physics which permits such experiments). In a simple universe that just has an electron traveling toward a double-slit wall and a detector, what happens? We can observe that and we can run our model to see what it predicts. We don't have to run the Turing machine with input of 10^80 particles for 13.8 billion years then try to sift through the output tape to find what matches our observations.

Same thing for the Many Worlds interpretation. It explains the results of our experiments just as well as Copenhagen, it just doesn't posit any special phenomenon like observation, observation is just what entanglement looks like from the perspective of one of the entangled particles (or system of particles if you're talking about the scientist).

Operationally, something like copenhagen, ie. neglect of unobserved predictions, and renormalisation , hasto occur, because otherwise you can't make predictions.

First of all: Of course you can use many worlds to make predictions, You do it every time you use the math of QFT. You can make predictions about entangled particles, can't you? The only thing is: while the math of probability is about weighted sums of hypothetical paths, in MW you take it quite literally as paths the actually being traversed. That's what you're trading for the magic dice machine in non-deterministic theories.

Secondly: Just because Many Worlds says those worlds exist, doesn't mean you have to invent some extra phenomenon to justify renormalization. At the end of the day the unobservable universe is still unobservable. When you're talking about predicting what you might observe when you run experiment X, it's fine to ultimately discard the rest of the multiverse. You just don't need to make up some story about how your perspective is special and you have some magic power to collapse waveforms that other particles don't have.

Hence my comment about SU&C. Different adds some extra baggage about what that means -- occurred in a different branch versus didn't occur -- but the operation still needs to occur.

Please stop introducing obscure acronyms without stating what they mean. It makes your argument less clear. More often than not it results in *more* typing because of the confusion it causes. I have no idea what this sentence means. SU&C = Single Universe and Collapse? Like objective collapse? "Different" what?

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-06T21:11:02.835Z · LW · GW
Well, the original comment was about explaining lightning

You're right. I think I see your point more clearly now. I may have to think about this a little deeper. It's very hard to apply Occam's razor to theories about emergent phenomena. Especially those several steps removed from basic particle interactions. There are, of course, other ways to weigh on theory against another. One of which is falsifiability.

If the Thor theory must be constantly modified so to explain why nobody can directly observe Thor, then it gets pushed towards un-falsifiability. It gets ejected from science because there's no way to even test the theory which in-turn means it has no predictive power.

As I explained in one of my replies to Jimdrix_Hendri, thought there is a formalization for Occam's razor, Solomonoff induction isn't really used. It's usually more like: individual phenomena are studied and characterized mathematically, then; links between them are found that explain more with fewer and less complex assumptions.

In the case of Many Worlds vs. Copenhagen, it's pretty clear cut. Copenhagen has the same explanatory power as Many Worlds and shares all the postulates of Many Worlds, but adds some extra assumptions, so it's a clear violation of Occam's razor. I don't know of a *practical* way to handle situations that are less clear cut.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-06T18:30:21.841Z · LW · GW
Thor isn't quite as directly in the theory :-) In Norse mythology...

Tetraspace Grouping's original post clearly invokes Thor as an alternate hypothesis to Maxwell's equations to explain the phenomenon of electromagnetism. They're using Thor as a generic stand-in for the God hypothesis.

Norse mythology he's a creature born to a father and mother, a consequence of initial conditions just like you.

Now you're calling them "initial conditions". This is very different from "conditions" which are directly observable. We can observe the current conditions of the universe, come up with theories that explain the various phenomena we see and use those theories to make testable predictions about the future and somewhat harder to test predictions about the past. I would love to see a simple theory that predicts that the universe not only had a definite beginning (hint: your High School science teacher was wrong about modern cosmology) but started with sentient beings given the currently observable conditions.

Sure, you'd have to believe that initial conditions were such that would lead to Thor.

Which would be a lineage of Gods that begins with some God that created everything and is either directly or indirectly responsible for all the phenomena we observe according to the mythology.

I think you're the one missing Tetraspace Grouping's point. They weren't trying to invoke all of Norse mythology, they were trying to compare the complexity of explaining the phenomenon of electromagnetism by a few short equations vs. saying some intelligent being does it.

You wouldn't penalize the Bob hypothesis by saying "Bob's brain is too complicated", so neither should you penalize the Thor hypothesis for that reason.

The existence of Bob isn't a hypothesis it's not used to explain any phenomenon. Thor is invoked as the cause of, not consequence of, a fundamental phenomenon. If I noticed some loud noise on my roof every full moon, and you told me that your friend bob likes to do parkour on rooftops in my neighborhood in the light of the full moon, that would be a hypothesis for a phenomenon that I observed and I could test that hypothesis and verify that the noise is caused by Bob. If you posited that Bob was responsible for some fundamental forces of the universe, that would be much harder for me to swallow.

The true reason you penalize the Thor hypothesis is because he has supernatural powers, unlike Bob. Which is what I've been saying since the first comment.

No. The supernatural doesn't just violate Occam's Razor: it is flat-out incompatible with science. The one assumption in science is naturalism. Science is the best system we know for accumulating information without relying on trust. You have to state how you performed an experiment and what you observed so that others can recreate your result. If you say, "my neighbor picked up sticks on the sabbath and was struck by lightning" others can try to repeat that experiment.

It is, indeed, possible that life on Earth was created by an intelligent being or a group of intelligent beings. They need not be supernatural. That theory, however; is necessarily more complex than any a-biogenesis theory because you have to then explain how the intelligent designer(s) came about which would eventually involve some form of a-biogenesis.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-06T04:00:20.192Z · LW · GW

You're trying to conflate theory, conditions, and what they entail in a not so subtle way. Occam's razor is about the complexity of a theory, not conditions, not what the theory and conditions entail. Just the theory. The Thor hypothesis puts Thor directly in the theory. It's not derived from the theory under certain conditions. In the case of the Thor theory, you have to assume more to arrive at the same conclusion.

It's really not that complicated.

Comment by Abe Dillon (abe-dillon) on There's No Fire Alarm for Artificial General Intelligence · 2019-08-06T03:05:32.394Z · LW · GW

That's not how rolling a die works. Each roll is completely independent. The expected value of rolling a 20 sided die is 10.5 but there's no logical way to assign an expected outcome of any given roll. You can calculate how many times you'd have to roll before you're more likely than not to have rolled a specific value (1-P(specific value))^n < 0.5 so log(0.5)/log(1-P(specific_value)) < n. In this case P(specific_value) is 1/20 = 0.05. So n > log(0.5)/log(0.95) = 13.513. So you're more likely than not to have rolled a "1" after 14 rolls, but that still doesn't tell you what to expect your Nth roll to be.

I don't see how your dice rolling example supports a pacifist outlook. We're not rolling dice here. This is a subject we can study and gain more information about to understand the different outcomes better. You can't do that with a dice. The outcomes of rolling a dice are not so dire. Probability is quite useful for making decisions in the face of uncertainty if you understand it better.

Comment by Abe Dillon (abe-dillon) on AI Alignment Open Thread August 2019 · 2019-08-06T02:20:00.433Z · LW · GW

The telos of life is to collect and preserve information. That is to say: this is the defining behavior of a living system, so it is an inherent goal. The beginning of life must have involved some replicating medium for storing information. At first, life actively preserved information by replicating, and passively collected information through the process of evolution by natural selection. Now life forms have several ways of collecting and storing information. Genetics, epigenetic, brains, immune systems, gut biomes, etc.

Obviously a system that collects and preserves information is anti-entropic, so living systems can never be fully closed systems. One can think of them as turbulent vortices that form in the flow of the universe from low-entropy to high-entropy. It may never be possible to halt entropy completely, but if the vortex grows enough, it may slow the progression enough that the universe never quite reaches equilibrium. That's the hope, at least.

One nice thing about this goal is that it's also an instrumental goal. It should lead to a very general form of intelligence that's capable of solving many problems.

One question is: if all living creatures share the same goal, why is there conflict? The simple answer is that it's a flaw in evolution. Different creatures encapsulate different information about how to survive. There are few ways to share this information, so there's not much way to form an alliance with other creatures. Ideally, we would want to maximize our internal, low entropy part, and minimize our interface with high entropy.

Imagine playing a game of Risk. A good strategy is to maximize the number of countries you control while minimizing the number of access points to your territory. If you hold North America, you want to take Venezuela, Iceland, and Kamchatka too because they add to your territory without adding to your "interface". You still only have three territories to defend. This principal extends to many real-world scenarios.

Of-course a better way is to form alliances with your neighbors so you don't have to spend so many resources concurring them (that's not a good way to win Risk, but it would be better in the real world).

The reason humans haven't figured out how to reach a state of peace is because we have a flawed implementation of intelligence that makes it difficult to align our interests (or to recognize that our base goals are inherently aligned).

One interesting consequence of the goal of collecting and preserving information is that it inherently implies a utility function to information. That is: information that is more relevant to the problem of collecting and preserving information is more valuable than information that's less relevant to that goal. You're not winning at life if you have an HD box set of "Happy Days" while your neighbor has only a flash drive with all of wikipedia on it. You may have more bits of information, but those bits aren't very useful.

Another reason for conflict among humans is the hard problem of when to favor information preservation over collection. Collecting information necessarily involves risk because it means encountering the unknown. This is the basic conflict between conservatism and liberalism in the most general form of those words.

Would an AI given the goal of collecting and preserving information completely solve the alignment problem? It seems like it might. I'd like to be able to prove such a statement. Thoughts?

EDIT: Please pardon the disorganized, stream-of-consciousness, style of this post. I'm usually skeptical of posts that seem so scatter-brained and almost... hippy-dippy... for lack of a better word. Like the kind of rambling that a stoned teenager might spout. Please work with me here. I've found it hard to present this idea without coming off as a spiritualist-quack, but it is a very serious proposal.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-06T00:58:26.772Z · LW · GW

I think you're example of interpreting quantum mechanics gets pretty close to the heart of the matter. It's one thing to point at solomonoff induction and say, "there's your formalization". It's quite another to understand how Occam's Razor is used in practice.

Nobody actually tries to convert the Standard Model to the shortest possible computer program, count the bits, and compare it to the shortest possible computer program for string theory or whatever.

What you'll find, however; is that some theories amount to other theories but with an extra postulate or two (e.g. many worlds vs. Copenhagen). So they are strictly more complex. If it doesn't explain more than the simpler theory the extra complexity isn't justified.

A lot of the progression of science over the last few centuries has been toward unifying diverse theories under less complex, general frameworks. Special relativity helped unify theories about the electric and magnetic forces, which were then unified with the weak nuclear force and eventually the strong nuclear force. A lot of that work has helped explain the composition of the periodic table and the underlying mechanisms to chemistry. In other words, where there used to be many separate theories, there are now only two theories that explain almost every phenomenon in the observable universe. Those two theories are based on surprisingly few and surprisingly simple postulates.

Over the 20th century, the trend was towards reducing postulates and explaining more, so it was pretty clear that Occam's razor was being followed. Since then, we've run into a bit of an impasse with GR and QFT not nicely unifying and discoveries like dark energy and dark matter.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-06T00:30:04.381Z · LW · GW
if you cast SI on terms of a linear string of bits, as is standard, you are building in a kind of single universe assumption.

First, I assume you mean a sequential string of bits. "Linear" has a well defined meaning in math that doesn't make sense in the context you used it.

Second, can you explain what you mean by that? It doesn't sound correct. I mean, an agent can only make predictions about its observable universe, but that's true of humans too. We can speculate about multiverses and how they may shape our observations (e.g. the many worlds interpretation of QFT), but so could an SI agent.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-05T19:48:21.080Z · LW · GW

That's not how algorithmic information theory works. The output tape is not a factor in the complexity of the program. Just the length of the program.

The size of the universe is not a postulate of the QFT or General Relativity. One could derive what a universe containing only two particles would look like using QFT or GR. It's not a fault of the theory that the universe actually contains ~ 10^80 particles†.

People used to think the solar system was the extent of the universe. Just over a century ago, the Milky Way Galaxy was thought to be the extent of the universe. Then it grew by a factor of over 100 Billion when we found that there were that many galaxies. That doesn't mean that our theories got 100 Billion times more complex.

If you take the Many Worlds interpretation and decide to follow the perspective of a single particle as though it were special, Copenhagen is what falls out. You're left having to explain what makes that perspective so special.

† Now we know that the observable universe may only be a tiny fraction of the universe at large which may be infinite. In-fact, there are several different types of multiverse that could exist simultaneously.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-05T19:37:19.643Z · LW · GW
Once you've observed a chunk of binary tape that has at least one humanlike brain (you), it shouldn't take that many bits to describe another (Thor).

Maxwell's Equations don't contain any such chunk of tape. In current physical theories (the Standard Model and General Relativity), the brains are not described in the math, rather brains are a consequence of the theories carried out under specific conditions.

Theories are based on postulates which are equivalent to axioms in mathematics. They are the statements from which everything else is derived but which can't be derived themselves. Statements like "the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source or observer."

At the turn of the 20th century, scientists were confused by the apparent contradiction between Galilean Relativity and the implication from Maxwell's Equations and empirical observation that the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source or observer. Einstein formulate Special Relativity by simply asserting that both were true. That is: the postulates of SR are:

  1. the laws of physics are invariant (i.e. identical) in all inertial frames of reference (i.e. non-accelerating frames of reference); and
  2. the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source or observer.

The only way to reconcile those two statements is if time and space become variables. The rest of SR is derived from those two postulates.

Quantum Field Theory is similarly derived from only a few postulates. None of them postulate that some intelligent being just exists. Any program that would describe such a postulate would be relatively enormous.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-05T19:11:35.524Z · LW · GW
The idea of counting postulates is attractive, but it harbours a problem...
...we'd still find that each postulate encapsulates many concepts, and that a fair comparison between competing theories should consider the relative complexity of the concepts as well.

Yes, I agree. A simple postulate count is not sufficient. That's why I said complexity is *related* to it rather than the number itself. If you want a mathematical formalization of Occam's Razor, you should read up on Solomonoff's Inductive Inference.

To address your point about the "complexity" of the "Many Worlds" interpretation of quantum field theory (QFT): The size of the universe is not a postulate of the QFT or General Relativity. One could derive what a universe containing only two particles would look like using QFT or GR. It's not a fault of the theory that the universe actually contains ~ 10^80 particles†.

People used to think the solar system was the extent of the universe. Just over a century ago, the Milky Way Galaxy was thought to be the extent of the universe. Then it grew by a factor of over 100 Billion when we found that there were that many galaxies. That doesn't mean that our theories got 100 Billion times more complex.

† Now we know that the observable universe may only be a tiny fraction of the universe at large which may be infinite. In-fact, there are several different types of multiverse that could exist simultaneously.

Comment by Abe Dillon (abe-dillon) on Occam's Razor: In need of sharpening? · 2019-08-02T23:35:04.927Z · LW · GW

The Many Worlds interpretation of Quantum Mechanics is considered simple because it takes the math at face value and adds nothing more. There is no phenomenon of wave-function collapse. There is no special perspective of some observer. There is no pilot wave. There are no additional phenomena or special frames of reference imposed on the math to tell a story. You just look at the equations and that's what they say is happening.

The complexity of a theory is related to the number of postulates you have to make. For instance: Special Relativity is actually based on two postulates:

  1. the laws of physics are invariant (i.e. identical) in all inertial frames of reference (i.e. non-accelerating frames of reference); and
  2. the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source or observer.

The only way to reconcile those two postulates are if space and time become variables.

The rest is derived from those postulates.

Quantum Filed Theory is based on Special Relativity and the Principal of Least Action.

Comment by Abe Dillon (abe-dillon) on jacobjacob's Shortform Feed · 2019-08-02T22:05:46.297Z · LW · GW

According to the standard model of physics: information can't be created or destroyed. I don't know if science can be said to "generate" information rather than capturing it. It seems like you might be referring to a less formal notion of information, maybe "knowledge".

Are short-forms really about information and knowledge? It's my understanding that they're about short thoughts and ideas.

I've been contemplating the value alignment problem and have come to the idea that the "telos" of life is to capture and preserve information. This seemingly implies some measure of the utility of information, because information that's more relevant to the problem of capturing and preserving information is more important to capture and preserve than information that's irrelevant to capturing and preserving information. You might call such a measure "knowledge", but there's probably already an information theoretic formalization of that word.

I have to admit, I don't have a strong background in information theory. I'm not really sure if it even makes sense to discuss what some information is "about". I think there's something called the Data-Information-Knowledge-Wisdom (DIKW) hierarchy which may help sort that out. I think data is the bits used to store information. Like the information content of an un-compressed word document might be the same after compressing said document, it just takes up less data. Knowledge might be how information relates to other information, like you might think it takes one bit of information to convey whether the British are invading by land or by sea, but if you have more information about what factors into that decision, like the weather then the signal conveys less than one bit of information because you can make a pretty good prediction without it. In other words: our universe follows some rules and causal relationships so treating events as independent random occurrences is rarely correct. Wisdom, I believe; is about using the knowledge and information you have to make decisions.

Take all that with a grain of salt.

Comment by Abe Dillon (abe-dillon) on Abe Dillon's Shortform · 2019-08-02T20:43:05.766Z · LW · GW

A flaw in the Gödel Machine may provide a formal justification for evolution

I've never been a fan of the concept of evolutionary computation. Evolution isn't fundamentally different than other forms of engineering, rather it's the most basic concept in engineering. The idea slightly modifying an existing solution to arrive at a better solution is a fundamental part of engineering. When you take away all of an engineer's other tools, like modeling, analysis, heuristics, etc. You're left with evolution.

Designing something can be modeled as a series of choices like traversing a tree. There are typically far more possibilities per choice than is practical to explore, so we use heuristics and intelligence to prune the tree. Sure, an evolutionary algorithm might consider branches you never would have considered, but it's still aimless, you could probably do better simply less aggressively pruning the search tree if you have the resources, there will always be countless branches that are clearly not worth exploring. You want to make a flying machine? What material should the fuselage be made of? What? You didn't even consider peanut butter? Why?!

I think that some of the draw to evolution comes from the elegant forms found in nature, many of which are beyond the capabilities of human engineering, but a lot of that can be chalked up to the fact that biology started by default with the "holy grail" of manufacturing technologies: codified molecular self-assembly. If we could harness that capability and bring all the techniques we've learned over the last few centuries about managing complexity (particularly from computer science), we would quickly be able to engineer some mind-blowing technology in a matter of decades rather than billions of years.

Despite all this, people still find success using evolutionary algorithms and generate a lot of hype even though the techniques are doomed not to scale. Is there a time and place where evolution really is the best technique? Can we derive some rule for when to try evolutionary techniques? Maybe.

There's a particular sentence in the paper on the Gödel Machine paper that always struck me as odd:

Any formal system that encompasses arithmetics (or ZFC etc) is either flawed or allows for unprovable but true statements. Hence even a Gödel machine with unlimited computational resources must ignore those self-improvements whose effectiveness it cannot prove

It seems like the machine is making an arbitrary decision in the face of undecidability, especially after admitting that a formal system is either flawed or allows for unprovable but true statements. The more appropriate behavior should be for the Gödel machine to copy itself where one copy implements the change and the other doesn't. This introduces some more problems, like; what is the cost of duplicating the machine and how should that be factored in, but I thought that observation might provide some food for thought.

Comment by Abe Dillon (abe-dillon) on Abe Dillon's Shortform · 2019-08-01T00:01:43.494Z · LW · GW

Rough is easy to find and not worth much.

Diamonds are much harder to find and worth a lot more.

I once read a post by someone who was unimpressed with the paper that introduced Generative Adversarial Networks (GANs). They pointed out some sloppy math and other such problems and were confused why such a paper had garnered so much praise.

Someone replied that, in her decades of reading research papers, she learned that finding flaws is easy and uninteresting. The real trick is being able to find the rare glint of insight that a paper brings to the table. Understanding how even a subtle idea can move a whole field forward. I kinda sympathize as a software developer.

I remember when I first tried to slog through Marcus Hutter's book on AIXI, I found the idea absurd. I have no formal background in mathematics, so I chalked some of that up to me not fully understanding what I was reading. I kept coming back to the question (among many others): "If AIXI is incomputable, how can Hutter supposedly prove that it performs 'optimally'? What does 'optimal' even mean? Surely it should include the computational complexity of the agent itself!"

I tried to modify AIXI to include some notion of computational resource utilization until I realized that any attempt to do so would be arbitrary. Some problems are much more sensitive to computational resource utilization than others. If I'm designing a computer chip, I can afford to have the algorithm run an extra month if it means my chip will be 10% faster. The algorithm that produces a sub-optimal solution in milliseconds using less than 20 MB of RAM doesn't help me. At the same time, if a saber-toothed tiger jumps out of a bush next to me. I don't have months to figure out a 10% faster route to get away.

I believe there are problems with AIXI, but lots of digital ink has been spilled on that subject. I plan on contributing a little to that in the near future, but I also wanted to point out that, it's easy to look at an idea like AIXI from the wrong perspective and miss a lot of what it truly has to say.

Comment by Abe Dillon (abe-dillon) on Abe Dillon's Shortform · 2019-07-31T23:22:06.551Z · LW · GW

Drop the "A"

Flight is a phenomenon exhibited by many creatures and machines alike. We don't say mosquitos are capable of flight and helicopters are capable of "artificial flight" as though the word means something fundamentally different for man-made devices. Flight is flight: the process by which an object moves through an atmosphere (or beyond it, as in the case of spaceflight) without contact with the surface.

So why do we feel the need to discuss intelligence as though it wasn't a phenomenon in its own right, but something fundamentally different depending on implementation?

If we were approaching this rationally, we'd first want to formalize the concept of intelligence mathematically so that we can bring to bear the full power of math to the pursuit and we could put to rest all the arguments and confusion caused by leaving the term so vaguely defined. Then we'd build a science dedicated to studying the phenomenon regardless of implementation. Then we'd develop ways to engineer intelligent systems (biological or otherwise) guided by the understanding granted us by a proper scientific field.

What we've done, instead; is developed a few computer science techniques, coined the term "Artificial Intelligence" and stumbled around in the dark trying to wear both the hat of engineer and scientist while leaving the word itself undefined. Seriously, our best definition amounts to: "we'll know it when we see it" (i.e. the Turing Test). That doesn't provide any guidance. That doesn't allow us to say "this change will make the system more intelligent" with any confidence.

Keeping the word "Artificial" in the name of what should be a scientific field only encourages tunnel vision. We should want to understand the phenomenon of intelligence whether it be exhibited by a computer, a human, a raven, a fungus, or a space alien.

Comment by Abe Dillon (abe-dillon) on Embedded World-Models · 2019-07-30T22:34:56.440Z · LW · GW
I was arguing that a specific type of fully general theory lacks a specific type of practical value

In that case, your argument lacks value in its own right because it is vague and confusing. I don't know any theories that fall in the "specific type" of general theory you tried to describe. You used Solomonoff as an example when it doesn't match your description.

one which people sometimes expect that type of theory to have.

When someone develops a formalization, they have to explicitly state its context and any assumptions. If someone expects to use Kolmogorov complexity theory to write the next hit game, they're going to have a bad time. That's not Kolmogorov's fault.

I'm arguing that certain characterizations of ideal behavior cannot help us explain why any given implementation approximates that behavior well or poorly.

Of course it can. It provides a different way of constructing a solution. You can start with an ideal then add assumptions that allow you to arrive at a more practicable implementation.

For instance, in computer vision; determining how a depth camera is moving in a scene is very difficult if you use an ideal formalization directly, but if you assume that the differences between two point-clouds are due primarily to affine transformations, then you can use the computationally cheap iterative-closest-point method based on Procrustes analysis to approximate the formal solution. Then, when you observe anomalous behavior, your usual suspects will be the list of assumptions you made to render the problem tractable. Are there non-affine transformations dominating the deltas between point clouds? Maybe that's causing my computer vision system to glitch. Maybe I need some way to detect such situations and/or some sort of fall-back.

Not only that, but there are many other reasons to formalize ideas like intelligence other than to guide the practical implementation of intelligent systems. You can explore the concept of intelligence and its bounds.

Again if you understand a tool for what it is, there's no problem. Of-course trying to use a purely formalized theory directly to solve real-world problems is going to yield confusing results. Trying to engineer a bridge using the standard model of particle physics is going to be just as difficult. It's not a fault of the theory, nor does it mean studying the theory is pointless. The problem is that you want it to be something it's not.

I don't understand how the rest of your points engage with my argument.

It's hard to engage much with your argument because it's made up of vague straw men:

Most "theories of rational belief" I have encountered

I have no solid context to engage you about. If you're talking about AIXI, then you've misunderstood AIXI because it isn't about choosing strategies out of a set of all strategies. In-fact, you've got Solomonoff Inductive inference completely wrong too:

For example, in Solomonoff, S is defined by computability while R is allowed to be uncomputable.

Solomonoff inductive inference is defined in the context of an agent observing an environment. That's all. It doesn't take actions. It just observes and predicts. There is no set of strategies. There is no rule for selecting a strategy, and given your definition of S and R:

We have some class of "practically achievable" strategies S, which can actually be implemented by agents. We note that an agent's observations provide some information about the quality of different strategies s∈S. So if it were possible to follow a rule like R≡ "find the best s∈S given your observations, and then follow that s," this rule would spit out very good agent behavior.

It doesn't even make sense that R would be incomputable given that S is computable.

When you say:

Concretely, this talk of approximations is like saying that a very successful chess player "approximates" the rule "consult all possible chess players, then weight their moves by past performance." Yes, the skilled player will play similarly to this rule, but they are not following it, not even approximately! They are only themselves, not any other player.

On what grounds do you even justify the claim that the chess player's behavior is "not even approximately" following the rule of "consult all possible chess players, then weight their moves by past performance."?

Actually, what vanilla AIXI would prescribe is a full tree traversal similar to the min-max algorithm. Which is, of-course; impractical. However, there are things you can do to approximate a full tree traversal more practically. You can build approximate models based on experience like "given the state of the board, what moves should I consider" which prunes the width of the tree, and "given the state of the board, how likely am I to win" which limits the depth of the tree. So instead of considering every possible move at every possible step of the game to every possible conclusion, you only consider 3-4 possible moves per step and only maybe 4-5 steps into the future. Maybe diminishing the number of moves per step.

Yes, there is a good reason Solomonoff does a weighted average and not an argmax

Did you edit your original comment? Because I could have sworn you said more disparaging the use of "arbitrary" weights. At any rate, it's not a "performance-weighted average" as it isn't about performance. It's about uncertainty.

Comment by Abe Dillon (abe-dillon) on Decision Theory · 2019-06-29T01:35:47.325Z · LW · GW
The reason it's untrue is because the concept of "I/O channels" does not exist within physics as we know it.

Yes. They most certainly do. The only truly consistent interpretation I know of current physics is information theoretic anyway, but I'm not interested in debating any of that. The fact is I'm communicating to you with physical I/O channels right now so I/O channels certainly exist in the real world.

the true laws of physics make no reference to inputs, outputs, or indeed any kind of agents at all.

Agents are emergent phenomenon. They don't exist on the level of particles and waves. The concept is an abstraction.

"I/O channels" are simply arrangements of matter and energy, the same as everything else in our universe. There are no special XML tags attached to those configurations of matter and energy, marking them "input", "output", "processor", etc. Such a notion is unphysical.

An I/O channel doesn't imply modern computer technology. It just means information is collected from or imprinted upon the environment. It could be ant pheromones, it could be smoke signals, its physical implementation is secondary to the abstract concept of sending and receiving information of some kind. You're not seeing the forest through the trees. Information most certainly does exist.

Why might this distinction be important? It's important because an algorithm that is implemented on physically existing hardware can be physically disrupted. Any notion of agency which fails to account for this possibility--such as, for example, AIXI, which supposes that the only interaction it has with the rest of the universe is by exchanging bits of information via the input/output channels--will fail to consider the possibility that its own operation may be disrupted.

I've explained in previous posts that AIXI is a special case of AIXI_lt. AIXI_lt can be conceived of in an embedded context, in which case; its model of the world would include a model of itself which is subject to any sort of environmental disturbance.

To some extent, an agent must trust its own operation to be correct, because you quickly run into infinite regression if the agent is modeling all the possible that it could be malfunctioning. What if the malfunction effects the way it models the possible ways it could malfunction? It should model all the ways a malfunction could disrupt how it models all the ways it could malfunction, right? It's like saying "well the agent could malfunction, so it should be aware that it can malfunction so that it never malfunctions". If the thing malfunctions, it malfunctions, it's as simple as that.

Aside from that, AIXI is meant to be a purely mathematical formalization, not a physical implementation. It's an abstraction by design. It's meant to be used as a mathematical tool for understanding intelligence.

AIXI also fails on various decision problems that involve leaking information via a physical side channel that it doesn't consider part of its output; for example, it has no regard for the thermal emissions it may produce as a side effect of its computations.

Do you consider how the 30 Watts leaking out of your head might effect your plans to every day? I mean, it might cause a typhoon in Timbuktu! If you don't consider how the waste heat produced by your mental processes effect your environment while making long or short-term plans, you must not be a real intelligent agent...

In the extreme case, AIXI is incapable of conceptualizing the possibility that an adversarial agent may be able to inspect its hardware, and hence "read its mind".

AIXI can't play tic-tac-toe with itself because that would mean it would have to model itself as part of the environment which it can't do. Yes, I know there are fundamental problems with AIXI...

This is, again, because AIXI is defined using a framework that makes it unphysical

No. It's fine to formalize something mathematically. People do it all the time. Math is a perfectly valid tool to investigate phenomena. The problem with AIXI proper, is that it's limited to a context in which the agent and environment are independent entities. There are actually problems where that is a decent approximation, but it would be better to have a more general formulation, like AIXI_lt that can be applied to contexts in which an agent is embedded in its environment.

This applies even to computable formulations of AIXI, such as AIXI-tl: they have no way to represent the possibility of being simulated by others, because they assume they are too large to fit in the universe.

That's simply not true.

I'm not sure what exactly is so hard to understand about this, considering the original post conveyed all of these ideas fairly well. It may be worth considering the assumptions you're operating under--and in particular, making sure that the post itself does not violate those assumptions--before criticizing said post based on those assumptions.

I didn't make any assumptions. I said what I believe to be correct.

I'd love to hear you or the author explain how an agent is supposed to make decisions about what to do in an environment if it's agency is completely undefined.

I'd also love to hear your thoughts on the relationship between math, science, and the real world if you think comparing a physical implementation to a mathematical formalization is any more fruitful than comparing apples to oranges.

Did you know that engineers use the "ideal gas law" every day to solve real-world problems even though they know that no real-world gas actually follows the "ideal gas law"?! You should go tell them that they're doing it wrong!

Comment by Abe Dillon (abe-dillon) on Subsystem Alignment · 2019-06-29T00:29:19.928Z · LW · GW

I'll probably have a lot more to say on this entire post later, but for now I wanted to address one point. Some problems, like wire-heading, may not be deal-breakers or reasons to write anything off. Humans are capable of hijacking their own reward centers and "wireheading" in many different ways (the most obvious being something like shooting heroin), yet that doesn't mean humans aren't intelligent. Things like wireheading, bad priors, or the possibility of "trolling"[] may just be hazards of intelligence.

If you're born and you build a model of how the world works based on your input, then you start using that model to reject noise, you might be rejecting information that would fix flaws and biases in your model. When you're young, it makes sense to believe elders when they tell you about how the world works because they probably know better than you, but if all those elders and everyone around you tell you about their belief in some false superstition, then that becomes an integral part of your world model, and evidence against your flawed world model may come long after you've weighted the model with an extremely high probability of being true. If the superstition involves great reward for adhering to and spreading it and great punishment for questioning it, then you have a trap that most valid models of intelligence will struggle with.

If some theory of intelligence is susceptible to that trap, it's not clear that said theory should be dismissed because the only current implementation of general intelligence we know of is also highly susceptible to that trap.

Comment by Abe Dillon (abe-dillon) on Decision Theory · 2019-06-28T23:34:17.725Z · LW · GW
This tends to assume that we can detangle things enough to see outcomes as a function of our actions.

No. The assumption is that an agent has *agency* over some degrees of freedom of the environment. It's not even an assumption, really; it's part of the definition of an agent. What is an agent with no agency?

If the agent's actions have no influence on the state of the environment, then it can't drive the state of the environment to satisfy any objective. The whole point of building an internal model of the environment is to understand how the agent's actions influence the environment. In other words: "detangling things enough to see outcomes as functions of [the agent's] actions" isn't just an assumption, it's essential.

The only point I can see in writing the above sentence would be if you said that a function isn't, generally; enough to describe the relationship between an agent's actions and the outcome: that you generally need some higher-level construct like a Turing machine. That would be fair enough if it weren't for the fact that the theory you're comparing yours to is AIXI which explicitly models the relationship between actions and outcomes via Turing machines.

AIXI represents the agent and the environment as separate units which interact over time through clearly defined I/O channels so that it can then choose actions maximizing reward.

Do you propose a model in which the relationship between the agent and the environment are undefined?

When the agent model is part of the environment model, it can be significantly less clear how to consider taking alternative actions.

Really? It seems you're applying magical thinking to the consequences of embedding one Turing machine within another. Why would it's I/O or internal modeling change so drastically? If I use a virtual machine to run Windows within Linux, does that make the experience of using MS Paint fundamentally different then running Windows in a native boot?

...there can be other copies of the agent, or things very similar to the agent.
Depending on how you draw the boundary around "yourself", you might think you control the action of both copies or only your own.

How is that unclear? If the agent doesn't actually control the copies, then there's no reason to imagine it does. If it's trying to figure out how best to exercise its agency to satisfy its objective, then imagining it has any more agency than it actually does is silly. You don't need to wander into the philosophical no-mans-land of defining the "self". It's irrelevant. What are your degrees of freedom? How can you uses them to satisfy your objective? At some point, the I/O channels *must be* well defined. It's not like a processor has an ambiguous number of pins. It's not like a human has an ambiguous number of motor neurons.

For all intents and purposes: the agent IS the degrees of freedom it controls. The agent can only change it's state, which; being a sub-set of the environment's state, changes the environment in some way. You can't lift a box, you can only change the position of your arms. If that results in a box being lifted, good! Or maybe you can't change the position of those arms, you can only change the electric potential on some motor neurons, if that results in arms moving, good! Play that game long enough and, at some point; the set of actions you can do is finite and clearly defined.

Your five-or-ten problem is one of many that demonstrate the brittleness problem of logic-based systems operating in the real world. This is well known. People have all but abandoned logic-based systems for stochastic systems when dealing with real-world problems specifically because it's effectively impossible to make a robust logic-based system.

This is the crux of a lot of your discussion. When you talk about an agent "knowing" its own actions or the "correctness" of counterfactuals, you're talking about definitive results which a real-world agent would never have access to.

It's possible (though unlikely) for a cosmic ray to damage your circuits, in which case you could go right -- but you would then be insane.

If a rare, spontaneous occurrence causes you to go right, you must be insane? What? Is that really the only conclusion you could draw from that situation? If I take a photo and a cosmic ray causes one of the pixels to register white, do I need to throw my camera out because it might be "inasane"?!

Maybe we can force exploration actions so that we learn what happens when we do things?

First of all, who is "we" in this case? Are we the agent or are we some outside system "forcing" the agent to explore?

Ideally, nobody would have to force the agent to explore its world. It would want to explore and experiment as an instrumental goal to lower uncertainty in its model of the world so that it can better pursue its objective.

A bad prior can think that exploring is dangerous

That's not a bad prior. Exploring *is* fundamentally dangerous. You're encountering the unknown. I'm not even sure if the risk/reward ratio of exploring is decidable. It's certainly a hard problem to determine when it's better to explore, and when it's too dangerous. Millions of the most sophisticated biological neural networks the planet Earth has to offer have grappled with the question for hundreds of years with no clear answer.

Forcing it to take exploratory actions doesn't teach it what the world would look like if it took those actions deliberately.

What? Again *who* is doing the "forcing" in this situation and how? Do you really want to tread into the other philosophical no-mans-land of free-will? Why would the question of whether the agent really wanted to take an action have any bearing whatsoever on the result of that action? I'm so confused about what this sentence even means.

EDIT: It's also unclear to me the point of the discussion on counterfactuals. Counterfactuals are of dubious utility for short-term evaluation of outcomes. They become less useful the further you separate the action from the result in time. I could think, "damn! I should have taken an alternate route to work this morning!" which is arguably useful and may actually be wrong, but if I think, "damn, if Eric the Red hadn't sailed to the new world, Hitler would have never risen to power!" That's not only extremely questionable, but also what use would that pondering be even if it were correct?

It seems like you're saying an embedded agent can't enumerate the possible outcomes of its actions before taking them, so it can only do so in retrospect. In which case, why can't an embedded agent perform a pre-emptive tree search like any other agent? What's the point of counterfactuals?

Comment by Abe Dillon (abe-dillon) on Embedded Agents · 2019-06-28T21:02:09.954Z · LW · GW

I don't see the point in adding so much complexity to such a simple matter. AIXI is an incomputable agent who's proofs of optimality require a computable environment. It requires a specific configuration of the classic agent-environment-loop where the agent and the environment are independent machines. That specific configuration is only applicable to a sub-set of real-world problems in which the environment can be assumed to be much "smaller" than the agent operating upon it. Problems that don't involve other agents and have very few degrees of freedom relative the agent operating upon them.

Marcus Hutter already proposed computable versions of AIXI like AIXI_lt. In the context of agent-environment loops, AIXI_lt is actually more general than AIXI because AIXI_lt can be applied to all configurations of the agent-environment loop including the embedded agent configuration. AIXI is a special case of AIXI_lt where the limits of "l" and "t" go to infinity.

Some of the problems you bring up seem to be concerned with the problem of reconciling logic with probability while others seem to be concerned with real-world implementation. If your goal is to define concepts like "intelligence" with mathematical formalizations (which I believe is necessary), then you need to delineate that from real-world implementation. Discussing both simultaneously is extremely confusing. In the real world, an agent only has is empirical observations. It has no "seeds" to build logical proofs upon. That's why scientists talk about theories and evidence supporting them rather than proofs and axioms.

You can't prove that the sun will rise tomorrow, you can only show that it's reasonable to expect the sun to rise tomorrow based on your observations. Mathematics is the study of patterns, mathematical notation is a language we invented to describe patterns. We can prove theorems in mathematics because we are the ones who decide the fundamental axioms. When we find patterns that don't lend themselves easily to mathematical description, we rework the tool (add concepts like zero, negative numbers, complex numbers, etc.). It happens that we live in a universe that seems to follow patterns, so we try to use mathematics to describe the patterns we see and we design experiments to investigate the extent to which those patterns actually hold.

The branch of mathematics for characterizing systems with incomplete information is probability. If you wan't to talk about real-world implementations, most non-trivial problems fall under this domain.

Comment by Abe Dillon (abe-dillon) on Embedded World-Models · 2019-06-26T01:16:10.095Z · LW · GW
I think that grappling with embeddedness properly will inevitably make theories of this general type irrelevant or useless

I disagree. This is like saying, "we don't need fluid dynamics, we just need airplanes!". General mathematical formalizations like AIXI are just as important as special theories that apply more directly to real-world problems, like embedded agents. Without a grounded formal theory, we're stumbling in the dark. You simply need to understand it for what it is: a generalized theory, then most of the apparent paradoxes evaporate.

Kolmogorov complexity tells us there is no such thing as a universal lossless compression algorithm, yet people happily "zip" data every day. That doesn't mean Kolmogorov wasted his time coming up with his general ideas about complexity. Real world data tends to have a lot of structure because we live in a low-entropy universe. When you take a photo or record audio, it doesn't look or sound like white noise because there's structure in the universe. In math-land, the vast majority of bit-strings would look and sound like incompressible white noise.

The same holds true for AIXI. The vast majority of problems drawn from problem space would essentially be, "map this string of random bits to some other string of random bits" in which case, the best you can hope for is a brute-force tree-search of all the possibilities weighted by Occam's razor (i.e. Solomonoff inductive inference).

Most "theories of rational belief" I have encountered -- including Bayesianism in the sense I think is meant here -- are framed at the level of an evaluator outside the universe, and have essentially no content when we try to transfer them to individual embedded agents. This is because these theories tend to be derived in the following way: ...

I can't speak to the motivations or processes of others, but these sound like assumptions without much basis. The reason I tend to define intelligence outside of the environment is because it generalizes much better. There are many problems where the system providing the solution can be decoupled both in time and space from the agent acting upon said solution. Agents solving problems in real-time are a special case, not a general case. The general case is: an intelligent system produces a solution/policy to a problem and an agent in an environment acts upon that solution/policy. An intelligent system might spend all night planning how to most efficiently route mail trucks the next morning, the drivers then follow those routes. A real-time model in which the driver has to plan her routs while driving is a special case. You can think of it as the drivers brain coming up with the solution/policy and the driver acting on it in situ.

You could make the case that the driver has to do on-line/real-time problem solving to navigate the roads and avoid collisions, etc. in which case the full solution would be a hybrid of real-time and off-line formulation (which is probably representative of most situations). Either way, constraining your definition of intelligence to only in-situ problem solving excludes many valid examples of intelligence.

Also, it doesn't seem like you understand what Solomonoff inductive inference is. The weighted average is used because there will typically be multiple world models that explain your experiences at any given point in time and Occam's razor says to favor shorter explanations that give the same result, so you weight the predictions of each model by the inverse of the length of the model (in bits, usually).

Concretely, this talk of approximations is like saying that a very successful chess player "approximates" the rule "consult all possible chess players, then weight their moves by past performance." Yes, the skilled player will play similarly to this rule, but they are not following it, not even approximately! They are only themselves, not any other player.

I think you're confusing behavior with implementation. When people talk about neural nets being "universal function approximators" they're talking about the input-output behavior, not the implementation. Obviously the implementation of an XOR gate is different than a neural net that approximates an XOR gate.