Tessellating Hills: a toy model for demons in imperfect search 2020-02-20T00:12:50.125Z


Comment by DaemonicSigil on Unconvenient consequences of the logic behind the second law of thermodynamics · 2021-03-07T22:22:39.097Z · LW · GW

If you believe that our existence is a result of a mere fluctuation of low entropy, then you should not believe that the stars are real. Why? Because the frequency with which a particular fluctuation appears decreases exponentially with the size of the fluctuation. (Basically because entropy is S=log W, where W is the number of micro states.) A fluctuation that creates both the Earth and the stars is far less likely than a fluctuation that just creates the Earth, and some photons heading towards the Earth that just happen to look like they came from stars. Even though it is unbelievably unlikely that the photons would happen to arrange themselves into that exact configuration, it's even more unbelievably unlikely that of all the fluctuations, we'd happen to live in one so large that it included stars. Of course, this view also implies a prediction about starlight: Since the starlight is just thermal noise, the most likely scenario is that it will stop looking like it came from stars, and start looking just like (very cold) blackbody radiation. So the most likely prediction, under the entropy fluctuation view, is that this very instant the sun and all the stars will go out.

That's a fairly absurd prediction, of course. The general consensus among people who've thought about this stuff is that our low entropy world cannot have been the sole result of a fluctuation, precisely because it would lead to absurd predictions. Perhaps a fluctuation triggered some kind of runaway process that produced a lot of low-entropy bits, or perhaps it was something else entirely, but at some point in the process, there needs to be a way of producing low entropy bits that doesn't become half as likely to happen for each additional bit produced.

Interestingly, as long as the new low entropy bits don't overwrite existing high entropy bits, it's not against the laws of thermodynamics for new low entropy bits to be produced. It's kind of a reversible computing version of malloc: you're allowed to call malloc and free as much as you like in reversible computing, as long as the bits in question are in a known state. So for example, you can call a version of malloc that always produces bits initialized to 0. And you can likewise free a chunk of bits, so long as you set them all to 0 first. If the laws of physics allow for an analogous physical process, then that could explain why we seem to live in a low entropy region of spacetime. Something must have "called malloc" early on, and produced the treasure trove of low entropy bits that we see around us. (It seems like in our universe, information content is limited by surface area. So a process that increased the information storage capacity of our universe would have to increase its surface area somehow. This points suggestively at the expansion of space and the cosmological constant, but we mostly still have no idea what's going on. And the surface area relation may break down once things get as large as the whole universe anyway.)

Comment by DaemonicSigil on Neck abacus · 2021-02-19T08:50:34.855Z · LW · GW

The static friction is high enough that the beads will only move if you push them, but low enough that they are easy to push. The necklace is made of 2 strands, joined at the endpoints. Put one strand on the table in the shape of , the other goes on top of it in the shape of . The beads go around the crossing points of the 2 strands, as rings in the plane. (You don't have to make an exact sine wave of course, anything that yields the same topological result will work, and to loop the beads around the crossings the way I've described, you''l have to thread the strands through the beads before joining them at the end.) Having the crossings pass through the beads increases the friction, since the bead is redirecting some of the tension in the strands.

In theory, putting beads in the normal way on a single strand could also work, if the diameter of the strand and the hole size of the bead were well matched. (You'd want fray-proof string for threading that not to be a huge pain, though.)

Comment by DaemonicSigil on Parable of the Dammed · 2020-12-11T03:02:45.825Z · LW · GW

Have you read Ursula K. LeGuin's book "Changing Planes" by any chance? If I recall correctly, there's a chapter where the viewpoint character is visiting a library, and reads a legend about two rival cities with a river border, and the outcome is rather similar.

Comment by DaemonicSigil on The central limit theorem in terms of convolutions · 2020-11-22T01:38:05.730Z · LW · GW

isn't actually the variance here. The variance is . Sorry for the confusing choice of variable name.

Comment by DaemonicSigil on The central limit theorem in terms of convolutions · 2020-11-21T21:40:36.960Z · LW · GW

Thanks for the interesting post. Looking forward to seeing where this series is going. Mostly for my own understanding, I'm going to mess around with fourier transforms and convolutions in this comment.

Proof that the fourier transform of the convolution of 2 functions is the product of their fourier transforms:

Let . Then, holding constant, . So:

Fourier transform of a general Gaussian distribution, :

Let .

Now we complete the square:

Let , so that .

We get a result of for the integral in using a clever trick []. So the result is:

Comment by DaemonicSigil on The ethics of AI for the Routledge Encyclopedia of Philosophy · 2020-11-19T01:01:20.880Z · LW · GW

I don't know if the entry is intended to cover speculative issues, but if so, I think it would be important to include a discussion of what happens when we start building machines that are generally intelligent and have internal subjective experiences. Usually with present day AI ethics concerns, or AI alignment, we're concerned about the AI taking actions that harm humans. But as our AI systems get more sophisticated, we'll also have to start worrying about the well-being of the machines themselves.

Should they have the same rights as humans? Is it ethical to create an AGI with subjective experiences whose entire reward function is geared towards performing mundane tasks for humans. Is it even possible to build an AI that is both highly intelligent, very general, and yet does not have subjective experiences and does not simulate beings with subjective experiences? etc.

Comment by DaemonicSigil on When Money Is Abundant, Knowledge Is The Real Wealth · 2020-11-10T04:51:00.574Z · LW · GW

What is the process for choosing the state censor, and the long term planning geniuses? How do these processes systematically select for very intelligent people with reliably correct beliefs?

Comment by DaemonicSigil on Multiple Worlds, One Universal Wave Function · 2020-11-06T05:31:53.939Z · LW · GW

I'd say that only a part of the section on falsifiability is nonsense, not the whole thing.

Paragraph 2 is reasonable. Physically speaking, you more or less have 2 options: Arbitrary objects either can be placed in superposition, or they can't. Unsurprisingly, this is experimentally testable by trying to put progressively larger and larger objects into superposition and seeing if you are successful. (Or more complicated objects, or more massive objects, whatever.) This also has practical consequences for our ability to build quantum computers.

Paragraph 3 is speculative, but also reasonable: If you believe that arbitrary objects can be placed into superposition, then you had better believe that applies to very massive objects and their gravitational fields. This experiment hasn't been done yet, but it could be done in principle, and it's even possible to do without messing around with supermassive black holes or Planck-energy accelerators. [1]

Paragraph 4 is a super sketchy anthropic-style argument that should definitely have been left out.

Paragraph 5 is about simplicity, which is great and all, but doesn't really bear on how falsifiable a theory is. Also should have been left out, or at least moved to the section on simplicity. (The simplicity advantage of decoherence over objective collapse theories is overrated IMO. Yes, decoherence is probably somewhat simpler than objective collapse, but only by a handful of Kolmogorov bits. Not really enough to be conclusive, so we should just do the experiments.)


Comment by DaemonicSigil on Intuitive Lagrangian Mechanics · 2020-10-31T18:21:03.463Z · LW · GW

Ah thanks, I've got it now. My browser seems to not like CMD-4, but putting dollar signs in markdown worked.

Comment by DaemonicSigil on Intuitive Lagrangian Mechanics · 2020-10-31T18:00:01.355Z · LW · GW

Nice. Some people in the comments were asking about the actual Taylor expansion for kinetic energy. So here it is (assuming ): (Since the derivative of is , we have for small .) So, up to an additive constant, we have . Also, if we want to account for non-gravitational potential energy as well, we can note that in special relativistic units, in free space, and in a potential. ( being the energy of the particle measured in that particle's frame.) So, assuming :

Comment by DaemonicSigil on The Darwin Game · 2020-10-12T04:37:15.174Z · LW · GW

How do you determine who gets the first 3? Maybe lsusr will be kind enough to provide a symmetry-breaking bit in the "extra" package. (It would only be fair, given that bots playing themselves are automatically given max score.) If not, and you have to do things the hard way, do you compare source code alphabetically, and favour X over Y on even rounds and Y over X on odd rounds?

Also, it may be a good idea to make the level of defection against outsiders depend on the round number. i.e. cooperate at first to maximize points, then after some number of rounds, when you're likely to be a larger proportion of the pool, switch to defecting to drive the remaining bots extinct more quickly.

Comment by DaemonicSigil on Fast Takeoff in Biological Intelligence · 2020-04-25T21:56:06.186Z · LW · GW

Another objection is that improvements in biological intelligence will tend to feed into improvements in artificial intelligence. For example, maybe after a couple of generations of biological improvement, the modified humans will be able to design an AI that quickly FOOMs and overtakes the slow generation by generation biological progress.

(It seems likely that once you've picked the low hanging fruit like stuffing people's genomes as full of intelligence-linked genes as possible without giving them genetic diseases, it will be much easier to implement any new intelligence improvements you can think of in code, rather than in proteins. The human brain is a much more sophisticated starting point than any current AI programs, but is probably much harder to modify significantly.)

Comment by DaemonicSigil on Tessellating Hills: a toy model for demons in imperfect search · 2020-02-22T19:15:38.938Z · LW · GW

That's very cool, thanks for making it. At first I was worried that this meant that my model didn't rely on selection effects. Then I tried a few different random seeds, and some, like 1725, didn't show demon-like behaviour. So I think we're still good.

Comment by DaemonicSigil on Tessellating Hills: a toy model for demons in imperfect search · 2020-02-21T01:28:34.289Z · LW · GW

No regularization was used.

I also can't see any periodic oscillations when I zoom in on the graphs. I think the wobbles you are observing in the third phase are just a result of the random noise that is added to the gradient at each step.

Comment by DaemonicSigil on Tessellating Hills: a toy model for demons in imperfect search · 2020-02-21T00:55:14.339Z · LW · GW

Thanks, and your summary is correct. You're also right that this is a pretty contrived model. I don't know exactly how common demons are in real life, and this doesn't really shed much light on that question. I mainly thought that it was interesting to see that demon formation was possible in a simple situation where one can understand everything that is going on.

Comment by DaemonicSigil on Tessellating Hills: a toy model for demons in imperfect search · 2020-02-21T00:45:20.534Z · LW · GW

Thanks. I initially tried putting the code in a comment on this post, but it ended up being deleted as spam. It's now up on github: It isn't particularly readable, for which I apologize.

The initial vector has all components set to 0, and the charts show the evolution of these components over time. This is just for a particular run, there isn't any averaging. x0 gets its own chart, since it changes much more than the other components. If you want to know how the loss varies with time, you can just flip figure 1 upside down to get a pretty good proxy, since the splotch functions are of secondary importance compared to the -x0 term.

Comment by DaemonicSigil on Tessellating Hills: a toy model for demons in imperfect search · 2020-02-20T00:15:04.776Z · LW · GW

Here is the code for people who want to reproduce these results, or just mess around:

import torch
import numpy as np
import matplotlib.pyplot as plt

DIMS = 16   # number of dimensions that xn has
WSUM = 5    # number of waves added together to make a splotch
EPSILON = 0.0025 # rate at which xn controlls splotch strength
TRAIN_TIME = 5000 # number of iterations to train for
LEARN_RATE = 0.2   # learning rate


# knlist and k0list are integers, so the splotch functions are periodic
knlist = torch.randint(-2, 3, (DIMS, WSUM, DIMS)) # wavenumbers : list (controlling dim, wave id, k component)
k0list = torch.randint(-2, 3, (DIMS, WSUM))       # the x0 component of wavenumber : list (controlling dim, wave id)
slist = torch.randn((DIMS, WSUM))                # sin coefficients for a particular wave : list(controlling dim, wave id)
clist = torch.randn((DIMS, WSUM))                # cos coefficients for a particular wave : list (controlling dim, wave id)

# initialize x0, xn
x0 = torch.zeros(1, requires_grad=True)
xn = torch.zeros(DIMS, requires_grad=True)

# numpy arrays for plotting:
x0_hist = np.zeros((TRAIN_TIME,))
xn_hist = np.zeros((TRAIN_TIME, DIMS))

# train:
for t in range(TRAIN_TIME):
    ### model: 
    wavesum = torch.sum(knlist*xn, dim=2) + k0list*x0
    splotch_n = torch.sum(
            (slist*torch.sin(wavesum)) + (clist*torch.cos(wavesum)),
    foreground_loss = EPSILON * torch.sum(xn * splotch_n)
    loss = foreground_loss - x0
    with torch.no_grad():
        # constant step size gradient descent, with some noise thrown in
        vlen = torch.sqrt(x0.grad*x0.grad + torch.sum(xn.grad*xn.grad))
        x0 -= LEARN_RATE*(x0.grad/vlen + torch.randn(1)/np.sqrt(1.+DIMS))
        xn -= LEARN_RATE*(xn.grad/vlen + torch.randn(DIMS)/np.sqrt(1.+DIMS))
    x0_hist[t] = x0.detach().numpy()
    xn_hist[t] = xn.detach().numpy()

plt.xlabel('number of steps')
for d in range(DIMS):
plt.xlabel('number of training steps')