It has already got some spread. Michael Nielsen shared it on Twitter (126 likes and 29 RTs as at writing).

Thanks for this, I'll be sharing it on /r/slatestarcodex and Hacker News (rationalist discords too if it comes up).

Maybe for the most efficient possible algorithm, but even that is not clear, and it's not clear we'll discover such algorithms anytime soon.

Using only current algorithms and architecture, a scaling jump of a few orders of magnitude seems doable.

Typo:

But this seems arbitrary — why should the fact that S’s causal influence on whether there’s money in the opaque box or not go via another agentmuchsuch a big difference?

The **bolded **should be "make" I think.

What’s more, the outcomes don’t scale smoothly with your level of skill. When rare, high leverage opportunities come around, being slightly more rational can make a huge difference. Bitcoin was one such opportunity;meeting my wife was another such one for me. I don’t know what the next one will be: an emerging technology startup? a political upheaval? cryonics? I know that the world is getting weirder faster, and the payouts to Rationality are going to increase commensurately.

I think COVID-19 has been another one. Many rats seem to have taken it seriously back in Jan/Feb.

Wei Dai made some money shorting the stock market.

I don't think using likelihoods when publishing in journals is tractable.

- Where did your priors come from? What if other scientists have different priors? Justifying the chosen prior seems difficult.
- Where did your likelihood ratios come from? What if other scientists disagree.

P values may bave been a failed attempt at objectivity, but they're a better attempt than moving towards subjective probabilities (even though the latter is more correct).

This was very refreshing to read. I'm glad EY has realised that mocking silly ideas doesn't actually help (it makes adherents of the idea double down and be much less likely to listen to you and may also alienate some neutrals. This is particularly true for ideas which have gained currency like the Abrahamic religions). I wasn't able to recommend The Sequences to Christian friends previously because of it's antireligiosity — here's hoping this version would be better.

I don't understand why being an embedded agent makes Bayesian reasoning impossible. My intuition is that an hypothesis doesn't have to be perfectly correlated with reality to be useful. Furthermore suppose you conceived of hypotheses as being a conjunction of elementary hypothesis, then I see no reason why you cannot perform Bayesian reasoning of the form "hypothesis X is one of the consituents of the true hypothesis", even if the agent can't perfectly describe the true hypothesis.

Also, "the agent is larger/smaller than the environment" is not very clear, so I think it would help if you would clarify what those terms mean.

Not AGI per se, but aligned and beneficial AGI. Like I said, I'm a moral nihilist/relativist and believe no objective morality exists. I do think we'll need a coherent moral system to fulfill our cosmic endowment via AGI.

My response to this was initially a tongue in cheek "I'm a moral nihilist, and there's no sense in which one moral system is intrinsically better than another as morality is not a feature of the territory". However, I wouldn't as solving morality is essential to the problem of creating aligned AI. There may be no objectively correct moral system, or intrinsically better moral system, or any good moral system, but we still need a coherent moral framework to use to generate our AI's utility function if we want it to be aligned, so morality is important, and we do need to develop an acceptable solution to it.

I don't see why better algorithms being more complex is a problem?

I disagree that intelligence and rationality are more fundamental than physics; the territory itself is physics, and that is all that is really there. Everything else (including the body of our phone knowledge) are models for navigating that territory.

Turing formalised computation and established the limits of computation given certain assumptions. However, those limits only apply as long as the assumptions are true. Turing did **not** prove that no mechanical system is superior to a Universal Turing Machine, and weird physics may enable super Turing computation.

The point I was making is that our models are only as good as their correlation with the territory. The abstract models we have aren't part of the territory itself.

Group A was most successful in the field of computation, so I have high confidence that their approach would be successful in intelligence as well (especially in intelligence of artificial agents).

I consider myself a rational realist, but I don't believe some of the things you attribute to rational realism (particularly concerning morality) and particularly concerning consciousness. I don't think there's a true decision theory or true morality, but I do think that you could find systems of reasoning that are provably optimal within certain formal models.

There is no sense in which our formal models are true, but as long as they have high predictive power the models would be useful, and that I think is all that matters.

If Tiffany's performance is good enough, then Tiffany is still best described as optimising for tic tac toe performance, because:

- Tic tac toe performance has high predictive power as an hypothesis for Tiffany's utility function.
- Tic tac toe performance has relatively low complexity when compared to other hypotheses with comparable predictive power.

This changes if Tiffany's performance is not sufficiently high (in which case there may be some other low complexity objective function that Tiffany is best described as optimising).

If we choose P, then X "exists" iff P appears somewhere in (a symbolic representation of) our universe. If we choose P2, then X "exists" iff P2 appears somewhere in (a symbolic representation of) our universe.

There is no reason why these two propositions should be equivalent. So whether we choose P or P2 may make a difference to whether or not X "exists".

As I understand this, the two propositions are equivalent. We do not arbitrarily pick P or P2 (we assume that one is picked and is picked consistently). What this means is that the G(TM) will also pick P or P2 consistently. If G(TM) outputs P, then it would never output P2 and vice versa. Only one of the members of the equivalence class become the shortest program, and that member represents the entire class everytime the class is invoked. So the shortest program will be computed by G consistently when simulating our universe.

My more fundamental objection is that it seems perfectly obvious to me that whether P appears in G(Q) has nothing whatever to do with whether X exists, because there is no reason why the universe should contain programs implementing all its objects.)

Do you agree that the universe can be simulated?

I think this is the reason why a distinction between subjective and objective probability is needed.

See this.

I think AlexMennen nailed it on the head—our utility is bounded and at a low number.

My first guesses have something to do with how likely the recepient is to return the favor in some fashion)

This sounds more like the handiwork of evolution to me.

As I said in the other thread:

I've changed my mind. I agree with this.

The probability of you getting struck by lightning and dying while making your decision is .

The probability of you dying by a meteor strike, by an earthquake, by .... is .

The probability that you don't get to complete your decision for one reason or the other is .

It doesn't make sense then to entertain probabilities vastly lower than , but not entertain probabilities much higher than .

What happens is that our utility is bounded at or below .

This is because we ignore probabilities at that order of magnitude via revealed preference theory.

If you value at times more utility than , then you are indifferent between exchanging for a times chance of getting .

I'm not indifferent between exchanging for a chance of anything, so my utility is bounded at . It is possible that others deny this is true for them, but they are being inconsistent. They ignore other events with higher probability which may prevent them from deciding, but consider events they assign vastly lower probability.

Realising that your utility function is bounded is sufficient to reject Pascal's mugging.

That said, I agree with you.

I've changed my mind. I agree with this.

The probability of you getting struck by lightning and dying while making your decision is .

The probability of you dying by a meteor strike, by an earthquake, by .... is .

The probability that you don't get to complete your decision for one reason or the other is .

It doesn't make sense then to entertain probabilities vastly lower than , but not entertain probabilities much higher than .

What happens is that our utility is bounded at or below .

This is because we ignore probabilities at that order of magnitude via revealed preference theory.

If you value at times more utility than , then you are indifferent between exchanging for a times chance of .

I'm not indifferent between exchanging for a chance of anything, so my utility is bounded at . It is possible that others deny this is true for them, but they are being inconsistent. They ignore other events with higher probability which may prevent them from deciding, but consider events they assign vastly lower probability.

I disagree with your conclusion (as I explained here, but upvoting anyway for the content of the post).

Also, it is possible to have a linear utility function and still reject Pascal's mugger if:

- Your linear utility function is bounded.
- You apply my rule.

Thus, LUH is independent of rejecting Pascal's muggle?

This result has even stronger consequences than that the Linear Utility Hypothesis is false, namely that utility is bounded.

Not necessary. You don't need to suppose bounded utility to explain rejecting Pascal's muggle.

My reason for rejecting pascal muggle is this.

If there exists a set of states E_j such that:

- P(E_j) < epsilon.
- There does not exist E_k (E_k is not a subset of E_j and P(E_k) < epsilon).

Then I ignore E_j in decision making in singleton decision problems. In iterated decision problems, the value for epsilon depends on the number of iterations.

I don't have a name for this principle (and it is an ad-hoc patch I added to my decision theory to prevent EU from being dominated by tiny probabilities of vast utilities).

This patch is different from bounded utility, because you might ignore a set of atates in a singleton problem, but consider same set in an iterated problem.

My point is that for me, extinction is not equivalent to losing current amount of lives now.

Human extinction now for me is worse than losing 10 trillion people, if the global population was 100 trillion.

This is because extinction destroys all potential future utility. It destroys thw potential of humanity.

I'm saying that extinction can't be evaluated normally, so you need a better example to state your argument against LUH.

Extinction now is worse than losing X people, if the global human population is 10 X, irregardless of how large X is.

That position above is **independent** of the linear utility hypothesis.

T**here is no finite number of lives that reach utopia, for which I would accept Omega's bet at a 90% chance of extinction. **

This does not mean I would accept Omega's bet at 89% chance of extinction (I wouldn't), but 90% is far above my ceiling, and I'm very sure of it.

See this thread. **there is no finite number of lives that reach utopia, for which I would accept Omega's bet at a 90% chance of extinction**.

Human extinction now for me is worse than losing 10 trillion people, if the global population was 100 trillion.

My disutility of extinction isn't just the number of lives lost. It involves the termination of all future potential of humanity, and I'm not sure how to value that, but see the bolded.

I don't assign a disutility to extinction, and my preference with regards to extinction is probably lexicographic with respect to some other things (see above).

Homosexuality being influenced by genes does not mean one cannot choose to change their sexuality.

I disagree that this is true for everyone.

: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better. Most of the ways people respond to Pascal's mugger don't apply to this situation, since the probabilities and ratios of utilities involved here are not at all extreme.

This is not an airtight argument.

Extinction of humanity is not 0 utils, it's negative utils. Let the utility of human extinction be -X.

If X > 10^101, then a linear utility function would pick option 1.

Linear Expected Utility (LEU) of option 1:

1.0(10^100) = 10^100.

LEU of option 2:

0.9(-X) + 0.1(10^102) = 10^101 + 0.9(-X)

10^101 - 0.9X < 10^100

-0.9X < 10^100 - 10^101

-0.9X < -9(10^100)

X > 10^101.

I place the extinction of humanity pretty highly, as ot curtails any possible future. So X is always at least as high as the utopia. I would not accept any utopia where the P of human extinction was > 0.51, because the negutility of human extinction outweighs utility of utopia for any possible utopia.

It's been almost four months since I wrote this thread. I've started to see the outline of an answer to my question. Over the course of the next year, I would begin documenting it.

Some values are fixed, for example, sexual orientation,

This is false, at least if you are making that statement for everyone. I am not aware if it's true for the vast majority of people.

I think this is a sensible idea, however I think I'm not at the level for this to be useful advice for me—I have a *severe *knowledge debt.

I'll take a look at it.

But that arbitrary choice will make a difference to whether or not we say that X and Y exist, andthatis what bothers me.

I do not understand how this follows from:

In order to decide whether (say) X exists, we have to *make an arbitrary choice* of whether to use P or P2. We will then use the same choice for Y as well, so X and Y will come out the same way, and that's fine.

If there's a problem, then it is the inference you're making from the second quote to the first. That is the cause of the misunderstanding. I would appreciate it if you explained why you think the arbitrary choice makes a difference whether an object exists (that is not supposed to happen, and I designed it so it wouldn't).

It would not have a different answer. If P is chosen for Y, and X and Y belong to the same equivalence class, then X would be computed as P, and vice versa for P2.

P.S: sorry for the late reply.

The constraint of consistency means that all programs equivalent to a certain program would be represented by one single program. I think it may be useful to define an equivalence clasd for each program. For each equivalence class, one member of that class would be substituted for any occurrence of the other members of the class.

There may be some other shortest program P2 that implements (equivalents of) them all, and P2 is completely ignored here.

If P2 is shorter than P, then according to the definition of information content, P2 would be selected instead of P.

Again: I'm happy to wait for further enlightenment until you have time to write the thing up properly. I'm replying to you because you replied to me :-), but would not want you to feel any pressure to keep responding here while you're busy with other things.

My plan prior to starting my "Meditations on Ontology" sequence:

Complete a course (autodidactry) in analytical philosophy.

- Learn information theory.
- Learn computability theory.
- Learn causality.
- Learn relevant maths.

I graduate next year, so there's hope.

Education is not mainly a form of signalling.

Getting an uber expensive education is signalling, getting an education itself is not.

Agreed.

Remember the notion of structural and functional equivalence.

Consider X, we compute the shortest program produces X as output. If Y is structurally equivalent to X, then Y is functionally equivalent to X.

If Y is structurally equivent to X, then the information content of Y and X are the same.

Given that a certain object exists, what can be said to exist is any object structurally equivalent to that object. If X, X', and X'' all have the same information content, and C_G(X) is in G(Q), then X, X' and X'' can each be said to exist. They are the same object.

Restrictions on interpretation should be added as a problem to be honest. Hopefully, the global criteria for simulation resolves thst.

Examples, clarifications, etc would be in the comments.

I don't want to end up making this too log to be a stub, but too short/unrigorous to be the full article. Given that I can't write the full article yet, I'll try and make the post as short as possible.

I have edited it and I think I've satisfactorily answered the points you raised. This is meant to be a stub, so I focused on brevity—necessarily at the expense of pedagogy—but I would be willing to address any concerns you have.

A lot of heavy lifting is done by my understanding of the concepts. I'm not sure how much I can explain the ontology without writing the full article (a task I'm not ready to do yet), but I would do my best.

I have an answer. I'll edit this post in a few hours (exam next hour) with my answer.

I'll make a series of stubs that present my answer soon.

The second stub is this.

A property of an object is a relation:

The relation is between the object and another object "atoms". (Number of X that compose Y).

A composition of relations is a relation.

Paris is in France.

France is in Europe.

Paris is in Europe.

Paris is the capital of France.

France is on Earth.

Paris is the capital of a country on Earth.

Perhaps because I've already arrived at it independently, this felt lacking.

But I agree with what you've said.

In my ontology, I have some expansions that I would soon post.

As for "exists", I find it useful to distinguish between "is manifest in some model" and "is manifest in the territory"—you have not made that distinction.

I agree with all your criticisms. I also think the article is wrong and didn't update except against Chollet, but I found the article educational.

Things I learned.

- No free lunch theorem. I'm very grateful for this, and it made me start learning more about optimisation.
- From the above: there is no general intelligence. I had previously believed that finding a God's algorithm for optimisation would be one of the most significant achievements of the century. I discovered that's impossible.
- Exponential growth does not imply exponential progress as exponential growth may meet exponential bottlenecks. This was also something I didn't appreciate. Upgrading from a level n intelligence to a level n+1 intelligence may require more relative intelligence than upgrading from a level n-1 to a level n. Exponential bottlenecks may result in diminishing marginal growth of intelligence.

The article may have seemed of significant pedagogical vale to me, because I hadn't met these ideas before. For example, I have just started reading the Yudkowsky-Hanson AI foom debate.

Going to reply here. I think the author is completely wrong, but you're missing several things.

Interpret this as a steelman. I do not agree with the author's conclusions or it's argument, but I think the essay was of pedagogical value. I think you're prematurely dismissing it.

This is trivial to prove. If brains are not even "intelligent", they can hardly be "generally intelligent". ;)

There is no generally intelligent algorithm. If you accept that intelligence is defined in terms of optimisation power, there is no intelligent algorithm that outperforms random search on all problems.

Worse there is no intelligent algorithm that outperforms random search on *most* problems; this profound result is called the No Free Lunch Theorem.

If you define general intelligence as an intelligent algorithm that can optimise on all problems, then random search (and its derivatives) are the only generally intelligent algorithms.

Yeah, someone has a clever definition of "highly specialized". Using this definition, even AIXI would be "highly specialized" in the problem of being AIXI. And the hypothetical recursively self-improving general artificial intelligence is also "highly specialized" in the problem of being a recursively self-improving general artificial intelligence. No need to worry about it becomingtoosmart.

This follows from the fact that there is no generally intelligent algorithm (save random search). The vast majority of potential optimisation problems are intractable (I would say pathological, but I'm not sure that makes sense when I'm talking about the majority of problems). Most optimisation problems cannot be solved except via exhaustive search. Humanity's cognitive architecture is highly specialised in the problems it can solve. This is true for all non exhaustive search methods.

Today I learned: Exceptionally high-IQ humans are incapable of solving major problems.

Majority of exceptionally high IQ humans do not in fact solve major problems. There are millions of people in the IQ 150+ range. How many of them are academic heavyweights (Nobel prize laureates, field medalists, ACM Turing award winners, etc)?

...giving up in the middle of the article, because I expect the rest to be just more of the same.

I think you should finish it.

Well, that's a bad analogy for schools fees. They usually result in skills transfer.

School fees aren't purely positional.