## Posts

## Comments

**rhaps0dy**on Cryonics signup guide #1: Overview · 2021-06-10T15:22:43.876Z · LW · GW

If I'm highly likely to move around the world several times in the next ~5 years, what can I do? Is it still a good idea to sign up for cryonics in whatever place I'm at the moment?

**rhaps0dy**on Löb's theorem simply shows that Peano arithmetic cannot prove its own soundness · 2021-04-22T11:34:20.458Z · LW · GW

I see, I misunderstood indeed.

What you wrote here is the literal statement of the theorem. It offers no extra understanding. I suppose you're right and I just wrote out a corollary, but I think the corollary is illuminating. I don't understand the comparison to induction in this comment or your previous one :(

**rhaps0dy**on Löb's theorem simply shows that Peano arithmetic cannot prove its own soundness · 2021-04-22T11:32:08.834Z · LW · GW

So it was that necessitation is outside of PA. Thank you! It seems to be assumed in modal logic (or at least a relevant modal logic) though. Which would imply that that modal logic assumes it is sound. It's still a bit weird to me that Löb's theorem is also true in that case.

**rhaps0dy**on Löb's theorem simply shows that Peano arithmetic cannot prove its own soundness · 2021-04-22T10:58:02.932Z · LW · GW

In PA, when proving something, you get the lemma that it is provable for free

As I understand it, this verbal statement can be translated to , or at least . The first is false[1] (Gödel's 1st incompleteness theorem), the second is not obviously related to Löb's theorem.

[1] also related to necessitation, a proof rule that substitutes for

**rhaps0dy**on Löb's theorem simply shows that Peano arithmetic cannot prove its own soundness · 2021-04-22T09:13:27.387Z · LW · GW

I think you're right, the statement is intuitively soundness, but I'm still confused. I would be very grateful if you could clear it up. Is "soundness consistency" what you mean by stronger? If we can assume necessitation, soundness and consistency are equivalent, proof below.

Does that mean we cannot assume the necessitation proof rule, that is if , add ? This would be consistent with "PA cannot prove its own completeness", the necessitation rule would imply that completeness is true.

First, we prove that consistency implies soundness by modus tollens. Suppose no soundness: ; then . From , by necessitation , and so overall , that is . We have proven , by modus tollens .

Second, we can prove that soundness implies consistency without necessitation, by modus tollens and contradiction. Suppose . Further supposing clearly proves , so we reject the assumption, thus . Thus we prove , by modus tollens .

Where did I go wrong? Is it the necessitation deduction rule? Is it that the necessitation deduction rule is valid, but cannot be used *within* a statement of PA?

**rhaps0dy**on Predictive Coding has been Unified with Backpropagation · 2021-04-03T22:53:53.519Z · LW · GW

I don't, and my best guess is that nobody has done it :) The paper you linked is extremely recent.

You'd have to start by finding an ODE model of predictive coding, which I suppose is possible taking limit of the learning rate to 0.

**rhaps0dy**on Predictive Coding has been Unified with Backpropagation · 2021-04-03T16:16:57.526Z · LW · GW

Not particularly relevant, I think, but interesting nonetheless.

A first drawback of this paper is that its conclusion assumes that the NN underneath trains with gradient *flow (GF)*, which is the continuous-time version of gradient descent (GD). This is a good assumption if the learning rate is very small, and the resulting GD dynamics closely track the GF differential equation.

This does not seem to be the case in practice. Larger initial learning rates help get better performance (https://arxiv.org/abs/1907.04595), so people use them in practice. If what people use in practice was well-approximated by GF, then smaller learning rates would give the same result. You can use another differential equation that does seem to approximate GD fairly well (http://ai.stanford.edu/blog/neural-mechanics/), but I don't know if the math from the paper still works out.

Second, as the paper points out, the kernel machine learned by GD is a bit strange in that the coefficients $a_i$ for weighing different $K(x, x_i)$ depend on $x$. Thus, the resulting output function is not in the reproducing kernel Hilbert space of the kernel that is purported to describe the NN. As a result, as kernel machines go, it's pretty weird. I expect that a lot of the analysis about the output of the learning process (learning theory etc) assumes that the $a_i$ do not depend on the test input $x$.

**rhaps0dy**on Strong Evidence is Common · 2021-03-16T10:54:50.071Z · LW · GW

I’m guessing they would have happily accepted a bet at 20:1 odds that my driver’s license would say “Mark Xu” on it.

I think they wouldn't have, mostly because you (or someone) offering the bet is fairly strong evidence that the name on your driver's license is not, in fact, "Mark Xu".

**rhaps0dy**on AI x-risk reduction: why I chose academia over industry · 2021-03-15T19:05:38.965Z · LW · GW

Also, you can try for a top-20 tenure-track position and, if you don't get it, "fail" gracefully into a research scientist position. The paths for the two of them are very similar (± 2 years of postdoctoral academic work).

**rhaps0dy**on Bootstrapped Alignment · 2021-03-08T10:07:37.027Z · LW · GW

Friendly AI, which is that it be designed to be aligned, say in a mathematically provable way, rather than as an engineered process that approaches alignment by approximation.

I think I understand that now, thank you!

this avoids Goodharting because there's no optimization being applied

I'm confused again here. Is this implying that a Friendly AI, per the definition above, is not an optimizer?

I am very pessimitic about being able to align an AI without any sort of feedback loop on the reward (thus, without optimization). The world's overall transition dynamics are likely to be chaotic, so the "initial state" of an AI that is provably aligned without feedback needs to be exactly the right one to obtain the outcome we want. It could be that the chaos does not affect what we care about, but I'm unsure about that, even linear systems can be chaotic.

It is not an endeavour as clearly impossible as "build an open-loop controller for this dynamical system", but I think it's similar.

**rhaps0dy**on Bootstrapped Alignment · 2021-03-04T19:18:41.516Z · LW · GW

I'm confused, I don't know what you mean by 'Friendly AI'. If I take my best guess for that term, I fail to see how it does not rely on optimization to stay aligned.

I take 'Friendly AI' to be either:

- An AI that has the right utility function from the start. (In my understanding, that used to be its usage.) As you point out, because of Goodhart's Law, such an AI is an impossible object.
- A mostly-aligned AI, that is designed to be corrigible. Humans can intervene to change its utility function or shut it down as needed to prevent it from taking bad actions. Ideally, it would consult human supervisors before taking a potentially bad action.

In the second case, humans are continuously optimizing the "utility function" to be closer to the true one. Or, modifying the utility function to make "shut down" the preferred action, whenever the explicit utility function presents a 'misaligned' preferred outcome. Thus, it also represents an optimization-based *weak alignment* method.

Would you argue that my second definition is also an impossible object, because it also relies on optimization?

I think part of my confusion comes from the very fuzzy definition of "optimization". How close, and how fast, do you have to get to the maximum possible value of some function U(s) to be said to optimize it? Or is this the entirely wrong framework altogether? There's no need to answer these now, I'm mostly curious about a clarification for 'Friendly AI'.

**rhaps0dy**on Luna Lovegood and the Chamber of Secrets - Part 12 · 2021-01-03T17:16:31.132Z · LW · GW

Your particular taste? I loved HPMOR but this is too absurd to be fun for me.

**rhaps0dy**on [Personal Experiment] One Year without Junk Media · 2019-12-15T16:20:56.056Z · LW · GW

Great initiative! I'm curious what is the reasoning behind this:

Recorded university lectures and recorded conference talks are forbidden.

Do you find that you get no learning benefit from these things? If so this would explain why also you're not allowing didactic youtube channels (Kurzgesagt, 3blue1brown, CGP grey...)

**rhaps0dy**on Interpretations of "probability" · 2019-05-19T21:21:19.863Z · LW · GW

I clicked this because it seemed interesting, but reading the Q&A:

In atypical game we consider, one player offers bets, another decides how to bet, and a third decides the outcome of the bet. We often call the first player Forecaster, the second Skeptic, and the third Reality.

How is this *any* different from the classical Dutch Book argument, that unless you maintain beliefs as probabilities you will inevitably lose money?

**rhaps0dy**on Interpretations of "probability" · 2019-05-19T21:18:55.397Z · LW · GW

Frequentist statistics were invented in a (failed) attempt to keep subjectivity out of science in a time before humanity really understood the laws of probability theory

I'm a Bayesian, but do you have a source for this claim? It was my understanding that Frequentism was mostly promoted by Ron Fisher in the 20th century, well *after* the work of Bayes.

Synthesised from Wikipedia:

While the first cited frequentist work (the weak law of large numbers, 1713, Jacob Bernoulli, Frequentist probability) predates Bayes' work (edited by Price in 1763, Bayes' Theorem), it's not by much. Further, according to the article on "Frequentist Probability", "[Bernoulli] is also credited with some appreciation for subjective probability (prior to and without Bayes theorem)."

The ones that pushed frequentism in order to achieve objectivity were Fisher, Neyman and Pearson. From "Frequentist probability": "All valued objectivity, so the best interpretation of probability available to them was frequentist". Fisher did other nasty things, such as using the fact that causality is really hard to soundly establish to argue that tobacco was not proven to cause cancer. But nothing indicates that this was done out of not understanding the laws of probability theory.

AI scientists use the Bayesian interpretation

Sometimes yes, sometimes not. Even Bayesian AI scientists use frequentist statistics pretty often.

This post makes it sound like frequentism is useless and that is not true. The concepts of: a stochastic estimator for a quantity, and looking at whether it is biased, and its variance; were developed by frequentists to look at real world data. AI scientists use it to analyse algorithms like gradient descent, or approximate Bayesian inference schemes, but the tools are definitely useful.

**rhaps0dy**on Is voting theory important? An attempt to check my bias. · 2019-02-18T00:26:30.572Z · LW · GW

After reading the title, my main objections to voting theory were:

- The theory is already understood well enough, what is hard is convincing existing institutions to change.
- Convincing existing institutions to change is
*really*hard, so there's not much point in advancing the theory

Though I do agree that public elections are a process by which huge amount of resources get allocated, in many of the world's richest countries, so it's an important problem.

You make some arguments against these two:

The debate among reform activists between various voting methods (IRV, approval, Condorcet, score, STAR, etc.; as well as the understanding of proportional representation)has progressed substantially(emphasis mine) in the 20 years that I've been a part of it, and I think it can progress further.

Can you point to an example of this? Something the understanding of which has improved in the last 20 years. Also, are activists and the academia one and the same? Is this improved understanding because new theory was developed, or because the activists started understanding theory that had already been developed in the academia.

Voting reform has happened before in various contexts, and it should be expected that sooner or later and somewhere or other it will happen again. Will it happen in the particular ways and places I'd like it to? There's no way to be sure either way, but I'd say that the probability is certainly over 1%,

The key question is not whether it will happen *somewhere* with a probability of at least 1%, but rather whether you or the people you inspire can move the needle with probability 1 in a millionth (or less), in a place that's sufficiently big (or even bigger). Are you thinking of influencing the US Gov in particular? That would certainly qualify for a big entity. What are the wins that the voting reform movement has done in the past few years? ( I suppose the #1 USA example is Fargo, in North Dakota, that passed approval voting )

(There's probably a lot of people unhappy with voting in some way, so if you can convince them that your proposal is going to make their group more powerful, maybe it's not so hard).

**rhaps0dy**on Allowing a formal proof system to self improve while avoiding Lobian obstacles. · 2019-02-14T17:12:51.594Z · LW · GW

I've spotted a mistake, I think: the relationship inside the first prover should be an if and only if relationship. This is because if says everything is a theorem, then trivially holds. Thus, the condition to give "proof privileges" should be . There might still be some problems from the modal logic, I'll check when I have some more time.

I can't believe that I was the first one to spot this. The post also has very few upvotes. Did nobody that knows this stuff see this and spot the mistake immediately? Was the post dismissed to start with? (in my opinion unfairly, but perhaps not).

**rhaps0dy**on Three Kinds of Research Documents: Exploration, Explanation, Academic · 2019-02-13T23:54:30.323Z · LW · GW

What about short technical reports / forum posts like the original "All Mathematicians are Trollable: Divergence of Naturalistic Logical Updates"? It's short, doesn't have any references, doesn't hold your hand with much background like academic articles do. On the other hand it contains more detailed information than "An Untrollable Mathematician Illustrated".

Is this a clarification post because it's the first example of an idea?

**rhaps0dy**on Complexity Penalties in Statistical Learning · 2019-02-06T22:32:49.736Z · LW · GW

Great post, thank you for writing it!

By taking squares we are more forgiving when the model gets the answer almost right but much less forgiving when the model is way off

It's really unclear why this would be better. I was going to ask for a clarification but I found something better instead.

The first justification for taking the *least-squares estimator* that came to mind is that it's the *maximum likelihood* solution if you assume your errors E are independent and Gaussian. Under those conditions, the probability density of the data is . We take the product because the errors are independent for each point. If you take the logarithm (which is a transformation that preserves the maximum), this comes out to be plus a constant that doesn't depend on g*.*

It turns out there's a stronger justification, which doesn't require assuming the errors are Gaussian. The Gauss-Markov Theorem shows that the best *unbiased* estimator of the coefficients of *g* that is *linear *in the powers of the data, i.e. .** **That it's unbiased means that, for several independent samples of data, the mean of the least-squares estimator will be the true coefficients of the polynomial you used to generate the data. The estimator is the *best* because it has the lowest variance under the above sampling scheme. If you assume you have a lot of data, by the central limit theorem, this is similar to lowest squared error in coefficient-space.

However, if you're willing to drop the requirement that the estimator is unbiased, you can get even better error on average. The James-Stein estimator has less variance than least-squares, at the cost of biasing your estimates of the coefficients towards zero (or some other point). So, even within a certain maximum allowed degree of the polynomial, it is helpful to penalise complexity of the *coefficients themselves*.

On a related note, for some yet unknown reason, neural networks that have way more parameters than is necessary in a single layer seem to generalize better than networks with few parameters. See for example Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks. It seems to be related to the fact that we're not actually finding the optimal configuration of parameters with gradient descent, but only approximating it. So the standard statistical analysis that you outlined here doesn't apply in that case.

**rhaps0dy**on Reframing Superintelligence: Comprehensive AI Services as General Intelligence · 2019-01-09T01:00:18.101Z · LW · GW

This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.

Recursive self-improvement that makes the intelligence "super" quickly is what makes the misaligned utility actually dangerous, as opposed to dangerous like a, say, current day automatized assembly line.

A robot that self-improves would need to have the capacity to control its actuators and *also *to self-improve. Since none of these capabilities directly depends on the other, each time one of them improves, the improvement is much more likely to be first demonstrated independently of an improvement in the other one.

Thus we're likely to already have some experience with self-improving AI, or the recursively improved AI to help us, when we get to dealing with people wanting to build self-improving robots. Even though with advanced AI in hand to help we should maybe still start early on that, it seems more important to get the not-necessarily-and-also-probably-not-robotic AI right.

**rhaps0dy**on Reframing Superintelligence: Comprehensive AI Services as General Intelligence · 2019-01-09T00:17:32.587Z · LW · GW

Yes, though I'm fairly sure he's talking about using trained neural networks to e.g. classify an image, which is known to be fairly cheap, rather than training them. In other words, he's talking about using an AI service rather than creating one.

He also says that "Machine learning and human learning differ in their relationship to costs" which is also evidence for my interpretation: training is expensive, testing on one example is very cheap.

**rhaps0dy**on AI safety without goal-directed behavior · 2019-01-08T19:05:57.384Z · LW · GW

Yup. I actually made this argument two posts ago.

Ah, that's good. I should probably read the rest of the sequence too.

Though it's not clear how you'd use a logic-based reasoning system to act in the world

The easy way to use them would be as they are intended: oracles that will answer questions about factual statements. Humans would still do the questioning and implementing here. It's unclear how exactly you'd ask really complicated, natural-language-based questions (obviously, otherwise we'd have solved AI), but I think it serves as an example of the paradigm.

**rhaps0dy**on AI safety without goal-directed behavior · 2019-01-07T14:11:30.726Z · LW · GW

I usually think that logic-based reasoning systems are the canonical example of of an AI without goal-directed behaviour. They just try to prove or disprove a statement, given a database of atoms and relationships. (Usually they're restricted to statements that are decidable by construction so that is always possible).

You can also frame their behaviour as a utility function: U(time, state) = 1 if you have correctly decided the statement at t ≤ time, 0 otherwise. But your statement that

>It seems possible to build systems in such a way that these properties are inherent in the way that they reason, such that it’s not even coherent to ask what happens if we “get the utility function slightly wrong”.

very much applies. I'm fairly sure you can specify the behaviour of _anything_, including "dumb" things like trousers, screwdrivers, rocks and saucepans, as an utility function + perfect optimization, even though for most things this is a very unhelpful way of thinking. Or at least human artifacts. E.g. a screwdriver optimizes "transmit the rotational force that is applied to you", a rock optimizes "keep these molecules bound and respond to forces according to the laws of physics".

**rhaps0dy**on The Craft & The Community - A Post-Mortem & Resurrection · 2018-12-22T12:55:40.250Z · LW · GW

Upvoting to people see that the project failed earlier, and don't have to spend a couple hours reading the main article given how this turned out.

**rhaps0dy**on Modes of Petrov Day · 2018-09-23T12:43:07.020Z · LW · GW

Typo in pg. 31 of the ceremony guide: "sir ead" -> "is read".

**rhaps0dy**on A Gym Gridworld Environment for the Treacherous Turn · 2018-08-07T10:31:54.488Z · LW · GW

Would it be possible to just apply model-based planning and show the treacherous turn on the first time?

Model-based planning is also AI, and we clearly have an available model of this environment.

**rhaps0dy**on Announcing the AI Alignment Prize · 2017-11-10T13:23:39.055Z · LW · GW

Is it possible to enter the contest as a group? Meaning, can the article written for the contest have several coauthors?

**rhaps0dy**on Nick Bostrom says Google is winning the AI arms race · 2016-10-06T09:50:25.038Z · LW · GW

I don't think we are that far away from AGI.

At the very least 20 years. And yes Alphabet are the closest, but in 20 years a lot of things can change.

**rhaps0dy**on Attention! Financial scam targeting Less Wrong users · 2016-06-21T07:55:39.946Z · LW · GW

This is what I thought. But ChristianKl is right: it doesn't need to. From the first false positive you're already doing damage with almost no cost to you. Sure your address will start to receive more spam, but it will be filtered like the spam you already have is.

But having it in the ISP, or as a really popular extension, would deal a big blow to spam.

**rhaps0dy**on Attention! Financial scam targeting Less Wrong users · 2016-06-15T17:10:31.607Z · LW · GW

Today in Hacker News there's a research article speaking exactly of this.

https://news.ycombinator.com/item?id=11909111

Makes me think that a possible method to mitigate spam would be to answer each email with a LSTM-generated blob of text, so the attackers are swarmed with false positives and cannot continue the attack. Of course, this would have to be implemented by the email provider.

**rhaps0dy**on 2016 LessWrong Diaspora Survey Analysis: Part One (Meta and Demographics) · 2016-05-14T12:31:20.847Z · LW · GW

Gratitude thread.

What a load of work, Ingres. Thank you for doing this.

**rhaps0dy**on Attention! Financial scam targeting Less Wrong users · 2016-04-09T12:23:49.033Z · LW · GW

I would probably use better spelling in the messages. It reduces credibility of the scammer.