Posts

No Anthropic Evidence 2012-09-23T10:33:06.994Z
A Mathematical Explanation of Why Charity Donations Shouldn't Be Diversified 2012-09-20T11:03:48.603Z
Consequentialist Formal Systems 2012-05-08T20:38:47.981Z
Predictability of Decisions and the Diagonal Method 2012-03-09T23:53:28.836Z
Shifting Load to Explicit Reasoning 2011-05-07T18:00:22.319Z
Karma Bubble Fix (Greasemonkey script) 2011-05-07T13:14:29.404Z
Counterfactual Calculation and Observational Knowledge 2011-01-31T16:28:15.334Z
Note on Terminology: "Rationality", not "Rationalism" 2011-01-14T21:21:55.020Z
Unpacking the Concept of "Blackmail" 2010-12-10T00:53:18.674Z
Agents of No Moral Value: Constrained Cognition? 2010-11-21T16:41:10.603Z
Value Deathism 2010-10-30T18:20:30.796Z
Recommended Reading for Friendly AI Research 2010-10-09T13:46:24.677Z
Notion of Preference in Ambient Control 2010-10-07T21:21:34.047Z
Controlling Constant Programs 2010-09-05T13:45:47.759Z
Restraint Bias 2009-11-10T17:23:53.075Z
Circular Altruism vs. Personal Preference 2009-10-26T01:43:16.174Z
Counterfactual Mugging and Logical Uncertainty 2009-09-05T22:31:27.354Z
Bloggingheads: Yudkowsky and Aaronson talk about AI and Many-worlds 2009-08-16T16:06:18.646Z
Sense, Denotation and Semantics 2009-08-11T12:47:06.014Z
Rationality Quotes - August 2009 2009-08-06T01:58:49.178Z
Bayesian Utility: Representing Preference by Probability Measures 2009-07-27T14:28:55.021Z
Eric Drexler on Learning About Everything 2009-05-27T12:57:21.590Z
Consider Representative Data Sets 2009-05-06T01:49:21.389Z
LessWrong Boo Vote (Stochastic Downvoting) 2009-04-22T01:18:01.692Z
Counterfactual Mugging 2009-03-19T06:08:37.769Z
Tarski Statements as Rationalist Exercise 2009-03-17T19:47:16.021Z
In What Ways Have You Become Stronger? 2009-03-15T20:44:47.697Z
Storm by Tim Minchin 2009-03-15T14:48:29.060Z

Comments

Comment by Vladimir_Nesov on plutonic_form's Shortform · 2022-01-24T12:09:25.749Z · LW · GW

Together with Bayes's formula (which in practice is mostly remaining aware of base rates when evidence comes to light), another key point about reasoning under uncertainty is to avoid it whenever possible. Like with long-term irrelevance of news, cognitive and methodological overhead makes uncertain knowledge less useful. There are exceptions, like you do want to keep track of news about an uncertain prospect of a war breaking out in your country, but all else equal this is not the kind of thing that's worth worrying about too much. And certainty is not the same as consensus or being well-known to interested people, since there are things that can be understood. If you study something seriously, there are many observations that can be made with certainty, mostly very hypothetical or abstract ones, that almost nobody else made. Truth-seeking is not about seeking all available truths, or else you might as well memorize white noise all day long.

Comment by Vladimir_Nesov on What's Up With Confusingly Pervasive Consequentialism? · 2022-01-21T12:30:44.939Z · LW · GW

beneficial effective plans become sparse faster than the harmful effective plans

The constants are more important than the trend here, whether a good plan for a pivotal act that sorts out AI risk in the medium term can be found. Discrimination of good plans only has to be improved enough to overcome the threshold of what's needed to search plans effective enough to solve that problem.

Comment by Vladimir_Nesov on Why maximize human life? · 2022-01-08T16:38:42.918Z · LW · GW

I see human values as something built by long reflection, a heavily philosophical process where it's unclear if humans (as opposed to human-adjacent aliens or AIs) doing the work is an important aspect of the outcome. This outcome is not something any extant agent knows. Maybe indirectly it's what I consider good, but I don't know what it is, so that phrasing is noncentral. Maybe long reflection is the entity that considers it good, but for this purpose it doesn't hold the role of an agent, it's not enacting the values, only declaring them.

Comment by Vladimir_Nesov on Morality is Scary · 2022-01-08T10:38:03.750Z · LW · GW

My point is that the alignment (values) part of AI alignment is least urgent/relevant to the current AI risk crisis. It's all about corrigibility and anti-goodharting. Corrigibility is hope for eventual alignment, and anti-goodharting makes inadequacy of current alignment and imperfect robustness of corrigibility less of a problem. I gave the relevant example of relatively well-understood values, preference for lower x-risks. Other values are mostly relevant in how their understanding determines the boundary of anti-goodharting, what counts as not too weird for them to apply, not in what they say is better. If anti-goodharting holds (too weird and too high impact situations are not pursued in planning and possibly actively discouraged), and some sort of long reflection is still going on, current alignment (details of what the values-in-AI prefer, as opposed to what they can make sense of) doesn't matter in the long run.

I include maintaining a well-designed long reflection somewhere into corrigibility, for without it there is no hope for eventual alignment, so a decision theoretic agent that has long reflection within its preference is corrigible in this sense. Its corrigibility depends on following a good decision theory, so that there actually exists a way for the long reflection to determine its preference so that it causes the agent to act as the long reflection wishes. But being an optimizer it's horribly not anti-goodharting, so can't be stopped and probably eats everything else.

An AI with anti-goodharting turned to the max is the same as AI with its stop button pressed. An AI with minimal anti-goodharting is an optimizer, AI risk incarnate. Stronger anti-goodharting is a maintenance mode, opportunity for fundamental change, weaker anti-goodharting makes use of more developed values to actually do the things. So a way to control the level of anti-goodharting in an AI is a corrigibility technique. The two concepts work well with each other.

Comment by Vladimir_Nesov on Why maximize human life? · 2022-01-08T10:03:21.622Z · LW · GW

it makes no sense to speak of things being simply “better”, without some agent or entity whose evaluations we take to be our metric for goodness

If the agent/entity is hypothetical, we get an abstract preference without any actual agent/entity. And possibly a preference can be specified without specifying the rest of the agent. A metric of goodness doesn't necessarily originate from something in particular.

Comment by Vladimir_Nesov on Why maximize human life? · 2022-01-07T14:45:43.388Z · LW · GW

Expected utility maximization is only applicable when utility is known. When it's not, various anti-goodharting considerations become more important, maintaining ability to further develop understanding of utility/values without leaning too much on any current guesses of what that is going to be. Keeping humans in control of our future is useful for that, but instrumentally convergent actions such as grabbing the matter in the future lightcone (without destroying potentially morally relevant information such as aliens) and moving decision making to a better substrate are also helpful for whatever our values eventually settle as. The process should be corrigible, should allow replacing humans-in-control with something better as understanding of what that is improves (and not getting locked-in into that either). The AI risk is about failing to set up this process.

Comment by Vladimir_Nesov on Radical openness - say things that others strongly dislike · 2022-01-07T08:28:41.813Z · LW · GW

Suppose that there is a norm against saying not-X. This could be anything between widespread mild discomfort at hearing not-X, and a death sentence for anyone who claimed not-X. A norm is strong when everyone follows it. If everyone follows this norm, only claims of X will be made in public, regardless of X's truth. This is so even when X is actually clearly true and not-X is actually clearly false.

The specific norm about not-X could be opposed by not following it. The usual form of norm violation that comes to mind is to publicly say not-X when you believe not-X to be true, perhaps even when it's irrelevant, hurtful, obnoxious, and unnecessary. But the norm is also violated by staying silent and not saying X when you believe X to be true and relevant. This is not very effective, but then again saying not-X is not necessarily very effective either, and avoiding claims of X is less costly both to yourself and others.

Furthermore, existence of a norm about not-X hurts truthful discussion of X, so possibly being truthful about X in public becomes less important than opposing the norm. This brings up the option of saying not-X when you believe not-X to be false. If followed as a general strategy, this causes topics with censorship norms to become even more actively mind-killing, poisoning all less-than-perfect wells. Compare this with saying not-X when it's true, albeit unnecessary and hurtful/obnoxious. Also a form of poisoning the topic.

Another general strategy is to make sure that the fact of existence of censorship norms about X, and their influence on possibility of a sensible discussion of X, is well-known. But this also won't work if there is a norm against discussing such considerations regarding X, which occasionally there is, if the influence of a pro-X agenda was particularly strong at some point in recent history. So an even more resilient strategy is to discuss this phenomenon in general, without mentioning any particular X of actual concern.

Comment by Vladimir_Nesov on A Reaction to Wolfgang Schwarz's "On Functional Decision Theory" · 2022-01-07T07:16:23.348Z · LW · GW

The formulation quoted from Schwarz's post unnecessarily implicitly disallows unpredictability. The usual more general formulation of Transparent Newcomb is to say that $1M is in the big box iff Omega succeeds in predicting that you one-box in case the big box is full. So if instead you succeed in confusing Omega, the box will be empty. A situation where Omega can't be confused also makes sense, in which case the two statements of the problem are equivalent.

Comment by Vladimir_Nesov on Stop arbitrarily limiting yourself · 2021-12-10T10:52:00.494Z · LW · GW

I think of this as developing curiosity as a deliberative skill. If left on intuitive level, it's liable to get persistent blank spots, topics or skills you flinch away from or become indifferent about and never explore. The heuristic is to make sure to investigate everything you repeatedly encounter, to prevent the situation where you don't look into something you deal with regularly for years.

Comment by Vladimir_Nesov on Morality is Scary · 2021-12-03T22:25:03.789Z · LW · GW

The implication of doing everything that AI could do at once is unfortunate. The urgent objective of AI alignment is prevention of AI risk, where a minimal solution is to take away access to unrestricted compute from all humans in a corrigible way that would allow eventual desirable use of it. All other applications of AI could follow much later through corrigibility of this urgent application.

Comment by Vladimir_Nesov on Morality is Scary · 2021-12-03T21:49:23.618Z · LW · GW

insufficient for a subculture trying to be precise and accurate and converge on truth

The tradeoff is with verbosity and difficulty of communication, it's not always a straightforward Pareto improvement. So in this case I fully agree with dropping "everyone" or replacing it with a more accurate qualifier. But I disagree with a general principle that would discount ease for a person who is trained and talented in relevant ways. New habits of thought that become intuitive are improvements, checklists and other deliberative rituals that slow down thinking need merit that overcomes their considerable cost.

Comment by Vladimir_Nesov on Vanessa Kosoy's Shortform · 2021-12-02T21:10:06.968Z · LW · GW

Goodharting is about what happens in situations where "good" is undefined or uncertain or contentious, but still gets used for optimization. There are situations where it's better-defined, and situations where it's ill-defined, and an anti-goodharting agent strives to optimize only within scope of where it's better-defined. I took "lovecraftian" as a proxy for situations where it's ill-defined, and base distribution of quantilization that's intended to oppose goodharting acts as a quantitative description of where it's taken as better-defined, so for this purpose base distribution captures non-lovecraftian situations. Of the options you listed for debate, the distribution from imitation learning seems OK for this purpose, if amended by some anti-weirdness filters to exclude debates that can't be reliably judged.

The main issues with anti-goodharting that I see is the difficulty of defining proxy utility and base distribution, the difficulty of making it corrigible, not locking-in into fixed proxy utility and base distribution, and the question of what to do about optimization that points out of scope.

My point is that if anti-goodharting and not development of quantilization is taken as a goal, then calibration of quantilization is not the kind of thing that helps, it doesn't address the main issues. Like, even for quantilization, fiddling with base distribution and proxy utility is a more natural framing that's strictly more general than fiddling with the quantilization parameter. If we are to pick a single number to improve, why privilege the quantilization parameter instead of some other parameter that influences base distribution and proxy utility?

The use of debates for amplification in this framing is for corrigibility part of anti-goodharting, a way to redefine utility proxy and expand the base distribution, learning from how the debates at the boundary of the previous base distribution go. Quantilization seems like a fine building block for this, sampling slightly lovecraftian debates that are good, which is the direction where we want to expand the scope.

Comment by Vladimir_Nesov on Morality is Scary · 2021-12-02T10:18:16.644Z · LW · GW

I'm leaning towards the more ambitious version of the project of AI alignment being about corrigible anti-goodharting, with the AI optimizing towards good trajectories within scope of relatively well-understood values, preventing overoptimized weird/controversial situations, even at the cost of astronomical waste. Absence of x-risks, including AI risks, is generally good. Within this environment, the civilization might be able to eventually work out more about values, expanding the scope of their definition and thus allowing stronger optimization. Here corrigibility is in part about continually picking up the values and their implied scope from the predictions of how they would've been worked out some time in the future.

Comment by Vladimir_Nesov on Question/Issue with the 5/10 Problem · 2021-11-29T19:12:22.820Z · LW · GW

The core of the 5-and-10 problem is not specific to a particular formalization or agent algorithm. It's fundametally the question of what's going on with agent's reasoning inside the 5 world. In the 10 world, agent's reasoning proceeds in a standard way, perhaps the agent considers both the 5 and 10 worlds, evaluates them, and decides to go with 10. But what might the agent be thinking in the 5 world, so that it ends up making that decision? And if the agent in the 10 world is considering the 5 world, what does the agent in the 10 world think about the thinking of the agent in the 5 world, and about what that implies in general?

How this happens is a test for decision making algorithms, as it might lead to a breakdown along the lines of the 5-and-10 problem, or to a breakdown of an informal model of how a particular algorithm works. The breakdown is not at all inevitable, and usually the test can't even be performed without changing the algorithm to make it possible, in which case we've intentionally broken the algorithm in an interesting way that might tell us something instructive.

In the post, what agent algorithm are you testing? Note that agent's actions are not the same thing as agent's knowledge of them. Proving A = 5 in a possibly inconsistent system is not the same thing as actually doing 5 (perhaps the algorithm explicitly says to do 10 upon proving A = 5, which is the chicken rule; there is no relevant typo in this parenthetical).

Comment by Vladimir_Nesov on Vanessa Kosoy's Shortform · 2021-11-24T23:07:33.521Z · LW · GW

I'm not sure this attacks goodharting directly enough. Optimizing a system for proxy utility moves its state out-of-distribution where proxy utility generalizes training utility incorrectly. This probably holds for debate optimized towards intended objectives as much as for more concrete framings with state and utility.

Dithering across the border of goodharting (of scope of a proxy utility) with quantilization is actionable, but isn't about defining the border or formulating legible strategies for what to do about optimization when approaching the border. For example, one might try for shutdown, interrupt-for-oversight, or getting-back-inside-the-borders when optimization pushes the system outside, which is not quantilization. (Getting-back-inside-the-borders might even have weird-x-risk prevention as a convergent drive, but will oppose corrigibility. Some version of oversight/amplification might facilitate corrigibility.)

Debate seems more useful for amplification, extrapolating concepts in a way humans would, in order to become acceptable proxies in wider scopes, so that more and more debates become non-lovecraftian. This is a different concern from setting up optimization that works with some fixed proxy concepts as given.

Comment by Vladimir_Nesov on From language to ethics by automated reasoning · 2021-11-22T00:43:37.668Z · LW · GW

Please don't do this. You've already posted this comment two weeks ago.

Comment by Vladimir_Nesov on A Defense of Functional Decision Theory · 2021-11-17T17:45:25.002Z · LW · GW

Well, if something’s not actually happening, then I’m not actually seeing it happen.

Not actually, you seeing it happen isn't real, but this unreality of seeing it happen proceeds in a specific way. It's not indeterminate greyness, and not arbitrary.

if something never happens, and I never observe it, then I never respond to it, either. My response to it is nothing.

If your response (that never happens) could be 0 or 1, it couldn't be nothing. If it's 0 (despite never having been observed to be 0), the claim that it's 1 is false, and the claim that it's nothing doesn't type check.

I'm guessing that the analogy between you and an algorithm doesn't hold strongly in your thinking about this, it's the use of "you" in place of "algorithm" that does a lot of work in these judgements that wouldn't happen for talking about an "algorithm". So let's talk about algorithms to establish common ground.

Let's say we have a pure total procedure f written in some programming language, with the signature f : O -> D, where O = Texts is the type of observations and D = {0,1} is the type of decisions. Let's say that in all plausible histories of the world, f is never evaluated on argument "green sky". In this case I would say that it's impossible for the argument (observation) to be "green sky", procedure f is never evaluated with this argument in actuality.

Yet it so happens that f("green sky") is 0. It's not 1 and not nothing. There could be processes sensitive to this fact that don't specifically evaluate f on this argument. And there are facts about what happens inside f with intermediate variables or states of some abstract machine that does the evaluation (procedure f's experience of observing the argument and formulating a response to it), as it's evaluated on this never-encountered argument, and these facts are never observed in actuality, yet they are well-defined by specifying f and the abstract machine.

You can ask: “but if it did happen, what would be your response?”—and that’s a reasonable question. But any answer to that question would indeed have to take as given that the event in question were in fact actually happening (otherwise the question is meaningless).

The question of what f("green sky") would evaluate to isn't meaningless regardless of whether evaluation of f on the argument "green sky" is an event that in fact actually happens. Actually extant evidence for a particular answer, such as a proof that the answer is 0, is arguably also evidence of the evaluation having taken place. But reasoning about the answer doesn't necessarily pin it down exactly, in which case the evaluation didn't necessarily take place.

For example, perhaps we only know that f("green sky") is the same as g("blue sky"), but don't know what the values are. Actually proving this equality doesn't in general require either f("green sky") or g("blue sky") to be actually evaluated.

You seem to be saying: “yes, certain things that can happen are impossible”, which is very much counter to all ordinary usage.

Winning a billion dollars on the stock market by following the guidance of a random number generator technically "can happen", but I feel it's a central example of something impossible in ordinary usage of the word. I also wouldn't say that it can happen, without the scare quotes, even though technically it can.

I would not say “this is impossible and isn’t happening”.

This is mostly relevant for decisions between influencing one world and influencing another, possible when there are predictors looking from one world into the other. I don't think behavior within-world (in ordinary situations) should significantly change depending on its share of reality, but also I don't see a problem with noticing that the share of reality of some worlds is much smaller than for some other worlds. Another use is manipulating a predictor that imagines you seeing things that you (but not the predictor) know can't happen, and won't notice you noticing.

Comment by Vladimir_Nesov on A Defense of Functional Decision Theory · 2021-11-16T01:50:23.257Z · LW · GW

If I am an agent, and something is happening to me

The point is that you don't know that something is happening to you just because you are seeing it happen. Seeing it happen is what takes place when you-as-an-algorithm is evaluated on the corresponding observations. A response to seeing it happen is well-defined even if the algorithm is never actually evaluated on those observations. When we spell out what happens inside the algorithm, what we see is that the algorithm is "seeing it happen". This is so even if we don't actually look. (See also.)

So for example, if I'm asking what would be your reaction to the sky turning green, what is the status of you-in-the-question who sees the sky turn green? They see it happen in the same way that you see it not happen. Yet from the fact that they see it happen, it doesn't follow that it actually happens (the sky is not actually green).

Another point is that for you-in-the-question, it might be the green-sky world that matters, not the blue-sky world. That is a side effect of how your insertion into the green-sky world doesn't respect the semantics of your preferences, which care about blue-sky world. For you-in-the-question with preferences ending up changed to care about the green-sky world, the useful sense of actuality refers to the green-sky world, so that for them it's the blue-sky world that's impossible. But if agents share preferences, this kind of thing doesn't happen. (This is another paragraph that doesn't respect rabbit hole safety regulations.)

If what is happening to me is actually happening in a simulation… well, so what?

You typically don't know that some observation is taking place even in a simulation, yet your response to that observation that never happens in any form, and is not predicted by any predictor, is still well-defined. It makes sense to ask what it is.

Sorry, do you mean that you don’t count low-probability events as impossible, or that you don’t count them as possible (a.k.a. “happening in actuality”)?

I mean that if something does happen in actuality-as-ensemble with very low probability, that doesn't disqualify it from being impossible according to how I'm using the word. Without this caveat literally nothing would be impossible in some settings.

I… have considerable difficult parsing what you’re saying in the second paragraph of your comment.

The link is not helpful here, it's more about what goes wrong when my sense of "impossible" is taken too far, for reasons that have nothing to do with word choice (it perhaps motivates this word choice a little bit). The use of that paragraph is in what's outside the parenthetical. It's intended to convey that when you choose between options A and B, it's usually said that both taking A and taking B is possible, while my use of "impossible" in this thread is such that the option that's not actually taken is instead impossible.

Comment by Vladimir_Nesov on A Defense of Functional Decision Theory · 2021-11-15T23:36:54.175Z · LW · GW

By "impossible" I mean not happening in actuality (which might be an ensemble, in which case I'm not counting what happens with particularly low probabilities), taking into account the policy that the agent actually follows. So the agent may have no way of knowing if something is impossible (and often won't before actually making a decision). This actuality might take place outside the thought experiment, for example in Transparent Newcomb that directly presents you with two full boxes (that is, both boxes being full is part of the description of the thought experiment), and where you decide to take both, the thought experiment is describing an impossible situation (in case you do decide to take both boxes), while the actuality has the big box empty.

So for the problem where you-as-money-maximizer choose between receiving $10 and $5, and actually have chosen $10, I would say that taking $5 is impossible, which might be an unusual sense of the word (possibly misleading before making the decision; 5-and-10 problem is about what happens if you take this impossibility too seriously in an unhelpful way). This is the perspective of an external Oracle that knows everything and doesn't make public predictions.

If this doesn't clear up the issue, could you cite a small snippet that you can't make sense of and characterize the difficulty? Focusing on Transparent-Newcomb-with-two-full-boxes might help (with respect to use of "impossible", not considerations on how to solve it), it's way cleaner than Bomb.

(The general difficulty might be from the sense in which UDT is a paradigm, its preferred ways of framing its natural problems are liable to be rounded off to noise when seen differently. But I don't know what the difficulty is on object level in any particular case, so calling this a "paradigm" is more of a hypothesis about the nature of the difficulty that's not directly helpful.)

Comment by Vladimir_Nesov on A Defense of Functional Decision Theory · 2021-11-15T21:55:54.585Z · LW · GW

UDT is about policies, not individual decisions. A thought experiment typically describes an individual decision taken in some situation. A policy specifies what decisions are to be taken in all situations. Some of these situations are impossible, but the policy is still defined for them following its type signature, and predictors can take a look at what exactly happens in the impossible situations. Furthermore, choice of a policy influences which situations are impossible, so there is no constant fact about which of them are impossible.

The general case of making a decision in an individual situation involves uncertainty about whether this situation is impossible, and ability to influence its impossibility. This makes the requirement for thought experiments to describe real situations an unnatural contraint, so in thought experiments read in context of UDT this requirement is absent by default.

A central example is Transparent Newcomb's Problem. When you see money in the big box, this situation is either possible (if you one-box) or impossible (in you two-box), depending on your decision. If a thought experiment is described as you facing the problem in this situation (with the full box), it's describing a situation (and observations made there) that may, depending on your decision, turn out to be impossible. Yet asking for what your decision in this situation will be is always meaningful, because it's possible to evaluate you as an algorithm even on impossible observations, which is exactly what all the predictors in such thought experiments are doing all the time.

What could the thought experiment even be about if the described scenario is not supposed to be real?

It's about evaluating you (or rather an agent) as an algorithm on the observations presented by the scenario, which is possible to do regardless of whether the scenario can be real. This in turn motivates asking what happens in other situations, not explicitly described in the thought experiment. A combined answer to all such questions is a policy.

Comment by Vladimir_Nesov on Improving on the Karma System · 2021-11-14T23:11:25.758Z · LW · GW

A fixed set of tags turns this into multiple-choice questions where all answers are inaccurate, and most answers are irrelevant. Write-in tags could be similar to voting on replies to a comment that evaluate it in some respect. Different people pay attention to different aspects, so the flexibility to vote on multiple aspects at once or differently from overall vote is unnecessary.

Comment by Vladimir_Nesov on Chris_Leong's Shortform · 2021-11-14T14:47:26.569Z · LW · GW

I don't see how there is anything here other than equivocation of different meanings of "world". Counterfactuals-as-worlds is not even a particularly convincing way of making sense of what counterfactuals are.

Comment by Vladimir_Nesov on A Defense of Functional Decision Theory · 2021-11-14T13:05:05.564Z · LW · GW

The statement of Bomb is bad at being legible outside the FDT/UDT paradigm, it's instead actively misleading there, so is a terrible confusion-conflict-and-not-clarity inducing example to show someone who is not familiar with it. The reason Left is reasonable is that the scenario being described is, depending on the chosen policy, almost completely not real, a figment of predictor's imagination.

Unless you've read a lot of FDT/UDT discussion, a natural reading of a thought experiment is to include the premise "the described situation is real". And so people start talking past each other, digging into the details of how to reason about the problem, when the issue is that they read different problem statements, one where the scenario is real, and another where its reality is not at all assured.

Comment by Vladimir_Nesov on Chris_Leong's Shortform · 2021-11-14T12:08:17.525Z · LW · GW

What correspondence? Counterfactuals-as-worlds have all laws of physics broken in them, including quantum mechanics.

Comment by Vladimir_Nesov on tivelen's Shortform · 2021-11-11T02:29:08.474Z · LW · GW

It's useful, but likely not valuable-in-itself for people to strive to be primarily morality optimizers. Thus the optimally moral thing could be to care about the optimally moral thing substantially less than sustainably feasible.

Comment by Vladimir_Nesov on Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation · 2021-11-11T00:48:52.265Z · LW · GW

tension between information and adherence-to-norms

This mostly holds for information pertaining to norms. Math doesn't need controversial norms, there is no tension there. Beliefs/claims that influence transmission of norms are themselves targeted by norms, to ensure systematic transmission. This is what anti-epistemology is, it's doing valuable work in instilling norms, including norms for perpetuating anti-epistemology.

So the soft taboo on politics is about not getting into a subject matter that norms care about. And the same holds for interpersonal stuff.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-10T20:59:08.079Z · LW · GW

epistemic hygiene

This is an example of the illusion of transparency issue. Many salient interpretations of what this means (informed by the popular posts on the topic, that are actually not explicitly on this topic) motivate actions that I consider deleterious overall, like punishing half-baked/wild/probably-wrong hypotheses or things that are not obsequiously disclaimed as such, in a way that's insensitive to the actual level of danger of being misleading. A more salient cost is nonsense hogging attention, but that doesn't distinguish it from well-reasoned clear points that don't add insight hogging attention.

The actually serious problem is when this is a symptom of not distinguishing epistemic status of ideas on part of the author, but then it's not at all clear that punishing publication of such thoughts helps the author fix the problem. The personal skill of tagging epistemic status of ideas in one's own mind correctly is what I think of as epistemic hygiene, but I don't expect this to be canon, and I'm not sure that there is no serious disagreement on this point with people who also thought about this. For one, the interpretation I have doesn't specify community norms, and I don't know what epistemic-hygiene-the-norm should be.

Comment by Vladimir_Nesov on Come for the productivity, stay for the philosophy · 2021-11-09T15:40:23.975Z · LW · GW

Many of these (or other) theory things never make you "more effective". But you do become able to interact with them.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-09T00:20:08.451Z · LW · GW

It's often useful to have possibly false things pointed out to keep them in mind as hypotheses or even raw material for new hypotheses. When these things are confidently asserted as obviously correct, or given irredeemably faulty justifications, that doesn't diminish their value in this respect, it just creates a separate problem.

A healthy framing for this activity is to explain theories without claiming their truth or relevance. Here, judging what's true acts as a "solution" for the problem, while understanding available theories of what might plausibly be true is the phase of discussing the problem. So when others do propose solutions, do claim what's true, a useful process is to ignore that aspect at first.

Only once there is saturation, and more claims don't help new hypotheses to become thinkable, only then this becomes counterproductive and possibly mostly manipulation of popular opinion.

Comment by Vladimir_Nesov on Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation · 2021-11-08T19:39:58.604Z · LW · GW

I don't know, a lot of this is from discussion of Kuhn, new paradigms/worldviews are not necessarily incentivized to say new things or make sense of new things, even though they do, they just frame them in a particular way. And when something doesn't fit a paradigm, it's ignored. This is good and inevitable for theorizing on human level, and doesn't inform usefulness or correctness of what's going on, as these things live inside the paradigm.

Comment by Vladimir_Nesov on Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation · 2021-11-08T19:31:50.174Z · LW · GW

It's about lifecycle of theory development, confronted with incentives of medium-term planning. Humans are not very intelligent, and the way we can do abstract theory requires developing a lot of tools that enable fluency with it, including the actual intuitive fluency that uses the tools to think more rigorously, which is what I call common sense.

My anchor is math, which is the kind of theory I'm familiar with, but the topic of the theory could be things like social structures, research methodologies, or human rationality. So when common sense has an opportunity to form, we have a "post-rigorous" stage where rigid principles (gears) that make the theory lawful can be wielded intuitively. Without getting to this stage, the theory is blind or (potentially) insane. It is blind without intuition or insane when intuition is unmoored from rigor. (It can be somewhat sane when pre-rigorous intuition is grounded in something else, even if by informal analogy.)

If left alone, a theory tends to sanity. It develops principles to organize its intuitions, and develops intuitions to wield its principles. Eventually you get something real that can be seen and shaped with purpose.

But when it's not at that stage, forcing it to change will keep it unsettled longer. If the theory opines about how an organizational medium-term plan works, what it should be, yet it's unsettled, you'll get insane opinions about the plans that shape insane plans. And reality chasing the plan, forcing it to confront what actually happens at present, gives an incentive to keep changing the theory before it's ready, keeping it in this state of limbo.

Comment by Vladimir_Nesov on Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation · 2021-11-08T11:18:14.800Z · LW · GW

This shapes up as a case study on the dangers of doing very speculative and abstract theory about medium-term planning. (Which might include examples like figuring out what kind of understanding is necessary to actually apply hypothetical future alignment theory in practice...)

The problem is that common sense doesn't work or doesn't exist in these situations, but it's still possible to do actionable planning, and massage the plan into a specific enough form in time to meet reality, so that reality goes according to the plan that on the side of the present adapts to it, even as on the side of the medium-term future it devolves into theoretical epicycles with no common sense propping it up.

This doesn't go bad when it's not in contact with reality, because then reality isn't hurrying it into a form that doesn't fit the emerging intuition of what the theory wants to be. And so it has time to mature into its own thing, or fade away into obscurity, but in any case there is more sanity to it formed of internal integrity. Whereas with a theoretical medium-term plan reality continually butchers the plan, which warps the theory, and human intuition is not good enough to reconcile the desiderata in a sensible way fast enough.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T08:37:16.375Z · LW · GW

Yes, sorry, I got too excited about the absurd hypothesis supported by two datapoints, posted too soon, then tried to reproduce, and it no longer worked at all. I had the time to see the page in firefox incognito window on the same system where I'm logged in and in a normal firefox window from a different Linux username that never had facebook logged in.

Edit: Just now it worked again twice, and after that it no longer did. Bottom line: Public facebook posts are not really public, at least today, they are only public intermittently.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T08:23:20.865Z · LW · GW

I can no longer see it when not logged in, even though I did before. Maybe we triggered a DDoS mitigation thingie?

Edit: Removed incorrect claim about how this worked (before seeing Said's response).

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T07:30:23.535Z · LW · GW

for brevity's sake

I think of robustness/redundancy as the opposite of nuance for the purposes of this thread. It's not the kind of redundancy where you set up a lot of context to gesture at an idea from different sides, specify the leg/trunk/tail to hopefully indicate the elephant. It's the kind of redundancy where saying this once in the first sentence should already be enough, the second sentence makes it inevitable, and the third sentence preempts an unreasonable misinterpretation that's probably logically impossible.

(But then maybe you add a second paragraph, and later write a fictional dialogue where characters discuss the same idea, and record a lecture where you present this yet again on a whiteboard. There's a lot of nuance, it adds depth by incising the grooves in the same pattern, and none of it is essential. Perhaps there are multiple levels of detail, but then there must be levels with little detail than make sense out of context, on their own, and the levels with a lot of detail must decompose into smaller self-contained points. I don't think I'm saying anything that's not tiresomely banal.)

Comment by Vladimir_Nesov on How do I keep myself/S1 honest? · 2021-11-08T06:04:37.818Z · LW · GW

I don't mean that S1 doesn't speak. It speaks a lot, like a talkative relative at a party, but it shouldn't be normative that its words are your words. You can disagree with its words, and it should be reasonable to hear you out when you do. You can demonstrate this distinction by allowing some of these disagreements to occur out loud in public. ("I just realized that I said X a few minutes ago. Actually I don't endorse that statement. Funny thing, I changed my mind about this a few years back, but I still occasionally parrot this more popular claim.")

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T05:09:01.145Z · LW · GW

The most obvious/annoying issue with karma is false disagreement zero equilibrium controversy tug of war that can't currently be split into more specific senses of voting to reveal that actually there is a consensus.

This can't be solved by pre-splitting, it has to act as needed, maybe co-opting the tagging system, with the default tag being "Boostworthy" (but not "Relevant" or anything specific like that), ability to see the tags if you click something, and ability to tag your vote with anything (one tag per voter, so to give a specific tag you have to untag "Boostworthy", and all tags sum up into the usual karma score that is the only thing that shows by default until you click something). This has to be sufficiently inconvenient to only get used when necessary, but then somehow become convenient enough for everyone to use (for that specific comment).

On the other hand there is Steam that only has approve/disapprove votes and gives vastly more useful quality ratings than most rating aggregators that are even a little bit more nuanced. So any good idea is likely to make things worse. (Though Steam doesn't have a zero equilibrium problem because the rating is the percentage of approve votes.)

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T03:31:51.667Z · LW · GW

The benefits of nuance are not themselves nuance. Nuance is extremely useful, but not good in itself, and the bleed-through of its usefulness into positive affect is detrimental to clarity of thought and communication.

Capacity for nuance abstracts away this problem, so might be good in itself. (It's a capacity, something instrumentally convergent. Though things useful for agents can be dangerous for humans.)

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T03:20:21.874Z · LW · GW

I'm specifically boosting the prescriptivist point about not using the word "rational" in an inflationary way that doesn't make literal sense. Comments can be valid, explicit on their own epistemic status, true, relevant to their intended context, not making well-known mistakes, and so on and so forth, but they can't be rational, for the reason I gave, in the sense of "rational" as a property of cognitive algorithms.

I think this is a mistake

Incidentally, I like the distinction between error and mistake from linguistics, where an error is systematic or deliberatively endorsed behavior, while a mistake is intermittent behavior that's not deliberatively endorsed. That would have my comment make an error, not a mistake.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T02:54:18.738Z · LW · GW

Nuance is the cost of precision and the bane of clarity. I think it's an error to feel positively about nuance (or something more specific like degrees of uncertainty), when it's a serious problem clogging up productive discourse, that should be burned with fire whenever it's not absolutely vital and impossible to avoid.

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-08T02:30:22.852Z · LW · GW

Rationality doesn't make sense as a property of comments. It's a quality of cognitive skills that work well (and might generate comments). Any judgement of comments according to rationality of algorithms that generated them is an ad hominem equivocation, the comments screen off the algorithms that generated them.

Comment by Vladimir_Nesov on Money Stuff · 2021-11-08T01:47:08.590Z · LW · GW

If you are about to say something socially neutral or approved, but a salient alternative to what you are saying comes with a cost (or otherwise a target of appeal to consequences), integrity in making the claim requires a resolve to have said that alternative too if it (counterfactually) turned out to be what you believe (with some unclear "a priori" weighing that doesn't take into account your thinking on that particular topic). But that's not enough if you want others to have a fair opportunity to debate the claim you make, for they would also incur the cost of the alternative claims, and the trial preregistration pact must be acausally negotiated with them and not just accepted on your own.

See this comment and its parent for a bit more on this. This is a large topic, related to glomarization and (dis)honesty. These contraptions have to be built around anti-epistemology to counteract its distorting effects.

Comment by Vladimir_Nesov on Tell the Truth · 2021-11-08T01:31:28.809Z · LW · GW

In this case the principle that leaves the state of evidence undisturbed is to keep any argument for not murdering puppies to yourself as well, for otherwise you in expectation would create filtered evidence in favor of not murdering puppies.

This is analogous to trial preregistration, you just do the preregistration like an updateless agent, committing to act as if you've preregistered to speak publicly on any topic on which you are about to speak regardless of what it turns out you have to say on it. This either prompts you to say a socially costly thing (if you judge the preregistration a good deal) or to stay silent on a socially neutral or approved thing (if the preregistration doesn't look like a good deal).

Comment by Vladimir_Nesov on Speaking of Stag Hunts · 2021-11-06T19:10:56.345Z · LW · GW

point at small things as if they are important

Taking unimportant things seriously is important. It's often unknown that something is important, or known that it isn't, and that doesn't matter for the way in which it's appropriate to work on details of what's going on with it. General principles of reasoning should work well for all examples, important or not. Ignoring details is a matter of curiosity, allocating attention, it shouldn't impact how the attention that happens to fall on a topic treats it.

general enthusiasm for even rather dull and tedious and unsexy work

This is the distinction between "outer enthusiasm", considering a topic important, and "inner enthusiasm", integrity in working on a topic for however long you decide to do so, even if you don't consider the topic important. Inner enthusiasm is always worthwhile, and equivocation with outer enthusiasm makes it harder to notice that. Or that there should be less outer enthusiasm.

Comment by Vladimir_Nesov on Has LessWrong Been Mind-Killed on the Topic of God and Religion? · 2021-11-06T13:01:19.742Z · LW · GW

It doesn't matter if a discussion is sympathetic or not, that's not relevant to the problem I'm pointing out. Theism is not even an outgroup, it's too alien and far away to play that role.

Anti-epistemology is not a label for bad reasoning or disapproval of particular cultures, it's the specific phenomenon of memes and norms that promote systematically incorrect reasoning, where certain factual questions end up getting resolved to false answers, resisting argument or natural intellectual exploration, certain topics or claims can't be discussed or thought about, and meaningless nothings hog all attention. It is the concept for the vectors of irrationality, the foundation of its staying power.

Comment by Vladimir_Nesov on Has LessWrong Been Mind-Killed on the Topic of God and Religion? · 2021-11-06T11:24:48.339Z · LW · GW

Closer to the object level, I like the post aesthetically, it's somewhat beautiful and well-crafted. I didn't find anything useful/interesting/specific in it, it only makes sense to me as a piece of art. At the same time, it fuels a certain process inimical to the purpose of LW.

Compare this with Scott Alexander's Moloch post or even Sarah Constantin's Ra post. There's specific content that the mythical analogies help organize and present.

The positive role of the mythical analogies is the same as in your post, but my impression was that in your post the payload is missing, and the mythical texture is closer to functional anti-epistemology, where it's not yet ground down to the residue of artistic expression by distance from the origin and distortion in loose retelling.

discussion of religion in a positive light

Discussion in a negative light has its own problems, where instead of developing clarity of thought one is busy digging moats that keep an opposing ideology at bay, a different kind of activity that involves only a very pro forma version of what it takes not to drown in the more difficult questions that are relevant in their own right.

Comment by Vladimir_Nesov on Has LessWrong Been Mind-Killed on the Topic of God and Religion? · 2021-11-06T10:38:18.253Z · LW · GW

Anti-epistemology lives in particular topics, makes it hard/socially costly/illegal to discuss them without committing the errors it instills. Its presence is instrumentally convergent in topics relevant to power (over people), such as religion, politics, and anything else politically charged.

Ideas are transported by analogies, and anti-epistemology grows on new topics via analogies with topics already infected by it, if it's not fought back from the other side with sufficient clarity. The act of establishing an analogy with such a topic is, all else equal, sabotage.

Comment by Vladimir_Nesov on The Opt-Out Clause · 2021-11-04T09:16:42.936Z · LW · GW

If this is the simulated world of the thought experiment (abstract simulation), and opting-out doesn't change the abstract simulation, then the opting-out procedure did wake you up in reality, but the instance within the abstract simulation who wrote the parent comment has no way of noticing that. The concrete simulation might've ended, but that only matters for reality of the abstract simulation, not its content.

Comment by Vladimir_Nesov on How do I keep myself/S1 honest? · 2021-11-03T11:33:51.502Z · LW · GW

S1 shouldn't have the authority to speak for you. To the extent this norm is established, it helps with all sorts of situations where S1 is less than graceful (perhaps it misrepresents your attitude, there are many mistakes other than unintended lying). Unfortunately this is not a common norm, so only starts working with sufficiently close acquaintances. And needs S2 that fuels the norm by dressing down S1 in public when appropriate, doesn't refuse to comment, and upholds the reputation of not making S1 a scapegoat.

Comment by Vladimir_Nesov on Money Stuff · 2021-11-03T10:51:22.606Z · LW · GW

These are statements whose truth can't be discussed, only claimed with filtered evidence. Like politics, this requires significant reframing to sidestep the epistemic landmines.