Posts

Announcing the CLR Foundations Course and CLR S-Risk Seminars 2024-11-19T01:18:10.085Z
How should I talk about optimal but not subgame-optimal play? 2022-06-30T13:58:59.576Z
[Madison] Collaborative Truthseeking 2019-03-26T04:09:55.881Z
[Madison] Meditations on Moloch 2018-11-28T19:21:08.231Z
Social Meetup: Bandung Indonesian 2018-11-17T06:44:22.671Z
Is skilled hunting unethical? 2018-02-17T18:48:21.635Z

Comments

Comment by JamesFaville (elephantiskon) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-02T13:35:07.432Z · LW · GW

Strongly agree with this. How I frame the issue: If people want to say that they identify as an "experiencer" who is necessarily conscious, and don't identify with any nonconscious instances of their cognition, then they're free to do that from an egoistic perspective. But from an impartial perspective, what matters is how your cognition influences the world. Your cognition has no direct access to information about whether it's conscious such that it could condition on this and give different outputs when instantiated as conscious vs. nonconscious.

Note that in the case where some simulator deliberately creates a behavioural replica of a (possibly nonexistent) conscious agent, consciousness does enter into the chain of logical causality for why the behavioural replica says things about its conscious experience. Specifically, the role it plays is to explain what sort of behaviour the simulator is motivated to replicate. So many (or even all) non-counterfactual instances of your cognition being nonconscious doesn't seem to violate any Follow the Improbability heuristic.

Comment by JamesFaville (elephantiskon) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-02T03:52:40.310Z · LW · GW

Thanks for the cool discussion Ryan and Nate! This thread seemed pretty insightful to me. Here’s some thoughts / things I’d like to clarify (mostly responding to Nate's comments).[1]

Who’s doing this trade?

In places it sounds like Ryan and Nate are talking about predecessor civilisations like humanity agreeing to the mutual insurance scheme? But humans aren’t currently capable of making our decisions logically dependent on those of aliens, or capable of rescuing them. So to be precise the entity engaging in this scheme or other acausal interactions on our behalf is our successor, probably a FAI, in the (possibly counterfactual or counterlogical) worlds where we solve alignment.

Nate says:

Roughly speaking, I suspect that the sort of civilizations that aren't totally fucked can already see that "comb through reality for people who can see me and make their decisions logically dependent on mine" is a better use of insurance resources, by the time they even consider this policy.

Unlike us, our FAI can see other aliens. So I think the operative part of that sentence is “comb through reality”—Nate’s envisioning a scenario where with ~85% probability our FAI has 0 reality-fluid before any acausal trades are made.[2] If aliens restrict themselves to counterparties with nonzero reality-fluid, and humans turn out to not be at a competence level where we can solve alignment, then our FAI doesn’t make the cut.

Note: Which FAI we deploy is unlikely to be physically overdetermined in scenarios where alignment succeeds, and definitely seems unlikely to be determined by more coarse-grained (not purely physical) models of how a successor to present-day humanity comes about. (The same goes for which UFAI we deploy.) I’m going to ignore this fact for simplicity and talk about a single FAI; let me know if you think it causes problems for what I say below.

Trading with nonexistent agents is normal

I do see an argument that agents trying to do insurance with similar motives to ours could strongly prefer to trade with agents who do ex post exist, and in particular those agents that ex post exist with more reality-fluid. It’s that insurance is an inherently risk-averse enterprise.[3] It doesn’t matter if someone offers us a fantastic but high-variance ex ante deal, when the whole reason we’re looking for insurance is in order to maximise the chances of a non-sucky ex post outcome. (One important caveat is that an agent might be able to do some trades to first increase their ex ante resources, and then leverage those increased resources in order to purchase better guarantees than they’d initially be able to buy.)

On the other hand, I think an agent with increasing utility in resources will readily trade with counterparties who wouldn’t ex post exist absent such a trade, but who have some ex ante chance of naturally existing according to a less informed prior of the agent. I get the impression Nate thinks agents would avoid such trades, but I’m not sure / this doesn’t seem to be explicit outside of the mutual insurance scenario.

There’s two major advantages to trading with ex post nonexistent agents, as opposed to updating on (facts upstream of) their existence and consequently rejecting trade with them:

  • Ex post nonexistent agents who are risk-averse w.r.t. their likelihood of meeting some future threshold of resources/value, like many humans seem to be, could offer you deals that are very attractive ex ante. 
  • Adding agents who (absent your trade) don’t ex post exist to the pool of counterparties you’re willing to trade with allows you to be much more selective when looking for the most attractive ex ante trades.

The main disadvantage is that by not conditioning on a counterparty’s existence you’re more likely to be throwing resources away ex post. The counterparty needs to be able to compensate you for this risk (as the mugger does in counterfactual mugging). I’d expect this bar is going to be met very frequently.

To recap, I'm saying that for plausible agents carrying out trades with our FAI, Nate's 2^-75 number won't matter. Instead, it would be something closer to the 85% number that matters—an ex ante rather than an ex post estimate of the FAI’s reality-fluid.

But would our FAI do the trade if it exists?

Nate says (originally talking about aliens instead of humanity):

As I understand it, the argument goes something like "your counterfactual picture of reality should include worlds in which your whole civilization turned out to be much much less competent, and so when you imagine the multiverse where you pay for all humanity to live, you should see that, in the parts of the multiverse where you're totally utterly completely incompetent and too poor to save anything but a fraction of your own brethren, somebody else pays to save you".

I agree that doing an insurance trade on behalf of a civilisation requires not conditioning on that civilisation’s competence. Nate implies that aliens’ civilisational competence is “tightly intertwined with [aliens’] basic reasoning faculties”, and this seems probably true for alien or human members of predecessor civilisations? But I don’t know why the civilisational competence of a FAI’s predecessor would be tightly intertwined with the FAI’s cognition. As mentioned above, I think the relevant actor here is our FAI, not our current selves. 

We can further specify civilisational competence (relative to the stakes of alignment) as a function of two variables:

  1. Physical facts about a civilisation’s history (i.e. the arrangement of atoms).
  2. Logical facts (beyond those accessible to current humans) that govern the relationship between civilisations instantiated via physics, and what sort of AI certain types of civilisations are likely to deploy.

Either of these when combined with the other provides evidence about what sort of AI a predecessor civilisation deploys, but each will be uninformative on its own. I have in mind that agents executing an insurance trade would condition on all physical facts about their counterparty's civilisation—up until some truncation point that’s plausibly late enough to be capturing our current selves—but would not condition on some logical facts that are necessary to interpret those physical facts into a ~determinate answer as to whether the civilisation solves alignment.

Conditioning on those logical facts sounds pretty analogous to conditioning on a digit of pi to me. The important part is that the facts an agent chooses not to condition on aren't determined by core parts of an agent's cognition / decision procedure. Those facts being determinative of an agent's amount of reality-juice is typically fine, this just discounts the value of the resources they possess when making such trades.

Does this mean we can have nice things?

So overall, I think that aliens or earth-originating UFAIs (who aren’t motivated by insurance) would be pretty interested in trading with our FAI, and vice-versa. Counterparties would discount the FAI’s resources by their prior probability that it’s deployed (before conditioning on factors that pin this down).

Because we’re assuming our FAI would be willing to offer terms that are terrible for us if denominated in measure-weighted resources, counterparties would gain ex ante resources by engaging in an insurance trade with it. Those resources could later be used to engage in trade with others who are themselves willing to (indirectly) trade with nonexistent agents, and who don’t have much more sceptical priors about the deployment odds of our FAI. So because the trade at hand yields a net profit, I don't think it competes much with ordinary alternative demands for counterparties’ resources.

Nevertheless, here’s a few (nonexhaustive) reasons why this trade opportunity wouldn't be taken by another updateless AI:

  • The agent has a better trading opportunity which is either sensitive to when in (logical) time they start fulfilling it, or which demands all the agent's current resources (at the time of discovering the trade) without compensating the agent for further profits.
  • The expected transaction costs of finding agents like our FAI outweigh the expected benefits from trade with it.
    • This might be plausible for aliens without hypercompute; I don't think it's plausible for earth-originating UFAI absent other effects.
      • …But I'm also not sure how strongly earth-originating AIs converge on UDT, before we select for those with more ex ante resources. Even all UDT earth-originating UFAIs doing insurance trades with their predecessors could be insufficient to guarantee survival.
    • Variation: Contrary to what I expect, maybe doing “sequential” acausal trades are not possible without each trade increasing transaction costs for counterparties an agent later encounters, to an extent that a (potentially small-scale) insurance trade with our FAI would be net-negative for agents who intend to do a lot of acausal trade.
  • The agent disvalues fulfilling our end of the trade enough that it's net-negative for it.
  • A maybe-contrived example of our FAI not being very discoverable: Assume the MUH. Maybe our world looks equally attractive to prospective acausal traders as an uncountable number of others. If an overwhelming majority of measure-weighted resources in our section of the multiverse is possessed by countably many agents who don’t have access to hypercompute, we'd have an infinitesimal chance of being simulated by one of them.
  • Our FAI has some restrictions on what a counterparty is allowed to do with the resources it purchases, which could drive down the value of those resources a lot.

Overall I'd guess 30% chance humanity survives misalignment to a substantial extent through some sort of insurance trade, conditional on us not surviving to a substantial extent another cheaper way?

Other survival mechanisms

I’m pretty uncertain about how Evidential cooperation in large worlds works out, but at my current rough credences I do think there’s a good chance (15%) we survive through something which pattern-matches to that, or through other schemes that look similar but have more substantial differences (10%).

I also put some credence in there being very little of us in base reality, in some of those scenarios could involve substantial survival odds. (Though I weakly think the overall contribution of these scenarios is undesirable for us.) 

  1. ^

    Meta: I don’t think figuring out insurance schemes is very important or time-sensitive for us. But I do think understanding the broader dynamics of acausal interactions that determine when insurance schemes would work could be very important and time-sensitive. Also note I'd bet I misinterpreted some claims here, but got to the point where it seemed more useful to post a response than work on better comprehension. (In particular I haven't read much on this page beyond this comment thread.)

  2. ^

    I don’t think Nate thinks alignment would be physically overdetermined if misalignment winds up not being overdetermined, but we can assume for simplicity there’s a 15% chance of our FAI having all the reality fluid of the Everett branches we’re in.

  3. ^

    I'm not clear on what the goal of this insurance scheme is exactly. Here's a (possibly not faithful) attempt: we want to maximise the fraction of reality-fluid devoted to minds initially ~identical to ours that are in very good scenarios as opposed to sucky ones, subject to a constraint that we not increase the reality-fluid devoted to minds initially ~identical to us in sucky scenarios. I’m kind of sympathetic to this—I think I selfishly care about something like this fraction. But it seems higher priority to me to minimise the reality-fluid devoted to us in sucky / terrible scenarios, and higher priority still to use any bargaining power we have for less parochial goals.

Comment by JamesFaville (elephantiskon) on How to Give in to Threats (without incentivizing them) · 2024-09-12T22:12:11.682Z · LW · GW

It's definitely not clear to me that updatelessness + Yudkowsky's solution prevent threats. The core issue is that a target and a threatener face a prima facie symmetric decision problem of whether to use strategies that depend on their counterpart's strategy or strategies that do not depend on their counterpart's strategy.[1]

In other words, the incentive targets have to use non-dependent strategies that incentivise favourable (no-threat) responses from threateners is the same incentive threateners have to use non-dependent strategies that incentivise favourable (give-into-threat) responses from targets. This problem is discussed in more detail in parts of Responses to apparent rationalist confusions about game / decision theory and in Updatelessness doesn't solve most problems.

There are potential symmetry breakers that privilege a no-threat equilibrium, such as the potential for cooperation between different targets. However, there are also potential symmetry breakers in the other direction. I expect Yudkowsky is aware of the symmetry of this problem and either thinks the symmetry breakers in favour of no-threats seem very strong, or is just very confident in the superintelligences-should-figure-this-stuff-out heuristic. Relatedly, this post argues that mutually transparent agents should be able to avoid most of the harm of threats being executed, even if they are unable to avoid threats from being made.

But these are different arguments to the one you make here, and I'm personally unconvinced even these arguments are strong enough that it's not very important for us to work on preventing harmful threats from being made by or against AIs that humanity deploys.

FYI A lot of Center on Long-Term Risk's research is motivated by this problem; I suggest people reach out to us if you're interested in working on it! 

  1. ^

    Examples of non-dependent strategies would include

    • Refusing all threats regardless of why they were made
    • Refusing threats to the extent prescribed by Yudkowsky's solution regardless of why they were made
    • Making threats regardless of a target's refusal strategy when the target is incentivised to give in

    An example of a dependent strategy would be

    • Refusing threats more often when a threatener accurately predicted whether or not you would refuse in order to determine whether to make a threat; and refusing threats less often when they did not predict you, or did so less accurately
Comment by JamesFaville (elephantiskon) on Which rationality posts are begging for further practical development? · 2023-07-24T00:36:42.429Z · LW · GW

How to deal with crucial considerations and deliberation ladders (link goes to a transcript + audio).

Comment by JamesFaville (elephantiskon) on Worst-case thinking in AI alignment · 2023-06-18T12:44:37.094Z · LW · GW

I like this post a lot! Three other reasons came to mind, which might be technically encompassed by some of the current ones but seemed to mostly fall outside the post's framing of them at least.

Some (non-agentic) repeated selections won't terminate until they find a bad thing
In a world with many AI deployments, an overwhelming majority of deployed agents might be unable to mount a takeover, but the generating process for new deployed agents might not halt until a rare candidate that can mount a takeover is found. More specifically, consider a world where AI progress slows (either due to governance interventions or a new AI winter), but people continue conducting training runs at a fairly constant level of sophistication. Suppose that for these state-of-the-art training runs that (i) there is only a negligible chance of finding a non-gradient-hacked AI that can mount a takeover or enable a pivotal act, but (ii) there is a tiny but nonnegligible chance of finding a gradient hacker that can mount a takeover.[1] Then eventually we will stumble across an unlikely training run that produces a gradient hacker.

This problem mostly seems like a special case of You're being optimised against, though here you are not optimised against by an agent, but rather by the nature of the problem. Alternatively, this example could be lumped into The space you’re selecting over happens to mostly contain bad things if we either (i) reframe the space under consideration from "deployed AIs" to "AIs capable of mounting a takeover" (h/t Thomas Kehrenberg), or (ii) reframe The space you’re selecting over happens to mostly contain bad things to The space you’re selecting over happens to mostly contain bad things, relative to the number of selections made.  But I think the fact that a selection may not terminate until a bad thing has been found is an important thing to pay attention to when it comes up, and weakly think it'd be useful to have a separate conceptual handle for it.

Aiming your efforts at worst-case scenarios
As long as some failure states are worse than others, optimising for the satisfaction of a binary success criterion won't generally be sufficient to maximise your marginal impact. Instead, you should target worlds based in part on how bad failure within them would be, along with the change in success probability for a marginal contribution. For example, maybe many low P(doom) worlds are such because intent-aligning AI turns out to be pretty straightforward in them. But easy intent-alignment may imply higher misuse risk, such that if misuse risk is more concerning than accident risk then contributing towards solving alignment problems in ways robust to misuse may remain very high impact in easy-intent-alignment worlds.[2]

One alternative way to state this consideration is that in most domains, there are actually multiple overlapping success criteria. Sometimes the more easily satisfied ones will be much higher-priority to target—even if your marginal contributions result in smaller changes to the odds of satisfying them—because they are more important.

This consideration is the main reason I prioritise worst-case AI outcomes (i.e. s-risks) over ordinary x-risk from AI.

Some bad things might be really bad
In a similar vein, for The space you’re selecting over happens to mostly contain bad things, it's not the raw probability of selecting a bad thing that matters, but the product of that with the expected harm of a bad thing. Since some bad things are Really Very Terrible, sometimes it will make sense to use worst-case assumptions even when bad things are quite rare, as long as the risk of finding one isn't Pascalian. I think the EU of an insecure selection is at particular risk of being awful whenever the left tail of the utility distribution of things you're selecting for is much thicker than the right.

  1. ^

    This is plausible to me because gradient-hacking could yield a "sharp left turn", taking us very OOD for the sort of models runs had previously been producing. Some other sharp left turn candidates should work just as well in this example.

  2. ^

    This is an interesting example, because in low P(doom) worlds of this sort marginal efforts to advance intent-alignment seem more likely to be harmful. If that were the case, alignment researchers would want to prioritise developing techniques that differentially help align AI to widely endorsed values rather than to the intent of an arbitrary deployer. Efforts to more directly intervene to prevent misuse would also look pretty valuable.

    But because of effects like these, it's not obvious that you would want to prioritise low P(doom) worlds even if you were convinced that failure within them was worse than in high P(doom) worlds, since advancing-intent-alignment interventions might be helpful in most other worlds where it might be harder for malevolent users to make use of them. (And it's definitely not apparent to me in reality that failure in low P(doom) worlds is worse than in high P(doom) worlds for this reason; I just thought this would make for a good example!)

Comment by JamesFaville (elephantiskon) on Why and When Interpretability Work is Dangerous · 2023-06-12T13:21:14.631Z · LW · GW

Another way interpretability work can be harmful: some means by which advanced AIs could do harm require them to be credible. For example, in unboxing scenarios where a human has something an AI wants (like access to the internet), the AI might be much more persuasive if the gatekeeper can verify the AI's statements using interpretability tools. Otherwise, the gatekeeper might be inclined to dismiss anything the AI says as plausibly fabricated. (And interpretability tools provided by the AI might be more suspect than those developed beforehand.)

It's unclear to me whether interpretability tools have much of a chance of becoming good enough to detect deception in highly capable AIs. And there are promising uses of low-capability-only interpretability -- like detecting early gradient hacking attempts, or designing an aligned low-capability AI that we are confident will scale well. But to the extent that detecting deception in advanced AIs is one of the main upsides of interpretability work people have in mind (or if people do think that interpretability tools are likely to scale to highly capable agents by default), the downsides of those systems being credible will be important to consider as well.

Comment by JamesFaville (elephantiskon) on MIRI announces new "Death With Dignity" strategy · 2022-04-03T13:15:13.992Z · LW · GW

There is another very important component of dying with dignity not captured by the probability of success: the badness of our failure state. While any alignment failure would destroy much of what we care about, some alignment failures would be much more horrible than others. Probably the more pessimistic we are about winning, the more we should focus on losing less absolutely (e.g. by researching priorities in worst-case AI safety).

Comment by JamesFaville (elephantiskon) on The average North Korean mathematician · 2021-03-07T22:19:08.006Z · LW · GW

I feel conflicted about this post. Its central point as I'm understanding it is that much evidence we commonly encounter in varied domains is only evidence about the abundance of extremal values in some distribution of interest, and whether/how we should update our beliefs about the non-extremal parts of the distribution is very much dependent on our prior beliefs or gears-level understanding of the domain. I think this is a very important idea, and this post explains it well.

Also, felt inspired to search out other explanations of the moments of a distribution - this one looks pretty good to me so far.

On the other hand, the men's rights discussion felt out of place to me, and unnecessarily so since I think other examples would be able to work just as well. Might be misjudging how controversial various points you bring up are, but as of now I'd rather see topics of this level of potential political heat discussed in personal blogposts or on other platforms, so long as they're mostly unrelated to central questions of interest to rationalists / EAs.

Comment by JamesFaville (elephantiskon) on Generalized Heat Engine · 2020-11-08T01:35:28.067Z · LW · GW

This is super interesting!

Quick typo note (unless I'm really misreading something): in your setups, you refer to coins that are biased towards tails, but in your analyses, you talk about the coins as though they are biased towards heads.

One is the “cold pool”, in which each coin comes up 1 (i.e. heads) with probability 0.1 and 0 with probability 0.9. The other is the “hot pool”, in which each coin comes up 1 with probability 0.2

 random coins with heads-probability 0.2

We started with only  tails

full compression would require roughly  tails, and we only have about 

Comment by JamesFaville (elephantiskon) on human psycholinguists: a critical appraisal · 2020-01-05T17:00:33.733Z · LW · GW

As far as I'm aware, there was not (in recent decades at least) any controversy that word/punctuation choice was associative. We even have famous psycholinguistics experiments telling us that thinking of the word "goose" makes us more likely to think of the word "moose" as well as "duck" (linguistic priming is the one type of priming that has held up to the replication crisis as far as I know). Whenever linguists might have bothered to make computational models, I think those would have failed to produce human-like speech because their associative models were not powerful enough.

Comment by JamesFaville (elephantiskon) on human psycholinguists: a critical appraisal · 2019-12-31T13:34:24.070Z · LW · GW

This comment does not deserve to be downvoted; I think it's basically correct. GPT-2 is super-interesting as something that pushes the bounds of ML, but is not replicating what goes on under-the-hood with human language production, as Marcus and Pinker were getting at. Writing styles don't seem to reveal anything deep about cognition to me; it's a question of word/punctuation choice, length of sentences, and other quirks that people probably learn associatively as well.

Comment by JamesFaville (elephantiskon) on Information empathy · 2019-07-30T05:56:23.620Z · LW · GW

Why should we say that someone has "information empathy" instead of saying they possess a "theory of mind"?

Possible reasons: "theory of mind" is an unwieldy term, it might be useful to distinguish in fewer words a theory of mind with respect to beliefs from a theory of mind with respect to preferences, you want to emphasise a connection between empathy and information empathy.

I think if there's established terminology for something we're interesting in discussing, there should be a pretty compelling reason why it doesn't suffice for us.

Comment by JamesFaville (elephantiskon) on On AI and Compute · 2019-04-04T15:30:17.491Z · LW · GW

It felt weird to me to describe shorter timeline projections as "optimistic" and longer ones as "pessimistic"- AI research taking place over a longer period is going to be more likely to give us friendly AI, right?

Comment by JamesFaville (elephantiskon) on (Why) Does the Basilisk Argument fail? · 2019-02-10T14:12:26.657Z · LW · GW

This approach can be made a little more formal with FDT/LDT/TDT: being the sort of agent who robustly does not respond to blackmail maximises utility more than being the sort of agent who sometimes gives in to blackmail, because you will not wind up in situations where you're being blackmailed.

Comment by JamesFaville (elephantiskon) on Subjunctive Tenses Unnecessary for Rationalists? · 2018-10-09T16:52:24.784Z · LW · GW

The subjunctive mood and really anything involving modality is complicated. Paul Portner has a book on mood which is probably a good overview if you're willing to get technical. Right now I think of moods as expressing presuppositions on the set of possible worlds you quantify over in a clause. I don't think it's often a good idea to try to get people to speak a native language in a way incompatible with the language as they acquired it in childhood; it adds extra cognitive load and probably doesn't affect how people reason (the exception being giving them new words and categories, which I think can clearly help reasoning in some circumstances).

Comment by JamesFaville (elephantiskon) on A compendium of conundrums · 2018-10-08T21:14:48.140Z · LW · GW

These are a blast!

Comment by JamesFaville (elephantiskon) on Advice Wanted; Reconcile with religious parent · 2018-09-22T15:27:53.065Z · LW · GW

I'm atheist and had an awesome Yom Kippur this year, so believing in God isn't a pre-req for going to services and not being unhappy. I think it would be sad if your father's kids gave up ritual practices that were especially meaningful to him and presumably to his ancestors. I think it would be sad if you sat through services that were really unpleasant for you year after year. I think it would be really sad if your relationship with your father blew up over this.

I think the happiest outcome would be that you wind up finding bits of the high holidays that you can enjoy, and your dad is satisfied with you maybe doing a little less than he might like. Maybe being stuck in synagogue for an entire day is bad, but going there for an hour or two gives you some interesting ethnographic observations to mull over. Talk it out with him, see what he really values, and compromise if you can.

Comment by JamesFaville (elephantiskon) on Wirehead your Chickens · 2018-06-21T21:43:39.485Z · LW · GW

I've seen this discussed before by Rob Wiblin and Lewis Bollard on the 80,000 Hours podcast (edit: tomsittler actually beat me to the punch in mentioning this).

Robert Wiblin: Could we take that even further and ultimately make animals that have just amazing lives that are just constantly ecstatic like they’re on heroin or some other drug that makes people feel very good all the time whenever they are in the farm and they say, “Well, the problem has basically been solved because the animals are living great lives”?
Lewis Bollard: Yeah, so I think this is a really interesting ethical question for people about whether that would, in people’s minds, solve the problem. I think from a pure utilitarian perspective it would. A lot of people would fine that kind of perverse having, for instance, particularly I think if you’re talking about animals that might psychologically feel good even in terrible conditions. I think the reason why it’s probably going to remain a thought experiment, though, is that it ultimately relies on the chicken genetics companies and the chicken producers to be on board...

I encourage anyone interested to listen to this part of the podcast or read it in the transcript, but it seems clear to me right now that it will be far easier to develop clean meat which is widely adopted than to create wireheaded chickens whose meat is widely adopted.

In particular, I think that implementing these strategies from the OP will be at least as difficult as creating clean meat:

  • breed animals who enjoy pain, not suffer from it
  • breed animals that want to be eaten, like the Ameglian Major Cow from the Hitchhiker's Guide to the Galaxy

I think that getting these strategies widely adopted is at least as difficult as getting enough welfare improvements widely adopted to make non-wireheaded chicken lives net-positive

  • identify and surgically or chemically remove the part of the brain that is responsible for suffering
  • at birth, amputate the non-essential body parts that would give the animals discomfort later in life

I think that breeding for smaller brains is not worthwhile because smaller brain size does not guarentee reduced suffering capacity and getting it widely adopted by chicken breeders is not obviously easier than getting many welfare improvements widely adopted.

I'm not as confident that injecting chickens with opioids would be a bad strategy, but getting this widely adopted by chicken farms is not obviously easier to me than getting many other welfare improvements widely adopted. I would be curious to see the details of the study romeostevensit mentioned, but my intuition is that outrage at that practice would far exceed outrage at current factory farm practices because of "unnaturalness", which would make adoption difficult even if the cost of opioids is low.

Comment by JamesFaville (elephantiskon) on Beyond Astronomical Waste · 2018-06-12T11:54:25.512Z · LW · GW

Nothing, if your definition of a copy is sufficiently general :-)

Am I understanding you right that you believe in something like a computational theory of identity and think there's some sort of bound on how complex something we'd attribute moral patienthood or interestingness to can get? I agree with the former, but don't see much reason for believing the latter.

Comment by JamesFaville (elephantiskon) on A Rationalist Argument for Voting · 2018-06-08T01:30:04.672Z · LW · GW

I just listened to a great talk by Nick Bostrom I'd managed to miss before now which mentions some considerations in favor and opposed to voting. He does this to illustrate a general trend that in certain domains it's easy to come across knock-down arguments ("crucial considerations") that invalidate or at least strongly counter previous knock-down arguments. Hope I summarized that OK!

When I last went to the polls, I think my main motivation for doing so was functional decision theory.

Comment by JamesFaville (elephantiskon) on Beyond Astronomical Waste · 2018-06-08T01:17:53.867Z · LW · GW

I feel like scope insensitivity is something to worry about here. I'd be really happy to learn that humanity will manage to take good care of our cosmic endowment but my happiness wouldn't scale properly with the amount of value at stake if I learned we took good care of a super-cosmic endowment. I think that's the result of my inability to grasp the quantities involved rather than a true reflection of my extrapolated values, however.

My concern is more that reasoning about entities in simpler universes capable of conducting acausal trades with us will turn out to be totally intractable (as will the other proposed escape methods), but since I'm very uncertain about that I think it's definitely worth further investigation. I'm also not convinced Tegmark's MUH is true in the first place, but this post is making me want to do more reading on the arguments in favor & opposed. It looks like there was a Rationally Speaking episode about it?

Comment by JamesFaville (elephantiskon) on Shadow · 2018-03-15T02:38:20.456Z · LW · GW

I actually like the idea of building a "rationalist pantheon" to give us handy, agenty names for important but difficult concepts. This requires more clearly specifying what the concept being named is: can you clarify a bit? Love Wizard of Earthsea, but don't get what you're pointing at here.

Comment by JamesFaville (elephantiskon) on Is skilled hunting unethical? · 2018-02-19T15:31:31.335Z · LW · GW

I think normal priors on moral beliefs come from a combination of:

  • Moral intuitions
  • Reasons for belief that upon reflection, we would accept as valid (e.g. desire for parsimony with other high-level moral intuitions, empirical discoveries like "vaccines reduce disease prevalence")
  • Reasons for belief that upon reflection, we would not accept as valid (e.g. selfish desires, societal norms that upon reflection we would consider arbitrary, shying away from the dark world)

I think the "Disney test" is useful in that it seems like it depends much more on moral intuitions than on reasons for belief. In carrying out this test, the algorithm you would follow is (i) pick a prior based on the movie heuristic, (ii) recall all consciously held reasons for belief that seem valid, (iii) update your belief in the direction of those reasons from the heuristic-derived prior. So in cases where our belief could be biased by (possibly unconscious) reasons for belief that upon reflection we would not accept as valid, where the movie heuristic isn't picking up many of these reasons, I'd expect this algorithm to be useful.

In the case of vaccinations, the algorithm makes the correct prediction: the prior-setting heuristic would give you a strong prior that vaccinations are immoral, but I think the valid reasons for belief are strong enough that the prior is easily overwhelmed.

I can come up with a few cases where the heuristic points me towards other possible moral beliefs I wouldn't have otherwise considered, whose plausibility I've come to think is undervalued upon reflection. Here's a case where I think the algorithm might fail: wealth redistribution. There's a natural bias towards not wanting strong redistributive policies if you're wealthy, and an empirical case in favor of redistribution within a first-world country with some form of social safety net doesn't seem nearly as clear-cut to me as vaccines. My moral intuition is that hoarding wealth is still bad, but I think the heuristic might point the other way (it's easy to make a film about royalty with lots of servants, although there are some examples like Robin Hood in the other direction).

Also, your comments have made me think a lot more about what I was hoping to get out of the heuristic in the first place and about possible improvements; thanks for that! :-)

Comment by JamesFaville (elephantiskon) on Is skilled hunting unethical? · 2018-02-18T00:34:12.960Z · LW · GW

I don't think the vaccination example shows that the heuristic is flawed: in the case of vaccinations, we do have strong evidence that vaccinations are net-positive (since we know their impact on disease prevalance, and know how much suffering there can be associated with vaccinatable diseases). So if we start with a prior that vaccinations are evil, we quickly update to the belief that vaccinations are good based on the strength of the evidence. This is why I phrased the section in terms of prior-setting instead of evidence, even though I'm a little unsure how a prior-setting heuristic would fit into a Bayesian epistimology. If there's decently strong evidence that skilled hunting is net-positive, I think that should outweigh any prior developed through the children's movie heuristic. But in the absence of such evidence, I think we should default to the naive position of it being unethical. Same with vaccines.

I'd be interested to know if you can think of a clearer counterexample though: right now, I'm basing my opinion of the heuristic on a notion that the duck test is valuable when it comes to extrapolating moral judgements from a mess of intuitions. What I have in mind as a counterexample is a behavior that upon reflection seems immoral but without compelling explicit arguments on either side, for which it is much easier to construct a compelling children's movie whose central conceit is that the behavior is correct than it is to construct a movie with the conceit that the behavior is wrong (or vice-versa).

Comment by JamesFaville (elephantiskon) on Is skilled hunting unethical? · 2018-02-17T21:47:31.707Z · LW · GW

Thanks for the feedback Raemon!

Concrete Concerns

I'd like to see ["when predators are removed from a system, a default thing that seems to happen is that death-by-predator is replaced by death-by-starvation" and "how do you do population control without hunting?"] at least touched on in wild-animal-suffering pieces

I'd like to see those talked about too! The reason I didn't is I really don't have any insights on how to do population control without hunting, or on which specific interventions for reducing wild animal suffering are promising. I could certainly add something indicating I think those sorts of questions are important, but that I don't really have any answers beyond "create welfare biology" and "spread anti-speciesism memes so that when we have better capabilities we will actually carry out large interventions".

have a table of contents of the issues at hand

I had a bit of one in the premise ("wild animal welfare, movement-building, habit formation, moral uncertainty, how to set epistemic priors"), but it sounds like you might be looking for something different/more specific? You're not talking about a table of contents consisting of more or less the section headings right?

Aiming to Persuade vs Inform

My methodology was "outline different reasons why skilled hunting could remain an unethical action", but I did a poor job of writing if the article seemed as though I thought each reason was likely to be true! I did put probabilities on everything to calculate the 90% figure at the top, but since I don't consider myself especially well-calibrated I thought it might be better to leave them off... The only reason that I think is actually more likely to be valid than wrong is #3, but I do assign enough probability mass to the others that I think they're of some concern.

I thought the arguments in favor of skilled hunting (make hunters happy and prevent animals from experience lives which might involve lots of suffering) were pretty apparent and compelling, but I might be typical-minding that. I also might be missing something more subtle?

In terms of whether that methodology was front-page appropriate, I do think that if the issue I was writing about was something slightly more political this would be very bad. But as I saw it, the main content of the piece isn't the proposition that skilled hunting is unethical, it's the different issues that come up in the process discussing it ("wild animal welfare, movement-building, habit formation, moral uncertainty, how to set epistemic priors"). My goal is not to persuade people that I'm right and you must not hunt even if you're really good at it, but to talk about interesting hammers in front of an interesting nail.

[Edit: Moved to personal blog.]

Comment by JamesFaville (elephantiskon) on Rationalist Lent · 2018-02-14T14:38:59.660Z · LW · GW

Why do you think we should be more worried about reading fiction? Associated addictiveness, time consumption, escapism?

Comment by JamesFaville (elephantiskon) on What Are Meetups Actually Trying to Accomplish? · 2018-02-09T01:10:29.794Z · LW · GW

Possible low-hanging fruit: name tags.

Comment by JamesFaville (elephantiskon) on What the Universe Wants: Anthropics from the POV of Self-Replication · 2018-01-12T20:43:43.783Z · LW · GW

What I'm taking away from this is that if (i) it is possible for child universes to be created from parent universes, and if (ii) the "fertility" of a child universe is positively correlated with that of its parent universe, then we should expect to live in a universe which will create lots of fertile child universes, whether this is accomplished through a natural process or as you suggest through inhabitants of the universe creating fertile child universes artificially.

I think that's a cool concept, and I wrote a quick Python script for a toy model to play around with. Your consequences seem kind of implausible to me though (I might try to write more on that later).

Comment by elephantiskon on [deleted post] 2018-01-12T05:38:43.395Z

Essentially, I read this as an attempt at continental philosophy rather than analytic philosophy, and I don't find continental-style work very interesting or useful. I believe you that the post is meaningful and thoughtful, but the costs of time or effort to understand the meanings or thoughts you're driving at are too high for me at least. I think trying to lay things out in a more organized and explicit manner would be helpful for your readers and possibly for you in developing these thoughts.

I don't want to get too precise about answering the above unless you're still interested in me doing so and don't mind me stating things in a way that might come across as pretty rude. Also, limiting myself to one more reply here since I should really stop procrastinating work, and just in case.

Comment by elephantiskon on [deleted post] 2018-01-11T20:03:38.393Z

I'm downvoting this post because I don't understand it even after your reply above, and the amount of negative karma currently on the post indicates to me that it's probably not my fault. It's possible to write a poetic and meaningful post about a topic and pleasant when someone has done so well, but I think you're better off first trying to state explicitly whatever you're trying to state to make sure the ideas are fundamentally plausible. I'm skeptical that meditations on a topic of this character are actually helpful to truth-seeking, but I might be typical-minding you.

Comment by JamesFaville (elephantiskon) on An Artificial paradise made by humans. (A bit Sci-fi idea) · 2018-01-11T03:33:38.515Z · LW · GW

I'm downvoting this because it appears to be a low-effort post which doesn't contribute or synthesize any interesting ideas. Prime Intellect is the novel that first comes to mind as discussing some of what you're talking about, but several chapters are very disturbing, and there's probably better examples out there. If you have Netflix, San Junipero (Season 3 Episode 4) of Black Mirror is fantastic and very relevant.

Comment by JamesFaville (elephantiskon) on The Loudest Alarm Is Probably False · 2018-01-03T00:30:23.591Z · LW · GW

I like this post's brevity, its usefulness, and the nice call-to-action at the end.

Comment by elephantiskon on [deleted post] 2018-01-03T00:24:45.679Z

I found the last six paragraphs of this piece extremely inspiring, to the extent that I think it nonnegligably raised the likelihood that I'll be taking "exceptional action" myself. I didn't personally connect much with the first part, though it was interesting. Did you used to want to want your reaction to idiocy be “'how can I help'”, even when it wasn't?

Comment by JamesFaville (elephantiskon) on The essay "Interstellar Communication Using Microbes: Implications for SETI" has implications for The Great Filter. · 2017-12-22T08:02:44.412Z · LW · GW

The case against "geospermia" here is vastly overstated: there's been a lot of research over the past decade or two establishing very plausible pathways for terrestrial abiogensis. If you're interested, read through some work coming out of Jack Szostak's lab (there's a recent review article here). I'm not as familiar with the literature on prebiotic chemistry as I am with the literature on protocell formation, but I know we've found amino acids on meteorites, and it wouldn't be surprising if they and perhaps some other molecules which are important to life were introduced to earth through meteorites rather than natural syntheses.

But in terms of cell formation, the null hypothesis should probably be that it occured on Earth. Panspermia isn't ridiculous per se, but conditions on Earth appear to have been much more suitable for cell formation than those of the surrounding neighborhood, and sufficiently suitable that terrestrial abiogensis isn't implausible in the least. When it comes to ways in which there could be wild-animal suffering on a galactic scale, I think the possibility of humans spreading life through space colonization is far more concerning.

Also, Zubrin writes:

Furthermore, it needs to be understood that the conceit that life originated on Earth is quite extraordinary. There are over 400 billion of stars in our galaxy, with multiple planets orbiting many of them. There are 51 billion hectares on Earth. The probability that life first originated on Earth, rather than another world, is thus comparable to the probability that the first human on our planet was born on any particular 0.1 hectare lot chosen at random, for example my backyard. It really requires evidence, not merely an excuse for lack of evidence, to be supported.

This is poor reasoning. A better metaphor would be that we're looking at a universe with no water except for a small pond somewhere, and wondering where the fish that currently live in that pond evolved. If water is so rare, why shouldn't we be confused that the pond exists in the first place? Anthropic principle (but be careful with this). Disclaimer: Picking this out because I thought it was the most interesting part in the piece, not because I went looking for bad metaphors.

As a meta-note, I was a little suspicious of this piece based on some bad signaling (the bio indicates potential bias, tables are made through screenshots, the article looks like it wants to be in a journal but is hosted on a private blog). I don't like judging things based on potentially spurious signals, but this might have nevertheless biased me a bit and I'm updating slightly in the direction of those signals being valuable.

Comment by JamesFaville (elephantiskon) on Rationalist Politicians · 2017-12-22T01:39:39.808Z · LW · GW

Have a look at 80K's (very brief) career profile for party politics. My rough sense is that efective altruists generally agree that pursuing elected office can be a very high-impact career path for individuals particularly well-suited to it, but think that even with an exceptional candidate succeeding is very difficult.

Comment by JamesFaville (elephantiskon) on Improvement Without Superstition · 2017-12-16T22:14:47.553Z · LW · GW

Upvoted mostly for surprising examples about obstetrics and CF treatment and for a cool choice of topic. I think your question, "when is one like the doctors saving CF patients and when is one like the doctors doing super-radical mastectomies?" is an important one to ask, and distinct from questions about modest epistomology.

Say there is a set of available actions of which a subset have been studied intensively enough that their utility is known with high degree of certainty, but that the utility of the other available actions in is uncertain. Then your ability to surpass the performance of an agent who chooses actions only from essentially comes down to a combination of whether choosing uncertain-utility actions from precludes also picking high-utility actions from , and what the expected payoff is from choosing uncertain-utility actions in according to your best information.

I think you could theoretically model many domains like this, and work things out just by maximizing your expected utility. But it would be nice to have some better heuristics to use in daily life. I think the most important questions to ask yourself are really (i) how likely are you to horribly screw things up by picking an uncertain-utility action, and (ii) do you care enough about the problem you're looking at to take lots of actions that have a low chance of being harmful, but a small chance of being positive.

Comment by JamesFaville (elephantiskon) on Strategic High Skill Immigration · 2017-12-06T06:23:55.044Z · LW · GW

I don't have much of a thoughtful opinion on the question at hand yet (though I have some questions below), but I wanted to express a deep appreciation for your use of detail elements: it really helps readability!

One concern I would want to see addressed is an estimation of negative effects of a "brain drain" on regional economies- if a focused high-skilled immigration policy has the potential to exacerbate global poverty, the argument that it has a positive impact on the far future needs to be very compelling. So would these economic costs be significant, or negligible? And would a more broadly permissive immigration policy have similar advantages? Also, given the scope of the issues at hand I would be very surprised if the advantages you ascribe to high-skilled immigration are all of roughly equal expected value: is there one which you think dominates the others? (Like reduced x-risk from AI?)

Comment by JamesFaville (elephantiskon) on Motivating a Semantics of Logical Counterfactuals · 2017-09-23T18:09:01.577Z · LW · GW

(Disclaimer: There's a good chance you've already thought about this.)

In general, if you want to understand a system (construal of meaning) forming a model of the output of that system (truth-conditions and felicity judgements) is very helpful. So if you're interested in understanding how counterfactual statements are interpreted, I think the formal semantics literature is the right place to start (try digging through the references here, for example).

Comment by JamesFaville (elephantiskon) on Fish oil and the self-critical brain loop · 2017-09-15T13:47:28.891Z · LW · GW

Muting the self-critical brain loop (and thanks for that terminology!) is something I'm very interested in. Have you investigated vegan alternatives to fish oil at all?

Comment by JamesFaville (elephantiskon) on Open thread, Jan. 16 - Jan. 22, 2016 · 2017-01-16T21:17:17.123Z · LW · GW

At what age do you all think people have the greatest moral status? I'm tempted to say that young children (maybe aged 2-10 or so) are more important than adolescents, adults, or infants, but don't have any particularly strong arguments for why that might be the case.