Posts

How does anthropic reasoning and illusionism/eliminitivism interact? 2022-10-05T22:31:49.773Z
Shiroe's Shortform 2022-10-01T14:28:48.125Z

Comments

Comment by Shiroe on Settled questions in philosophy · 2024-10-17T20:38:51.075Z · LW · GW

The existence of God and Free Will feel like religious problems that philosophers took interest in, and good riddance to them.

Whether the experience of suffering/pain is fictional or not is a hot topic in some circles, but both sides are quite insistent about being good church-going "materialists" (whatever that means).

As for "knowledge", I agree that question falls apart into a million little subproblems. But it took the work of analytic philosophers to pull it apart, and after much labor. You're currently reaping the rewards of that work and the simplicity of hindsight.

Comment by Shiroe on Settled questions in philosophy · 2024-10-17T20:24:34.352Z · LW · GW

Do you know if Plato was claiming Euclidean geometry was physically true in that sense? Doesn't sound like something he would say.

Comment by Shiroe on the case for CoT unfaithfulness is overstated · 2024-10-02T16:51:21.416Z · LW · GW

I'd like to see how this would compare to a human organization. Suppose individual workers or individual worker-interactions are all highly faithful in a tech company. Naturally, though, the entire tech company will begin exhibiting misalignment, tend towards blind profit seeking, etc. Despite the faithfulness of its individual parts.

Is that the kind of situation you're thinking of here? Is that why having mind-reading equipment that forced all the workers to dump their inner monologue wouldn't actually be of much use towards aligning the overall system, because the real problem is something like the aggregate or "emergent" behavior of the system, rather than the faithfulness of the individual parts?

Comment by Shiroe on the case for CoT unfaithfulness is overstated · 2024-10-02T16:03:10.770Z · LW · GW

What do you mean by "over the world"? Are you including human coordination problems in this?

Comment by Shiroe on Increasing IQ is trivial · 2024-09-08T19:29:25.308Z · LW · GW

Did you end up writing the list of interventions? I'd like to try some of them. (I also don't want to commit to doing 3 hours a day for two weeks until I know what the interventions are.)

Comment by Shiroe on Nick Bostrom’s new book, “Deep Utopia”, is out today · 2024-03-30T10:33:51.208Z · LW · GW

It's very surprising to me that he would think there's a real chance of all humans collectively deciding to not build AGI, and successfully enforcing the ban indefinitely.

Comment by Shiroe on Do not delete your misaligned AGI. · 2024-03-25T08:43:48.607Z · LW · GW

Patternism is usually defined as a belief about the metaphysics of consciousness, but that boils down to incoherence, so it's better defined as a property of a utility function of agents not minding being subjected to major discontinuities in functionality, ie, being frozen, deconstructed, reduced to a pattern of information, reconstructed in another time and place, and resumed.

That still sounds like a metaphysical belief, and less empirical since consciousness experience isn't involved in it (instead it sounds like it's just about personal identity).

Comment by Shiroe on If you weren't such an idiot... · 2024-03-03T04:32:18.488Z · LW · GW

Any suggestions for password management?

Comment by Shiroe on Increasing IQ is trivial · 2024-03-03T04:23:21.680Z · LW · GW

Because it's an individualized approach that is a WIP and if I just write it down 99% of people will execute it badly.

Why is that a problem? Do you mean this in the sense of "if I do this, it will lead to people making false claims that my experiment doesn't replicate" or "if I do this, nothing good will come of it so it's not even worth the effort of writing".

Comment by Shiroe on Increasing IQ is trivial · 2024-03-03T04:19:41.424Z · LW · GW

I'm confused whether:

  1. the point of this article is that the IQ tests are broken, because some trivial life improvements (like doing yoga and eating blueberries) will raise your IQ score or whether:
  2. the point of this article is that you actually raised your "g" by doing trivial life improvements, and we should be excited by how easy it is to become more intelligent

Skimming it again I'm pretty sure you mean (2).

Comment by Shiroe on How is Chat-GPT4 Not Conscious? · 2024-03-03T04:06:41.930Z · LW · GW

If I understand right the last sentence should say "does not hold".

Comment by Shiroe on Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do. · 2024-02-24T08:24:38.757Z · LW · GW

It's not easy to see the argument for treating your vales as incomparable with the values of other people, but seeing your future self's values as identical to your own. Unless you've adopted some idea of a personal soul.

Comment by Shiroe on Deep atheism and AI risk · 2024-01-06T15:37:18.608Z · LW · GW

The suffering and evil present in the world has no bearing on God's existence. I've always failed to buy into that idea. Sure, it sucks. But it has no bearing on the metaphysical reality of a God. If God does not save children--yikes I guess? What difference does it make? A creator as powerful as has been hypothesised can do whatever he wants; any arguments from rationalism be damned.

Of course, the existence of pointless suffering isn't an argument against the existence of a god. But it is an old argument against the existence of a god who deserves to be worshipped with sincerity. We might even admit that there is a cruel deity, and still say non serviam, which I think is a more definite act of atheism than merely doubting any deity's existence.

Comment by Shiroe on Terminology: <something>-ware for ML? · 2024-01-03T17:37:58.188Z · LW · GW

"tensorware" sprang to mind

Comment by Shiroe on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-29T04:06:39.131Z · LW · GW

Yeah, it's hard to say whether this would require restructuring the whole reward center in the brain or if the needed functionality is already there, but just needs to be configured with different "settings" to change the origin and truncate everything below zero.

My intuition is that evolution is blind to how our experiences feel in themselves. I think it's only the relative differences between experiences that matter for signaling in our reward center. This makes a lot of sense when thinking about color and "qualia inversion" thought experiments, but it's trickier with valence. My color vision could become inverted tomorrow, and it would hardly affect my daily routine. But not so if my valences were inverted.

Comment by Shiroe on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-27T22:22:56.700Z · LW · GW

What about our pre-human ancestors? Is the twist that humans can't have negative valences either?

Comment by Shiroe on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-27T19:33:26.509Z · LW · GW

I agreed up until the "euthanize everything that remains" part. If we actually get to the stage of having aligned ASI, there are probably other options with the same or better value. The "gradients of bliss" that I described in another comment may be one.

Comment by Shiroe on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-27T14:53:33.612Z · LW · GW

Pearce has the idea of "gradients of bliss", which he uses to try to address the problem you raised about insensitivity to pain being hazardous. He thinks that even if all of the valences are positive, the animal can still be motivated to avoid danger if doing so yields an even greater positive valence than the alternatives. So the prey animals are happy to be eaten, but much more happy to run away.

To me, this seems possible in principle. When I feel happy, I'm still motivated at some low level to do things that will make me even happier, even though I was already happy to begin with. But actually implementing "gradients of bliss" in biology seems like a post-ASI feat of engineering.

(By the way, your idea of predation-induced unconsciousness isn't one I had heard before, it's interesting.)

Comment by Shiroe on 5. Moral Value for Sentient Animals? Alas, Not Yet · 2023-12-27T12:40:17.800Z · LW · GW

What are your thoughts on David Pearce's "abolitionist" project? He suggests genetically engineering wild animals to not experience negative valences, but still show the same outward behavior. From a sentientist stand-point, this solves the entire problem, without visibly changing anything.

Comment by Shiroe on Here's the exit. · 2023-12-21T23:04:54.577Z · LW · GW

Same. I feel somewhat jealous of people who can have a visceral in-body emotional reaction to X-risks. For most of my life I've been trying to convince my lizard brain to feel emotions that reflect my beliefs about the future, but it's never cooperated with me.

Comment by Shiroe on How to Control an LLM's Behavior (why my P(DOOM) went down) · 2023-11-29T13:00:53.103Z · LW · GW

You can compress huge prompts into metatokens, too (just run inference with the prompt to generate the training data)

I'm very curious about this technique but couldn't find anything about it. Do you have any references I can read?

Comment by Shiroe on Justification for Induction · 2023-11-27T20:24:19.159Z · LW · GW

I see. Yes, "philosophy" often refers to particular academic subcultures, with people who do their philosophy for a living as "philosophers" (Plato had a better name for these people). I misread your comment at first and thought it was the "philosopher" who was arguing for the instrumentalist view, since that seems like their more stereotypical way of thinking and deconstructing things (whereas the more grounded physicist would just say "yes, you moron, electrons exist. next question.").

Comment by Shiroe on Justification for Induction · 2023-11-27T13:57:52.692Z · LW · GW

Do you have any examples of the "certain philosophers" that you mentioned? I've often heard of such people described that way, but I can't think of anyone who's insulted scientists for assuming e.g. causality is real.

Comment by Shiroe on Justification for Induction · 2023-11-27T08:59:15.580Z · LW · GW

On the contrary, it is my intention to illustrate that assertions of instances that have not been experienced (with respect to their assertion at t1) can be justified in the future in which they are observed (with respect to their observation at t2).

 

Sorry, I may not be following this right. I had thought the point of the skeptical argument was that you can't justify a prediction about the future until it happens. Induction is about predicting things that haven't happened yet. You don't seem to be denying the skeptical argument here, if we still need to wait for the prediction to resolve before it can be justified.

Comment by Shiroe on What's the evidence that LLMs will scale up efficiently beyond GPT4? i.e. couldn't GPT5, etc., be very inefficient? · 2023-11-24T22:14:27.355Z · LW · GW

I've also noticed that scaffolded LLM agents seem inherently safer. In particular, deceptive alignment would be hard for one such agent to achieve, if at every thought-step it has to reformulate its complete mind state into the English language just in order to think at all.

You might be interested in some work done by the ARC Evals team, who prioritize this type of agent for capability testing.

Comment by Shiroe on Seth Explains Consciousness · 2023-11-21T18:56:29.630Z · LW · GW

I'm sorry that comparing my position to yours led to some confusion: I don't deny the reality of 3rd person facts. They probably are real, or at least it would be more surprising if they weren't than if they were. (If not, then where would all of the apparent complexity of 1st person experience come from? It seems positing an external world is a good step in the right direction to answering this). My comparison was about which one we consider to be essential. If I had used only "pragmatist" and "agnostic" as descriptors, it would have been less confusing.

Again, I think the main difference between our positions is how we define standards of evidence. To me, it would be surprising if someone came to know 3rd person facts without using 1st person facts in the process. If the 1st person facts are false, this casts serious doubt on the 3rd person facts which were implied. At our stage of the conversation, it seems like we can start proposing far more effective theories, like that nothing exists at all, which explains just as much of the available evidence we still have if we have no 1st person facts.

You seem to believe we can get at the true third person reality directly, maybe imagining we are equivalent to it. You can imagine a robot (i.e. one of us) having its pseudo-experiences and pseudo-observations all strictly happening in the 3rd person, even engaging in scientific pursuits, without needing to introduce an idea like the 1st person. But as you said earlier, just because you can imagine something, doesn't mean that it's possible. You need to start with the evidence available to you, not what sounds reasonable to you. The idea of that robot is occurring in your 1st person perspective as a mental experience, which means it counts as evidence for the 1st person perspective at least as much as it counts as evidence for the 3rd. So does what it feels like to think eliminitivism is possible, and so does what it feels like to chew 5 Gum® and etc, and etc.

To me, all of this is a boring tautology. For you, it's more like a boring absurdity, or rather it's the truth turned upside down and pulled inside out. This is why I'm more interested in finding a double crux, something that would reveal the precise points where our thinking diverges and reconverges. There are already some parallels that we've both noticed, I think. I would say that you believe in the 1st person but with only one authentic observer: God, who is and who sees everything with perfect indifference, like in Spinoza's idea. You could also reframe my notion of the 1st person to be a kind of splintered or shattered 3rd person reality, one which can never totally connect itself back together all at once. Our ways of explaining away the problems are essentially the same: we both stress that our folk theoretic concepts are untrustworthy, that we are deceiving ourselves, that we apply a theory which shapes our interpretations without us realizing it. We are also both missing quite a few teeth, from biting quite a few bullets.

There must be some precise moment where our thinking diverges. What is that point? It seems like something we need to use a double crux to find. Do you have any ideas?

Comment by Shiroe on Seth Explains Consciousness · 2023-11-04T12:59:45.214Z · LW · GW

If I had to choose between those two phrasings I would prefer the second one, for being the most compatible between both of our notions. My notion of "emerges from" is probably too different from yours.

The main difference seems to be that you're a realist about the third-person perspective, whereas I'm a nominalist about it, to use your earlier terms. Maybe "agnostic" or "pragmatist" would be good descriptors too. The third-person is a useful concept for navigating the first-person world (i.e. the one that we are actually experiencing). But that it seems useful is really all that we can say about it, due to the epistemological limitations we have as human observers.

I think this is why it would be a good double crux if we used the issue of epistemological priority: I would think very differently about Hard Problem related questions if I became convinced that the 3rd person had higher priority than the 1st person perspective. Do you think this works as a double crux? Is it symmetrical for you as well in the right way?

Comment by Shiroe on Seth Explains Consciousness · 2023-10-26T15:18:08.833Z · LW · GW

I meant subjective in the sense of "pertaining to a subject's frame of reference", not subjective in the sense of "arbitrary opinion". I'm sorry if that was unclear.

Comment by Shiroe on Seth Explains Consciousness · 2023-10-26T15:11:43.600Z · LW · GW

But all of these observations are also happening from a third-person perspective, just like the rest of reality.

This is a hypothesis, based on information in your first-person perspective. To make arguments about a third-person reality, you will always have to start with first-person facts (and not the other way around). This is why the first person is epistemologically more fundamental.

It's possible to doubt that there is a third-person perspective (e.g. to doubt that there's anything like being God). But our first person perspective is primary, and cannot be escaped from. Optical illusions and stage tricks aren't very relevant to this, except in showing that even our errors require a first-person perspective to occur.

EDIT: The third-person perspective being epistemologically more/less fundamental than the first-person perspective could work as a double crux with me. Does it work on your end as well?

Comment by Shiroe on Seth Explains Consciousness · 2023-10-25T18:02:05.149Z · LW · GW

You don't believe that all human observations are necessarily made from a first-person viewpoint? Can you give a counter-example? All I can think of are claims that involve the paranormal or supernatural.

Comment by Shiroe on Seth Explains Consciousness · 2023-10-24T13:08:37.967Z · LW · GW

I don't think I fall into either camp because I think the question is ambiguous. It could be talking about the natural structure of space and time ("mathematics") or it could be talking about our notation and calculation methods ("mathematics"). The answer to the question is "it depends what you mean".

The nominalist vs realist issue doesn't appear very related to my understanding of the Hard Problem, which is more about the definition of what counts as valid evidence. Eliminitivism says that subjective observations are problematic. But all observations are subjective (first person), so defining what counts as valid evidence is still unresolved.

Comment by Shiroe on Seth Explains Consciousness · 2023-10-20T23:38:16.719Z · LW · GW

I appreciate hearing your view; I don't have any comments to make. I'm mostly interested in finding a double crux.

This isn't really a double crux, but it could help me think of one:

If someone becomes convinced that there isn't any afterlife, would this rationally affect their behavior? Can you think of a case where someone believed in Heaven and Hell, had acted rationally in accordance with that belief, then stopped believing in Heaven and Hell, but still acted just the same way as they did before? We're assuming their utility function hasn't changed, just their ontology.

Comment by Shiroe on Seth Explains Consciousness · 2023-10-13T14:21:59.768Z · LW · GW

Here are some cruxes, stated from what I take to be your perspective:

  1. That there's nothing at stake whether or not we have first person experiences of the kind that eliminitivists deny; it makes no practical difference to our lives whether we're so-called "automatons" or "zombies", such terms being only theoretical distinctions. Specifically it should make no difference to a rational ethical utilitarian whether or not eliminitivism happens to be true. Resources should be allocated the same way in either case, because there's nothing at stake.
  2. Eliminitivism is a more parsimonious theory than non-eliminitivism, and is strictly better than it for scientific purposes; elimitivism already explains all of the facts about our world, and adding so-called "first person experiences" is just a cog which won't connect to anything else; removing it wouldn't require arbitrary double standards for the validity of evidence.
  3. There's no way of separating experience from functionality in a system. If an organism manifests consistent and enduring behaviors of self-preservation, goal-seeking, etc. then it must have experiences, regardless of how the organism itself happens to be constructed.

I'm looking for double cruxes now. The first two don't seem very useful to me as double cruxes, but maybe the last one is. Any ideas?

Comment by Shiroe on Seth Explains Consciousness · 2023-09-26T05:23:57.304Z · LW · GW

because such sensations would be equivalent to predictions that I would be burning alive, which would be false and therefore interfere with my functioning

I don't see a necessary equivalence here. You could be fully aware that the sensations were inaccurate, or hallucinated. But it would still hurt just as much.

if you could have a body which doesn’t experience, then it’s not going to function as normal.

A human body, or any kind of body? It seems like a robot could engage in the same self-preservation behavior as a human without needing to have anything like burning sensations. I can imagine a sort of AI prosthesis for people born with congenital insensitivity to pain that would make their hand jerk away from a burning hot surface, despite them not ever experiencing pain or even knowing what it is.

Comment by Shiroe on Seth Explains Consciousness · 2023-09-09T17:26:50.395Z · LW · GW

You seem to be claiming that you have experiences, but that their role is purely functional. If you were to experience all tactile sensations as degrees of being burnt alive, but you could still make predictions just as well as before, it wouldn't make any difference to you?

Comment by Shiroe on Seth Explains Consciousness · 2023-09-09T17:15:26.755Z · LW · GW

It's plausible that reverse-engineering the human mind requires tools that are much more powerful than the human mind.

Comment by Shiroe on Seth Explains Consciousness · 2023-08-24T20:20:25.195Z · LW · GW

So you don't believe there is such a thing as first-person phenomenal experiences, sort of like Brian Tomasik? Could you give an example or counterexample of what would or wouldn't qualify as such an experience?

Comment by Shiroe on Seth Explains Consciousness · 2023-08-24T19:58:55.632Z · LW · GW

Doesn't "direct" have the implication of "certain" here?

Comment by Shiroe on Seth Explains Consciousness · 2023-08-24T19:51:20.889Z · LW · GW

Response in favor of the assumption that Signer said was detrimental.

Comment by Shiroe on Seth Explains Consciousness · 2023-08-24T15:33:52.479Z · LW · GW

but my current theory is that one such detrimental assumption is "I have direct knowledge of content of my experiences"

It's true this is the weakest link, since instances of the template "I have direct knowledge of X" sound presumptuous and have an extremely bad track record.

The only serious response in favor of the presumptuous assumption [edit] that I can think of is epiphenomenalism in the sense of "I simply am my experiences", with self-identity (i.e. X = X) filling the role of "having direct knowledge of X". For explaining how we're able to have conversations about "epiphenomenalism" without it playing any local causal role in us having these conversations, I'm optimistic that observation selection effects could end up explaining this.

Comment by Shiroe on Seth Explains Consciousness · 2023-08-24T15:05:28.491Z · LW · GW

The burden of proof is on those who assert that the Hard Problem is real. You can say what consciousness is not, but can you say what it is?

In the sense that you mean this, this is a general argument against the existence of everything, because ultimately words have to be defined either in terms of other words or in terms of things that aren't words. Your ontology has the same problem, to the same degree or worse. But we only need to give particular examples of conscious experience, like suffering. There's no need to prove that there is some essence of consciousness. Theories that deny the existence of these particular examples are (at best) at odds with empiricism.

Therefore I choose to accept the benefits of the sensation of experience and accept the Easy Problem of consciousness as the overwhelmingly likely Only Problem of consciousness.

It's deeply unclear to me what you mean by this. If you're denying that you have phenomenal experiences like suffering (i.e. negative valences), your rational decision making should be strongly affected by this belief. In the same way that someone who has stopped believing in Hell and Heaven should change their behavior to account for this radical change in their ontology.

Comment by Shiroe on Seth Explains Consciousness · 2023-08-23T21:14:20.365Z · LW · GW

Are you saying that you don't think there's any fact of the matter whether or not you have phenomenal experiences like suffering? Or do you mean that phenomenal experience is unreal in the same way that the hellscape described by Dante is unreal?

Comment by Shiroe on Seth Explains Consciousness · 2023-08-23T21:05:46.575Z · LW · GW

I don't like "illusionism" either, since it makes it seem like illusionists are merely claiming that consciousness is an illusion, i.e., it is something different than what it seems to be. That claim isn't very shocking or novel, but illusionists aren't claiming that. They're actually claiming that you aren't having any internal experience in the first place. There isn't any illusion.

"Fictionalism" would be a better term than "illusionism": when people say they are having a bad experience, or an experience of saltiness, they are just describing a fictional character.

Comment by Shiroe on love, not competition · 2022-10-30T21:28:33.472Z · LW · GW

Exactly. I wish the economic alignment issue was brought up more often.

Comment by Shiroe on How does anthropic reasoning and illusionism/eliminitivism interact? · 2022-10-15T15:25:11.702Z · LW · GW

You're right. I'm updating towards illusionism being orthogonal to anthropics in terms of betting behavior, though the upshot is still obscure to me.

Comment by Shiroe on Against the normative realist's wager · 2022-10-15T00:44:31.340Z · LW · GW

I agree realism is underrated. Or at least the term is underrated. It's the best way to frame ideas about sentientism (in the sense of hedonic utilitarianism). On the other hand, you seem to be talking more about rhetorical benefits of normative realism about laws.

Most people seem to think phenomenal valence is subjective, but that's confusing the polysemy of the word "subjective", which can mean either arbitrary or bound to a first-person subject. All observations (including valenced states like suffering) are subjective in the second sense, but not in the first. We have good evidence for believing that our qualities of experience are correlated across a great many sentient beings, rather than being some kind of private uncorrelated noise.

"Moral realism" is a good way to describe this situation that we're in as observers of such correlated valences, even if God-decreed rules of conduct isn't what we mean by that term.

Comment by Shiroe on Loss of Alignment is not the High-Order Bit for AI Risk · 2022-10-15T00:01:59.400Z · LW · GW

it is easy to cooperate on the shared goal of not dying

Were you here for Petrov Day? /snark

But I'm confused what you mean about a Pivotal Act being unnecessary. Although both you and a megacorp want to survive, you each have very different priors about what is risky. Even if the megacorp believes your alignment program will work as advertised, that only compels them to cooperate with you if they are (1) genuinely concerned about risk in the first place, (2) believe alignment is so hard that they will need your solution, and (3) actually possess the institutional coordination abilities needed.

And this is just for one org.

Comment by Shiroe on Loss of Alignment is not the High-Order Bit for AI Risk · 2022-10-14T21:45:59.597Z · LW · GW

World B has a 1, maybe minus epsilon chance of solving alignment, since the solution is already there.

That is totum pro parte. It's not World B which has a solution at hand. It's you who have a solution at hand, and a world that you have to convince to come to a screeching halt. Meanwhile people are raising millions of dollars to build AGI and don't believe it's a risk in the first place. The solution you have in hand has no significance for them. In fact, you are a threat to them, since there's very little chance that your utopian vision will match up with theirs.

You say World B has chance 1 minus epsilon. I would say epsilon is a better ballpark, unless the whole world is already at your mercy for some reason.

Comment by Shiroe on Loss of Alignment is not the High-Order Bit for AI Risk · 2022-10-14T20:45:38.410Z · LW · GW

Okay, let's operationalize this.

Button A: The state of alignment technology is unchanged, but all the world's governments develop a strong commitment to coordinate on AGI. Solving the alignment problem becomes the number one focus of human civilization, and everyone just groks how important it is and sets aside their differences to work together.

Button B: The minds and norms of humans are unchanged, but you are given a program by an alien that, if combined with an AGI, will align that AGI in some kind of way that you would ultimately find satisfying.

World B may sound like LW's dream come true, but the question looms: "Now what?" Wait for Magma Corp to build their superintelligent profit maximizer, and then kindly ask them to let you walk in and take control over it?

I would rather live in world A. If I was a billionaire or dictator, I would consider B more seriously. Perhaps the question lurking in the background is this: do you want an unrealistic Long Reflection or a tiny chance to commit a Pivotal Act? I don't believe there's a third option, but I hope I'm wrong.

Comment by Shiroe on Loss of Alignment is not the High-Order Bit for AI Risk · 2022-10-14T19:35:28.602Z · LW · GW

I agree that the political problem of globally coordinating non-abuse is more ominous than solving technical alignment. If I had the option to solve one magically, I would definitely choose the political problem.

What it looks like right now is that we're scrambling to build alignment tech that corporations will simply ignore, because it will conflict with optimizing for (short-term) profits. In a word: Moloch.