Posts
Comments
Yes, red and green seem subjectively very different -- but only to conscious attention. A green object amid many red objects (or vice versa) does not grab my attention in the way that, e.g. a yellow object might.
When shown a patch of red-or-green in a lab setting, I see "Red" or "Green" seemingly at random.
If shown a red patch next to a green patch in a lab, I'll see one "Red" and one "Green", but it's about 50:50 as to whether they'll be switched or not. How does that work? I have no hypotheses that aren't very low confidence. It seems as much a mystery to me as I infer it seems mysterious to you.
I've read that imagination (in the sense of conjuring mental imagery) is a spectrum, and I've encountered a test which some but not all phantasic people fail.
I don't recall the details enough to pose it directly, but I think I do recall enough to reinvent the test:
- Ask the subject to visualize a 3x3 grid of letters.
- Provide the information required to construct the visualization in an unusual order, for example top-to-bottom right-to-left for people not accustomed to that layout.
- Ask them to read the 3-letter word in each row.
Test details guessed above may not properly recreate the ability to distinguish levels of imagery. My hazy memory says the words might be top-to-bottom? Or the order of providing the letters might matter?
Someone actually seeing the image you've requested they construct would be able to trivially read off three words. ...but someone without mental imagery or with insufficient mental imagery may fail.
I recall discovering that I really can't imagine more than about 2 letters at a time before adding additional detail to my mental visual workspace forces the loss of something else. That seems pretty poor, and tracks with my inability to imagine human faces -- my theory is that a specific face requires more details to distinguish it from other faces than the maximum amount of detail I can visualize.
Actually, having written this, it just now occurs to me that my cached thought may be incorrect, that all my other qualia processing is "normal".
...I routinely (but not always) fail to perceive any qualia for hunger or smells (this predates COVID) -- yet, curiously, in the case of smells I somehow know (without any experience of perception) that there is a smell that I ought to be experiencing, and its rough intensity.
In the case of hunger, I'll literally fail to know I need to eat. I'll get the shakes and collapse and wonder why. I've needed to establish a habit of scheduled eating, to avoid this occurrence.
Previously, I had grouped these defects in with my inability to know my own wants -- in my theory: trauma damage that severed the connection to certain mental modules -- but it now occurs to me that an alternative hypothesis exists: that there's a possible connection to my unusual visual qualia processing.
I see a couple leads to investigate, which could help shed additional light on the topic. One is common enough to have a name: synthesthesia. The other, I think, may be personally unique to me, or at least some combination of very-rare and never-discussed that I've never heard of it.
Synthesthesia, to my understanding, involves multiple qualia accompanying various experiences, notably including qualia native to a different sensory modality. E.g. "That sound was green." Exploring the causal chain resulting in such utterances seems likely to turn up insights into qualia which will be more broadly applicable.
As for my unusual qualia processing: I am measurably red-green colorblind; in a laboratory setting, clean of context clues, I guess no better than chance whether a color is red or green, although I can reliably tell X and Y are both red or green and they're the opposite color from each other. Yet in everyday life, I experience qualia for red and green, almost always "correctly" (in that the qualium I'll call "Green" I experience when seeing actual green objects well in excess of 99% of the time, and vice versa for "Red" and red objects.)
My current theory as to how this works:
- Whichever module assigns qualia information to my experiences has some memory and some world knowledge, which it uses to make educated guesses.
- When I see red or green, I think my qualia-assignment process searches my visual field for anything known-green, for example plant life. (Or known-red, though this case is rarer.) Since I apparently can tell red and green apart, then having a known example allows it to chain the "this is definitely green" belief to everything else I'm seeing that's red-or-green and not opposite to the known-green object.
- By elimination, the remainder of objects are red.
- The assigned qualia are stable, that is, they never change while I'm experiencing them, even when their incorrectness becomes evident. "That LED is green, not red." --> No shift in perception.
- ...but repeated experiences with the same object separated by enough time, will result in me experiencing different qualia for the same object on different encounters, and feedback like the above "that LED is green, not red" will eventually be learned, and I'll see "Green" reliably after enough feedback.
I think the existence of my defect may shed some light on the working of normal qualia.
That is, I think there's a module which makes educated guesses about certain true properties of the world, based on the sensory stream, and annotates the sense information with its guesses, before that sense information reaches awareness. These annotations either become or select "qualia", the inexplicable ineffable differences in experience correlating to (or encoding) actual sense data.
Further, I think that investigating the causal chain resulting in my unusual experience might allow us to localize the qualia-annotation process in my brain, and perhaps find a standard location in many brains.
I suggest a simple explanation: some of us have qualia and some of us don’t.
Well that's an alarming hypothesis.
I've seen expressed (and held the view personally) that a world devoid of qualia is an example of a world devoid of value, in the consequentialist sense. ...but at least in my case, my view was somewhat grounded in the idea that all people are morally significant combined with the implicit assumption that the overwhelming majority of adults experience qualia, so updating to a higher probability of "a significant fraction of people alive today do not experience anything like qualia" ought to also come with an update away from "non-qualia-experiencing agents lack moral value".
I worry that some people may hold my prior view uncritically, and see an admission of not experiencing qualia as a moral license to disregard the person's well-being. See various historical takes about "X minority doesn't have souls" and the resultant treatment.
I can see some sense in this take; I've personally succumbed to predatory gambling services in years past, and can attest from personal experience that successfully quitting one addiction leaves me exceedingly vulnerable to pick up a different one. I rotated through six different highly-damaging vices, before settling into a relatively less-harmful vice, and I was grateful to find myself there.
And that's my point: you are in expectation doing someone who is inclined to addiction a favor by forcing them off of a particularly bad addiction.
You would be doing them a slightly bigger favor if you provide a much less-harmful replacement addiction at the same time. For me, the holy grail was to find a net-positive activity to slot into the addiction-shaped hole in my brain, and after a lifetime of struggle, I finally got "reading pop nonfiction books" to replace "scrolling social media", the first real Good in a long line of Lesser Evils.
Dramatically increasing taxes for childless people.
Too-low fertility concerns me deeply. My current preferred strategy had been something financial along the lines of this proposal, but on reflection, I think I need to update.
The main reasons that I see driving the people around me to defer/eschew children are, in rough decreasing order of prevalence:
- Lack of a suitable partner.
- Lack of fertility.
- Expectation of being an incompetent parent.
- Severe untreated mental health ailments.
- Other career priorities, not compatible with child raising.
The third point above is, in my social circle, usually downstream of needing to spend too much time working, not being able to afford childcare services, not being able to afford college, etc.
These issues could be lessened by financial interventions with the net effect of offsetting the burdens of child raising (obviously, implementation details matter a great deal.)
...I had previously assumed that "expectation of incompetence" was the primary issue because it's my primary issue (in the sense of looming large in my mind). ...but having taken an inventory of the childless adults in my social network, I now see that "inability to find a suitable partner" and "infertility" are much bigger issues.
"Inability to find a suitable partner" seems risky to fix, because it looks easy to create horrible side-effects. Many in my social network are in relationships but aren't having children because they're abusive relationships. Others have successfully escaped abusive relationships and subsequently given up on finding a partner; being alone is less painful, and they expect to find another abuser with nontrivial probability if they go looking for another partner. Still others lost a suitable partner to a treatable disease, due primarily to insurance companies ending up de facto in charge of healthcare decisions, and rendering decisions without the patient's health as the priority.
I don't see a good solution. (Well, except for "don't allow insurers to drive healthcare decisions".) All attempts to "solve" this issue I see being promoted in real life seem on-net harmful. My home state of Texas thinks the best solution is to ban divorce, and we expect that to be implemented within the next year. I don't see that as being a net good, even if it may compel additional births in the short term on net. Trapping people in abusive relationships seems incredibly dystopian. I also foresee this making young women even more afraid of getting into the wrong relationship than they already are, which runs the risk of making the "finding suitable partners" problem worse in the long run.
"Lack of fertility" seems challenging to fix, but far less fraught. I was born without a uterus, and wish to bear a child if the medical technology were developed to allow me to grow a uterus. I have a sibling on life-saving medication, a side-effect of which is extremely low sperm count -- a better drug to treat the condition (or an actual cure) could resolve this issue. Multiple of my sisters-in-law have simply failed to conceive for years, due to unknown-to-me causes, and I suspect the issues are similarly fixable via continued medical innovation.
Question: is it actually possible to cryptographically obfuscate goals/plans/precomittments (or other elements of "deception" as mentioned in your second bullet point) against something inspecting your algorithm and working memory?
In particular, it seems to me that anything properly able to inspect you can, at a minimum, discover the use of encryption, and find where the encryption key is read from, and then read the encrypted content. Or fail the model once encryption is detected at all.
I'm not saying it will be easy to build such transparency tools -- in fact, that must be a herculean challenge, since selecting against interpreted thoughts selects for uninterpretable thoughts, and a superintelligence might use principles not yet known to humanity -- I just mean that, from the perspective of the AI itself, I think perfect obfuscation against ideal logically-possible interpretability tools might be impossible.
Edit:
Nevermind, I answered my own question. There is no such logically-possible interpretability tool. For any such tool we logically-counterfactually assume could exist, then the AI could implement that tool, and utilize encryption if and only if that tool asserts that the AI doesn't use encryption. ...therefore no algorithm exists which can verify the use of encryption in the general case.
This argument has roughly the same shape as my reasoning regarding why prediction markets are likely to have much worse predictive power than one would naively guess, conditional on anyone using the outputs of a prediction market for decisions of significance: individual bettors are likely to care about the significant outcomes of the prediction. This outcome-driven prediction drive need not outweigh the profit/accuracy-driven component of the prediction market -- though it might -- in order to alter the prediction rendered enough to alter the relevant significant decision.
Perhaps the prediction market concept can be rescued from this failure mode via some analogue of the concept of financial leverage? That is, for predictions which will be used for significant decision purposes, some alteration may be applied to the financial incentive schedule, such that the expected value of predictive accuracy would remain larger than the value to predictors realizable by distorting the decision process. Alas, I find myself at a loss to specify an alternate incentive schedule with the desired properties for questions of high significance.
If there's any reason to suspect grant-givers to be uninformed on the topic, or biased against it, crowd-sourcing a sum of that size sounds possible.
Agreed.
...which would imply that dangers should be minimal from either slow augmentation which has time to become ubiquitous in the gene pool, or from limited augmentation that does not exceed a few standard deviations from the current mean. Assuming, of course, that our efforts don't cause unwanted values shift.
I think all currently progressing human enhancement projects of which I am aware are not expecting gains so large as to be dangerous, and therefore worthy of support.
I'd imagine that could be arranged. I live with an unusually fast rate of forgetting. With effort, I suspect my condition could be reverse-engineered and replicated.
After a couple years, I can re-experience something not knowing where the whole plot goes, but always knowing where the current scene will go. In games, I experience that with the plot, but still have my muscle memory. Great time for re-plays on hard mode; by this time I have almost all the skills, and almost none of the plot spoilers.
After about 5 years, I'll remember roughly how something made me feel overall, but little else. This is often about the time when I seek out a re-exposure for the things I remember as being unusually high quality.
It's unclear how long it takes exactly -- matters are often unclear for me where they rely on my memory as a key input -- but after some amount of time, my level of recall fades to "vaguely familiar" and then I completely forget that I've seen a thing at all. I'd estimate on the order of a decade or so.
I have the exact superpower people so often jokingly wish for around great media. As you can imagine from the nature of memory loss, I completely fail to appreciate my situation.
Wow, thanks for sharing! I had been taking my ability to imagine sounds completely for granted and now I find myself appreciating this ability.
It's semi-fictional evidence, but there's a rule in the Neverworld tabletop roleplaying game core rules that says something to the effect of success restores willpower points. This implies that at least someone else long ago had the same observation.
I do have that problem with swimming. I share the tendency that Eliezer points out, but I think we are both atypical in this shared way, rather than that Eliezer is on to a new explanation for a ubiquitous mental phenomenon.
The "sharp left turn" refers to a breakdown in alignment caused by capabilities gain.
An example: the sex drive was a pretty excellent adaptation at promoting inclusive genetic fitness, but when humans capabilities expanded far enough, we invented condoms. "Inventing condoms" is not the sort of behavior that an agent properly aligned with the "maximize inclusive genetic fitness" goal ought to execute.
At lower levels of capability, proxy goals may suffice to produce aligned behavior. The hypothesis is that most or all proxy goals will suddenly break down at some level of capability or higher, as soon as the agent is sufficiently powerful to find strategies that come close enough to maximizing the proxy.
This can cause many AI plans to fail, because most plans (all known so far?) fail to ensure the agent is actually pursuing the implementor's true goal, and not just a proxy goal.
I think the missing piece is that it's really hard to formally-specify a scale of physical change.
I think the notion of "minimizing change" is secretly invoking multiple human brain abilities, which I suspect will each turn out to be very difficult to formalize. Given partial knowledge of a current situation S: (1) to predict the future states of the world if we take some hypothetical action, (2) to invent a concrete default / null action appropriate to S, and (3) to informally feel which of two hypothetical worlds is more or less "changed" with respect to the predicted null-action world.
I think (1) (2) and (3) feel so introspectively unobtrusive because we have no introspective access into them; they're cognitive black-boxes. We just see that their outputs are nearly always available when we need them, and fail to notice the existence of the black-boxes entirely.
You'll also require an additional ability, a stronger form of (3) which I'm not sure even humans implement: (4) given two hypothetical worlds H1 and H2, and the predicted null-action world W0, compute the ratio difference(H1, W0) / difference(H2, W0), without dangerous corner-cases.
If you can formally specify (1) (2) and (4), then yes! I then think you can use that to construct a utility function that won't obsess (won't "tile the universe") using the plan you described -- though I recommend investing more effort than my 30-minute musings to prove safety, if you seem poised to actually implement this plan.
Some issues I foresee:
-
Humans are imperfect at (1) and (2), and the (1)- and (2)-outputs are critical to not just ensuring non-obsession, but also to the intelligence quality of the AI overall. While formalizing human (1) and (2) algorithms may enable human-level general AI (a big win in its own right), superhuman AI will require non-human formalizations for (1) and (2). Inventing non-human formalizations here feels difficult and risky -- though perhaps unavoidable.
-
The hypothetical world states in (4) are very-very-high-dimensional objects, so corner-cases in (4) seem non-trivial to rule-out. A formalization of the human (3)-implementation might be sufficient for some viable alternative plan, in which case the difficulty of formalizing (3) is bounded-above by the difficulty of reverse-engineering the human (3) neurology. By contrast, inventing an inhuman (4) could be much more difficult and risky. This may be weak evidence that plans merely requiring (3) ought to be preferred over plans requiring (4).
I'm transgender myself, currently a few years into transition, and I actually experienced some of the issues you predicted above.
I did need to relearn basic locomotion as my body shape changed over months. I started hormone replacement in early winter, and when I resumed distance running in the late spring, I was surprised to discover that I needed to relearn how to run. My gait was different enough that running took actual focus just to avoid falling down.
I also experienced a pretty bizarre period of about a year where my body had changed substantially, but my sensory map of my body hadn't. That issue eventually corrected itself, and as it did, I became unable to remember what it felt like to have my original configuration. A bunch of old memories lost that detail, though the remainder of those memories remain intact.
...and that's just from bodily changes.
I strongly agree with your thesis. Altering the mind is hard. Faced with a mismatch between my body and my mind, changing the mind to match my body or vice versa would have been equally good solutions. Changing the body is so much easier, which is why I chose that path.