Posts

Virtually Rational - VRChat Meetup 2024-01-28T05:52:36.934Z
Global LessWrong/AC10 Meetup on VRChat 2024-01-24T05:44:26.587Z
Found Paper: "FDT in an evolutionary environment" 2023-11-27T05:27:50.709Z
"Benevolent [ie, Ruler] AI is a bad idea" and a suggested alternative 2023-11-19T20:22:34.415Z
the gears to ascenscion's Shortform 2023-08-14T15:35:08.389Z
A bunch of videos in comments 2023-06-12T22:31:38.285Z
gamers beware: modded Minecraft has new malware 2023-06-07T13:49:10.540Z
"Membranes" is better terminology than "boundaries" alone 2023-05-28T22:16:21.404Z
"A Note on the Compatibility of Different Robust Program Equilibria of the Prisoner's Dilemma" 2023-04-27T07:34:20.722Z
Did the fonts change? 2023-04-21T00:40:21.369Z
"warning about ai doom" is also "announcing capabilities progress to noobs" 2023-04-08T23:42:43.602Z
"a dialogue with myself concerning eliezer yudkowsky" (not author) 2023-04-02T20:12:32.584Z
A bunch of videos for intuition building (2x speed, skip ones that bore you) 2023-03-12T00:51:39.406Z
To MIRI-style folk, you can't simulate the universe from the beginning 2023-03-01T21:38:26.506Z
How to Read Papers Efficiently: Fast-then-Slow Three pass method 2023-02-25T02:56:30.814Z
Hunch seeds: Info bio 2023-02-17T21:25:58.422Z
If I encounter a capabilities paper that kinda spooks me, what should I do with it? 2023-02-03T21:37:36.689Z
Hinton: "mortal" efficient analog hardware may be learned-in-place, uncopyable 2023-02-01T22:19:03.227Z
Call for submissions: “(In)human Values and Artificial Agency”, ALIFE 2023 2023-01-30T17:37:48.882Z
Stop Talking to Each Other and Start Buying Things: Three Decades of Survival in the Desert of Social Media 2023-01-08T04:45:11.413Z
Metaphor.systems 2022-12-21T21:31:17.373Z
[link, 2019] AI paradigm: interactive learning from unlabeled instructions 2022-12-20T06:45:30.035Z
Relevant to natural abstractions: Euclidean Symmetry Equivariant Machine Learning -- Overview, Applications, and Open Questions 2022-12-08T18:01:40.246Z
[paper link] Interpreting systems as solving POMDPs: a step towards a formal understanding of agency 2022-11-05T01:06:39.743Z
We haven't quit evolution [short] 2022-06-06T19:07:14.025Z
What can currently be done about the "flooding the zone" issue? 2020-05-20T01:02:33.333Z
"The Bitter Lesson", an article about compute vs human knowledge in AI 2019-06-21T17:24:50.825Z
thought: the problem with less wrong's epistemic health is that stuff isn't short form 2018-09-05T08:09:01.147Z
Hypothesis about how social stuff works and arises 2018-09-04T22:47:38.805Z
Events section 2017-10-11T16:24:41.356Z
Avoiding Selection Bias 2017-10-04T19:10:17.935Z
Discussion: Linkposts vs Content Mirroring 2017-10-01T17:18:56.916Z
Test post 2017-09-25T05:43:46.089Z
The Social Substrate 2017-02-09T07:22:37.209Z

Comments

Comment by the gears to ascension (lahwran) on the gears to ascenscion's Shortform · 2024-12-12T20:46:02.328Z · LW · GW

I don't think the answer is as simple as changing terminology or carefully modelling their current viewpoints and bridging the inferential divides.

Indeed, and I think that-this-is-the-case is the message I want communicators to grasp: I have very little reach, but I have significant experience talking to people like this, and I want to transfer some of the knowledge from that experience to people who can use it better.

The thing I've found most useful is to be able to express that significant parts of their viewpoint are reasonable. Eg, one thing I've tried is "AI isn't just stealing our work, it's also stealing our competence". Hasn't stuck, though. I find it helpful to point out that yes, climate change sure is a (somewhat understated) accurate description of what doom looks like.

I do think "allergies" are a good way to think about it, though. They're not unable to consider what might happen if AI keeps going as it is, they're part of a culture that is trying to apply antibodies to AI. And those antibodies include active inference wishcasting like "AI is useless". They know it's not completely useless, but the antibody requires them to not acknowledge that in order for its effect to bind; and their criticisms aren't wrong, just incomplete - the problems they raise with AI are typically real problems, but not high impact ones so much as ones they think will reduce the marketability of AI.

Comment by the gears to ascension (lahwran) on [Fiction] A Disneyland Without Children · 2024-12-12T20:44:14.068Z · LW · GW

This is the story I use to express what a world where we fail looks like to left-leaning people who are allergic to the idea that AI could be powerful. It doesn't get the point across great, due to a number of things that continue to be fnords for left leaning folks which this story uses, but it works better than most other options. It also doesn't seem too far off what I expect to be the default failure case; though the factories being made of low-intelligence robotic operators seems unrealistic to me.

I opened it now to make this exact point.

Comment by the gears to ascension (lahwran) on sarahconstantin's Shortform · 2024-12-11T16:14:14.389Z · LW · GW

This is talking about dem voters or generally progressive citizens, not dem politicians, correct?

Comment by the gears to ascension (lahwran) on the gears to ascenscion's Shortform · 2024-12-11T07:13:18.727Z · LW · GW

people who dislike AI, and therefore could be taking risks from AI seriously, are instead having reactions like this. https://blue.mackuba.eu/skythread/?author=brooklynmarie.bsky.social&post=3lcywmwr7b22i why? if we soberly evaluate what this person has said about AI, and just, like, think about why they would say such a thing - well, what do they seem to mean? they typically say "AI is destroying the world", someone said that in the comments; but then roll their eyes at the idea that AI is powerful. They say the issue is water consumption - why would someone repeat that idea? Under what framework is that a sensible combination of things to say? what consensus are they trying to build? what about the article are they responding to?

I think there are straightforward answers to these questions that are reasonable and good on behalf of the people who say these things, but are not as effective by their own standards as they could be, and which miss upcoming concerns. I could say more about what I think, but I'd rather post this as leading questions, because I think the reading of the person's posts you'd need to do to go from the questions I just asked to my opinions will build more of the model I want to convey than saying it directly.

But I think the fact that articles like this get reactions like this is an indication that orgs like Anthropic or PauseAI are not engaging seriously with detractors, and trying seriously to do so seems to me like a good idea. It's not my top priority ask for Anthropic, but it's not very far down the virtual list.

But it's just one of many reactions of this category I've seen that seem to me to indicate that people engaging with a rationalist-type negative attitude towards their observations of AI are not communicating successfully with people who have an ordinary-person-type negative attitude towards what they've seen of AI. I suspect that at least a large part of the issue is that rationalists have built up antibodies to a certain kind of attitude and auto-ignore it, despite what I perceive to be its popularity, and as a result don't build intuitive models about how to communicate with such a person.

Comment by the gears to ascension (lahwran) on Habryka's Shortform Feed · 2024-12-08T18:40:06.790Z · LW · GW

I suspect fixing this would need to involve creating something new which doesn't have the structural problems in EA which produced this, and would involve talking to people who are non-sensationalist EA detractors but who are involved with similarly motivated projects. I'd start here and skip past the ones that are arguing "EA good" to find the ones that are "EA bad, because [list of reasons ea principles are good, and implication that EA is bad because it fails at its stated principles]"

I suspect, even without seeking that out, the spirit of EA that made it ever partly good has already and will further metastasize into genpop.

Comment by the gears to ascension (lahwran) on leogao's Shortform · 2024-12-08T18:30:13.548Z · LW · GW

I was someone who had shorter timelines. At this point, most of the concrete part of what I expected has happened, but the "actually AGI" thing hasn't. I'm not sure how long the tail will turn out to be. I only say this to get it on record.

Comment by the gears to ascension (lahwran) on Optimizing for Agency? · 2024-12-08T12:20:36.451Z · LW · GW

https://www.drmichaellevin.org/research/

https://www.drmichaellevin.org/publications/

it's not directly on alignment, but it's relevant to understanding agent membranes. understanding his work seems useful as a strong exemplar of what one needs to describe with a formal theory of agents and such. particularly interesting is https://pubmed.ncbi.nlm.nih.gov/31920779/

It's not the result we're looking for, but it's inspiring in useful ways.

Comment by the gears to ascension (lahwran) on Optimizing for Agency? · 2024-12-06T21:04:12.492Z · LW · GW

Yes to both. I don't think Cannell is correct about an implementation of what he said being a good idea, even if it was a certified implementation, and I also don't think his idea is close to ready to implement. Agent membranes still seem at all interesting, right now as far as I know the most interesting work is coming from the Levin lab (tufts university, michael levin), but I'm not happy with any of it for nailing down what we mean by aligning an arbitrarily powerful mind to care about the actual beings in its environment in a strongly durable way.

Comment by the gears to ascension (lahwran) on Why are there no interesting (1D, 2-state) quantum cellular automata? · 2024-11-29T09:11:36.537Z · LW · GW

What is a concise intro that will teach me everything I need to know for understanding every expression here? I'm also asking Claude, interested in input from people with useful physics textbook taste

Comment by the gears to ascension (lahwran) on the gears to ascenscion's Shortform · 2024-11-28T08:49:34.176Z · LW · GW

qaci seems to require the system having an understanding-creating property that makes it a reliable historian. have been thinking about this, have more to say, currently rather raw and unfinished.

Comment by the gears to ascension (lahwran) on Which things were you surprised to learn are not metaphors? · 2024-11-26T22:32:22.048Z · LW · GW

hmm actually, I think I was the one who was wrong on that one. https://en.wikipedia.org/wiki/Synaptic_weight seems to indicate the process I remembered existing doesn't primarily work how I thought it did.

Comment by the gears to ascension (lahwran) on Which things were you surprised to learn are not metaphors? · 2024-11-21T22:05:53.328Z · LW · GW

these might all be relatively obvious but here are some I've found nice to notice

Comment by the gears to ascension (lahwran) on Being nicer than Clippy · 2024-09-27T08:39:40.772Z · LW · GW

some fragments:

What hunches do you currently have surrounding orthogonality, its truth or not, or things near it?

re: hard to know - it seems to me that we can't get a certifiably-going-to-be-good result from a CEV based ai solution unless we can make it certifiable that altruism is present. I think figuring out how to write down some form of what altruism is, especially altruism in contrast to being-a-pushover, is necessary to avoid issues - because even if any person considers themselves for CEV, how would they know they can trust their own behavior?

as far as I can tell humans should by default see themselves as having the same kind of alignment problem as AIs do, where amplification can potentially change what's happening in a way that corrupts thoughts which previously implemented values. can we find a CEV-grade alignment solution that solves the self-and-other alignment problems in humans as well, such that this CEV can be run on any arbitrary chunk of matter and discover its "true wants, needs, and hopes for the future"?

Comment by the gears to ascension (lahwran) on Release: Optimal Weave (P1): A Prototype Cohabitive Game · 2024-09-22T16:03:40.582Z · LW · GW

It might be good to have a suggestion that people can't talk if it's not their turn

It might be good to explain why the turn timer. turns out we should have used the recommended one imo

At the end of the game folks started talking about "I got a little extinction, not so much hell", "I got pretty much utopia", "I got omelas". Describing how to normalize points maybe good

also, four people seemed like too many for the first round. maybe fewer? maybe just need more rounds to understand it?

Comment by the gears to ascension (lahwran) on Laziness death spirals · 2024-09-19T22:58:17.323Z · LW · GW

Much of this seems quite plausible and matches my experience, but it seems worded slightly overconfidently. Like, maybe 5 to 10% at most. Lots of points where I went, "well, in your experience, you mean. And also, like, in mine. But I don't think either of us know enough to assert this is always how it goes." I've had issues in the past from taking posts like this a bit too seriously when someone sounded highly confident, so I'm just flagging - this seems like a good insight but I'd guess the model as described is less general than it sounds, and I couldn't tell you exactly how much less general off the top of my head.

But also, it's a pattern I've noticed as well, and have found to be incredibly useful to be aware of. So, weak upvoted.

Comment by the gears to ascension (lahwran) on Lucius Bushnaq's Shortform · 2024-09-18T10:36:01.796Z · LW · GW

I do think natural latents could have a significant role to play somehow in QACI-like setups, but it doesn't seem like they let you avoid formalizing, at least in the way you're talking about. It seems more interesting in terms of avoiding specifying a universal prior over possible worlds, if we can instead specify a somewhat less universal prior that bakes in assumptions about our worlds' known causal structure. it might help with getting a robust pointer to the start of the time snippet. I don't see how it helps avoiding specifying "looping", or "time snippet", etc. natural latents seem to me to be primarily about the causal structure of our universe, and it's unclear what they even mean otherwise. it seems like our ability to talk about this concept is made up of a bunch of natural latents, and some of them are kind of messy and underspecified by the phrase, mainly relating to what the heck is a physics.

Comment by the gears to ascension (lahwran) on Benito's Shortform Feed · 2024-09-11T08:32:06.251Z · LW · GW

I like #2. On a similar thread: would be nice to have a separate section for pinned comments. I looked into pull requesting it at one point but looks like it either isn't as trivial as I hoped, or I simply got lost in the code. I feel like folks having more affordance to say, "Contrary to its upvotes or recency, I think this is one of the most representative comments from me, and others seeing my page should see it" would be helpful - pinning does this already but it has ui drawbacks because it simply pushes recent comments out of the way and the pinned marker is quite small (hence why I edit my pinned comments to say they're pinned).

Comment by the gears to ascension (lahwran) on Epistemic states as a potential benign prior · 2024-08-31T20:58:34.062Z · LW · GW

then you're not guaranteed that your AI gets anywhere at all; its knightian uncertainty might remain so immense that the AI keeps picking the null action all the time because some of its knightian hypotheses still say that anything else is a bad idea.

@Tamsin, The knightian in IB is related to limits of what hypotheses you can possibly find/write down, not - if i understand so far - about an adversary. The adversary stuff is afaict mostly to make proofs work.

@all, anyway the big issue I (still) have with this is still that, if the user is trying to give these statements, how bad is it if they screw up in some nonobvious fundamental way? Does this prior instantly collapse if the user is a kinda bad predictor on some important subset of logic or makes only statements that aren't particularly connected to some part of the statements needed to describe reality?

I'd be particularly interested to see Garrabrant, Kosoy, Diffractor, Gurkenglas comment on where they think this works or doesn't

Comment by the gears to ascension (lahwran) on Ruby's Quick Takes · 2024-08-30T02:57:42.692Z · LW · GW

Interested! I would pay at cost if that was available. I'll be asking about which posts are relevant to a question, misc philosophy questions and asking for Claude to challenge me, etc. Primarily interested if I can ask for brevity using a custom prompt, in the system prompt.

Comment by the gears to ascension (lahwran) on the gears to ascenscion's Shortform · 2024-08-27T18:08:27.021Z · LW · GW

Wei Dai and Tsvi BT posts have convinced me I need to understand how one does philosophy significantly better. Anyone who thinks they know how to learn philosophy, I'm interested to hear your takes on how to do that. I get the sense that perhaps reading philosophy books is not the best way to learn to do philosophy.

I may edit this comment with links as I find them. Can't reply much right now though.

Comment by the gears to ascension (lahwran) on Release: Optimal Weave (P1): A Prototype Cohabitive Game · 2024-08-20T21:35:33.297Z · LW · GW

This looks really interesting, I'd be eager to try an online version if it's not too much trouble to make?

Comment by the gears to ascension (lahwran) on Eugenics And Reproduction Licenses FAQs: For the Common Good · 2024-08-14T05:29:26.306Z · LW · GW

[edit 7d later: I was too angry here. I think there's some version of this that can be defended, but it's not the version I wrote while angry. edit 2mo latet: It's pretty close, but my policy suggestions need refinement and I need to justify why I think the connection to past eugenics still exists.]

 

they're welcome to argue in favor of genetic enhancement, as long as it happens after birth. yes, I know it's orders of magnitude harder. But knowing anything about an abortable child should be illegal. I'm a big fan of abortion, as long as it is blind to who is being aborted, because as soon as it depends on the child's characteristics, it's reaching into the distribution and killing some probabilistic people - without that, it's not killing the probabilistic person. Another way to put this is, life begins at information leakage. anything else permits information leakage that imposes the agency of the parents. That's not acceptable. Argue for augmenting a living organism all you want! we'll need at least mild superintelligence to pull it off, unfortunately, but it's absolutely permitted by physics. But your attempt to suppress the societal immune response to this thing in particular is unreasonable. Since you've already driven away the people who would say this in so many words by not banning it sooner, it is my responsibility to point this out. Almost everyone I know who I didn't meet from lesswrong hates this website, and they specifically cite acceptance of eugenics as why. You're already silencing a crowd. I will not shut up.

Comment by the gears to ascension (lahwran) on Eugenics And Reproduction Licenses FAQs: For the Common Good · 2024-08-14T02:02:12.267Z · LW · GW

[edit 7d later: I was too angry here. I think there's some version of this that can be defended, but it's not the version I wrote while angry. edit 2mo latet: It's pretty close, but my policy suggestions need refinement and I need to justify why I think the connection to past eugenics still exists.]

If abortion should be "blind to the child's attributes", then you should put your money where you mouth is and be willing to take of disabled children who will never have a future or be able to take care of themselves. If you won't do that, then you should concede. Your dogma will not create a sustainable, long-lasting civilization.

Solve these things in already-developed organisms. It's orders of magnitude harder, yes, but it's necessary for it to be morally acceptable. Your brother should get to go into cryonics and be revived once we can heal him. Failing that, it's just the risk you take reproducing. Or at least, that will remain my perspective, since I will keep repeating, from my dad's perspective, I would have been a defect worth eliminating. There is nothing you can say to convince me, and I will take every option available to me to prevent your success at this moral atrocity. Sorry about the suffering of ancient earth, but let's fix it in a way that produces outcomes worth creating rather than hellscapes of conformity.

Comment by the gears to ascension (lahwran) on Eugenics And Reproduction Licenses FAQs: For the Common Good · 2024-08-14T01:56:51.655Z · LW · GW

[edit 7d later: I was too angry here. I think there's some version of this that can be defended, but it's not the version I wrote while angry. edit 2mo latet: It's pretty close, but my policy suggestions need refinement and I need to justify why I think the connection to past eugenics still exists.]

it kind of is, actually. Perhaps you could try not wearing your "pretense of neutrality" belief as attire and actually consider what you just said is acceptable. I'd rather say, "if something can reasonably be described as eugenics, that's an automatic failure of acceptability, and you have to argue why your thing is not eugenics in order for it to be welcome anywhere." bodily autonomy means against your parents, too.

edit: i was ratelimited for this bullshit. fuck this website

Comment by the gears to ascension (lahwran) on jacquesthibs's Shortform · 2024-08-13T17:16:55.603Z · LW · GW

I have a user script that lets me copy the post into the Claude ui. No need to pay another service.

Comment by the gears to ascension (lahwran) on Eugenics And Reproduction Licenses FAQs: For the Common Good · 2024-08-13T17:14:25.966Z · LW · GW

[edit 7d later: I was too angry here. I think there's some version of this that can be defended, but it's not the version I wrote while angry. edit 2mo latet: It's pretty close, but my policy suggestions need refinement and I need to justify why I think the connection to past eugenics still exists.]

I sure didn't. I'm surprised you expected to be downvoted. This shit typically gets upvoted here. My comment being angry and nonspecific will likely be in the deep negatives.

That said, I gave an argument sketch. People use this to eliminate people like me and my friends. Got a counter, or is my elevated emotion not worth taking seriously?

Comment by the gears to ascension (lahwran) on Eugenics And Reproduction Licenses FAQs: For the Common Good · 2024-08-13T16:58:00.231Z · LW · GW

[edit 7d later: I was too angry here. I think there's some version of this that can be defended, but it's not the version I wrote while angry. edit 2mo latet: It's pretty close, but my policy suggestions need refinement and I need to justify why I think the connection to past eugenics still exists.]

Get this Nazi shit off this fucking website already for fuck's sake. Yeah yeah discourse norms, I'm supposed to be nice and be specific. But actually fuck off. I'm tired of all of y'all that upvote posts like these to 100 karma. If you can't separate your shit from eugenics then your shit is bad. Someone else can make the reasoned argument, I've had enough.

If we can't get tech advanced enough to become shapeshifters, modify already grown bodies, we don't get to mess with genes. I will never support tools that let people select children by any characteristic, they would have been used against me and so many of my friends. Wars have been fought over this, and if you try it, they will be again. Abortion should be blind to the child's attributes.

Comment by the gears to ascension (lahwran) on J's Shortform · 2024-08-13T16:04:58.901Z · LW · GW

Mmm. If someone provides real examples and gets downvoted for them, isn't that stronger evidence of issues here? I think you are actually just wrong about this claim. Also, if you in particular provide examples of where you've been downvoted I expect it will be ones that carry my downvote because you were being an ass, rather than because of any factual claim they contain. (And I expect to get downvoted for saying so!)

j is likely to not have that problem, unless they also are systematically simply being an ass and calling it an opinion. If you describe ideas and don't do this shit Shankar and I are, I think you probably won't be downvoted.

Comment by the gears to ascension (lahwran) on Humanity isn't remotely longtermist, so arguments for AGI x-risk should focus on the near term · 2024-08-13T04:26:39.834Z · LW · GW

moral utilitarianism is a more specific thing than utility maximization, I think?

Comment by the gears to ascension (lahwran) on J's Shortform · 2024-08-13T03:30:35.844Z · LW · GW

@j-5 looking forward to your reply here. as you can see, it's likely you'll be upvoted for expanding, so I hope it's not too spooky to reply.

@Shankar Sivarajan, why did you downvote the request for examples?

Comment by the gears to ascension (lahwran) on J's Shortform · 2024-08-12T14:29:01.985Z · LW · GW

I've noticed the sanity waterline has in fact been going up on other websites over the past ten years. I suspect

prediction market ethos,

YouTube science education,

things like brilliant,

and general rationalist idea leakage,

Are to blame

Comment by the gears to ascension (lahwran) on Eli's shortform feed · 2024-08-12T14:21:30.055Z · LW · GW

Build enough nuclear power plants and we could boil the oceans with current tech, yeah? They're a significant fraction of fusion output iiuc?

Comment by the gears to ascension (lahwran) on Outrage Bonding · 2024-08-09T23:49:10.780Z · LW · GW

I mean, in situations that are generating outrage because of having influence on an outcome you do in fact care about, how do you select the information that you need in order to have what influence on that outcome is possible for you, without that overwhelming you with emotional bypass of your reasoning or breaking your ability to estimate how much influence is possible to have? To put this another way - is there a possible outrage-avoiding behavior vaguely like this one, but where if everyone took on the behavior, it would make things better instead of making one into a rock with "I cooperate and forfeit this challenge" written on it?

In other words - all of the above, I guess? but always through the lens of treating outrage as information for more integrative processes, likely information to be transformed into a form that doesn't get integrated as "self" necessarily, rather than letting outrage be the mental leader and override your own perspectives. Because if you care about the thing the outrage is about, even if you agree with it, you probably especially don't want to accept the outrage at face value, since it will degrade your response to the situation. Short-term strategic thinking about how to influence a big world pattern usually allows you to have a pretty large positive impact, especially if "you" is a copyable, self-cooperating behavior. Outrage cascades are a copyable, partially-self-cooperating behavior, but typically collapse complexity of thought. When something's urgent I wouldn't want push a social context towards dismissing it, but I'd want to push the social context towards calm, constructive, positive responses to the urgency.

You gave the example of someone getting a bunch of power and folks being concerned about this, but ending up manipulated by the outrage cascades. Your suggestions seem to lean towards simply avoiding circles which are highly charged with opinion about who or what has power; I'm most interested in versions of this advice the highly charged circles could adopt to be healthier and have more constructive responses. It does seem like your suggestions aren't too far from this, hence why it seems at all productive to ask.

Idk, those are some words. Low coherent on quite what it is I'm asking for, but you seem on a good track with this, I guess? maybe we come up with something interesting as a result of this question.

re: cloud react - yeah fair

Comment by the gears to ascension (lahwran) on Outrage Bonding · 2024-08-09T16:20:33.722Z · LW · GW

Is there a way to do this that stays alert and active, but avoids outrage?

Comment by the gears to ascension (lahwran) on Organisation for Program Equilibrium reading group · 2024-08-06T03:26:01.855Z · LW · GW

Well, I'm not going to be in london any time soon.

Comment by the gears to ascension (lahwran) on Dragon Agnosticism · 2024-08-02T21:16:46.175Z · LW · GW

I think you should think about how your work generalizes between the topics, and try to make it possible for alignment researchers to take as much as they can from it; this is because I expect software pandemics are going to become increasingly similar to wetware pandemics, and so significant conceptual parts of defenses for either will generalize somewhat. That said, I also think that the stronger form of the alignment problem is likely to be useful to you directly on your work anyway; if detecting pandemics in any way involves ML, you're going to run into adversarial examples, and will quickly be facing the same collapsed set of problems (what objective do I train for? how well did it work, can an adversarial optimization process eg evolution or malicious bioengineers break this? what side effects will my system have if deployed?) as anyone who tries to deploy ML. If you're instead not using ML, I just think your system won't work very well and you're being unambitious at your primary goal, because serious bioengineered dangers are likely to involve present-day ML bio tools by the time they're a major issue.

But I think you in particular are doing something sufficiently important that it's quite plausible to me that you're correct. This is very unusual and I wouldn't say it to many people. (normally I'd just not bother directly saying they should switch to working on alignment because of not wanting to waste their time when I'm confident they won't be worth my time to try to spin up, and I instead just make noise about the problem vaguely in people's vicinity and let them decide to jump on it if desired.)

Comment by the gears to ascension (lahwran) on Lessons from the FDA for AI · 2024-08-02T09:20:23.115Z · LW · GW

I like the idea of an "FDA but not awful" for AI alignment. However, the FDA is pretty severely captured and inefficient, and has pretty bad performance compared to a randomly selected WHO-listed authority; though that may not be the right comparison set, as you'd ideally want to filter to only those in countries with significant drug development research industries, I've heard the FDA is bad even by that standard. this is mentioned in the article but at the end and only briefly.

Comment by the gears to ascension (lahwran) on AI #75: Math is Easier · 2024-08-02T06:41:35.632Z · LW · GW

So, as soon as I saw the song name I looked it up, and I had no idea what the heck it was about until I returned and kept reading your comment. I tried getting claude to expand on it. every single time it recognized the incest themes. None of the first messages recognized suicide, but many of the second messages did, when I asked what the character singing this is thinking/intending. But I haven't found leading questions sufficient to get it to bring up suicide first. wait, nope! found a prompt where it's now consistently bringing up suicide. I had to lead it pretty hard, but I think this prompt won't make it bring up suicide for songs that don't imply it... yup, tried it on a bunch of different songs, the interpretations all match mine closely, now including negative parts. Just gotta explain why you wanna know so bad, so I think people having relationship issues won't get totally useless advice from claude. Definitely hard to get the rose colored glasses off, though, yeah.

Comment by lahwran on [deleted post] 2024-08-01T23:39:24.308Z

Relying on markets for alignment implicitly assumes that economic incentives will naturally lead to aligned behavior. But we know from human societies that markets alone don't guarantee particular outcomes - real-world markets often produce unintended negative consequences at scale. Attempts at preventing this exist, but none are even close to leak-free, strong enough to contain the most extreme malicious agents the system instantiates or prevent them from breaking the system's properties; in other words, markets have particularly severe inner alignment issues, especially compared to backprop. Markets are fundamentally driven by the pursuit of defined rewards or currencies, so in such a system, how do we ensure that the currency being optimized for truly captures what we care about - how do you ensure that you're only paying for good things, in a deep way, rather than things that have "good" printed on the box?

Also, using markets in the context of alignment is nothing new, MIRI has been talking about it for years; the agent foundations group has many serious open problems related to it. If you want to make progress on making something like this a good idea, you're going to need to do theory work, because it can be confidently known now that the design you proposed is catastrophically misaligned and will exhibit all the ordinary failures markets have now.

Comment by the gears to ascension (lahwran) on antimonyanthony's Shortform · 2024-07-31T08:35:22.833Z · LW · GW

ah my bad, my attention missed the link! that does in fact answer my whole question, and if I hadn't missed it I'd have had nothing to ask :)

Comment by the gears to ascension (lahwran) on antimonyanthony's Shortform · 2024-07-29T22:31:23.695Z · LW · GW

That's what I already believed, but OP seems to disagree, so I'm trying to understand what they mean

Comment by the gears to ascension (lahwran) on antimonyanthony's Shortform · 2024-07-29T22:29:40.540Z · LW · GW

query rephrase: taboo both "algorithmic ontology" and "physicalist ontology". describe how each of them constructs math to describe things in the world, and how that math differs. That is, if you're saying you have an ontology, presumably this means you have some math and some words describing how the math relates to reality. I'm interested in a comparison of that math and those words; so far you're saying things about a thing I don't really understand as being separate from physicalism. Why can't you just see yourself as multiple physical objects and still have a physicalist ontology? what makes these things different in some, any, math, as opposed to only being a difference in how the math connects to reality?

Comment by the gears to ascension (lahwran) on Relativity Theory for What the Future 'You' Is and Isn't · 2024-07-29T18:58:00.731Z · LW · GW

[edit: pinned to profile]

"Hard" problem

That seems to rely on answering the "hard problem of consciousness" (or as I prefer, "problem of first-person something-rather-than-nothing") with an answer like, "the integrated awareness is what gets instantiated by metaphysics".

That seems weird as heck to me. It makes more sense for first-person-something-rather-than-nothing question to be answered by "the individual perspectives of causal nodes (interacting particles' wavefunctions, or whatever else interacts in spatially local ways) in the universe's equations are what gets Instantiated™ As Real® by metaphysics".

(by metaphysics here I just mean ~that-which-can-exist, or ~the-root-node-of-all-possibility; eg this is the thing solomonoff induction tries to model by assuming the root-node-of-all-possibility contains only halting programs, or tegmark 4 tries to model as some mumble mumble blurrier version of solomonoff or something (I don't quite grok tegmark 4); I mean the root node of the entire multiverse of all things which existed at "the beginning", the most origin-y origin. the thing where, when we're surprised there's something rather than nothing, we're surprised that this thing isn't just an empty set.)

If we assume my belief about how to resolve this philosophical confusion is correct, then we cannot construct a description of a hypothetical universe that could have been among those truly instantiated as physically real in the multiverse, and yet also have this property where the hard-problem "first-person-something-rather-than-nothing" can disappear over some timesteps but not others. Instead, everything humans appear to have preferences about relating to death becomes about the so called easy problem, the question of why the many first-person-something-rather-than-nothings of the particles of our brain are able to sustain an integrated awareness. Perhaps that integrated awareness comes and goes, eg with sleep! It seems to me to be what all interesting research on consciousness is about. But I think that either, a new first-person-something-rather-than-nothing-sense-of-Consciousness is allocated to all the particles of the whole universe in every infinitesimal time slice that the universe in question's true laws permit; or, that first-person-something-rather-than-nothing is conserved over time. So I don't worry too much about losing the hard-problem consciousness, as I generally believe it's just "being made of physical stuff which Actually Exists in a privileged sense".

The thing is, this answer to the hard problem of consciousness has kind of weird results relating to eating food. Because it means eating food is a form of uploading! you transfer your chemical processes to a new chunk of matter, and a previous chunk of matter is aggregated as waste product. That waste product was previously part of you, and if every particle has a discrete first-person-something-rather-than-nothing which is conserved, then when you eat food you are "waking up" previously sleeping matter, and the waste matter goes to sleep, forgetting near everything about you into thermal noise!

"Easy" problem

So there's still an interesting problem to resolve - and in fact what I've said resolves almost nothing; it only answers camp #2, providing what I hope is an argument for why they should become primarily interested in camp #1. In camp #1 terms, we can discuss information theory or causal properties about whether the information or causal chain that makes up the things those first-person-perspective-units ie particles are information theoretically "aware of" or "know" things about their environment; we can ask causal questions - eg, "is my red your red?" can instead be "assume my red is your red if there is no experiment which can distinguish them, so can we find such an experiment?" - in which case, I don't worry about losing even the camp #1 form of selfhood-consciousness from sleep, because my brain is overwhelmingly unchanged from sleep and stopping activations and whole-brain synchronization of state doesn't mean it can't be restarted. 

It's still possible that every point in spacetime has a separate first-person-something-rather-than-nothing-"consciousness"/"existence", in which case maybe actually even causally identical shapes of particles/physical stuff in my brain which are my neurons representing "a perception of red in the center of my visual field in the past 100ms" are a different qualia than the same ones at a different infinitesimal timestep, or are different qualia than if the exact same shape of particles occurred in your brain. But it seems even less possible to get traction on that metaphysical question than on the question of the origin of first-person-something-rather-than-nothing, and since I don't know of there being any great answers to something-rather-than-nothing, I figure we probably won't ever be able to know. (Also, our neurons for red are, in fact, slightly different. I expect the practical difference is small.)

But that either doesn't resolve or at least partially backs OP's point, about timesteps/timeslices already potentially being different selves in some strong sense, due to ~lack of causal access across time, or so. Since the thing I'm proposing also says non-interacting particles in equilibrium have an inactive-yet-still-real first-person-something-rather-than-nothing, then even rocks or whatever you're on top of right now or your keyboard keys carry the bare-fact-of-existence, and so my preference for not dying can't be about the particles making me up continuing to exist - they cannot be destroyed, thanks to conservation laws of the universe, only rearranged - and my preference is instead about the integrated awareness of all of these particles, where they are shaped and moving in patterns which are working together in a synchronized, evolution-refined, self-regenerating dance we call "being alive". And so it's perfectly true that the matter that makes me up can implement any preference about what successor shapes are valid.

Unwantable preferences?

On the other hand, to disagree with OP a bit, I think there's more objective truth to the matter about what humans prefer than that. Evolution should create very robust preferences for some kinds of thing, such as having some sort of successor state which is still able to maintain autopoesis. I think it's actually so highly evolutionarily unfit to not want that that it's almost unwantable for an evolved being to not want there to be some informationally related autopoietic patterns continuing in the future.

Eg, consider suicide - even suicidal people would be horrified by the idea that all humans would die if they died, and I suspect that suicide is an (incredibly high cost, please avoid it if at all possible!) adaptation that has been preserved because there are very rare cases where it can be increase the inclusive fitness of a group the organism arose from (but I generally believe it almost never is a best strategy, so if anyone reads this who is thinking about it, please be aware I think it's a terribly high cost way to solve whatever problem makes it come to mind, and there are almost certainly tractable better options - poke me if a nerd like me can ever give useful input); but I bring it up because it means, while you can maybe consider the rest of humanity or life on earth to be not sufficiently "you" in an information theory sense that dying suddenly becomes fine, it seems to me to be at least one important reason that suicide is ever acceptable to anyone at all; if they knew they were the last organism I feel like even a maximally suicidal person would want to stick it out for as long as possible, because if all other life forms are dead they'd want to preserve the last gasp of the legacy of life? idk.

But yeah, the only constraints on what you want are what physics permits matter to encode and what you already want. You probably can't just decide to want any old thing, because you already want something different than that. Other than that objection, I think I basically agree with OP.

Comment by the gears to ascension (lahwran) on antimonyanthony's Shortform · 2024-07-29T17:49:30.829Z · LW · GW

how does a physicalist ontology differ from an algorithmic ontology in terms of the math?

Comment by the gears to ascension (lahwran) on lukemarks's Shortform · 2024-07-27T07:26:24.065Z · LW · GW

What's the epistemic backing behind this claim, how much data, what kind? Did you do it, how's it gone? How many others do you know of dropping out and did it go well or poorly?

Comment by the gears to ascension (lahwran) on Organisation for Program Equilibrium reading group · 2024-07-26T08:47:46.259Z · LW · GW

I suggest any time during the intersection of london and eastern time, ie london evening. (I'm not in either, but that intersection is a reliable win for organizing small international online social events, in my experience)

Comment by the gears to ascension (lahwran) on Organisation for Program Equilibrium reading group · 2024-07-25T19:22:52.693Z · LW · GW

Interested. Not in london.

Comment by the gears to ascension (lahwran) on What are the actual arguments in favor of computationalism as a theory of identity? · 2024-07-21T06:06:32.785Z · LW · GW

Well I'd put it the other way round. I don't know what phenomenal consciousness is unless it just means the bare fact of existence. I currently think the thing people call phenomenal consciousness is just "having realityfluid".

Comment by the gears to ascension (lahwran) on What are the actual arguments in favor of computationalism as a theory of identity? · 2024-07-21T05:14:18.504Z · LW · GW

Agreed about its implementation of awareness, as opposed to being unaware but still existing. What about its implementation of existing, as opposed to nonexistence?

Comment by the gears to ascension (lahwran) on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-07-19T23:01:21.103Z · LW · GW

I think it would help if we taboo consciousness and instead talk about existence ("the hard problem"/"first-person-ness"/"camp #2", maybe also "realityfluid") and awareness ("the easy problem"/"conscious-of-what"/"camp #1", maybe also "algorithm"). I agree with much of your reasoning, though I think the case that can be made for most cells having microqualia awareness seems very strong to me; whether there are larger integrated bubbles of awareness seems more suspect.

 

Edit: someone strong upvoted, then someone else strong downvoted. Votes are not very helpful; can you elaborate in a sentence or two or use phrase reacts?