post by [deleted] · · ? · GW · 0 comments

This is a link post for

0 comments

Comments sorted by top scores.

comment by interstice · 2023-11-23T02:00:30.284Z · LW(p) · GW(p)

if you act like a ratfic character, you win in ratfic and you win in real life.

Citation needed.

Replies from: carado-1

↑ comment by Tamsin Leake (carado-1) · 2023-11-23T02:06:51.141Z · LW(p) · GW(p)

ratfic (as i'm using here) typically showcases characters applying lesswrong rationality well. lesswrong rationality is typically defined as ultimately instrumental to winning.

Replies from: interstice, Archimedes

↑ comment by interstice · 2023-11-23T02:11:34.073Z · LW(p) · GW(p)

LW-rationality(and ratfic by extension) aspires to be instrumental to winning in the real world, whether it in fact does so is an emprical question.

Replies from: faul_sname

↑ comment by faul_sname · 2023-11-23T03:53:33.541Z · LW(p) · GW(p)

The empirical answer to that question does appear to be "yes" to some extent, though that's mostly a couple of very big wins (calling out bitcoin, COVID, and LLMs as important very early relative to even the techie population) rather than a bunch of consistent small wins. And also there have been some pretty spectacular failures.

↑ comment by Archimedes · 2023-11-23T02:16:03.197Z · LW(p) · GW(p)

That sounds rather tautological.

Assuming ratfic represents LessWrong-style rationality well and assuming LW-style rationality is a good approximation of truly useful instrumental reasoning, then the claim should hold. There’s room for error in both assumptions.

comment by Trevor Hill-Hand (Jadael) · 2023-11-23T00:01:09.237Z · LW(p) · GW(p)

I see people upvoting this, and I think I can see some good insights in this post, but MAN are glowfics obnoxious to read, and this feels really hard to read in a very similar way. I'm sad it is not easier to read.

comment by Mateusz Bagiński (mateusz-baginski) · 2023-11-22T19:48:10.190Z · LW(p) · GW(p)

(Meta) Why do you not use capital letters, unless in acronyms? I find it harder to parse.

Replies from: SaidAchmiz, carado-1

↑ comment by Said Achmiz (SaidAchmiz) · 2023-11-22T23:47:25.910Z · LW(p) · GW(p)

Strongly agreed. Capitalization conventions are part of standard English-language orthography for a very good reason; they make it much easier to read text.

Such things as “stylistic choices” and “to make my formal writing more representative of my casual chatting” do not even begin to approach sufficient justification for imposing such a dramatic cost on your readers.

Writing like this sends a very strong signal that you do not care about your readers’ time or cognitive resources, which in turn implies that you’re uninterested in whether you’re successfully communicating your ideas. (Of course, if that’s your intent, then fair enough. But then why post at all?)

Replies from: carado-1

↑ comment by Tamsin Leake (carado-1) · 2023-11-23T01:33:13.760Z · LW(p) · GW(p)

you know what, fair enough. i've edited the post to be capitalized in the usual way.

↑ comment by Tamsin Leake (carado-1) · 2023-11-22T22:55:38.852Z · LW(p) · GW(p)

i've sortof-answered previously [LW(p) · GW(p)].

comment by quiet_NaN · 2023-11-23T12:23:11.247Z · LW(p) · GW(p)

Yes, having a general principle of being kind to others is downstream of that, because a paladin who is known to be kind and helpful will tend to have more resources to save the world with.

Well, instrumental convergence is a thing. If there is a certain sweet spot for kindness with regard to resource gain, I would expect the paladin, the sellsword and the hellknight all to arrive at similar kindness levels.

There is a spectrum between pure deontology and pure utilitarianism. I agree with the author and EY that pure utilitarianism is not suitable for humans. In my opinion, one failure mode of pure deontology is refusing to fight evil in any not totally inefficient way, while one failure mode of pure utilitarianism is to lose sight of the main goal while focusing on some instrumental subgoal.

Of course, in traditional D&D, paladins are generally characterized as deontological sticklers for their rules ("lawful stupid").

Let's say you're concerned about animal suffering. You should realize that what is gonna have the most impact on how much animal suffering the future will contain is, by far, determined by what kind of AI is the one that inevitably takes over the world, and then you should decide to work on something which impacts what kind of AI is the AI that inevitably takes over the world.

If ASI comes along, I expect that animal suffering will no longer be a big concern. Economic practices which cause animal suffering tend to decline in importance as we get to higher tech levels. The plow may be pulled by the oxen, but no sensible spaceship engine will ever run on puppy torture. This leaves the following possibilities:

* An ASI which is orthogonal to animal (including human) welfare. It will simply turn the animals to something more useful to its goals, thereby ending animal suffering.

* An ASI which is somewhat aligned to human interests. This could probably result in mankind increasing the total number of pets and wild animals by orders of magnitudes for human reasons. But even today, only a small minority of humans prefers to actively hurt animals for their own sake. Even if we do not get around to fixing pain for some reason, the expected suffering per animal will not be more than what we consider acceptable today. (Few people advocate extincting any wild species because they just suffer too much.)

* An ASI whose end goal is to cause animal suffering. I Have No Mouth And I Must Scream, But With Puppies. This is basically a null set of all the possible ASIs. I concede that we might create a human torturing ASI if we mess up alignment badly enough by making a sign error or whatever, but even that seems like a remote possibility.

So if ones goal is to minimize the harm per animal conditional on it existing, and one believes that ASI is within reach, the correct focus would seem to be to ignore alignment and focus on capabilities. Either you end up with a paperclip maximizer who is certain to reduce non-human animal suffering as compared to our current world of factory farming within a decade, or you end up with a friendly AI which will get rid of factory farming because it upsets some humans (and is terribly inefficient for food production).

Of course, if you care about the total number of net-happy animals, and species not going extinct and all of that, then alignment will start to matter.

Replies from: carado-1

↑ comment by Tamsin Leake (carado-1) · 2023-11-23T14:18:46.621Z · LW(p) · GW(p)

agreed overall.

if ones goal is to minimize the harm per animal conditional on it existing, and one believes that ASI is within reach, the correct focus would seem to be to ignore alignment and focus on capabilities

IMO aligned AI reduces suffering even more than unaligned AI because it'll pay alien civilizations (eg baby eaters [LW · GW]) to not do things that we'd consider large scale suffering (in exchange for some of our lightcone), so even people closer to the negative utilitarian side should want to solve alignment.

comment by Nicholas / Heather Kross (NicholasKross) · 2023-11-22T23:31:24.273Z · LW(p) · GW(p)

I got a lot out of the post, including self-understanding from observing my reactions to and thoughts about the post.

Replies from: Artaxerxes

↑ comment by Artaxerxes · 2023-11-23T06:28:41.575Z · LW(p) · GW(p)

What kinds of reactions to and thoughts about the post did you have that you got a lot out of observing?

Replies from: NicholasKross

↑ comment by Nicholas / Heather Kross (NicholasKross) · 2023-11-24T00:32:52.970Z · LW(p) · GW(p)

Non-exhaustive, and a maybe non-representatively useful selection: I realized how easily I can change my state-of-mind, down to emotions and similar things. Not instant/fully at-will, but more "I can see the X, let it pass over/through me, and then switch to Y if I want".

comment by goldenfetus · 2023-11-24T19:56:30.529Z · LW(p) · GW(p)

>For example: I expect that if I had framed this post as "you should want to save the world, and here's how" rather than "if you want to save the world, then here's how", a bunch of people who are currently on board with this post would have instead felt aggressed and reacted by finding excuses to reject this post's contents.

I argue/observe that the people who would have felt that way if you had said that would have actually been aggressed against (as opposed to just feeling that way), and therefore would not have needed to find any excuses, since they would have had a valid reason to reject (at least parts of) the post's content. Your other invocations of that concept, however, appeared reasonable and insightful to me.

I enjoyed your post and found it useful for a number of reasons. It may be worthwhile to examine the possibility that other paladins exist who would not necessarily cooperate with you, because their conception of the goal state "the world is saved" is different from your own.

Replies from: carado-1, carado-1

↑ comment by Tamsin Leake (carado-1) · 2023-11-25T13:40:05.192Z · LW(p) · GW(p)

that last point is plausible for some, but for most i expect that we're far from the pareto frontier and large positive sum gains to be made through cooperation (assuming they implement a decision theory that allows such cooperation).

↑ comment by Tamsin Leake (carado-1) · 2023-11-25T13:39:29.406Z · LW(p) · GW(p)

comment by Screwtape · 2024-12-13T18:43:25.524Z · LW(p) · GW(p)

I like this essay. I am not a paladin and do not particularly plan to become one. I do not think all the people setting out to maximize utility would stand behind this particular version of the rallying cry.

But I do think paladins exist, I want them to have a rallying cry, and when it works — when they do manage to point themselves at the right target, and are capable of making a dent, then I appreciate that they exist and chose to do that. I also appreciate the "if you want to save the world, then here's how" framing.

I don't quite think someone could follow this essay to paladinhood so I'm mixed as to whether it succeeds at what it's setting out to do. I've given this a small upward vote in the review, with the intent to say yeah, hey paladins, I'm glad you're around LessWrong and I think something would be lost if you all left.

comment by Ben Pace (Benito) · 2024-12-12T08:57:47.951Z · LW(p) · GW(p)

I have various disagreements with some of the points in this post, and I don't think it adds enough new ideas to be strongly worthy of winning the annual review, but I am grateful to have read it, and for worthwhile topics it helps to retread the same ground in slightly different ways with some regularity [LW · GW]. I will give this a +1 vote.

(As an example disagreement, there's a quote of a fictional character saying "There will be time enough for love and beauty and joy and family later. But first we must make the world safe for them." A contrary hypothesis I believe in more is that growing from children into adults involves bringing to life all parts of us that have been suffocated by Moloch, including many of these very powerful very human parts, and it is not good for these parts of us to be lost to the world until after the singularity.)

comment by Nicholas / Heather Kross (NicholasKross) · 2024-01-13T06:15:37.007Z · LW(p) · GW(p)

to quote a fiend, "your mind is a labyrinth of anti-cognitive-bias safeguards, huh?"

[emphasis added]

The implied context/story this is from sure sounds interesting. Mind telling it?

comment by Mateusz Bagiński (mateusz-baginski) · 2023-12-05T16:22:54.035Z · LW(p) · GW(p)

Thanks, I think I needed this. It leaves me with a similar feeling as A Man in the Arena.