Posts

Looking for intuitions to extend bargaining notions 2024-08-24T05:00:13.995Z
Moving away from physical continuity 2024-07-12T05:05:01.231Z
Inner Optimization Mechanisms in Neural Nets 2024-05-12T17:52:52.803Z
User-inclination-guessing algorithms: registering a goal 2024-03-20T15:55:01.314Z
ProgramCrafter's Shortform 2023-07-21T05:26:03.188Z
LLM misalignment can probably be found without manual prompt engineering 2023-07-08T14:35:44.119Z
Does object permanence of simulacrum affect LLMs' reasoning? 2023-04-19T16:28:22.094Z
The frozen neutrality 2023-04-01T12:58:40.873Z
Proposal on AI evaluation: false-proving 2023-03-31T12:12:15.636Z
How AI could workaround goals if rated by people 2023-03-19T15:51:04.743Z

Comments

Comment by ProgramCrafter (programcrafter) on ProgramCrafter's Shortform · 2024-09-06T21:37:22.948Z · LW · GW

A sufficient condition for a good [possibly AI-gen] plan

Current posts mainly focus on necessary properties of plans they'd like to see/be executed. I suggest a sufficient condition:

Plan is good, should be acted upon, etc at least when it is endorsed in advance, endorsed in retrospect and endorsed in counterfactual.

  1. Endorsed in advance: everyone relevant hears the plan and possible outcomes in advance, evaluates acceptability and accepts the plan.

  2. Endorsed in retrospect: everyone relevant looks upon intended outcomes, checks what happened actually, evaluates plan and has no regret.

  3. Endorsed in counterfactual: given choice in a set of plans, person would evaluate the specific plan as acceptable - somewhat satisfying them, not inducing much desire to switch.

Choice according to these criteria is still hard, but it should be a bit less mysterious.

Comment by ProgramCrafter (programcrafter) on What are the effective utilitarian pros and cons of having children (in rich countries)? · 2024-09-02T16:58:56.796Z · LW · GW

Halving the global population has the same effect on climate as doubling the size of the Earth's atmosphere

assuming economics (CO2 emission) scales linearly with population.

(alt idea) I think a large contributor to greenhouse gases is transport to remote areas, so solving problems with housing prices in local areas could somewhat concentrate people and help with ecology.

Comment by ProgramCrafter (programcrafter) on Does a time-reversible physical law/Cellular Automaton always imply the First Law of Thermodynamics? · 2024-08-30T20:40:23.577Z · LW · GW

Translation-invariance means that law is invariant in  - that is, if you test a system twice at different times, you might obtain same results. It is kind of continuous symmetry.

There is discrete, mirror symmetry, which is weaker than continuous but very much stronger than allowing to reverse-simulate. Mirror symmetry with "axis" T means that if system evolved from state A to B in time interval [T-t; T], then it would evolve from B to A during interval [T; T+t].

Reverse-simulation is even weaker, it only requires to specify a law for calculating past states (no need for reverse law to be same as forward one).

Comment by ProgramCrafter (programcrafter) on Does a time-reversible physical law/Cellular Automaton always imply the First Law of Thermodynamics? · 2024-08-30T20:22:45.645Z · LW · GW

if there are hypothetical time-reversible rules that don't have the first law of conservation of energy due to not implying time symmetry

Yes, there are. For instance, take one-dimensional model with a particle (to avoid questions about origin, you might add two reference particles which would not move) having conserved quantity and position . The law could be ; it is symmetric around zero, it is reverse-simulatable, but not translation invariant.

Comment by ProgramCrafter (programcrafter) on Does a time-reversible physical law/Cellular Automaton always imply the First Law of Thermodynamics? · 2024-08-30T18:43:38.360Z · LW · GW

Do you happen to have definition of "energy" for cellular automata? I guess you could group states reachable via the reversible law (thus being on one loop) into equivalence classes, but that does not say anything about cells in any local area.

Physics is continuous and has Noether's theorem; for it, time shift symmetry (not even reversing time direction) implies conservation of energy.

Comment by ProgramCrafter (programcrafter) on Ruby's Quick Takes · 2024-08-30T14:25:54.426Z · LW · GW

I'm interested! I, among other usage, hope to use it for finding posts exploring similar topics by different names.

By the way, I have an idea what to use instead of a payment model: interacting with user's local LLM like one started within LM Studio. That'd require a checkbox/field to enter API URL, some recommendations on which model to use and working out how to reduce amount of content fed into model (as user-run LLM seem to have smaller context windows than needed).

Comment by ProgramCrafter (programcrafter) on "Deception Genre" What Books are like Project Lawful? · 2024-08-28T21:44:26.008Z · LW · GW

There is a epub version at https://www.mikescher.com/blog/29/Project_Lawful_ebook.

(Alternatively, you could open each glowfic thread, select "show flat" or option like that, then print/parse the resulting page, obtaining a few PDFs which can be merged afterwards.)

Comment by ProgramCrafter (programcrafter) on LessWrong email subscriptions? · 2024-08-28T04:52:02.957Z · LW · GW

Batching of various email types into a digest that comes once a day/week.

I feel like this does not fit with LessWrong well; its point is not only to have correct knowledge, but also to think correctly, and that works better with lots of evidence. In a digest, people might believe whatever the post title says disregarding caveats described in post itself.

Comment by ProgramCrafter (programcrafter) on Darwinian Traps and Existential Risks · 2024-08-27T22:08:45.218Z · LW · GW

When defectors thrive in a mixed population due to selection pressures, more individuals will adapt by choosing defection. Over time, this process erodes the number of cooperators until they eventually all become defectors. 

This contradicts a paper "FDT in an evolutionary environment" linked in https://www.lesswrong.com/posts/oR8hdkjuBrvpHmzGF/found-paper-fdt-in-an-evolutionary-environment, which argues that in similar situations defectors may be less competitive than intelligent agents (which, by the way, cooperate among themselves). Also, evolution didn't produce purely selfish entities... for me it suggests that one of more important reasons of defection (and mutual defection) is limit on cognition (so, bounds on rationality).

Comment by ProgramCrafter (programcrafter) on How do we know dreams aren't real? · 2024-08-24T18:50:07.856Z · LW · GW

Actually, collections of atoms (let's call them structures) can be special.

For instance, there are structures which tend to produce copies of themselves; with some changes (one sign flip), one can obtain structure which tends to destroy its instances. They have approximately same complexity so their rate of randomly arising is equal; however, over time count of former structures increases while count of latter decreases. So we shouldn't expect all atom collections to appear with equal probability even in full universe.

Comment by ProgramCrafter (programcrafter) on How do we know dreams aren't real? · 2024-08-24T18:42:05.551Z · LW · GW

As a basic counterexample, just consider a fully empty infinite universe. It is in equilibrium (and does not violate any known laws of physics)

Your premise violates quantum mechanic, actually. Such an universe's amplitude distribution is delta function (fully empty with probability 1, any other state with probability 0), which does not have second derivative so its future evolution is undefined.

Comment by ProgramCrafter (programcrafter) on Leaky Delegation: You are not a Commodity · 2024-08-24T12:08:33.141Z · LW · GW

It doesn't support that exact point but rather "the better you can do something yourself, the less downside is there in doing it yourself instead of outsourcing" which seems to be base of paradox mentioned.

Comment by ProgramCrafter (programcrafter) on Leaky Delegation: You are not a Commodity · 2024-08-24T04:16:34.325Z · LW · GW

There's a simple reason: when you do something yourself, the value is in both having it done and learning how to do it.

There's probably another reason too! If you know how to do something yourself, you save effort while communicating what you actually want to providers, and use less resources of failure-mode-anticipation algorithms.

Comment by ProgramCrafter (programcrafter) on Freedom of Speech · 2024-08-20T18:06:10.643Z · LW · GW

The post makes a separate claim with each sentence and, instead of going on to reasons, continues with yet another claim. I think this negatively affects its quality: for instance

It [freedom of speech] protects people with minority views from persecution by the state or the mob. <...> Freedom of speech can be limited by the state, corporations, the mob or individuals acting alone. Any use of coercion to suppress ideas is an attack on freedom of speech.

This seems significantly misleading. Ideas should be selected based on their merits, and that requires that some ideas do not survive. Thus, suppression of ideas could be a thing positive to social rationality.

Comment by ProgramCrafter (programcrafter) on A computational complexity argument for many worlds · 2024-08-20T17:50:17.525Z · LW · GW

But physics is roughly time-symmetric so branches merge all the time as well. (I believe they merge at a rate slightly less than splitting, so the configuration space expands a bit over time.)

Comment by ProgramCrafter (programcrafter) on Ten arguments that AI is an existential risk · 2024-08-13T20:06:02.432Z · LW · GW

A utilitarian, a deep ecologist, and a Christian might agree on policy in the present world, but given arbitrary power their preferred futures might be a radical loss to the others. <...> People who broadly agree on good outcomes within the current world may, given much more power, choose outcomes that others would consider catastrophic

I think that many people do not intend for their preferred policy to be implemented everywhere; so, at least they could be satisfied with a small region of universe. Though, AI-as-tool is quite likely to be created under control of those who want to have every power source and (an assumption) thus also want to steer most of the world; it's unclear if AI-as-agent would have strong preferences about the parts of world it doesn't see.

Comment by ProgramCrafter (programcrafter) on Richard Ngo's Shortform · 2024-08-09T19:41:45.828Z · LW · GW

It seems to me that agent's strategy in the limit will either be null action or evolution-dictated action, not sure which. That is, "in universe where it's easy to do A the agent will choose to do A" somewhat implies "according to how easy it is for agent doing A to gain more optimization power, actions will be chosen" which is essentially evolution.

Comment by ProgramCrafter (programcrafter) on Mistakes with Conservation of Expected Evidence · 2024-08-06T18:02:24.874Z · LW · GW

it would imply that you should ignore mathematical proofs if the person who came up with the proof only searched for positive proofs and wouldn't have spend time trying to prove the opposite. (This ties in with the very first section -- failing to find a proof is like remaining silent.)

I think this strategy becomes coherent if you update on the claim "fact X is true, here's its proof" being made? After all, there's lower probability that person publishes such claim if they fail to find the proof.

(Generalization: it doesn't matter much on what arguments you update, it matters more what you end up believing.)

Comment by ProgramCrafter (programcrafter) on How to avoid death by AI. · 2024-07-29T20:15:17.962Z · LW · GW

This proposal, done without care, might turn out as a variation on advertisements, which are usually delivered to phone-tapping farms instead of to people who would be interested in a product/concept. If you have a provably better solution, you'd outcompete existing ad systems!

Comment by ProgramCrafter (programcrafter) on Lucius Bushnaq's Shortform · 2024-07-29T08:13:48.333Z · LW · GW

Current LLMs are trivially mesa-optimisers under the original definition of that term.

Do current LLMs produce several options then compare them according to an objective function?

They do, actually, evaluate each of possible output tokens, then emitting one of the most probable ones, but I think that concern is more about AI comparing larger chunks of text (for instance, evaluating paragraphs of a report by stakeholders' reaction).

Comment by ProgramCrafter (programcrafter) on Nathan Young's Shortform · 2024-07-25T12:28:26.808Z · LW · GW

I disagree with "of course". The laws of cognition aren't on any side, but human rationalists presumably share (at least some) human values and intend to advance them; insofar they are more successful than non-rationalists this qualifies as Good.

Comment by ProgramCrafter (programcrafter) on Closed Limelike Curves's Shortform · 2024-07-23T08:29:30.628Z · LW · GW

I have created Discord server: "Decision Articles Treatment", https://discord.gg/P7m63mAP.

@the gears to ascension @Olli Järviniemi @DusanDNesic not sure if your reacts would create notifications, so pinging manually.

Comment by ProgramCrafter (programcrafter) on Closed Limelike Curves's Shortform · 2024-07-20T17:17:21.870Z · LW · GW

What I'd now want to see is more people actually coordinating to do something about it - set up a Telegram or Discord group or something, and start actually working on improving the pages - rather than this just being one of those complaints on how Rationalists Never Actually Tried To Win, which a lot of people upvote and nod along with, and which is quickly forgotten without any actual action.

So mote it be. I can start the group/server and do moderation (though not 24/7, of course). Whoever is reading this: please choose between Telegram and Discord with inline react.

Moderation style I currently use: "reign of terror", delete offtopic messages immediately, after large discussions delete the messages which do not carry much information (even if someone replies to them).

I've created a couple of prediction markets:

Will I manage group for improvement of Wikipedia-related articles Will LessWrong have book review on some newly-added source to Wikipedia rationality-related article

Comment by ProgramCrafter (programcrafter) on Closed Limelike Curves's Shortform · 2024-07-20T09:49:54.611Z · LW · GW

There is also article Decision-making.

Importance arguments:

  1. Five wikiprojects rely on this article, but it is C-class on Wikipedia scale;
  2. Topic seems quite important for people. If someone not knowing how to make decisions stumbles upon the article, the first image they see... is a flowchart, which can scare non-programmists away.
Comment by ProgramCrafter (programcrafter) on How To Go From Interpretability To Alignment: Just Retarget The Search · 2024-06-17T19:12:04.790Z · LW · GW

The system might develop several search parts, some of which would be epistemical - for instance, "where my friend Bob is? Volunteering at a camp? Eating out at a cafe? Watching a movie?" - and attempt to retarget one to select the option based on alignment target instead of truth would make AI underperform or act on invalid world model.

Are there better ways to fix this issue than to retarget just the last search (one nearest to the output)?

Comment by ProgramCrafter (programcrafter) on Open Thread Summer 2024 · 2024-06-15T14:11:52.427Z · LW · GW

A problem is that

  • we don't know specific goal representation (actual string in place of "A"),
  • we don't know how to evaluate LLM output (in particular, how to check whether the plan suggested works for a goal),
  • we have a large (presumably infinite non-enumerable) set of behavior B we want to avoid,
  • we have explicit representation for some items in B, mentally understand a bit more, and don't understand/know about other unwanted things.
Comment by ProgramCrafter (programcrafter) on AI #68: Remarkably Reasonable Reactions · 2024-06-15T10:34:44.136Z · LW · GW

The MVP version is that everyone buys (obviously transferrable) credits, and communications have a credit amount attached. Each person can set a minimum below which communications get filtered out entirely, and the target can see the credit bid when determining whether to open the message. Once they open the message, they can choose to keep the credits, do nothing or tip the person back depending on whether they found the interaction helpful.

By the way, such technology already exists and is called "blockchain"; it allows to send public or semi-public (encrypted) messages to anyone but requires to pay for that, and it allows to authenticate sender (in particular, for forwarding messages).

Comment by ProgramCrafter (programcrafter) on My AI Model Delta Compared To Yudkowsky · 2024-06-10T20:27:41.803Z · LW · GW

I assume we all agree that the system can understand the human ontology, though?

This, however likely, is not certain. A possible way for this assumption to fail is when a system allocates minimal cognitive capacity to its internal ontology and remaining power to selecting best actions; this may be a viable strategy if system's world model is still enough descriptive but does not have extra space to represent human ontology fully.

Comment by ProgramCrafter (programcrafter) on minutes from a human-alignment meeting · 2024-05-25T08:34:27.810Z · LW · GW

Just make it in John's self-interest.

That's the first step; the second is to make it more beneficial than alternatives, and preferably by a large margin so that adversaries can't outbid norm-following way (as is case with peer pressure).

Comment by ProgramCrafter (programcrafter) on "Which chains-of-thought was that faster than?" · 2024-05-24T03:27:18.911Z · LW · GW

I'm unsure in whether that point should be in condition, actually; for me, it feels like very few chains of thoughts will be considered for optimization then, so the advice would be useful only for already self-improving people. I would try to replace that point so that it doesn't trigger too often in the same area of life, maybe.

Comment by ProgramCrafter (programcrafter) on Why you should learn a musical instrument · 2024-05-18T20:51:43.707Z · LW · GW

What I didn't know is how immediately thought-provoking it would be to learn even the most basic things about playing music. Maybe it's like learning to program, if you used a computer all the time but you never had one thought about how it might work.

That comparison is also thought-provoking) Thinking for a minute yielded that programming may be considered quite similar to playing music, but differs that in programming you do not need to do most things in any specific order. For example, if I have a dataset of a competition participants, it doesn't matter whether I deduplicate names or remove disqualified entries first.

Comment by ProgramCrafter (programcrafter) on Monthly Roundup #18: May 2024 · 2024-05-13T13:28:01.003Z · LW · GW

Reuters: BREAKING: Reuters reports that TikTok’s owner ByteDance would prefer to ‘shut down’ its app in the US rather than sell it if all legal options are exhausted

Eigenrobot: Why would you say this?

It’s odd that a profit maximizing firm would actually pursue this strategy.

 

I'd like to mention the explanation that ByteDance does not consider US dollars to have enough value. Given that China can't use them to lobby cancelling sanctions, for instance, that does mean that US dollars aren't equivalent to unspecialized optimization power for them, and might have little value.

Comment by ProgramCrafter (programcrafter) on Thoughts on the relative economic benefits of polyamorous relationships? · 2024-05-09T19:28:41.784Z · LW · GW

I would guess this is somewhat similar to having a network of friends: a polycule is even bound to be smaller. And I can totally imagine being emotionally, romantically, sexually attached to one set of partners and opinion-sharing attached to a slightly different set.

Comment by ProgramCrafter (programcrafter) on Quantum Explanations · 2024-04-29T18:52:05.434Z · LW · GW

I believe Focus Your Uncertainty essay of Sequences touches this topic: at very least, math is useful for splitting limited amount of resources.

Comment by ProgramCrafter (programcrafter) on "You're the most beautiful girl in the world" and Wittgensteinian Language Games · 2024-04-28T12:12:41.377Z · LW · GW

Testing status: I've only dated once, because I'm moving to other city to enter university.

The girl I have dated was quite pretty but not the most beautiful around. Luckily I learnt that she has read HP:MoR early so didn't even try to over-hyperbole and say that she was the most beautiful - both of us would understand that it's false - instead, I smiled at appropriate moments.

Another non-verbal sign is not to dismiss parts of dialogue. When my girlfriend suggested a few animes to watch, and I doubted I would like them, I still visibly wrote them down but avoided promising that I will actually watch them. (I ended up liking one and said so afterwards!)


I have quite specific perspective on talking, because I notice that I'm trying to understand others' perspective and internal beliefs structure when they don't understand something. Roughly once a month, someone of my classmates would ask a strange-looking question, and teacher would answer something similar but not the question (like "Why this approximation works?" - "There's how you do it..." - "I've understood how to calculate it, but why is it the answer?"), and afterwards I try to patch the underlying beliefs structure.

Comment by ProgramCrafter (programcrafter) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-22T14:57:16.349Z · LW · GW

Speaking of next steps, I'd love to see a transformer that was trained to manipulate those states (given target state and interactor's tokens, would emit its own tokens for interleaving)! I believe this would look even cooler, and may be useful in detecting if AI starts to manipulate someone.

Comment by ProgramCrafter (programcrafter) on "You're the most beautiful girl in the world" and Wittgensteinian Language Games · 2024-04-21T15:17:11.192Z · LW · GW

I'd say that it doesn't carve reality at the same places as my understanding. I neither upvoted nor downvoted the post, but had to consciously remember that I have that option at all.


I think that language usage can be represented as vector, in basis of two modes:

  1. "The Fiat": words really have meanings, and goal of communication is to transmit information (including requests, promises, etc!),
  2. "Non-Fiat": you simply attempt to say a phrase that makes other people do something that furthers your goal. Like identifying with a social group (see Belief as Attire) or non-genuine promises.

(Note 1: if someone asked me what mode I commonly use, I would think. Think hard.)

(Note 2: I've found a whole tag about motivations which produce words - https://www.lesswrong.com/tag/simulacrum-levels! Had lost it for certain time before writing this comment.)

In life, I try to communicate less hyperboles and replace them with non-verbal signs, which do not carry implication of either "the most beautiful" or "more beautiful than everyone around".

Comment by ProgramCrafter (programcrafter) on hydrogen tube transport · 2024-04-19T05:47:04.441Z · LW · GW

Maybe vehicles would need to carry some shaped charges to cut a hole in the tube in case of emergency.

That would likely create sparks, and provided the tube has been cut the hydrogen is going to explode.

Comment by ProgramCrafter (programcrafter) on When is a mind me? · 2024-04-19T04:13:05.379Z · LW · GW

*preferably not the last state but some where the person felt normal.

I believe that's right! Though, if person can be reconstructed from N bits of information, and dead body retains K << N, then we need to save N-K bits (or maybe all N, for robustness) somewhere else.

It's an interesting question how many bits can be inferred from social networks trace of the person, actually.

Comment by ProgramCrafter (programcrafter) on ProgramCrafter's Shortform · 2024-04-18T16:29:13.447Z · LW · GW

Continuing to make posts into songs! I believe I'm getting a bit better, mainly in rapid-lyrics-writing; would appreciate pointers how to improve further.

  1. https://suno.com/song/ef734c80-bce6-4825-9906-fc226c1ea5b4 (based on post Don't teach people how to reach the top of a hill)
  2. https://suno.com/song/c5e21df5-4df7-4481-bbe3-d0b7c1227896 (based on post Effectively Handling Disagreements - Introducing a New Workshop)

Also, if someone is against me creating a musical form of your post, please say so! I don't know beforehand which texts would seem easily convertible to me.

Comment by ProgramCrafter (programcrafter) on Should we maximize the Geometric Expectation of Utility? · 2024-04-17T11:39:08.745Z · LW · GW

This is especially concerning if we, as good Bayesians, refuse to assign a zero probability to any event, including zero utility ones.

 

I feel that since people don't care ultimately about money, all-nonzero probabilities will make all events have nonzero utility as well.

Comment by ProgramCrafter (programcrafter) on Non-ultimatum game problem · 2024-04-12T22:15:01.912Z · LW · GW

Let's solve this problem without trying to refer to existing particular cases.

For the start, we assume that utility of A-the-thief monotonically decreases with time served; utility of B monotonically increases, and if A gives up lollipops it is increased by another constant.

Let's graph what choices A has when B does not give up lollipops.

We may notice that in this case B will simply throw A in jail for life. Well, what happens if A is willing to cooperate a bit?

  1. Coordination result will not be below or to the left of Pareto frontier, since otherwise it is possible to do better than that;
  2. Coordination result will not be below or to the left of no-coordination result, since otherwise one or both parties are acting irrationally.

We may see that after these conditions, only a piece of curve "A gives up lollipops" remains. The exact bargaining point can then be found out by using ROSE values, but we can already conclude that A will likely be convicted for a long time but not for life.

Comment by ProgramCrafter (programcrafter) on ProgramCrafter's Shortform · 2024-04-04T19:10:10.091Z · LW · GW

The LessWrong's AI-generated album was surprisingly nice and, even more importantly, pointed out the song generator to me! (I've tried to find one a year ago and failed)

So I've decided to try my hand on quantum mechanics sequence. Here's what I have reached yet: https://app.suno.ai/playlist/81b44910-a9df-43ce-9160-b062e5b080f8/. (10 generated songs, 3 selected, unfortunately not the best quality)

Comment by ProgramCrafter (programcrafter) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-04T19:04:36.486Z · LW · GW

I've decided to try my hand on quantum mechanics sequence! Here's what I have reached yet: https://app.suno.ai/playlist/81b44910-a9df-43ce-9160-b062e5b080f8/. (10 generated songs, 3 selected, unfortunately not the best quality)

Comment by ProgramCrafter (programcrafter) on Open Thread Spring 2024 · 2024-03-23T19:48:24.254Z · LW · GW

I've came across a poll about exchanging probability estimates with another rationalist: https://manifold.markets/1941159478/you-think-something-is-30-likely-bu?r=QW5U.

You think something is 30% likely but a friend thinks 70%. To what does that change your opinion?

I feel like there can be specially-constructed problems when the result probability is 0, but haven't been able to construct an example. Are there any?

Comment by ProgramCrafter (programcrafter) on Alice and Bob is debating on a technique. Alice says Bob should try it before denying it. Is it a fallacy or something similar? · 2024-03-18T12:37:34.466Z · LW · GW

On what basis can Alice assume

Not actually assume, but that's certainly Bayesian evidence (should Bob have tried , he would likely respond in another way).

Also, :smile: your own comment is a fairly large bit of evidence that you haven't yet read the Sequences (by the way, I recommend doing that). For instance, you can consider different ways of thinking, answer the questions 1-4 from their perspectives, and that would be evidence on which way is better - though, reality is still the ultimate judge how each situation turns out.

Comment by ProgramCrafter (programcrafter) on Raising children on the eve of AI · 2024-02-23T14:28:52.226Z · LW · GW

I don't think there is anything I can do educationally to better ensure they thrive as adults other than making sure I teach them practical/physical build and repair skills

I think one more thing could be useful, I'd call it "structural rise": over many different spheres of society, large projects are created by combining some small parts; ways to combine them and test robustness (for programs)/stability (for organisations)/beauty (music)/etc seem pretty common for most of the areas, so I guess they can be learned separately.

Comment by programcrafter on [deleted post] 2024-01-27T10:06:21.324Z

I suppose 3d-whiteboard could be useful in allowing to connect more relevant subjects to each node (re: problem of four colours: countries on plane can be coloured in 4 different ways so that touching ones have different colour, and in space there's no such limit).

Comment by ProgramCrafter (programcrafter) on Saving the world sucks · 2024-01-11T14:40:37.137Z · LW · GW

the best you can do is what you think is personally good

As far as you're an ideal optimizing agent with consistent values and full knowledge, otherwise actions based on your thoughts may end up worse than using social heuristics.

That there are no bugs when it comes to values. That you should care about exactly what you want to care about. That if you want to team up and save the world from AI or poverty or mortality, you can, but you don’t have to.

Locally invalid. Values can be terminal (what you care about) and instrumental, and saving the world for most people is actually instrumental.

There’s value in giving explicit permission to confused newcomers to not get trapped in moral chains, because it’s really easy to hurt yourself doing that.

I think that's true since memes can be harmful, but there is also value in reminding that if more people worked to save improve the world on average, it would be better, and often a simpler way is to do that yourself instead of pushing that meme+responsibility out.

Save the world if you want to, but please don’t if you don’t want to.

I'd continue that with "but please don't destroy the world whichever option you choose, since that will interfere with my terminal goals and I'll care about your non-existence".

Comment by ProgramCrafter (programcrafter) on The shard theory of human values · 2024-01-09T10:31:15.374Z · LW · GW

That's certainly an interesting position in discussion about what people want!

Namely, that actions and preferences are just conditionally-activated and those context activations are balanced against each other. That means that person's preference system may be not only incomplete but incoherent in architecture, and moral systems and goals obtained via reflection are almost certainly not total (will lack in some contexts), creating problem in RLHF.

The first assumption, that part of neurons is basically randomly initialized, can't be tested really well because all humans are born in similar gravity field, see similarly-structured images in first days (all "colorful patches" correspond to objects which are continuous, mostly flat or uniformly round), etc and that leaves a generic imprint.