Posts
Comments
almost any improvement will indirectly help at designing AI
That may be too strong of a statement. Say some new tool helps improve AI legislation more than AI design, this might turn slowing down the wheel.
Commit to only use superhuman persuasion when arguing towards a valid conclusion via valid arguments, in a manner that doesn't go against the interests of the person being persuaded.
In this plan, how should the AI define what’s in the interest of the person being persuaded? For example, say you have a North Korean soldier who can be persuaded to quite for the west (at the risk of getting the shitty jobs most migrants have) or who can be persuaded to remain loyal to his bosses (at the risk of raising his children in the shitty country most north korean have), what set of rules would you suggest?
the asteroid would likely burn up, but perhaps you have a solution for that
Yes, there’s a well known solution: just make the asteroid fast enough, and it will burn less in the atmosphere.
My understanding of this framework is probably too raw to go sane (A natural latent is a convolution basis useful for analyzing natural inputs, and it’s powerful because function composition is powerful) but it could fit nicely with Agency is what neurons in the biological movement area detect.
That’s great analogy. To me the strength of the OP is to pinpoint that LLMs already exhibit the kind of general ability we would expect from AGI, and the weakness is to forget that LLMs do not exhibit some specific ability most thought easy, such as the agency that even clownfishes exhibit.
In a way this sounds like again the universe is telling us we should rethink what intelligence is. Chess is hard and doing the dishes is easy? Nope. Language is hard and agency is central? Nope.
Thanks for the prompt! If we ask Claude 3 to be happy about x, don’t you think that counts as nudging it toward implementing a conscious being?
Perhaps you could identify your important beliefs
That part made me think. If I see bright minds falling in this trap, does blindness goes with importance of the belief for that person? I would say yes I think. As if that’s where we tend to make more « mistakes. that can behave as ratchets of the mind ». Thanks for the insight!
that also perhaps are controversial
Same exercise: if I see bright minds falling in this trap, does blindness goes with controversial beliefs? Definitely! Almost by definition actually.
each year write down the most likely story you can think of that would make it be wrong
I don’t feel I get this part as well as the formers. Suppose I hold the lab leak view, then notice it’s both controversial (« these morons can’t update right »), and much more important to me (« they don’t get how important it is for the safety of everyone »). What should I write?
Yup. Thanks for trying, but these beliefs seem to form a local minima, like a trap for the rational minds -even very bright ones. Do you think you understand how an aspiring rationalist could 1) recover and get out of this trap 2) don’t fall for it in the first place?
To be clear, my problem is not with the possibility of a lab leak itself, it’s with the evaluation that present evidences are anything but posthoc rationalizations fueled by unhealthy levels of tunnel vision. If bright minds can fall for that on this topic specifically, how do I know I’m not making the same mistake on something else?
(Spoiler warning)
(Also I didn’t check the previous survey nor the comments there, so expect some level of redondance)
The score itself (8/18) is not that informative, but checking the « accepted » answers is quite interesting. Here’s my « errors » and how much I’m happy making them:
You should be on the outlook for people who are getting bullied, and help defend them against the bullies.
I agree some rationalist leaders are toxic characters who will almost inevitably bully their students and collaborators, and I’m happy to keep strongly disagreeing with that. [actually most rationalists taking the survey agree with the statement, see tailcalled below]
Modern food is more healthy because we have better hygiene than people did in the past.
I strongly agree, LW accept weaker position. Ok, maybe I don’t have the data for a period of time in the past where food was not that bad. (Slight update on both the content and the need to countercheck the data myself).
It should be easy to own guns.
Seriously, is that even a question? Aren’t rationalists supposed to look at the data at some point?
Statistics show that black people still are far from catching up to white people in society.
I strongly agree, LW accept weaker position. But what I had in mind was the situation in Canada, so maybe I just don’t have the relevant data for the USA. Yes, that’s my kind of humour.
It is bad to buy and destroy expensive products in the name of "art".
I thought I disagree, but ok I admit I could imagine good examples. Like a statue from melting guns involved in murdering children at school.
You should believe your friends if they tell you they've seen ghosts.
That’s an interesting one. I would not believe ghosts caused this perception, but I will not negate the perception itself, nor believe that establishing the truth about ghosts is what should occupy my mind in this situation.
Charity organizations should offer their employees dream vacations in the tropics to make the employment more attractive and enjoyable, thereby attracting more people to the charity.
I thought everyone got the memo it did hurt the whole movement. Am I wrong or the « correct » answer is just outdated from pre SBF times?
There is no God. Supernatural claims are never true
That’s the only two questions where I firsthand did expect to differ, for complicated reasons that I just find this week was better expressed by Aella. (In short, it’s more useful to keep a sane dose of agnosticism)
https://knowingless.com/2018/05/02/so-says-crazybrain/
(I should also credit Scott Aaronson for best showing QM allows supernatural claims such as « there’s free will »)
Curiosity is for boring nerds.
Meow.
I am an old person. They may not let you do that in chemistry any more.
Absolutely! In my first chemistry lab, a long time ago, our teacher warned us that she had just lost a colleague to cancer at the age of forty, and she swore that if we didn't follow the security protocols very seriously, she would be our fucking nightmare.
I never heard her swear after that.
Not bad! But I stand by « random before (..) » as a better picture in the following sense: neurons don’t connect once to an address ending in 3. It connects several thousands of times to an address ending in 3. Some connexion are on the door, some on windows, some on the roof, one has been seen trying to connect to the dog, etc. Then it’s pruned, and the result looks not that far from a crystal. Or a convnet.
(there’s also long lasting silent synapses and a bit of neurogenesis, but that’s details for another time)
Hmmm, I disagree with the randomness.
I don’t think you do. Let me rephrase: the weights are picked at random, under a distribution biased by molecular cues, then pruned through activity dependent mechanisms.
In other words, our disagreement seems to count as an instance of Bertrand’s paradox.
The story went that “Perceptrons proved that the XOR problem is unsolvable by a single perceptron, a result that caused researchers to abandon neural networks”. (…) When I first heard the story, I immediately saw why XOR was unsolvable by one perceptron, then took a few minutes to design a two-layered perceptron network that solved the XOR problem. I then noted that the NAND problem is solvable by a single perceptron, after which I immediately knew that perceptron networks are universal since the NAND gate is.
Exactly the same experience and thoughts in my own freshyears (nineties), including the « but wasn’t that already known? » moment.
Rosenblatt’s solution was mainly just randomization because he mistakenly believed that the retina was randomly wired to the visual cortex, and he believed in emulating nature. Rosenblatt was working with the standard knowledge of neuroscience in his time. He could not have known that neural connections were anything but random – the first of the Hubel and Wiesel papers was published only in 1959.
Push back against this. Wiring is actually largely random before the critical periods that prunes most synapses, after which what remains is selected to fit the visual properties of the training environment. One way to mimicking that is to pick the delta weights at random and update iff the error diminishes (annealing).
But that’s one nitpick within many food for thought, thanks for the good reading!
He can be rough and on rare occasion has said things that could be considered personally disrespectful, but I didn't think that people were that delicate.
You may wish to update on this. I’ve only exchange a few words with one of the name, but that was enough to make clear he doesn’t bother being respectful. That may work in some non delicate research environment I don’t want to know about, but most bright academic I know like to have fun at work, and would leave any non delicate work environment (unless they make their personal duty to clean the place).
What do you think orthogonality thesis is?
I think that’s the deformation of a fundamental theorem (« there exists an universal Turing machine, e.g. it can run any program ») into a practical belief (« an intelligence can pick its value at random »), with a motte and bailey game on the meaning of can where the motte is the fundamental theorem and the bailey is the orthogonal thesis.
(thanks for the link to your own take, e.g. you think it’s the bailey that is the deformation)
Consider the sense in which humans are not aligned with each other. We can't formulate what "our goals" are. The question of what it even means to secure alignment is fraught with philosophical difficulties.
It’s part of the appeal, isn’t it?
If the oversight AI responsible for such decisions about a slightly stronger AI is not even existentially dangerous, it's likely to do a bad job of solving this problem.
I don’t get the logic here. Typo?
So I'm not claiming sudden changes, only intractability of what we are trying to do
That’s a fair point, but the intractability of a problem usually goes with the tractability of a slightly relaxed problem. In other words, it can be both fundamentally impossible to please everyone and fundamentally easy to control paperclips maximizers.
And also an aligned AI doesn't make the world safe until there is a new equilibrium of power, which is a point they don't address, but is still a major source of existential risk. For example, imagine giving multiple literal humans the power of being superintelligent AIs, with no issues of misalignment between them and their power. This is not a safe world until it settles, at which point humanity might not be there anymore. This is something that should be planned in more detail than what we get by not considering it at all.
Well said.
All significant risks are anthropogenic.
You think all significant risks are known?
Also, it seems clear how to intentionally construct a paperclip maximizer: you search for actions whose expected futures have more paperclips, then perform those actions. So a paperclip maximizer is at least not logically incoherent.
Indeed the inconsistency appears only with superintelligent paperclip maximizers. I can be petty with my wife. I don’t expect a much better me would.
Existentially dangerous paperclip maximizers don't misunderstand human goals.
Of course they do. If they didn’t and picked their goal at random, they wouldn’t make paperclips in the first place.
There's this post from 2013 whose title became a standard refrain on this point
I wouldn’t say that’s the point I was making.
This has been hashed out more than a decade ago and no longer comes up as a point of discussion on what is reasonable to expect. Except in situations where someone new to the arguments imagines that people on LessWrong expect such unbalanced AIs that selectively and unfairly understand some things but not others.
That’s a good description of my current beliefs, thanks!
Would you bet that a significant proportion on LW expect strong AI to selectively and unfairly understand (and defend, and hide) their own goal while selectively and unfairly not understand (and not defend, and defeat) the goals of both the developers and any previous (and upcoming) versions?
If it doesn't have a motive to do that,[ask the AI itself to monitor its well functioning, including alignement and non deceptiveness] it might do a bad job of doing that. Not because it doesn't have the capability to do a better job, but because it lacks the motive to do a better job, not having alignment and non-deceptiveness as its goals.
You realize that this basically defeats the orthogonality thesis, right?
I agree it might do a bad job. I disagree an AI doing a bad job on this would be close to hide its intent.
One way AI alignment might go well or turn out to be easy is if humans can straightforwardly succeed in building AIs that do monitor such things competently, that will nudge AIs towards not having any critical alignment problems. It's unclear if this is how things work, but they might. It's still a bad idea to try with existentially dangerous AIs at the current level of understanding, because it also might fail, and then there are no second chances.
In my view that’s a very honorable point to make. However I don’t know how to ponder this with its mirror version: we might also not have a second chance to build an AI that will save us from x_risks. What’s your general method for this kind of puzzle?
Consider two AIs, an oversight AI and a new improved AI. If the oversight AI is already existentially dangerous, but we are still only starting work on aligning an AI, then we are already in trouble.
Can we more or less rule out this scenario based on the observation all main players nowadays work on aligning their AI?
If the oversight AI is not existentially dangerous, then it might indeed fail to understand human values or goals, or fail to notice that the new improved AI doesn't care about them and is instead motivated by something else.
That’s completely alien to me. I can’t see how a numerical computer could hide its motivation without having been trained specifically for that. We the primates have been specifically trained to play deceptive/collaborative games. To think that a random pick of value would push an AI to adopt this kind of behavior sounds a lot like anthropomorphism. To add that it would do so suddenly, with no warning or sign in previous version and competitors, I have no good word for that. But I guess Pope & Belrose already made a better job explaining this.
Perhaps the position you disagree with is that a dangerous general AI will misunderstand human goals. That position seems rather silly, and I'm not aware of reasonable arguments for it. It's clearly correct to disagree with it, you are making a valid observation in pointing this out.
Thanks! To be honest I was indeed surprised that was controversial.
But then who are the people that endorse this silly position and would benefit from noticing the error? Who are you disagreeing with, and what do you think they believe, such that you disagree with it?
Well, anyone who still believe in paperclip maximizers. Do you feel like it’s an unlikely belief among rationalists? What would be the best post on LW to debunk this notion?
The AI itself doesn't fail, it pursues its own goals. Not pursuing human goals is not AI's failure in achieving or understanding what it wants, because human goals is not what it wants. Its designers may have intended for human goals to be what it wants, but they failed. And then the AI doesn't fail in pursuing its own goals that are different from human goals. The AI doesn't fail in understanding what human goals are, it just doesn't care to pursue them, because they are not its goals. That is the threat model, not AI failing to understand human goals.
That’s indeed better, but yes I also find this better scenario unsound. Why the designers wouldn’t ask the AI itself to monitor its well functioning, including alignement and non deceptiveness? Then either it fails by accident (and we’re back to the idiotic intelligence) or we need an extra assumption, like the AGI will tell us what problem is coming, then it will warn us what slightly inconvenient measures can prevent it, and then we still let it happen for petty political reasons. Oh well. I think I’ve just convinced myself doomers are right.
All that's required is that we aren't able to coordinate well enough as a species to actually stop it.
Indeed I would be much more optimistic if we were better at dealing with much simpler challenges, like put a price on pollution and welcome refugees with humanity.
Thanks 👍
(noice seems to mean « nice », I assume you meant « noise »)
Assuming "their" refers to the agent and not humans,
It refers to humans, but I agree it doesn’t change the disagreement, i.e. a super AI stupid enough to not see a potential misalignment coming is as problematic as the notion of a super AI incapable of understanding human goals.
(Epistemic fstatus: first thoughts after first reading)
Most is very standard cognitive neuroscience, although with more emphasis on some things (the subdivision of synaptic buttons into silent/modifiable/stable, notion of complex and simple cells in the visual system) than other (the critical periods, brain rhythms, iso/allo cortices, brain symetry and circuits, etc). There’s one bit or two wrong, but that’s nitpicks or my mistake.
The idea of synapses as detecting frequency code is not exactly novel (it is the usual working hypothesis for some synapses in the cerebellum, although the exact code is not known I think), but the idea that it’s a general principle that works because the synapse recognize it’s own noise is either novel or not well known even within cognitive science (it might be a common idea among specialists of synaptic transmission, or original). I feel it promising, like how Hebb has the idea of it’s law.
Impressively promising work, thanks & good luck! Is there anything a layperson can do to help you reach your goal?
More specifically, if the argument that we should expect a more intelligent AI we build to have a simple global utility function that isn't aligned with our own goals is valid then why won't the very same argument convince a future AI that it can't trust an even more intelligent AI it generates will share it's goals?
For the same reason that one can expect a paperclip maximizer could both be intelligent enough to defeat humans and stupid enough to misinterpret their goal, e.g. you need to believe the ability to select goals is completely separated from the ability to reach them.
(Beware it’s hard and low status to challenge that assumption on LW)
Yes, that’s the crux. In my view, we can reverse…
Inability to distinguish noice and patters is true only for BBs. If we are real humans, we can percieve noice as noice with high probability.
… as « Ability to perceive noise means we’re not BB (high probability). »
Can you tell more about why we can’t use our observation to solve this?
That’s an interesting loophole in my reasoning, thanks! But isn’t that in tension with the observation that we can perceive noise as noise?
(yes humans can find spurious patterns in noise, but they never go as far as mistaking white noise for natural pictures)
Yes, although I see that more as an alternative intuition pump rather than a different point.
A true Boltzmann brain may have an illusion of the order in completely random observations.
Sure, like a random screen may happen to look like a natural picture. It’s just exponentially unlikely with picture size, whereas the scenario you suggest is indeed generic in producing brains that look like they evolved from simpler brains.
In other words, you escape the standard argument by adding an observation, e.g. the observation that random fluctuations should almost never make our universe looks obeying physical laws.
One alternative way to see this point is the following: if (2) our brains are random fluctuations, then they are exponentially unlikely to have been created long ago, whereas if (1) it is our observable universe itself that comes from random fluctuations, it could equally have been created 10 billions years or 10 seconds ago. Then counting makes (1) much more likely than (2).
0% that the tool itself will make the situation with the current comment ordering and discourse on platforms such as Twitter, Facebook, YouTube worse.
Thanks for the detailed answer, but I’m more interested in polarization per see than in the value of comment ordering. Indeed we could imagine that your tool feels like it behaves as well as you wanted, but that’s make the memetic world less diverse then more fragile (like monocultures tend to collapse here and then). What’d be your rough range for this larger question?
This is sort of what is customary to expect, but leaning into my optimism bias, I should plan as if this is not the case. (Otherwise, aren’t we all doomed, anyway?)
In your opinion, what are the odds that your tool would make polarization worse? (What’s wrong with keep looking for better plans?)
Nothing at all. I’m big fan of these kind of ideas and I’d love to present yours to some friends, but I’m afraid they’ll get dismissive if I can’t translate your thoughts into their usual frame of reference. But I get you didn’t work this aspect specifically, there’s many fields in cognitive sciences.
About how much specificity, it’s up to interpretation. A (1k by 1k by frame by cell type by density) tensor representing the cortical columns within the granular cortices is indeed a promising interpretation, although it’d probably be short of an extrapyramidal tensor (and maybe an agranular one).
You mean this: "We're not talking about some specific location or space in the brain; we're talking about a process."
You mean there’s some key difference in meaning between your original formulation and my reformulation? Care to elaborate and formulate some specific prediction?
As an example, I once gave a try at interpreting data from olfactory system for a friend who were wondering if we could find sign of an chaotic attractor. If you ever toy with Lorenz model, one key feature is: you either see the attractor by plotting x vs vs z, or you can see it by plotting one of these variable only vs itself at t+delta vs itself at t+2*delta (for many deltas). In other words, that gives a precise feature you can look for (I didn’t find any, and nowadays it seems accepted that odors are location specific, like every other sense). Do you have a better idea or it’s more or less what you’d have tried?
Is accessing the visual cartesian theater physically different from accessing the visual cortex? Granted, there's a lot of visual cortex, and different regions seem to have different functions. Is the visual cartesian theater some specific region of visual cortex?
In my view: yes, no. To put some flesh on the bone, my working hypothesis is: what’s conscious is gamma activity within an isocortex connected to the claustrum (because that’s the information which will get selected for the next conscious frame/can be considered as in working memory)
I'm not sure what your question about ordering in sensory areas is about.
You said: what matters is temporal dynamics. I said: why so many maps if what matters is timing?
Why do we seem to have different kinds of information in different layers at all? That's what interests me.
The closer to the input, the more sensory. The closer to the output, the more motor. The closer to the restrictions, the easier to interpret activity as latent space. Is there any regularity that you feel hard to interpret this way?
Finally, here's an idea I've been playing around with for a long time:
Thanks, I’ll go read. Don’t hesitate to add other links that can help understand your vision.
I'm willing to speculate that [6 Hz to 10 Hz ]that's your 'one-shot' refresh rate.
It’s possible. I don’t think there was relevant human data in Walter Freeman time, so I’m willing to speculate that’s indeed the frame rate in mouse. But I didn’t check the literature he had access to, so just a wild guess.
the imagery of the stage 'up there' and the seating area 'back here' is not at all helpful
I agree there’s no seating area. I still find the concept of a cartesian theater useful. For exemple, it allows knowing where to plant electrodes if you want to access the visual cartesian theater for rehabilitation purposes. I guess you’d agree that can be helpful. 😉
We're not talking about some specific location or space in the brain; we're talking about a process.
I have friends who believe that, but they can’t explain why the brain needs that much ordering in the sensory areas. What’s your own take?
But what is [the distributed way]that?
You know backprop algorithm? That’s a mathematical model for the distributed way. It was recently shown that it produces networks that explains (statistically speaking) most the properties of the BOLD cortical response in our visial systems. So, whatever the biological cortices actually do, it turns equivalent for the « distributed memory » aspect.
Or that's my speculation.
I wonder if that’s too flattering for connectionism, which mostly stalled until the early breakthrough in computer vision suddenly attract every labs. BTW
A few comments before later. 😉
What I meant was that the connectionist alternative didn't really take off until GPUs were used, making massive parallelism possible.
Thanks for the clarification! I guess you already noticed how research centers in cognitive science seem to have a failure mode over a specific value question: Do we seek excellence at the risk of overfitting funding agency criterion, or do we seek fidelity to our interdisciplinary mission at the risk of compromising growth?
I certainly agree that, before the GPUs, the connectionist approach had a very small share of the excellence tokens. But it was already instrumental in providing a common conceptual framework beyond cognitivism. As an example, even the first PCs were enough to run toy examples of double dissociation using networks structured by sensory type rather than by cognitive operation. From a neuropsychological point of view, that was already a key result. And for the neuroscientist in me, toy models like Kohonen maps were already key to make sense of why we need so many short inhibitory neurons in grid-like cortical structures.
Going back to Yevick, in her 1975 paper she often refers to holographic logic as 'one-shot' logic, meaning that the whole identification process takes place in one operation, the illumination of the hologram (i.e. the holographic memory store) by the reference beam. The whole memory 'surface' is searched in one unitary operation.
Like a refresh rate? That would fit the evidence for a 3-7 Hz refresh rate of our cartesian theater, or the way LLMs go through prompt/answer cycles. Do you see other potential uses for this concept?
We've got to understand how the memory is structured so that that is possible.
What’s wrong with « the distributed way »?
When I hear « conventional, sequential, computational regime », my understanding is « the way everyone was trying before parallel computation revolutionized computer vision ». What’s your definition so that using GPU feels sequential?
Thanks, I didn’t know this perspective on the history of our science. The stories I most heard were indeed more about HH model, Hebb rule, Kohonen map, RL, and then connexionism became deep learning..
If the object tends toward geometrical simplicity – she was using identification of visual objects as her domain – then a conventional, sequential, computational regime was most effective.
…but neural networks did refute that idea! I feel like I’m missing something here, especially since you then mention GPU. Was sequential a typo?
Our daily whims might be a bit inconsistent, but our larger goals aren't.
It’s a key faith I used to share, but I’m now agnostic about that. To take a concrete exemple, everyone knows that blues and reds get more and more polarized. Grey type like old me would thought there must be a objective truth to extract with elements from both sides. Now I’m wondering if ethics should ends with: no truth can help deciding whether future humans should be able to live like bees or like dolphins or like the blues or like the reds, especially when living like the reds means eating the blues and living like the blues means eating the dolphins and saving the bees. But I’m very open to hear new heuristics to tackle this kind of question
And we can get those goals into AI - LLMs largely understand human ethics even at this point.
Very true, unless we nitpick definitions for « largely understand ».
And what we really want, at least in the near term, is an AGI that does what I mean and checks.
Very interesting link, thank you.
Fascinating paper! I wonder how much they would agree that holography means sparse tensors and convolution, or that the intuitive versus reflexive thinking basically amount to visuo-spatial versus phonological loop. Can’t wait to hear which other idea you’d like to import from this line of thought.
I have no idea whether or not Hassibis is himself dismissive of that work
Well that’s a problem, don’t you think?
but many are.
Yes, as a cognitive neuroscientist myself, you’re right that many within my generation tend to dismiss symbolic approaches. We were students during a winter that many of us thought caused by the over promising and under delivering of the symbolic approach, with Minsky as the main reason for the slow start of neural networks. I bet you have a different perspective. What’s your three best points for changing the view of my generation?
Because I agree, and because « strangely » sounds to me like « with inconstancies ».
In other words, in my view the orthodox view on orthogonality is problematic, because it suppose that we can pick at will within the enormous space of possible functions, whereas the set of intelligent behavior that we can construct is more likely sparse and by default descriptible using game theory (think tit for tat).
This is a sort of positive nihilism. Because value is not inherent in the physical world, you can assign value to whatever you want, with no inconsistency.
Say we construct a strong AI that attributes a lot of value to a specific white noise screenshot. How would you expect it to behave?
Your point is « Good AIs should have a working memory, a concept that comes from psychology ».
DH point is « Good AIs should have a working memory, and the way to implement it was based on concepts taken from neuroscience ».
That’s indeed orthogonal notions, if you will.
I’m a bit annoyed that Hassabis is giving neuroscience credit for the idea of episodic memory.
That’s not my understanding. To me he is giving neuroscience credit for the ideas that made possible to implement a working memory in LLM. I guess he didn’t want to use words like thalamocortical, but from a neuroscience point of view transformers indeed look inspired by the isocortex, e.g. by the idea that a general distributed architecture can process any kind of information relevant to a human cognitive architecture.
I’d be happy if you could point out a non competitive one, or explain why my proposal above does not obey your axioms. But we seem to get diminished returns to sort these questions out, so maybe it’s time to close at this point and wish you luck. Thanks for the discussion!
Saying fuck you is helpful when the aim is to exclude whoever disagree with your values. This is often instrumental to construct a social group, or to get accepted in a social group that includes high status toxic characters. I take be nice as the claim that there are always better objectives.
This is aiming at a different problem than goal agnosticism; it's trying to come up with an agent that is reasonably safe in other ways.
Well, assuming a robust implementation, I still think it obeys your criterions, but now you mention « restrictive », my understanding is that you want this expression to specifically refers to pure predictors. Correct?
If yes, I’m not sure that’s the best choice for clarity (why not « pure predictors »?) but of course that’s your choice. If not, can you give some examples of goal agnostic agents other than pure predictors?
You forgot to explain why these arguments only apply to strangers. Is there a reason to think medical research and economical incentives are better when it’s a family member who need a kidney?
Nope, my social media presence is very very low. But I’m open to suggestion since I realized there’s a lot of toxic characters with high status here. Did you try EA forums? Is it better?
(The actual question is about your best utilitarian model, not your strategy given my model.)
Uniform distribution of donating kidney sounds also the result when a donor is 10^19 more likely to set the example. Maybe I should precise that the donor is unlikely to take the 1% risk unless someone else is more critical to war effort.