Predicting Alignment Award Winners Using ChatGPT 4 2024-02-08T14:38:37.925Z
Discussion Meetup 2024-02-07T10:03:04.958Z
New Years Meetup (Zwolle) 2023-12-30T11:23:33.414Z
Mini-Workshop on Applied Rationality 2023-10-11T09:13:00.325Z
United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress 2023-04-20T23:19:01.229Z
March - Social Meetup 2023-03-04T20:19:30.626Z
Short Notes on Research Process 2023-02-22T23:41:45.279Z
February Online Meetup 2023-02-11T05:45:09.464Z
Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review) 2023-01-28T05:26:49.866Z
A Simple Alignment Typology 2023-01-28T05:26:36.660Z
Optimizing Human Collective Intelligence to Align AI 2023-01-07T01:21:25.328Z
Announcing: The Independent AI Safety Registry 2022-12-26T21:22:18.381Z
New Years Social 2022-12-26T01:22:31.930Z
Loose Threads on Intelligence 2022-12-24T00:38:41.689Z
Research Principles for 6 Months of AI Alignment Studies 2022-12-02T22:55:17.165Z
Three Alignment Schemas & Their Problems 2022-11-26T04:25:49.206Z
Winter Solstice - Amsterdam 2022-10-13T12:52:22.337Z
Deprecated: Some humans are fitness maximizers 2022-10-04T19:38:10.506Z
Let's Compare Notes 2022-09-22T20:47:38.553Z
Overton Gymnastics: An Exercise in Discomfort 2022-09-05T19:20:01.642Z
Novelty Generation - The Art of Good Ideas 2022-08-20T00:36:06.479Z
Cultivating Valiance 2022-08-13T18:47:08.628Z
Alignment as Game Design 2022-07-16T22:36:15.741Z
Research Notes: What are we aligning for? 2022-07-08T22:13:59.969Z
Naive Hypotheses on AI Alignment 2022-07-02T19:03:49.458Z
July Meet Up - Utrecht 2022-06-22T21:46:13.752Z


Comment by Shoshannah Tekofsky (DarkSym) on Predicting Alignment Award Winners Using ChatGPT 4 · 2024-02-08T17:36:19.802Z · LW · GW

Oh, that does help to know, thank you!

Comment by Shoshannah Tekofsky (DarkSym) on New Years Meetup (Zwolle) · 2024-01-13T10:37:25.746Z · LW · GW

Hi! comment so everyone gets a msg about this:

Location is The Refter in Zwolle at Bethlehemkerkplein 35a, on the first floor!

If you have trouble finding it feel free to ping me here, on the discord, or the what's app group. Link to discord can be found below!

Comment by Shoshannah Tekofsky (DarkSym) on Mini-Workshop on Applied Rationality · 2023-10-21T11:22:17.395Z · LW · GW

We are moving to Science Park Library

Comment by Shoshannah Tekofsky (DarkSym) on Mini-Workshop on Applied Rationality · 2023-10-20T21:38:34.113Z · LW · GW

The ACX meeting on the same day is unfortunately cancelled. For that reason we are extending the deadline for sign up:

If you have a confirmation email, then you can definitely get in.

Otherwise, fill out the form and we'll select 3 people for the remaining spots. If people show up without signing up, they can get in if we are below 20. If we are on 20 or more, then no dice :D

(Currently 17)

Comment by Shoshannah Tekofsky (DarkSym) on Mini-Workshop on Applied Rationality · 2023-10-18T11:12:55.981Z · LW · GW

Update: So far 11 people have been confirmed for the event. If you filled out the sign up form, but did not receive an email with confirmation, and you think you should, please DM me here on LW.

The last review cycle will be Friday morning, so if you want to attend, be sure to fill out the form before then.

Looking forward to seeing you there!

Comment by Shoshannah Tekofsky (DarkSym) on Mini-Workshop on Applied Rationality · 2023-10-16T14:43:47.744Z · LW · GW

Here is the sign-up form. Please fill it out before Friday. People who are accepted in to the workshop will receive an email to that effect. 

Comment by Shoshannah Tekofsky (DarkSym) on Mini-Workshop on Applied Rationality · 2023-10-14T15:32:30.823Z · LW · GW

We have hit 15 signups!

Keep an eye on your inboxes for the signup form.

Comment by Shoshannah Tekofsky (DarkSym) on United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress · 2023-04-22T06:08:00.063Z · LW · GW

Well damn... Well spotted.

I found the full-text version and will dig in to this next week to see what's up exactly.

Comment by Shoshannah Tekofsky (DarkSym) on United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress · 2023-04-21T18:12:52.230Z · LW · GW

Thank you! I wholeheartedly agree to be honest. I've added a footnote to the claim, linking and quoting your comment. Are you comfortable with this?

Comment by Shoshannah Tekofsky (DarkSym) on United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress · 2023-04-21T05:28:33.323Z · LW · GW

Oooh gotcha. In that case, we are not remotely any good at avoiding the creation of unaligned humans either! ;)

Comment by Shoshannah Tekofsky (DarkSym) on United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress · 2023-04-21T01:48:09.415Z · LW · GW

Could you paraphrase? I'm not sure I follow your reasoning... Humans cooperate sufficiently to generate collective intelligence, and they cooperate sufficiently due to a range of alignment mechanics between humans, no?

Comment by Shoshannah Tekofsky (DarkSym) on Fucking Goddamn Basics of Rationalist Discourse · 2023-02-04T05:41:32.910Z · LW · GW

Should we have a rewrite the Rationalist Basics Discourse contest?

Not that I think anything is gonna beat this. But still :D

Ps: can be both content and/or style

Comment by Shoshannah Tekofsky (DarkSym) on A Simple Alignment Typology · 2023-01-30T23:50:04.547Z · LW · GW

Thank you! I appreciate the in-depth comment.

Do you think any of these groups hold that all of the alignment problem can be solved without advancing capabilities?

Comment by Shoshannah Tekofsky (DarkSym) on Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review) · 2023-01-30T23:29:02.901Z · LW · GW


And I appreciate the correction -- I admit I was confused about this, and may not have done enough of a deep-dive to untangle this properly. Originally I wanted to say "empiricists versus theorists" but I'm not sure where I got the term "theorist" from either.

Comment by Shoshannah Tekofsky (DarkSym) on Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review) · 2023-01-30T23:27:11.399Z · LW · GW


And to both examples, how are you conceptualizing a "new idea"? Cause I suspect we don't have the same model on what an idea is.

Comment by Shoshannah Tekofsky (DarkSym) on Looking for a specific group of people · 2023-01-20T23:44:37.872Z · LW · GW

Two things that worked for me:

  1. Produce stuff, a lot of stuff, and make it findable online. This makes it possible for people to see your potential and reach out to you.

  2. Send an email to anyone you admire asking if they are interested in going for a coffee (if you have the funds to fly out to them) or do a video call. Explain why you admire them and why this would be high value to you. I did this for 4 people without limit of 'how likely are they to answer' and one of them said 'yeah sure' and I think the email made them happy cause a reasonable subset of people like learning how they have touched other's lives in a positive way.

Comment by Shoshannah Tekofsky (DarkSym) on Optimizing Human Collective Intelligence to Align AI · 2023-01-08T19:50:21.957Z · LW · GW

Even in experiments, I think most of the value is usually from observing lots of stuff, more than from carefully controlling things.

I think I mostly agree with you but have the "observing lots of stuff" categorized as "exploratory studies" which are badly controlled affairs where you just try to collect more observations to inform your actual eventual experiment. If you want to pin down a fact about reality, you'd still need to devise a well-controlled experiment that actually shows the effect you hypothesize to exist from your observations so far.

If you actually go look at how science is practiced, i.e. the things successful researchers actually pick up during PhD's, there's multiple load-bearing pieces besides just that.


Note that a much simpler first-pass on all these is just "spend a lot more time reading others' work, and writing up and distilling our own".

I agree, but if people were both good at finding necessary info as an individual and we had better tools for coordinating (e.g.,finding each other and relevant material faster) then that would speed up research even further. And I'd argue that any gains in speed of research is as valuable as the same proportional delay in developing AGI.

Comment by Shoshannah Tekofsky (DarkSym) on Looking for Spanish AI Alignment Researchers · 2023-01-07T22:01:22.622Z · LW · GW

There is an EU telegram group where they are, among other things, collecting data on where people are in Europe. I'll DM an invite.

Comment by Shoshannah Tekofsky (DarkSym) on Optimizing Human Collective Intelligence to Align AI · 2023-01-07T03:31:04.821Z · LW · GW

That makes a lot of sense! And was indeed also thinking of Elicit

Comment by Shoshannah Tekofsky (DarkSym) on New Years Social · 2023-01-03T19:18:28.678Z · LW · GW

Note: The meetup this month is Wednesday, Jan 4th, at 15:00. I'm in Berkeley currently, and I couldn't see how times were displayed for you guys cause I have no option to change time zones on LW. I apologize if this has been confusing! I'll get a local person to verify dates and times next time (or even set them).

Comment by Shoshannah Tekofsky (DarkSym) on Loose Threads on Intelligence · 2022-12-28T04:28:10.993Z · LW · GW

Did you accidentally forget to add this post to your research journal sequence?

I thought I added it but apparently hadn't pressed submit. Thank you for pointing that out!


  1. optimization algorithms (finitely terminating)
  2. iterative methods (convergent)

That sounds as if as if they are always finitely terminating or convergent, which they're not. (I don't think you wanted to say they are)

I was going by the Wikipedia definition:

To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution (on some specified class of problems), or heuristics that may provide approximate solutions to some problems (although their iterates need not converge).

I don't quite understand this. What does the sentence "computational optimization can compute all computable functions" mean? Additionally, in my conception of "computational optimization" (which is admittedly rather vague), learning need not take place. 

I might have overloaded the phrase "computational" here. My intention was to point out what can be encoded by such a system. Maybe "coding" is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.

these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a "target direction", then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what's chosen is not only a direction but a whole path.

I'm wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word "optimize"?


You write "Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality"

  • Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once. 

If I learn the optimal path to work, then I can use that multiple times. I'm not sure I agree with the distinction you are drawing here ... Some problems in life only need to be solved exactly once, but that's the same as any thing you learn only being applicable once. I didn't mean to claim the processes are identical, but that they share an underlying structure. Though indeed, this might an empty intuitive leap with no useful implementation. Or maybe not a good matching at all.

I do not know what you mean by "mapping a utility function to world states". Is the following a correct paraphrasing of what you mean?

"An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say 'these states are rather optimal' "

Yes, thank you.


I don't quite understand the analogy to hyperparameters here. To me, it seems like childbirth's meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don't yet see how it connects to hyperparameters.

A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter 'Satisfaction from hard work' that takes 'pain' as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense? 

What if instead of trying to build an AI that tries to decode our brain's utility function, we build the process that created our values in the first place and expose the AI to this process

Digging in to shard theory is still on my todo list. [bookmarked]

Many models that do not overfit also memorize much of the data set. 

Is this on the sweet spot just before overfitting or should I be thinking of something else?


Thank you for you extensive comment! <3

Comment by Shoshannah Tekofsky (DarkSym) on Announcing: The Independent AI Safety Registry · 2022-12-27T19:52:17.410Z · LW · GW

Oh my, this looks really great. I suspect between this and the other list of AIS researchers, we're all just taking different cracks at generating a central registry of AIS folk so we can coordinate at all different levels on knowing what people are doing and knowing who to contact for which kind of connection. However, maintaining such an overarching registry is probably a full time job for someone with high organizational and documentation skills.

Comment by Shoshannah Tekofsky (DarkSym) on Announcing: The Independent AI Safety Registry · 2022-12-26T22:16:13.531Z · LW · GW

I'll keep it in mind, thank you!

Comment by Shoshannah Tekofsky (DarkSym) on Announcing: The Independent AI Safety Registry · 2022-12-26T22:14:30.624Z · LW · GW

Great idea!

So my intuition is that letting people edit a file that is publicly linked is inviting a high probability of undesirable results (like accidental wipes, unnoticed changes to the file, etc). I'm open to looking in to this if the format gains a lot of traction and people find it very useful. For the moment, I'll leave the file as-is so no one's entry can be accidentally affected by someone else's edits. Thank you for the offer though!

Comment by Shoshannah Tekofsky (DarkSym) on Research Principles for 6 Months of AI Alignment Studies · 2022-12-02T23:40:23.802Z · LW · GW

Thank you for sharing! I actually have a similar response myself but assumed it was not general. I'm going to edit the image out.

Comment by Shoshannah Tekofsky (DarkSym) on A caveat to the Orthogonality Thesis · 2022-11-22T23:18:59.189Z · LW · GW

EDIT: Both are points are moot using Stuart Armstrong's narrower definition of the Orthogonality thesis that he argues in General purpose intelligence: arguing the Orthogonality thesis:

High-intelligence agents can exist having more or less any final goals (as long as these goals are of feasible complexity, and do not refer intrinsically to the agent’s intelligence).

Old post:

I was just working through my own thoughts on the Orthogonality thesis and did a search on LW on existing material and found this essay. I had pretty much the same thoughts on intelligence limiting goal complexity, so yay!

Additional thought I had: Learning/intelligence-boosting motivations/goals are positively correlated with intelligence. Thus, given any amount of time, an AI with intelligence-boosting motivations will become smarter than those do not have that motivation.

It is true that instrumental convergence should lead any sufficiently smart AI to also pursue intelligence-boosting (cognitive enhancement) but:

  • At low levels of intelligence, AI might fail at instrumental convergence strategies.
  • At high levels of intelligence, AI that is not intelligence-boosting will spend some non-zero amount of resources on its actual other goals and thus be less intelligent than an intelligence-boosting AI (assuming parallel universes, and thus no direct competition).

I'm not sure how to integrate this insight in to the orthogonality thesis. It implies that:

"At higher intelligence levels, intelligence-boosting motivations are more likely than other motivations" thus creating a probability distribution across the intelligence-goal space that I'm not sure how to represent. Thoughts?

Comment by Shoshannah Tekofsky (DarkSym) on All AGI safety questions welcome (especially basic ones) [July 2022] · 2022-11-15T10:27:14.583Z · LW · GW

Hmm, that wouldn't explain the different qualia of the rewards, but maybe it doesn't have to. I see your point that they can mathematically still be encoded in to one reward signal that we optimize through weighted factors.

I guess my deeper question would be: do the different qualias of different reward signals achieve anything in our behavior that can't be encoded through summing the weighted factors of different reward systems in to one reward signal that is optimized?

Another framing here would be homeostasis - if you accept humans aren't happiness optimizers, then what are we instead? Are the different reward signals more like different 'thermostats' where we trade off the optimal value of thermostat against each other toward some set point?

Intuitively I think the homeostasis model is true, and would explain our lack of optimizing. But I'm not well versed in this yet and worry that I might be missing how the two are just the same somehow.

Comment by Shoshannah Tekofsky (DarkSym) on Estimating the probability that FTX Future Fund grant money gets clawed back · 2022-11-14T08:36:31.434Z · LW · GW

Clawbacks refer to grants that have already been distributed but would need to be returned. You seem to be thinking of grants that haven't been distributed yet. I hope both get resolved but they would require different solutions. The post above is only about clawbacks though.

Comment by Shoshannah Tekofsky (DarkSym) on Estimating the probability that FTX Future Fund grant money gets clawed back · 2022-11-14T05:45:13.968Z · LW · GW

As a grantee, I'd be very interested in hearing what informs your estimate, if you feel comfortable sharing.

Comment by Shoshannah Tekofsky (DarkSym) on Solstice 2022 Roundup · 2022-11-13T06:29:04.735Z · LW · GW


Small celebration in Amsterdam:

Comment by Shoshannah Tekofsky (DarkSym) on All AGI safety questions welcome (especially basic ones) [July 2022] · 2022-11-12T06:28:37.627Z · LW · GW

Sure. For instance, hugging/touch, good food, or finishing a task all deliver a different type of reward signal. You can be saturated on one but not the others and then you'll seek out the other reward signals. Furthermore, I think these rewards are biochemically implemented through different systems (oxytocin, something-sugar-related-unsure-what, and dopamine). What would be the analogue of this in AI?

Comment by Shoshannah Tekofsky (DarkSym) on 7 traps that (we think) new alignment researchers often fall into · 2022-10-08T18:11:30.870Z · LW · GW

ah, like that. Thank you for explaining. I wouldn't consider that a reversal cause you're then still converting intuitions into testable hypotheses. But the emphasis on discussion versus experimentation is then reversed indeed.

Comment by Shoshannah Tekofsky (DarkSym) on 7 traps that (we think) new alignment researchers often fall into · 2022-10-08T09:53:21.264Z · LW · GW

What would the sensible reverse of number 5? I can generate those them for 1-4 and 6, but I am unsure what the benefit could be of confusing intuitions with testable hypotheses?

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-07T08:00:25.347Z · LW · GW

I really appreciate that thought! I think there were a few things going on:

  • Definitons and Degrees: I think in common speech and intuitions it is the case that failing to pick the optimal option doesn't mean something is not an optimizer. I think this goes back to the definition confusion, where 'optimizer' in CS or math literally picks the best option to maximize X no matter the other concerns. While in daily life, if one says they optimize on X then trading off against lower concerns at some value greater than zero is still considered optimizing. E.g. someone might optimize their life for getting the highest grades in school by spending every waking moment studying or doing self-care but they also spend one evening a week with a romantic partner. I think in regular parlance and intuitions, this person is said to be an optimizer cause the concept is weighed in degrees (you are optimizing more on X) instead of absolutes (you are disregarding everything else except X).
  • unrepresented internal experience: I do actually experience something related to conscious IGF optimization drive. All the responses and texts I've read so far are from people that say that they don't, which made me assume the missing piece was people's awareness of people like myself. I'm not a perfect optimizer (see above definitional considerations) but there are a lot of experiences and motivations that seemed to not be covered in the original essay or comments. E.g. I experience a strong sense of identity shift where, since I have children, I experience myself as a sort of intergenerational organism. My survival and flourishing related needs internally feel secondary to that of the aggregate of the blood line I'm part of. This shift happened to me during my first pregnancy and is quite a disorienting experience. It seems to point so strongly at IGF optimization that claiming we don't do that seemed patently wrong. From examples I can now see that it's still a matter of degrees and I still wouldn't take every possible action to maximize the number of copies of my genes in the next generation.
  • where we are now versus where we might end up: people did agree we might end up being IGF maximizers eventually. I didn't see this point made in the original article and I thought the concern was that training can never work to create inner alignment. Apparently that wasn't the point haha.

Does that make sense? Curious to hear your thoughts.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-06T19:37:25.538Z · LW · GW

good to know, thank you!

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-06T19:37:08.007Z · LW · GW

On further reflection, I changed my mind (see title and edit at top of article). Your comment was one of the items that helped me understand the concepts better, so just wanted to add a small thank you note. Thank you!

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-06T19:20:18.323Z · LW · GW


On that note, I was wondering if there was any way I could tag the people that engaged me on this (cause it's spread between 2 articles) just so I can say thanks? Seems like the right thing to do to high five everyone after a lost duel or something? Dunno, there is some sentiment there where a lightweight acknowledgement/update would be a useful thing to deliver in this case, I feel, to signal that people's comments actually had an effect. DM'ing everyone or replying to each comment again would give everyone a notification but generates a lot of clutter and overhead, so that's why tagging seemed like a good route.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-06T18:55:07.394Z · LW · GW

I wasn't sure how I hadn't argued that, but between all the difference comments, I've now pieced it together. I appreciate everyone engaging me on this, and I've updated the essay to "deprecated" with an explanation at the top that I no longer endorse these views.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-06T18:53:04.830Z · LW · GW

Thank you. Between all the helpful comments, I've updated my point of view and updated this essay to deprecated with an explanation + acknowledgement at the top.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-05T20:10:05.660Z · LW · GW

The surrogacy example originally struck me as very unrealistic cause I presumed it was mostly illegal (it is in Europe but apparently not in some States of the US) and heavily frowned upon here for ethical reasons (but possibly not in the US?). So my original reasoning was that you'd get in far more trouble for applying for many surrogates than for swapping out sperm at the sperm bank.

I guess if this is not the case then it might have been a fetish for those doctors? I'm slightly confused about the matter now what internal experience put them up to it if they'd eschew surrogates while they are legal and socially acceptable in parts of the US.

The other options just seem like relatively risky endeavors that are liable to blow up their succesful sperm swapping projects.

Comment by Shoshannah Tekofsky (DarkSym) on Humans aren't fitness maximizers · 2022-10-05T12:32:29.537Z · LW · GW

Yes, good point. I was looking at those statistics for a bit. Poorer parents do indeed tend to maximize their number of offspring no matter the cost while richer parents do not. It might be that parents overestimate the IGF payoffs of quality, but then that just makes them bad/incorrect optimizers. It wouldn't make them less of an optimizer.

I think there also some other subtle nuances going on, like for instance, I'd consider myself fairly close to an IGF optimizer but I don't care about all genes/traits equally. There is a multigenerational "strain" I identify strongly with. A bloodline, you could say. But my mediocre eye sight isn't part of that, and I'd be surprised to hear this mechanic working any differently for others. Also, I'm not sure if all of the results of quality maximizers are obvious. E.g., Dutch society have a handful of extremely rich people that became rich 400 years ago during the golden age. Their bloodlines are keeping money made back then and the wealth increases every generation. Such a small segment is impossible to represent in controlled experiments, but maybe richer parents do start moving toward trying to "buy these lottery tickets" of reproduction, hoping to move their 1-2 kids in to the stratosphere. It's not like they need 10 kids to be sure they will be represented in the next generation cause their kids will survive regardless.

Either way, I also realized I'm probably using a slightly different definition of optimizer than Nate is, so that probably explains some of the disagreement as well. I'd consider knowing X is the optimal action, but not being able to execute X cause you feel too much fear to still be in line with an optimizer's behavior bcause you are optimizing over the options you have and a fear response limits your options. I suspect my perspective is not that uncommon and might explain some of the pushback Nate is referring to for the claim that is obvious from his definition.

Comment by Shoshannah Tekofsky (DarkSym) on Humans aren't fitness maximizers · 2022-10-05T12:23:52.041Z · LW · GW

I think the notion that people are adaptation-executors, who like lots of things a little bit in context-relevant situations, predicts our world more than the model of fitness-maximizers, who would jump on this medical technology and aim to have 100,000s of children soon after it was built.

I think this skips the actual social trade-offs of the strategy you outline above:

  1. The likely back lash in society against any woman who tries this is very high. Any given rich woman would have to find surrogate women who are willing to accept the money and avoid being the target of social condemnation or punitive measures of the law. It's a high risk / high reward strategy that also needs to keep paying off long after she is dead, as her children might be shunned or lose massive social capital as well. If you consider people's response to eugenics or gene editing of human babies, then you can imagine the backlash if a woman actually paid surrogates at scale. It's not clear to me that the strategy you outline above is actually all that viable for the vast majority of rich women.
  2. I'd argue some of are IGF maximizers for the hand that we have been dealt, which includes our emotional response, intelligence, and other traits. Many of us have things like fear-responses to heavily hard-wired that no matter what we recognize as the optimal response, we can't actually physically execute it.

I realize item 2 points to a difference in how we might define an optimizer, but it's worth disambiguating this. I suspect claiming no humans are IGF maximizers or some humans are IGF maximizers might come down to the definition of maximizer that one uses. And thus might explain the pushback that Nate runs in to for a claim he finds self-evident.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-05T12:09:11.843Z · LW · GW

My claim was purely that some people do actually optimize on this. It's just fairly hard, and their success also relies on how their abilities to game the system compares to how strong the system is. E.g. There was that fertility doctor that just used his own sperm all the time, for instance.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-05T12:06:18.836Z · LW · GW

Makes sense. I'm starting to suspect I overestimated the number of people who would take these deals, but I think there still would be more for the above than for the original thought experiments.

Comment by Shoshannah Tekofsky (DarkSym) on Deprecated: Some humans are fitness maximizers · 2022-10-05T12:03:39.578Z · LW · GW

That last one killed me hahaha _

Comment by Shoshannah Tekofsky (DarkSym) on Humans aren't fitness maximizers · 2022-10-04T19:47:25.800Z · LW · GW

Here is my best attempt at working out my thoughts on this, but I noticed I reached some confusion at various points. I figured I'd post it anyway in case it either actually makes sense or people have thoughts they feel like sharing that might help my confusion.

Edit: The article is now deprecated. Thanks for everyone commenting here for helping me understand the different definitions of optimizer. I do suspect my misunderstanding of Nate's point might mirror why there is relatively common pushback against his claim? But maybe I'm typical minding.

Comment by Shoshannah Tekofsky (DarkSym) on Humans aren't fitness maximizers · 2022-10-04T12:06:53.505Z · LW · GW

They are a small minority currently cause the environment changes so quickly right now. Things have been changing insanely fast in the last century or so but before the industrial revolution and especially before the agriculture revolution, humans were much better optimized for IGF, I think. Evolution is still 'training' us and these last 100 years have been a huge change compared to the generation length of humans. Nate is stating that humans genetically are not IGF maximizers, and that is false. We are, we are just currently heavily being 'retrained'.

Re: quantity/quality. I think people nominally say they are optimizing for quality when really they just don't have enough drive to have more kids at the current cost. There is much less cultural punishment on saying you are going for quality over quantity instead of saying you just don't want more kids cause it's a huge investment. Additionally, children who grow up in bad home environments seem less likely have kids of their own, and parents having mental breakdowns is one of the common 'bad' environments. So quality can definitely optimize for quantity in the long run.

Ps: i wish I had more time for more nuanced answers. Considering writing this up in more detail. My answers are rather rushed. My apologies

Comment by Shoshannah Tekofsky (DarkSym) on Humans aren't fitness maximizers · 2022-10-04T08:17:44.506Z · LW · GW

I disagree humans don't optimize IGF:

  1. We seem to have different observational data. I do know some people who make all their major life decisions based on quality and quantity of offspring. Most of them are female but this might be a bias in my sample. Specifically, quality trades off against quantity: waiting to find a fitter partner and thus losing part of your reproductive window is a common trade off. Similarly, making sure your children have much better lives than you by making sure your own material circumstances (or health!) are better is another. To be fair, they seem to be a small minority currently but I think that is due to point 3 and would be rectified in more a constant environment.
  2. A lot of our drives do indirectly help IGF. Your aestethic sense may be somewhat wired to your ability to recognize and enjoy the visual appearance of healthy mates. Similarly for healthy environments to grow up in, etc. Sure, it gets hijacked for 20 other things, but how big is the loss in IGF to keep it around? I would argue it's generally not an issue for the subsection of humans that are directly driven to have big families.
  3. Many of us have badly optimized drives cause our environments have changed too fast. It will take a few generations of constant environment (not gonna happen at our current level of technological progress) to catch up. The obvious example is birth control: sex drive used to actually be a great proxy signal to optimize on offspring. Now it's no longer but we still love sex. But in a few generations the only people alive are the descendants of people who wanted kids no matter their sex drive. 'evolution' will now select directly on desire for kids but it takes awhile to catch up.

I'm not saying evolution optimized us very well, but I don't think it's accurate to say that we are not IGF maximizers. The environment has just changed much too quickly and selection pressure has been low the last few generations, but things like birth control actually introduce a new selection pressure on drive to reproduce. Humans are mediocre IGF maximizers in an environment that is changing unusually fast.

Comment by Shoshannah Tekofsky (DarkSym) on Let's Compare Notes · 2022-09-25T17:27:15.740Z · LW · GW

Thank you for the comment!

Possibly such a proof exists. With more assumptions, you can get better information on human values, see here. This obviously doesn't solve all concerns.

Those are great references! I'm going to add them to my reading list, thank you.

Only a few people think about this a lot -- I currently can only think of the Center on Long-Term Risk on the intersection of suffering focus and AI Safety. Given how bad suffering is, I'm glad that there are people thinking about it, and do not think that a simple inefficiency argument is enough.

I'd have to flesh out my thinking here more, which was why it was a very short note. But essentially, I suspect generating suffering as a subgoal for an AGI is something like an anti-convergent goal: It makes almost all other goals harder to achieve. An intuitive example is the bio-industry which currently generates a lot of suffering. However, as soon as we develop ways to grow meat in labs, this will be vastly more efficient, and thus we will converge to using that. An animal (or human) uses up energy while suffering, and the suffering itself tends to lower productivity and health in so many ways that it is both inefficient in purpose and resources. That said, there can be a transition period (such as we have now with the bio-industry) where the high suffering state is the optimum for some window of time till a more efficient method is generated (e.g. lab grown meat). In that window, there would then of course but very much suffering for humanity. I wouldn't expect that window be particularly big though, cause human suffering achieves very few goals (as in, it covers very little of the possible goal space an AGI might target) and if recursive self-improvement is true, then the window would simply pass fairly quickly.

I hope I don't misrepresent you by putting these two quotes together. Is your position that the ethical dilemmas of "fiddling with human brains" would be solved by, instead, just fiddling with simulated brains? If so, then I disagree: I think simulated brains are also moral patients, to the same degree that physical brains are. I like this fiction a lot.

Hmm, good point. I'm struck by how I had considered this issue when in a conversation with someone else 6 weeks ago, but now didn't surface this consideration in my notes... I feel something may be going on with being in a generative frame versus a critique frame. And I can probably use awareness of that to generate better ideas.

Comment by Shoshannah Tekofsky (DarkSym) on Capability and Agency as Cornerstones of AI risk ­— My current model · 2022-09-16T13:07:04.356Z · LW · GW

What distinguishes capabilities and intelligence to your mind, and what grounds that distinction? I think I'd have to understand that to begin to formulate an answer.