Posts

Non-loss of control AGI-related catastrophes are out of control too 2023-06-12T12:01:26.682Z
How should we think about the decision relevance of models estimating p(doom)? 2023-05-11T04:16:56.211Z

Comments

Comment by Mo Putera (Mo Nastri) on Universal Basic Income and Poverty · 2024-07-27T04:55:02.111Z · LW · GW

I wasn't aware of these options, thank you.

Comment by Mo Putera (Mo Nastri) on Podcast: "How the Smart Money teaches trading with Ricki Heicklen" (Patrick McKenzie interviewing) · 2024-07-12T05:47:33.500Z · LW · GW

There's also Byrne Hobart's article Understanding Jane Street (5.5k words), although pitched at an even lower level maybe. He recommends The Laws of Trading as further reading, alongside: 

This interview with Yaron Minsky is a great look at their decision to use Ocaml. If you're interested in more about the mechanics of exchanges and trading, the Hide not Slide Substack is good. As a starting point, here's their writeup on Jane Street. Max Dama on Automated Trading is old, but a very helpful overview of the industry for technical people. If you want to learn Ocaml, Jane Street's Yaron Minsky has coauthored a good book on it.

Comment by Mo Putera (Mo Nastri) on When is a mind me? · 2024-07-11T08:02:43.083Z · LW · GW

Your topline answers to the questions you assume xlr8harder cares about more seem similar to Holden Karnofsky's, and I haven't seen his essay on this mentioned so in this thread so I thought it'd be useful to link it here: What counts as death? An unconventional but simple take on personal identity, that dissolves most paradoxes

My philosophy on "what counts as death" is simple, though unconventional, and it seems to resolve most otherwise mind-bending paradoxical thought experiments about personal identity. It is the same basic idea as the one advanced by Derek Parfit in Reasons and Persons;1 Parfit also claims it is similar to Buddha's view2 (so it's got that going for it).

I haven't been able to find a simple, compact statement of this philosophy, and I think I can lay it out in about a page. So here it is, presented simply and without much in the way of caveats (this is "how things feel to me" rather than "something I'm confident in regardless of others' opinions"):

Constant replacement. In an important sense, I stop existing and am replaced by a new person each moment (second or minute or whatever).

The sense in which it feels like I "continue to exist, as one unified thread through time" is just an illusion, created by the fact that I have memories of my past. The only thing that is truly "me" is this moment; next moment, it will be someone else.

Kinship with past and future selves. My future self is a different person from me, but he has an awful lot in common with me: personality, relationships, ongoing projects, and more. Things like my relationships and projects are most of what give my current moment meaning, so it's very important to me whether my future selves are around to continue them.

So although my future self is a different person, I care about him a lot, for the same sorts of reasons I care about friends and loved ones (and their future selves).3

If I were to "die" in the common-usage (e.g., medical) sense, that would be bad for all those future selves that I care about a lot.4

(I do of course refer to past and future Holdens in the first person. When I refer to someone as "me," that means that they are a past or future self, which generally means that they have an awful lot in common with me. But in a deeper philosophical sense, my past and future selves are other people.)

And that's all. I'm constantly being replaced by other Holdens, and I care about the other Holdens, and that's all that's going on.

  • I don't care how quickly the cells in my body die and get replaced (if it were once per second, that wouldn't bother me). My self is already getting replaced all the time, and replacing my cells wouldn't add anything to that.
  • I don't care about "continuity of consciousness" (if I were constantly losing consciousness while all my cells got replaced, that wouldn't bother me).
  • If you vaporized me and created a copy of me somewhere else, that would just be totally fine. I would think of it as teleporting. It'd be chill.
  • If you made a bunch of copies of me, I would be all of them in one sense (I care about them a lot, in the same way that I normally care about future selves) and none of them in another sense (just as I am not my future selves).
  • If you did something really weird like splitting my brain in half and combining each half with someone else's brain, that would create two people that I care about more than a stranger and less than "Holden an hour from now."

I don't really find any thought experiments on this topic trippy or mind bending. They're all just cases where I get replaced with some other people who have some things in common with me, and that's already happening all the time.

Footnotes

  1. For key quotes from Reasons and Persons, see pages 223-224; 251; 279-282; 284-285; 292; 340-341. For explanations of "psychological continuity" and "psychological connectedness" (which Parfit frequently uses in discussing what matters for what counts as death), see page 206.

    "Psychological connectedness" is a fairly general idea that seems consistent with what I say here; "psychological continuity" is a more specific idea that is less important on my view (though also see pages 288-289, where Parfit appears to equivocate on how much, and how, psychological continuity matters). 

  2. "As Appendix J shows, Buddha would have agreed. The Reductionist View [the view Parfit defends] is not merely part of one cultural tradition. It may be, as I have claimed, the true view about all people at all times." Reasons and Persons page 273. Emphasis in original. 
  3. There's the additional matter that he's held responsible for my actions, which makes sense if only because my actions are predictive of his actions. 
  4. I don't personally care all that much about these future selves' getting to "exist," as an end in itself. I care more about the fact that their disappearance would mean the end of the stories, projects, relationships, etc. that I'm in. But you could easily take my view of personal identity while caring a lot intrinsically about whether your future selves get to exist. 
Comment by Mo Putera (Mo Nastri) on Simulacra Levels Summary · 2024-07-06T06:01:17.828Z · LW · GW

This is great, thank you for making it.

Comment by Mo Putera (Mo Nastri) on In Defense of Lawyers Playing Their Part · 2024-07-03T04:28:52.515Z · LW · GW

So instead, we tell the lawyers to go nuts. Be as biased as possible, and, as long as they're equally skilled and there aren't background factors that favor one position over the other, this ensures that each presented position is equally far from the truth. The jury now has a fair overview of both sides of the case, without a malicious lawyer being able to advantage one over the other.

This reminds me of Peter Watts' classic post about (among others) how science works:

Science doesn’t work despite scientists being asses. Science works, to at least some extent, because scientists are asses. Bickering and backstabbing are essential elements of the process. Haven’t any of these guys ever heard of “peer review”?

There’s this myth in wide circulation: rational, emotionless Vulcans in white coats, plumbing the secrets of the universe, their Scientific Methods unsullied by bias or emotionalism. Most people know it’s a myth, of course; they subscribe to a more nuanced view in which scientists are as petty and vain and human as anyone (and as egotistical as any therapist or financier), people who use scientific methodology to tamp down their human imperfections and manage some approximation of objectivity.

But that’s a myth too. The fact is, we are all humans; and humans come with dogma as standard equipment. We can no more shake off our biases than Liz Cheney could pay a compliment to Barack Obama. The best we can do— the best science can do— is make sure that at least, we get to choose among competing biases.

That’s how science works. It’s not a hippie love-in; it’s rugby. Every time you put out a paper, the guy you pissed off at last year’s Houston conference is gonna be laying in wait. Every time you think you’ve made a breakthrough, that asshole supervisor who told you you needed more data will be standing ready to shoot it down. You want to know how the Human Genome Project finished so far ahead of schedule? Because it was the Human Genome projects, two competing teams locked in bitter rivalry, one led by J. Craig Venter, one by Francis Collins — and from what I hear, those guys did not like each other at all.

This is how it works: you put your model out there in the coliseum, and a bunch of guys in white coats kick the shit out of it. If it’s still alive when the dust clears, your brainchild receives conditional acceptance. It does not get rejected. This time. ...

Science is so powerful that it drags us kicking and screaming towards the truth despite our best efforts to avoid it. And it does that at least partly fueled by our pettiness and our rivalries. Science is alchemy: it turns shit into gold. 

Comment by Mo Putera (Mo Nastri) on My 5-step program for losing weight · 2024-07-02T17:45:52.664Z · LW · GW

It took me about 5 years. Again, I don't think it's a useful approach if you don't like exercising in the first place; for me 5 years of resistance training has felt less like a weight-loss strategy and more like an excuse to have fun chasing goals and make like-minded friends along the way.

Comment by Mo Putera (Mo Nastri) on My 5-step program for losing weight · 2024-07-02T02:52:24.735Z · LW · GW

Lots of people believe they can eat more if they just exercise more. Unfortunately our bodies are highly efficient relative to the density of modern food, so “exercising it away” is not a realistic plan.

'Exercising it away' seems misguided given our bodies' energetic efficiency, as you said. What's instead worked for me is raising my basal metabolic rate substantially by adding muscle, which is very energetically expensive, via ~3 resistance training sessions a week. 

Admittedly I don't know of a way to maintain the required muscle mass for this strategy to work long-term without enjoying physical activity, which I seem to enjoy the way most people enjoy good food, which probably makes this useless as general advice.

Comment by Mo Putera (Mo Nastri) on The Xerox Parc/ARPA version of the intellectual Turing test: Class 1 vs Class 2 disagreement · 2024-07-01T10:31:27.238Z · LW · GW

Thought it would be useful to share this 2017 HN thread

The Myths of Creativity by David Burkus has this passage on class 1 vs class 2 disagreement: 

In the 1970s at Xerox PARC, regularly scheduled arguments were routine. The company that gave birth to the personal computer staged formal discussions designed to train their people on how to fight properly over ideas and not egos. PARC held weekly meetings they called "Dealer" (from a popular book of the time titled Beat the Dealer). Before each meeting, one person, known as "the dealer," was selected as the speaker. The speaker would present his idea and then try to defend it against a room of engineers and scientists determined to prove him wrong. Such debates helped improve products under development and sometimes resulted in wholly new ideas for future pursuit. The facilitators of the Dealer meetings were careful to make sure that only intellectual criticism of the merit of an idea received attention and consideration. Those in the audience or at the podium were never allowed to personally criticize their colleagues or bring their colleagues' character or personality into play. 

Bob Taylor, a former manager at PARC, said of their meetings, "If someone tried to push their personality rather than their argument, they'd find that it wouldn't work." Inside these debates, Taylor taught his people the difference between what he called Class 1 disagreements, in which neither party understood the other party's true position, and Class 2 disagreements, in which each side could articulate the other's stance. Class 1 disagreements were always discouraged, but Class 2 disagreements were allowed, as they often resulted in a higher quality of ideas. Taylor's model removed the personal friction from debates and taught individuals to use conflict as a means to find common, often higher, ground. 

Alan Kay responded to the above with 

This is one of those stories that has distorted over time. "Dealer" was a weekly meeting for many purposes, the main one was to provide a vehicle for coordination, planning, communication without having to set up a management structure for brilliant researchers who had some "lone wolves" tendencies.

Part of these meetings were presentations by PARC researchers. However, it was not a gantlet to be run, and it was not to train people to argue in a constructive way (most of the computer researchers at PARC were from ARPA community research centers, and learning how to argue reasonably was already part of that culture).

Visitors from Xerox frequently were horrified by the level of argument and the idea that no personal attacks were allowed had to be explained, along with the idea that the aim was not to win an argument but to illuminate. Almost never did the participants have to be reminded about "Class 1" and "Class 2", etc. The audience was -not- determined to prove the speaker wrong. That is not the way things were done.

which I suppose suggests the answer to this comment's question is "probably not".

Comment by Mo Putera (Mo Nastri) on Sci-Fi books micro-reviews · 2024-06-25T11:22:48.239Z · LW · GW

re: The Fractal Prince (really the whole Quantum Thief trilogy), I may be biased, but when I first read it I had 2 reactions: (1) this is the most targeted-at-my-ingroup novel I have ever read (2) nobody outside of my ingroup will get the kajillion references flying around, since Hannu Rajaniemi never bothered footnoting / defending any of them (unlike say what Peter Watts did with Blindsight), so people will think he's just making up technobabble when he's not, which means he'll be generally underappreciated despite the effusive praise (which will be of the generic "he's so smart" variety), which made my heart sink. 

But Gwern not only got it (unsurprisingly), he articulated it better than I ever could, so thanks Gwern:

Hannu makes no concessions to the casual reader, as he mainlines straight into his veins the pre-deep-learning 2010-era transhumanist zeitgeist via Silicon Valley—if it was ever discussed in a late-night bull session after a Singularity University conference, it might pop up here. Hannu stuffs the novels with blink-and-you’ll-miss-it ideas on the level of Olaf Stapeldon. A conventional Verne gun is too easy a way of getting to space—how about beating Project Orion by instead using a nuclear space gun (since emulated brains don’t care about high g acceleration)? Or for example, the All-Defector reveals that, since other universes could be rewriting their rules to expand at maximum speed, erasing other universes before they know it, he plans to rewrite our universe’s rule to do so first (ie. he will defect at the multiversal level against all other universes); whereas beginner-level SF like The Three Body Problem would dilate on this for half a book, Hannu’s grand reveal gets all of 2 paragraphs before crashing into the eucatastrophic ending.

For world-building, he drops neologisms left and right, and hard ones at that—few enough American readers will be familiar with the starting premise of “Arsène Lupin in spaaaace!” (probably more are familiar with the anime Lupin The Third these days), but his expectations go far beyond that: the ideal reader of the trilogy is not merely one familiar with the Prisoner’s Dilemma but also with the bizarre zero-determinant PD strategies discovered ~2008, and not just with such basic physics as quantum entanglement or applications like quantum dots, but exotic applications to quantum auctions & game theory (including Prisoner’s Dilemma) & pseudo-telepathy (yes, those are things), and it would definitely be helpful if that reader happened to also be familiar with Eliezer Yudkowsky’s c. 2000s writings on “Coherent Extrapolated Volition”, with a dash of Nikolai Fyodorovich Fyodorov’s Russian Cosmism for seasoning (although only a dash2).

This leads to an irony: I noted while reading Masamune Shirow’s Ghost in the Shell cyberpunk manga that almost everything technical in the GitS manga turned out to be nonsense despite Shirow’s pretensions to in-depth research & meticulous attention to detail; while in QT, most technical things sound like cyberpunk nonsense and Hannu doesn’t insert any editorial notes like Shirow does to defend them, but are actually real and just so arcane you haven’t heard of them.

For example, some readers accuse Hannu of relying on FTL communication via quantum entanglement, which is bad physics; but Hannu does not! If they had read more closely (similar to the standard reader failure to understand the physics of “Story of Your Life”), they would have noticed that at no point is there communication faster-than-light, only coordination faster-than-light—‘spooky action at a distance’. He is instead employing advanced forms of quantum entanglement which enable things like secret auctions or for coordinated strategies of game-playing (quantum coordination, like treating the particle measurements as flipping a coin and one person does the ‘Heads’ strategy and the other person does the ‘Tails’ strategy does not require communication, obviously, but surprisingly, quantum coordination can be superior to all apparently-equivalent communication-free classical strategies). He explains briefly that the zoku use quantum entanglement in these ways, but a reader could easily miss that, given all the other things they are trying to understand and how common ‘quantum woo’ is.⁠3⁠ 

Rajaniemi confirmed that Gwern "got it" like nobody else did:

As a longtime fan of gwern 's work -- gwern.net is the best rabbit hole on the Internet -- it's a treat to see this incredibly thoughtful (and slightly spoilery) review of the Quantum Thief trilogy. gwern.net/review/book#quantu… Gwern perfectly nails the emotional core of the trilogy and, true to form, spots a number of easter eggs I thought no one would ever find. This may be my favorite review of all time.

I admire Rajaniemi for pulling it off as you said, but I'm somehow not that surprised. He's bright (mathematical physics PhD) and has been working at writing-as-craft for a while:

But talking to him about his rapid career, it’s quickly apparent he’s no stranger to being compared to other sci-fi rising stars, having first seriously begun writing in 2002 while studying his PhD as part of writing group called Writers Bloc – which includes authors Charles Stross and Alan Campbell. “It is, and always has been a place with quite a harsh level of criticism,” he says. “But in a healthy and professional way, of course, so it was a good group of people and environment in which to develop.”

I personally got the Quantum Thief trilogy because I'd been blown away by Stross' Accelerando, wanted more, and saw Stross say of Rajaniemi: "Hard to admit, but I think he’s better at this stuff than I am.  The best first SF novel I’ve read in many years." 

Comment by Mo Putera (Mo Nastri) on Suffering Is Not Pain · 2024-06-24T08:57:58.975Z · LW · GW

I see. You may be interested in a contrary(?) take from the Welfare Footprint Project's researchers; in their FAQ they write

4. Why don't you use the term 'suffering', instead of 'pain'?

We prefer not to use the term suffering for various reasons. First, our analyses are concerned with “any” negative affective state (including mild ones), whereas the term suffering is often used to denote more severe states that are accompanied by concurrent negative feelings such as the perception of lack of control, fear, anxiety, the impossibility to enjoy pleasant activities or even a threat to one’s sense of self. Additionally, it is not yet possible to determine objectively when an unpleasant state becomes suffering. This is so far a value judgement, which we leave open to users of our estimates. The term ‘pain’ (both physical and psychological), in turn, is associated with negative affective experiences of a wide range of intensities.

They define their terms further here. To be fair, they focus on non-human animal welfare; I suppose your suffering vs joy distinction is more currently actionable in human-focused contexts e.g. CBT interventions.

Comment by Mo Putera (Mo Nastri) on Suffering Is Not Pain · 2024-06-23T16:25:14.915Z · LW · GW

The motivation of this post is to address the persistent conflation between suffering and pain I have observed from members of the EA community, even amongst those who purport to be “suffering-focused” in their ethical motivations.

I'm pretty suffering-focused in practice for EA-related actions (mostly donations), so I was hoping you'd say more. So:

Having this distinction in mind is critical in order to develop ethical policies and effective interventions. For instance, as previously mentioned, CBT and mindfulness practices have been shown to reduce suffering by altering the mental response to pain rather than addressing the pain itself. If (the alleviation of) suffering is what we care about, this distinction guides us to focus on the root causes of suffering in our ethical considerations, rather than merely alleviating pain. Recognizing that suffering often lies in an aversive mental reaction to pain rather than the pain itself enables more precise scientific research and more effective strategies for reducing overall suffering.

This was probably the first intervention that came to mind for me as well when seeing your claim that distinguishing pain and suffering matters in the EA-action-guiding sense; unfortunately it's already a thing, e.g. HLI recommending StrongMinds. I'd be interested if you have any other ideas for underexplored / underappreciated cause areas / intervention groups that might be worth further investigation when reevaluated via this pain vs suffering distinction? (This is my attempt to make this distinction pay rent, albeit in actions instead of anticipated experiences.)

Comment by Mo Putera (Mo Nastri) on Searching for the Root of the Tree of Evil · 2024-06-09T17:26:57.734Z · LW · GW

That's not the sense I get from skimming his second most recent post, but I don't understand what he's getting at well enough to speak in his place.

Comment by Mo Putera (Mo Nastri) on Just admit that you’ve zoned out · 2024-06-08T12:07:52.961Z · LW · GW

Not an answer to your question, just an extended quote from the late Fields medalist Bill Thurston from his classic essay On proof and progress which seemed relevant:

Mathematicians have developed habits of communication that are often dysfunctional. Organizers of colloquium talks everywhere exhort speakers to explain things in elementary terms. Nonetheless, most of the audience at an average colloquium talk gets little of value from it. Perhaps they are lost within the first 5 minutes, yet sit silently through the remaining 55 minutes. Or perhaps they quickly lose interest because the speaker plunges into technical details without presenting any reason to investigate them. At the end of the talk, the few mathematicians who are close to the field of the speaker ask a question or two to avoid embarrassment.

... Outsiders are amazed at this phenomenon, but within the mathematical community, we dismiss it with shrugs. ...

Mathematical knowledge can be transmitted amazingly fast within a subfield. When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps days by members of the subfield.

Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.

In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked. In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.

Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics?

Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.

People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it. 

Okay, I liked that passage but maybe it wasn't very useful. Ravi Vakil's advice to potential PhD students attending talks seems more useful, especially the last bullet:

  • At the end of the talk, you should try to answer the questions: What question(s) is the speaker trying to answer? Why should we care about them? What flavor of results has the speaker proved? Do I have a small example of the phenonenon under discussion? You can even scribble down these questions at the start of the talk, and jot down answers to them during the talk.
  • Try to extract three words from the talk (no matter how tangentially related to the subject at hand) that you want to know the definition of. Then after the talk, ask me what they mean. ...
  • New version of the previous jot: try the "three things" exercise.
  • See if you can get one lesson from the talk (broadly interpreted). 
  • Try to ask one question at as many seminars as possible, either during the talk, or privately afterwards. The act of trying to formulating an interesting question (for you, not the speaker!) is a worthwhile exercise, and can focus the mind.
Comment by Mo Putera (Mo Nastri) on Response to Aschenbrenner's "Situational Awareness" · 2024-06-08T11:55:56.645Z · LW · GW

I'm guessing Rob is referring to footnote 54 in What do XPT forecasts tell us about AI risk?:

And while capabilities have been increasing very rapidly, research into AI safety, does not seem to be keeping pace, even if it has perhaps sped-up in the last two years. An isolated, but illustrative, data point of this can be seen in the results of the 2022 section of a Hypermind forecasting tournament: on most benchmarks, forecasters underpredicted progress, but they overpredicted progress on the single benchmark somewhat related to AI safety.

That last link is to Jacob Steinhardt's tweet linking to his 2022 post AI Forecasting: One Year In, on the results of their 2021 forecasting contest. Quote:

Progress on a robustness benchmark was slower than expected, and was the only benchmark to fall short of forecaster predictions. This is somewhat worrying, as it suggests that machine learning capabilities are progressing quickly, while safety properties are progressing slowly. ...

As a reminder, the four benchmarks were:

  • MATH, a mathematics problem-solving dataset;
  • MMLU, a test of specialized subject knowledge using high school, college, and professional multiple choice exams;
  • Something Something v2, a video recognition dataset; and
  • CIFAR-10 robust accuracy, a measure of adversarially robust vision performance.

...

Here are the actual results, as of today:

  • MATH: 50.3% (vs. 12.7% predicted)
  • MMLU: 67.5% (vs. 57.1% predicted)
  • Adversarial CIFAR-10: 66.6% (vs. 70.4% predicted)
  • Something Something v2: 75.3% (vs. 73.0% predicted)

That's all I got, no other predictions.

Comment by Mo Putera (Mo Nastri) on Level up your spreadsheeting · 2024-05-26T15:17:35.010Z · LW · GW

Great post, especially the companion piece :)

I'm tangentially reminded of professional modeler & health economist froolow's refactoring of GiveWell's cost-effectiveness models in his A critical review of GiveWell's 2022 cost-effectiveness model (sections 3 and 4), which I think of as complementary to your post in that it teaches-via-case-study how to level up your spreadsheet modeling. 

Here's GiveWell's model architecture:

And here's froolow's refactoring: 

The difference in micro-level architecture is also quite large:

As someone who's spent a lot of his (short) career building dashboards and models in Google Sheets, and having seen GiveWell's CEAs, I empathized with froolow's remarks here:

After the issue of uncertainty analysis, I’d say the model architecture is the second biggest issue I have with the GiveWell model, and really the closest thing to a genuine ‘error’ rather than a conceptual step which could be improved. Model architecture is how different elements of your model interact with each other, and how they are laid out to a user. 

It is fairly clear that the GiveWell team are not professional modellers, in the same way it would be obvious to a professional programmer that I am not a coder (this will be obvious as soon as you check the code in my Refactored model!). That is to say, there’s a lot of wasted effort in the GiveWell model which is typical when intelligent people are concentrating on making something functional rather than using slick technique. A very common manifestation of the ‘intelligent people thinking very hard about things’ school of model design is extremely cramped and confusing model architecture. This is because you have to be a straight up genius to try and design a model as complex as the GiveWell model without using modern model planning methods, and people at that level of genius don’t need crutches the rest of us rely on like clear and straightforward model layout. However, bad architecture is technical debt that you are eventually going to have to service on your model; when you hand it over to a new member of staff it takes longer to get that member of staff up to speed and increases the probability of someone making an error when they update the model.

Comment by Mo Putera (Mo Nastri) on Fund me please - I Work so Hard that my Feet start Bleeding and I Need to Infiltrate University · 2024-05-21T01:34:19.498Z · LW · GW

I think you're right that I missed their point, thanks for pointing it out.

I have had experiences similar to Johannes' anecdote re: ignoring broken glass to not lose fragile threads of thought; they usually entailed extended deep work periods past healthy thresholds for unclear marginal gain, so the quotes above felt personally relevant as guardrails. But also my experiences don't necessarily generalize (as your hypothetical shows).

I'd be curious to know your model, and how it compares to some of John Wentworth's posts on the same IIRC.

Comment by Mo Putera (Mo Nastri) on Fund me please - I Work so Hard that my Feet start Bleeding and I Need to Infiltrate University · 2024-05-20T09:39:42.405Z · LW · GW

These thoughts remind me of something Scott Alexander once wrote - that sometimes he hears someone say true but low status things - and his automatic thoughts are about how the person must be stupid to say something like that, and he has to consciously remind himself that what was said is actually true.

For anyone who's curious, this is what Scott said, in reference to him getting older – I remember it because I noticed the same in myself as I aged too:

I look back on myself now vs. ten years ago and notice I’ve become more cynical, more mellow, and more prone to believing things are complicated. For example: [list of insights] ...

All these seem like convincing insights. But most of them are in the direction of elite opinion. There’s an innocent explanation for this: intellectual elites are pretty wise, so as I grow wiser I converge to their position. But the non-innocent explanation is that I’m not getting wiser, I’m just getting better socialized. ...

I’m pretty embarassed by Parable On Obsolete Ideologies, which I wrote eight years ago. It’s not just that it’s badly written, or that it uses an ill-advised Nazi analogy. It’s that it’s an impassioned plea to jettison everything about religion immediately, because institutions don’t matter and only raw truth-seeking is important. If I imagine myself entering that debate today, I’d be more likely to take the opposite side. But when I read Parable, there’s…nothing really wrong with it. It’s a good argument for what it argues for. I don’t have much to say against it. Ask me what changed my mind, and I’ll shrug, tell you that I guess my priorities shifted. But I can’t help noticing that eight years ago, New Atheism was really popular, and now it’s really unpopular. Or that eight years ago I was in a place where having Richard Dawkins style hyperrationalism was a useful brand, and now I’m (for some reason) in a place where having James C. Scott style intellectual conservativism is a useful brand. A lot of the “wisdom” I’ve “gained” with age is the kind of wisdom that helps me channel James C. Scott instead of Richard Dawkins; how sure am I that this is the right path?

Sometimes I can almost feel this happening. First I believe something is true, and say so. Then I realize it’s considered low-status and cringeworthy. Then I make a principled decision to avoid saying it – or say it only in a very careful way – in order to protect my reputation and ability to participate in society. Then when other people say it, I start looking down on them for being bad at public relations. Then I start looking down on them just for being low-status or cringeworthy. Finally the idea of “low-status” and “bad and wrong” have merged so fully in my mind that the idea seems terrible and ridiculous to me, and I only remember it’s true if I force myself to explicitly consider the question. And even then, it’s in a condescending way, where I feel like the people who say it’s true deserve low status for not being smart enough to remember not to say it. This is endemic, and I try to quash it when I notice it, but I don’t know how many times it’s slipped my notice all the way to the point where I can no longer remember the truth of the original statement.

This was back in 2017. 

Comment by Mo Putera (Mo Nastri) on Fund me please - I Work so Hard that my Feet start Bleeding and I Need to Infiltrate University · 2024-05-20T01:27:11.932Z · LW · GW

Holden advised against this:

Jog, don’t sprint. Skeptics of the “most important century” hypothesis will sometimes say things like “If you really believe this, why are you working normal amounts of hours instead of extreme amounts? Why do you have hobbies (or children, etc.) at all?” And I’ve seen a number of people with an attitude like: “THIS IS THE MOST IMPORTANT TIME IN HISTORY. I NEED TO WORK 24/7 AND FORGET ABOUT EVERYTHING ELSE. NO VACATIONS."

I think that’s a very bad idea.

Trying to reduce risks from advanced AI is, as of today, a frustrating and disorienting thing to be doing. It’s very hard to tell whether you’re being helpful (and as I’ve mentioned, many will inevitably think you’re being harmful).

I think the difference between “not mattering,” “doing some good” and “doing enormous good” comes down to how you choose the job, how good at it you are, and how good your judgment is (including what risks you’re most focused on and how you model them). Going “all in” on a particular objective seems bad on these fronts: it poses risks to open-mindedness, to mental health and to good decision-making (I am speaking from observations here, not just theory).

That is, I think it’s a bad idea to try to be 100% emotionally bought into the full stakes of the most important century - I think the stakes are just too high for that to make sense for any human being.

Instead, I think the best way to handle “the fate of humanity is at stake” is probably to find a nice job and work about as hard as you’d work at another job, rather than trying to make heroic efforts to work extra hard. (I criticized heroic efforts in general here.)

I think this basic formula (working in some job that is a good fit, while having some amount of balance in your life) is what’s behind a lot of the most important positive events in history to date, and presents possibly historically large opportunities today.

Also relevant are the takeaways from Thomas Kwa's effectiveness as a conjunction of multipliers, in particular:

  • It's more important to have good judgment than to dedicate 100% of your life to an EA project. If output scales linearly with work hours, then you can hit 60% of your maximum possible impact with 60% of your work hours. But if bad judgment causes you to miss one or two multipliers, you could make less than 10% of your maximum impact. (But note that working really hard can sometimes enable multipliers-- see this comment by Mathieu Putz.)
  • Aiming for the minimum of self-care is dangerous.
Comment by Mo Putera (Mo Nastri) on AIs teams will probably be more superintelligent than individual AIs · 2024-05-07T10:15:00.408Z · LW · GW

Amateur hour question, if you don't mind: how does your "future of AI teams" compare/contrast with Drexler's CAIS model?

Comment by Mo Putera (Mo Nastri) on Losing Faith In Contrarianism · 2024-04-26T04:31:07.527Z · LW · GW

You might also be interested in Scott's 2010 post warning of the 'next-level trap' so to speak: Intellectual Hipsters and Meta-Contrarianism 

A person who is somewhat upper-class will conspicuously signal eir wealth by buying difficult-to-obtain goods. A person who is very upper-class will conspicuously signal that ey feels no need to conspicuously signal eir wealth, by deliberately not buying difficult-to-obtain goods.

A person who is somewhat intelligent will conspicuously signal eir intelligence by holding difficult-to-understand opinions. A person who is very intelligent will conspicuously signal that ey feels no need to conspicuously signal eir intelligence, by deliberately not holding difficult-to-understand opinions.

... 

Without meaning to imply anything about whether or not any of these positions are correct or not3, the following triads come to mind as connected to an uneducated/contrarian/meta-contrarian divide:

- KKK-style racist / politically correct liberal / "but there are scientifically proven genetic differences"
- misogyny / women's rights movement / men's rights movement
- conservative / liberal / libertarian4
- herbal-spiritual-alternative medicine / conventional medicine / Robin Hanson
- don't care about Africa / give aid to Africa / don't give aid to Africa
- Obama is Muslim / Obama is obviously not Muslim, you idiot / Patri Friedman5

What is interesting about these triads is not that people hold the positions (which could be expected by chance) but that people get deep personal satisfaction from arguing the positions even when their arguments are unlikely to change policy6 - and that people identify with these positions to the point where arguments about them can become personal.

If meta-contrarianism is a real tendency in over-intelligent people, it doesn't mean they should immediately abandon their beliefs; that would just be meta-meta-contrarianism. It means that they need to recognize the meta-contrarian tendency within themselves and so be extra suspicious and careful about a desire to believe something contrary to the prevailing contrarian wisdom, especially if they really enjoy doing so.

Comment by Mo Putera (Mo Nastri) on Nick Bostrom’s new book, “Deep Utopia”, is out today · 2024-04-01T05:11:49.421Z · LW · GW

In Bostrom's recent interview with Liv Boeree, he said (I'm paraphrasing; you're probably better off listening to what he actually said)

  • p(doom)-related
    • it's actually gone up for him, not down (contra your guess, unless I misinterpreted you), at least when broadening the scope beyond AI (cf. vulnerable world hypothesis, 34:50 in video)
    • re: AI, his prob. dist. has 'narrowed towards the shorter end of the timeline - not a huge surprise, but a bit faster I think' (30:24 in video)
    • also re: AI, 'slow and medium-speed takeoffs have gained credibility compared to fast takeoffs'
    • he wouldn't overstate any of this
  • contrary to people's impression of him, he's always been writing about 'both sides' (doom and utopia) 
  • in the past it just seemed more pressing to him to call attention to 'various things that could go wrong so we could avoid these pitfalls and then we'd have plenty of time to think about what to do with this big future'
    • this reminded me of this illustration from his old paper introducing the idea of x-risk prevention as global priority: 
Comment by Mo Putera (Mo Nastri) on [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate · 2024-03-30T18:49:54.776Z · LW · GW

What's your take on Scott's post?

Comment by Mo Putera (Mo Nastri) on "How could I have thought that faster?" · 2024-03-12T04:52:01.298Z · LW · GW

If I take this claimed strategy as a hypothesis (that radical introspective speedup is possible and trainable), how might I falsify it? I ask because I can already feel myself wanting to believe it's true and personally useful, which is an epistemic red flag. Bonus points if the falsification test isn't high cost (e.g. I don't have to try it for years).

Comment by Mo Putera (Mo Nastri) on Using axis lines for good or evil · 2024-03-07T11:48:12.186Z · LW · GW

I was wondering about this too. I thought of Eugene Wei writing about Edward Tufte's classic book The Visual Display of Quantitative Information, which he considers "[one of] the most important books I've read". He illustrates with an example, just like dynomight did above, starting with this chart auto-created in Excel: 

chart-1.pngand systematically applies Tufte's principles to eventually end up with this:

chart-4.png

Wei adds further commentary:

No issues for color blind users, but we're stretching the limits of line styles past where I'm comfortable. To me, it's somewhat easier with the colored lines above to trace different countries across time versus each other, though this monochrome version isn't terrible. Still, this chart reminds me, in many ways, of the monochromatic look of my old Amazon Analytics Package, though it is missing data labels (wouldn't fit here) and has horizontal gridlines (mine never did).

We're running into some of these tradeoffs because of the sheer number of data series in play. Eight is not just enough, it is probably too many. Past some number of data series, it's often easier and cleaner to display these as a series of small multiples. It all depends on the goal and what you're trying to communicate.

At some point, no set of principles is one size fits all, and as the communicator you have to make some subjective judgments. For example, at Amazon, I knew that Joy wanted to see the data values marked on the graph, whenever they could be displayed. She was that detail-oriented. Once I included data values, gridlines were repetitive, and y-axis labels could be reduced in number as well.

Tufte advocates reducing non-data-ink, within reason, and gridlines are often just that. In some cases, if data values aren't possible to fit onto a line graph, I sometimes include gridlines to allow for easy calculation of the relative ratio of one value to another (simply count gridlines between the values), but that's an edge case.

For sharp changes, like an anomalous reversal in the slope of a line graph, I often inserted a note directly on the graph, to anticipate and head off any viewer questions. For example, in the graph above, if fewer data series were included, but Greece remained, one might wish to explain the decline in health expenditures starting in 2008 by adding a note in the plot area near that data point, noting the beginning of the Greek financial crisis (I don't know if that's the actual cause, but whatever the reason or theory, I'd place it there).

If we had company targets for a specific metric, I'd note those on the chart(s) in question as a labeled asymptote. You can never remind people of goals often enough.

And I thought, okay, sounds persuasive and all, but also this feels like Wei/Tufte is pushing their personal aesthetic on me, and I can't really tell the difference (or whether it matters).

Comment by Mo Putera (Mo Nastri) on If you weren't such an idiot... · 2024-03-05T10:12:22.294Z · LW · GW

I'm curious about you not doing these, since I'd unquestioningly accepted them, and would love for you to elaborate:

- save lots of money in a retirement account and buy index funds
- shower daily
- use shampoo
- wear shoes
- walk

Regarding 'diet stuff', I mostly agree and like how Jay Daigle put it:

I’ve decided lately that people regularly get confused, on a number of subjects, by the difference between science and engineering. ... Tl;dr: Science is sensitive and finds facts; engineering is robust and gives praxes. Many problems happen when we confuse science for engineering and completely modify our praxis based on the result of a couple of studies in an unsettled area. ...

This means two things. First is that we need to understand things much better for engineering than for science. In science it’s fine to say “The true effect is between +3 and -7 with 95% probability”. If that’s what we know, then that’s what we know. And an experiment that shrinks the bell curve by half a unit is useful. For engineering, we generally need to have a much better idea of what the true effect is. (Imagine trying to build a device based on the information that acceleration due to gravity is probably between 9 and 13 m/s^2).

Second is that science in general cares about much smaller effects than engineering does. It was a very long time before engineering needed relativistic corrections due to gravity, say. A fact can be true but not (yet) useful or relevant, and then it’s in the domain of science but not engineering. 

Why does this matter?

The distinction is, I think fairly clear when we talk about physics. ... But people get much more confused when we move over to, say, psychology, or sociology, or nutrition. Researchers are doing a lot of science on these subjects, and doing good work. So there’s a ton of papers out there saying that eggs are good, or eggs are bad, or eggs are good for you but only until next Monday or whatever.

And people have, often, one of two reactions to this situation. The first is to read one study and say “See, here’s the scientific study. It says eggs are bad for you. Why are you still eating eggs? Are you denying the science?” And the second reaction is to say that obviously the scientists can’t agree, and so we don’t know anything and maybe the whole scientific approach is flawed.

But the real situation is that we’re struggling to develop a science of nutrition. And that shit is hard. We’ve worked hard, and we know some things. But we don’t really have enough information to do engineering, to say “Okay, to optimize cardiovascular health you need to cut your simple carbs by 7%, eat an extra 10g of monounsaturated fats every day, and eat 200g of protein every Wednesday” or whatever. We just don’t know enough.

And this is where folk traditions come in. Folk traditions are attempts to answer questions that we need decent answers to, that have been developed over time, and that are presumably non-horrible because they haven’t failed obviously and spectacularly yet. A person who ate “Like my grandma” is probably on average at least as healthy as a person who tried to follow every trendy bit of scientistic nutrition advice from the past thirty years.

Comment by Mo Putera (Mo Nastri) on Increasing IQ is trivial · 2024-03-03T07:05:20.200Z · LW · GW

Nitpick that doesn't bear upon the main thrust of the article: 

2021: Here’s a random weightlifter I found coming in at over 400kg, I don’t have his DEXA but let’s say somewhere between 300 and 350kgs of muscle.

More plausibly Josh Silvas weighs 220-ish kg, not 400 kg, and there's no way he has anywhere near 300+ kg of muscle. To contextualize, the heaviest WSM winners ever weighed around 200-210 kg (Hafthor, Brian); Brian in particular had a lean body mass of 156 kg back when he weighed 200 kg peaking for competition ('peaking' implies unsustainability), which is the highest DEXA figure I've ever found in years of following strength-related statistics. 

Comment by Mo Putera (Mo Nastri) on How I build and run behavioral interviews · 2024-02-28T18:44:23.451Z · LW · GW

The two highest mean validity paired procedures for predicting job performance are general mental ability (GMA) plus an integrity test, and GMA + a structured interview (Schmidt et al 2016 meta-analysis of "100 years of research in personnel selection", reviewing 31 procedures, via 80,000 Hours – check out Table 2 on page 71). GMA alone beats all other single procedures; integrity tests not only beat all other non-GMA procedures but also correlate nearly zero with GMA, hence the combination efficacy. 

A bit more on integrity tests, if you (like me) weren't clear on them:

These tests are used in business and industry to hire employees with reduced probability of counterproductive work behaviors on the job, such as fighting, drinking or taking drugs, stealing from the employer, equipment sabotage, or excessive absenteeism. Integrity tests do predict these behaviors, but surprisingly they also predict overall job performance (Ones, Viswesvaran, & Schmidt, 1993).

Behavioral interviews – which Schmidt et al call situational judgment tests – are either middle of the rankings (for knowledge-based tests) or near the bottom (for behavioral tendencies). Given this, I'd be curious what value Ben gets out of investing nontrivial effort into running them, cf. Luke's comment.

Comment by Mo Putera (Mo Nastri) on The Pareto Best and the Curse of Doom · 2024-02-24T05:12:03.346Z · LW · GW

I think curse of dimensionality is apt, since the prerequisite reading directly references it:

One problem with this whole GEM-vs-Pareto concept: if chasing a Pareto frontier makes it easier to circumvent GEM and gain a big windfall, then why doesn’t everyone chase a Pareto frontier? Apply GEM to the entire system: why haven’t people already picked up the opportunities lying on all these Pareto frontiers?

Answer: dimensionality. If there’s 100 different specialties, then there’s only 100 people who are the best within their specialty. But there’s 10k pairs of specialties (e.g. statistics/gerontology), 1M triples (e.g. statistics/gerontology/macroeconomics), and something like 10^30 combinations of specialties. And each of those pareto frontiers has room for more than one person, even allowing for elbow room. Even if only a small fraction of those combinations are useful, there’s still a lot of space to stake out a territory.

That said, the way John talks about it there I think 'boon of dimensionality' might be more apt still, but in Screwtape's context 'curse' is right.

Comment by Mo Putera (Mo Nastri) on Noticing Panic · 2024-02-06T17:47:14.106Z · LW · GW

Great comment. I also like Nate Soares' Dive in:

In my experience, the way you end up doing good in the world has very little to do with how good your initial plan was. Most of your outcome will depend on luck, timing, and your ability to actually get out of your own way and start somewhere. The way to end up with a good plan is not to start with a good plan, it's to start with some plan, and then slam that plan against reality until reality hands you a better plan.

It's important to possess a minimal level of ability to update in the face of evidence, and to actually change your mind. But by far the most important thing is to just dive in.

Comment by Mo Putera (Mo Nastri) on POC || GTFO culture as partial antidote to alignment wordcelism · 2024-01-31T17:46:34.075Z · LW · GW

Would the recent Anthropic sleeper agents paper count as an example of bullet #2 or #3? 

Comment by Mo Putera (Mo Nastri) on How do you feel about LessWrong these days? [Open feedback thread] · 2024-01-31T17:35:30.654Z · LW · GW

I've been considering writing a post about this but I think my writing style tends to be a bit ... messy ... to get upvoted here.

Please do. I've been mulling over related half-digested thoughts -- replacing the symbol / brand with the substance, etc.

Comment by Mo Putera (Mo Nastri) on Searching for outliers · 2024-01-30T11:35:44.554Z · LW · GW

Say more? (e.g. illustrative / motivating examples, related reading etc)

Comment by Mo Putera (Mo Nastri) on How to write better? · 2024-01-30T09:51:48.388Z · LW · GW

You might be interested in Scott Alexander's writing advice. In particular, ever since reading that comment a ~decade ago I find myself repeatedly doing what he said here:

The best way to improve the natural flow of ideas, and your writing in general, is to read really good writers so much that you unconsciously pick up their turns of phrase and don't even realize when you're using them. The best time to do that is when you're eight years old; the second best time is now.

Your role models here should be those vampires who hunt down the talented, suck out their souls, and absorb their powers. Which writers' souls you feast upon depends on your own natural style and your goals. I've gained most from reading Eliezer, Mencius Moldbug, Aleister Crowley, and G.K. Chesterton (links go to writing samples from each I consider particularly good); I'm currently making my way through Chesterton's collected works pretty much with the sole aim of imprinting his writing style into my brain.

Stepping from the sublime to the ridiculous, I took a lot from reading Dave Barry when I was a child. He has a very observational sense of humor, the sort where instead of going out looking for jokes, he just writes about a topic and it ends up funny. It's not hard to copy if you're familiar enough with it. And if you can be funny, people will read you whether you have any other redeeming qualities or not.

Comment by Mo Putera (Mo Nastri) on why I'm anti-YIMBY · 2024-01-29T07:33:24.061Z · LW · GW

Yeah, or adversarial collaboration-style. I'd be especially curious about (1) what would change your mind (same for the YIMBY proponent) (2) empirical data

Comment by Mo Putera (Mo Nastri) on AI #48: The Talk of Davos · 2024-01-26T07:39:57.704Z · LW · GW

I do not understand why very smart people are almost intelligence deniers.

This reminded me of Are smart people's personal experiences biased against general intelligence? It's collider bias: 

I think that people who are high in g will tend to see things in their everyday life that suggest to them that there is a tradeoff between being high g and having other valuable traits.

The post's illustrative example is Nassim "IQ is largely a pseudoscientific swindle" Taleb. 

Comment by Mo Putera (Mo Nastri) on Humans aren't fleeb. · 2024-01-24T17:22:22.115Z · LW · GW

(Some of your subsections link to a Google document instead of the relevant section in the post you intended.)

Comment by Mo Putera (Mo Nastri) on 60+ Possible Futures · 2024-01-22T11:27:32.852Z · LW · GW

This is great, I've bookmarked it for future reference, thank you for doing the work of distilling all this.

I think Anders Sandberg's grand futures might fit in under your last subsection. Long quote incoming (apologies in advance, it's hard to summarize Sandberg):

Rob Wiblin: ... What are some futures that you think could plausibly happen that are amazing from various different points of view?

Anders Sandberg: One amazing future is humanity gets its act together. It solves existential risk, develops molecular nanotechnology and atomically precise manufacturing, masters biotechnology, and turns itself sustainable: turns half of the planet into a wilderness preserve that can evolve on its own, keeping to the other half where you have high material standards in a totally sustainable way that can keep on going essentially as long as the biosphere is going. And long before that, of course, people starting to take steps to maintain the biosphere by putting up a solar shield, et cetera. And others, of course, go off — first settling the solar system, then other solar systems, then other galaxies — building this super-civilisation in the nearby part of the universe that can keep together against the expansion of the universe, while others go off to really far corners so you can be totally safe that intelligence and consciousness remains somewhere, and they might even try different social experiments.

That’s one future. That one keeps on going essentially as long as the stars are burning. And at that point, they need to turn to actually taking matter and putting it into the dark black hole accretion disks and extracting the energy and keep on going essentially up until the point where you get proton decay — which might be curtains, but this is something north of 1036 years. That’s a lot of future, most of it long after the stars had burned out. And most of the beings there are going to be utterly dissimilar to us.

But you could imagine another future: In the near future, we develop ways of doing brain emulation and we turn ourselves into a software species. Maybe not everybody; there are going to be stragglers who are going to maintain the biosphere on the Earth and going to be frowning at those crazies that in some sense committed suicide by becoming software. The software people are, of course, just going to be smiling at them, but thinking, “We’ve got the good deal. We got on this infinite space we can define endlessly.”

And quite soon they realise they need more compute, so they turn a few other planets of the solar system into computing centres. But much of a cultural development happens in the virtual space, and if that doesn’t need to expand too much, you might actually end up with a very small and portable humanity. I did a calculation some years ago that if you actually covered a part of the Sahara Desert with solar panels and use quantum dot cellular automaton computing, you could keep mankind in an uploaded form running there indefinitely, with a rather minimal impact on the biosphere. So in that case, maybe the future of humanity is instead going to be a little black square on a continent, and not making much fuss in the outside universe.

I hold that slightly unlikely, because sooner or later somebody’s going to say, “But what about space? What about just exploring that material world I heard so much about from Grandfather when he was talking? ‘In my youth, we were actually embodied.'” So I’m not certain this is a stable future.

The thing that interests me is that I like open-ended futures. I think it’s kind of worrisome if you come up with an idea of a future that is so perfected, but it requires that everybody do the same thing. That is pretty unlikely, given how we are organised as people right now, and systems that force us to do the same thing are terrifyingly dangerous. It might be a useful thing to have a singleton system that somehow keeps us from committing existential risk suicide, but if that impairs our autonomy, we might actually have lost quite a lot of value. It might still be worth it, but you need to think carefully about the tradeoff. And if its values are bad, even if it’s just subtly bad, that might mean that we lose most of the future.

I also think that there might be really weird futures that we can’t think well about. Right now we have certain things that we value and evaluate as important and good: we think about the good life, we think about pleasure, we think about justice. We have a whole set of things that are very dependent on our kind of brains. Those brains didn’t exist a few million years ago. You could make an argument that some higher apes actually have a bit of a primitive sense of justice. They get very annoyed when there is unfair treatment. But as you go back in time, you find simpler and simpler organisms and there is less and less of these moral values. There might still be pleasure and pain. So it might very well be that the fishes swimming around the oceans during the Silurian already had values and disvalues. But go back another few hundred million years and there might not even have been that. There was still life, which might have some intrinsic value, but much less of it.

Where I’m getting at with this is that value might have emerged in a stepwise way: We started with plasma near the Big Bang, and then eventually got systems that might have intrinsic value because of complex life, and then maybe systems that get intrinsic value because they have consciousness and qualia, and maybe another step where we get justice and thinking about moral stuff. Why does this process stop with us? It might very well be that there are more kinds of value waiting in the wings, so to say, if we get brains and systems that can handle them.

That would suggest that maybe in 100 million years we find the next level of value, and that’s actually way more important than the previous ones all taken together. And it might not end with that mysterious whatever value it is: there might be other things that are even more important waiting to be discovered. So this raises this disturbing question that we actually have no clue how the universe ought to be organised to maximise value or doing the right thing, whatever it is, because we might be too early on. We might be like a primordial slime thinking that photosynthesis is the biggest value there is, and totally unaware that there could be things like awareness.

Rob Wiblin: OK, so the first one there was a very big future, where humanity and its descendants go and grab a lot of matter and energy across the universe and survive for a very long time. So there’s the potential at least, with all of that energy, for a lot of beings to exist for a very long time and do all kinds of interesting stuff.

Then there’s the very modest future, where maybe we just try to keep our present population and we try to shrink our footprint as much as possible so that we’re interfering with nature or the rest of the universe as little as possible.

And then there’s this wildcard, which is maybe we discover that there are values that are totally beyond human comprehension, where we go and do something very strange that we don’t even have a name for at the moment.

Comment by Mo Putera (Mo Nastri) on Four visions of Transformative AI success · 2024-01-22T11:21:35.041Z · LW · GW

the value generators are about as simple and general as we could have gotten

Would you say it's something like empowerment? Quoting Jacob:

Empowerment provides a succinct unifying explanation for much of the apparent complexity of human values: our drives for power, knowledge, self-actualization, social status/influence, curiosity and even fun[4] can all be derived as instrumental subgoals or manifestations of empowerment. Of course empowerment alone can not be the only value or organisms would never mate: sexual attraction is the principle deviation later in life (after sexual maturity), along with the related cooperative empathy/love/altruism mechanisms to align individuals with family and allies (forming loose hierarchical agents which empowerment also serves).

The key central lesson that modern neuroscience gifted machine learning is that the vast apparent complexity of the adult human brain, with all its myriad task specific circuitry, emerges naturally from simple architectures and optimization via simple universal learning algorithms over massive data. Much of the complexity of human values likewise emerges naturally from the simple universal principle of empowerment.

Empowerment-driven learning (including curiosity as an instrumental subgoal of empowerment) is the clear primary driver of human intelligence in particular, and explains the success of video games as empowerment superstimuli and fun more generally.

This is good news for alignment. Much of our values - although seemingly complex - derive from a few simple universal principles. Better yet, regardless of how our specific terminal values/goals vary, our instrumental goals simply converge to empowerment regardless. Of course instrumental convergence is also independently bad news, for it suggests we won't be able to distinguish altruistic and selfish AGI from their words and deeds alone. But for now, let's focus on that good news:

Safe AI does not need to learn a detailed accurate model of our values. It simply needs to empower us.

Comment by Mo Putera (Mo Nastri) on On "Geeks, MOPs, and Sociopaths" · 2024-01-22T04:41:13.075Z · LW · GW

Curious what you think of Scott Alexander's Peter Turchin-inspired 'cyclic model' alternative to Chapman's model, which he argues better matches his experience, summarizable as precycle → growth (forward + upward + outward) → involution → postcycle: 

Either through good luck or poor observational skills, I’ve never seen a lot of sociopath takeovers. Instead, I’ve seen a gradual process of declining asabiyyah. Good people start out working together, then work together a little less, then turn on each other, all while staying good people and thinking they alone embody the true spirit of the movement.

Comment by Mo Putera (Mo Nastri) on What rationality failure modes are there? · 2024-01-22T04:26:38.692Z · LW · GW

Curious to see you elaborate on the last point, or just pointers to further reading. I think I agree in a betting sense (i.e. is Jan's claim true or false?) but don't really have a gears-level understanding.

Comment by Mo Putera (Mo Nastri) on What rationality failure modes are there? · 2024-01-22T04:23:59.036Z · LW · GW

I'm not sure your last sentence is true, mainly because selection bias: a fair proportion of the more instrumental folks are too busy actually doing work IRL to post frequently here anymore (e.g. Luke Muehlhauser, who I still sometimes think of as the author of posts like How to Beat Procrastination instead of his current role). 

Comment by Mo Putera (Mo Nastri) on On "Geeks, MOPs, and Sociopaths" · 2024-01-22T04:16:52.821Z · LW · GW

you can tell who are the sociopaths by their money & unnaturally high h-index, and you can tell who are the geeks by their quality work

Tangential to your comment's main point, but for non-insiders maybe PaperRank, AuthorRank and Citation-Coins are harder to game than the h-index: 

Since different papers and different fields have largely different average number of co-authors and of references we replace citations with individual citations, shared among co-authors. Next, we improve on citation counting applying the PageRank algorithm to citations among papers. Being time-ordered, this reduces to a weighted counting of citation descendants that we call PaperRank. Similarly, we compute an AuthorRank applying the PageRank algorithm to citations among authors. These metrics quantify the impact of an author or paper taking into account the impact of those authors that cite it. Finally, we show how self- and circular- citations can be eliminated by defining a closed market of citation-coins. 

They still can't be compared between subfields though, only within.

Comment by Mo Putera (Mo Nastri) on Being nicer than Clippy · 2024-01-18T08:44:23.609Z · LW · GW

I don't have anything to add other than that I really appreciate how you've articulated a morass of vague intuitions I've begun to have re: boundaries-oriented ethics, and that I hope you end up writing this up as a full standalone post sometime.

Comment by Mo Putera (Mo Nastri) on An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers · 2024-01-18T08:36:06.658Z · LW · GW

I'm personally very glad you nevertheless decided to go ahead and publish this (pedagogically beautiful) essay; I'm already mentally drawing up a list of friends to share this with :) 

Comment by Mo Putera (Mo Nastri) on What good is G-factor if you're dumped in the woods? A field report from a camp counselor. · 2024-01-13T06:21:54.096Z · LW · GW

Really? I thought it was unsettling.

Comment by Mo Putera (Mo Nastri) on Notes on notes on virtues · 2024-01-06T05:12:58.784Z · LW · GW

I hope you do too. One of my aims this year is to try 'intentional virtue training', and your sequence has been an impetus, although I've only skimmed certain parts so I intend to read them more thoroughly later. I'm not sure whether I should try Ben Franklin's approach or SotF&E's; the former strikes me as somewhat harsher, but I have a hunch (empirically unsupported aside from my own confounder-laden upbringing) that the harshness is a feature not a bug for a certain sort of person, including me, so I'm leaning towards that. 

Comment by Mo Putera (Mo Nastri) on Notes on notes on virtues · 2024-01-05T07:59:45.059Z · LW · GW

Hi David, is the notes on virtues sequence still ongoing? 

I like the idea of the Society of the Free and Easy, but the fact that the program began to dwindle after a while does give me pause from a 'will it work for me?' perspective.

Comment by Mo Putera (Mo Nastri) on The spiritual benefits of material progress · 2024-01-02T09:13:09.815Z · LW · GW

Is that disagreement enough to change the (predicted) truth value of Jason's claim though? 

I'll admit to being biased here. I live in a rapidly-developing middle-income country; the difference in opportunity between my generation and my parents is nearly as vast as between 1910 and 2009 in Gordon's statistics. To me, while I agree wholeheartedly that Gordon's categorization doesn't cleave reality at the same joints Jason's does, it's still ~irrelevant in that it doesn't change my mind on the directionality of Jason's claim.

Comment by Mo Putera (Mo Nastri) on Memory bandwidth constraints imply economies of scale in AI inference · 2023-12-31T18:39:25.418Z · LW · GW

A few years back VCs were fooled by a number of well meaning startups based on the pitch "We can just make a big matmul chip like a GPU but with far more on chip SRAM and thereby avoid the VN bottleneck!"

Including Cerebras?

Comment by Mo Putera (Mo Nastri) on Value systematization: how values become coherent (and misaligned) · 2023-12-28T13:08:14.234Z · LW · GW

Tangentially:

See Friston's predictive-processing framework in neuroscience

Nostalgebraist has argued that Friston's ideas here are either vacuous or a nonstarter, in case you're interested.