Posts
Comments
Found an annotated version of Vernor Vinge's A Fire Upon The Deep.
The part of Ajeya's comment that stood out to me was this:
On a meta level I now defer heavily to Ryan and people in his reference class (METR and Redwood engineers) on AI timelines, because they have a similarly deep understanding of the conceptual arguments I consider most important while having much more hands-on experience with the frontier of useful AI capabilities (I still don't use AI systems regularly in my work).
I'd also look at Eli Lifland's forecasts as well:
I don't think you need that footnoted caveat, simply because there isn't $150M/year worth of room for more funding in all of AMF, Malaria Consortium's SMC program, HKI's vitamin A supplementation program, and New Incentives' cash incentives for routine vaccination program all combined; these comprise the full list of GiveWell's top charities.
Another point is that the benefits of eradication keep adding up long after you've stopped paying for the costs, because the counterfactual that people keep suffering and dying of the disease is no longer happening. That's how smallpox eradication's cost-effectiveness can plausibly be less than a dollar per DALY averted so far and dropping (Guesstimate model, analysis). Quoting that analysis:
3.10.) For how many years should you consider benefits?
It is not clear for how long we should continue to consider benefits, since the benefits of vaccines would potentially continue indefinitely for hundreds of years. Perhaps these benefits would eventually be offset by some other future technology, and we could try to model that. Or perhaps we should consider a discount rate into the future, though we don’t find that idea appealing.
Instead, we decided to cap at an arbitrary fixed amount of years set to 20 by default, though adjustable as a variable in our spreadsheet model (or by copying and modifying our Guesstimate models). We picked 20 because it felt like a significant enough amount of time for technology and other dynamics to shift.
It’s important to think through what cap makes the most sense, though, as it can have a large effect on the final model, as seen in this table where we explore the ramifications of smallpox eradication with different benefit thresholds:
Smallpox Eradication Cost-effectiveness
I thought it'd be useful for others to link to your longer writings on this:
They used to consider speedrunning games a guilty pleasure, but after goal-factoring their supposed guilty pleasure concludes that the guilt doesn't align with their actual goals and values and feeling bad about enjoying speedrunning doesn't really serve any productive purpose, so now they enjoy speedrunning unabashedly.
Maybe it's more correct to say that understanding requires specifically compositional compression, which maintains an interface-based structure hence allowing us to reason about parts without decompressing the whole, as well as maintaining roughly constant complexity as systems scale, which parallels local decodability. ZIP achieves high compression but loses compositionality.
I think what explains the relative ease of progress in physics has more so to do with its relative compositionality in contrast to other disciplines like biology or economics or the theory of differential equations, in the sense Jules Hedges meant it. To quote that essay:
For examples of non-compositional systems, we look to nature. Generally speaking, the reductionist methodology of science has difficulty with biology, where an understanding of one scale often does not translate to an understanding on a larger scale. ... For example, the behaviour of neurons is well-understood, but groups of neurons are not. Similarly in genetics, individual genes can interact in complex ways that block understanding of genomes at a larger scale.
Such behaviour is not confined to biology, though. It is also present in economics: two well-understood markets can interact in complex and unexpected ways. Consider a simple but already important example from game theory. The behaviour of an individual player is fully understood: they choose in a way that maximises their utility. Put two such players together, however, and there are already problems with equilibrium selection, where the actual physical behaviour of the system is very hard to predict.
More generally, I claim that the opposite of compositionality is emergent effects. The common definition of emergence is a system being ‘more than the sum of its parts’, and so it is easy to see that such a system cannot be understood only in terms of its parts, i.e. it is not compositional. Moreover I claim that non-compositionality is a barrier to scientific understanding, because it breaks the reductionist methodology of always dividing a system into smaller components and translating explanations into lower levels.
More specifically, I claim that compositionality is strictly necessary for working at scale. In a non-compositional setting, a technique for a solving a problem may be of no use whatsoever for solving the problem one order of magnitude larger. To demonstrate that this worst case scenario can actually happen, consider the theory of differential equations: a technique that is known to be effective for some class of equations will usually be of no use for equations removed from that class by even a small modification. In some sense, differential equations is the ultimate non-compositional theory.
That makes more sense, thanks :)
Any good ideas on how to check / falsify this? I've been thinking of checking my own AI-driven job loss predictions but find it harder to specify the details than expected.
Upvoted and up-concreted your take, I really appreciate experiments like this. That said:
This isn't necessarily overwhelming evidence of anything, but it might genuinely make my timelines longer. Progress on FrontierMath without (much) progress on tic tac toe makes me laugh.
I'm confused why you think o1 losing the same way in tic tac toe repeatedly shortens your timelines, given that it's o3 that pushed the FrontierMath SOTA score from 2% to 25% (and o1 was ~1%). I'd agree if it was o3 that did the repeated same-way losing, since that would make your second sentence make sense to me.
Kim Stanley Robinson seems to fit this too:
In some ways, Robinson’s path as a science fiction writer has followed a strange trajectory. He made his name writing about humanity’s far-flung future, with visionary works about the colonization of Mars (“The Mars Trilogy”), interstellar, intergenerational voyages into deep space (“Aurora”), and humanity’s expansion into the far reaches of the solar system (“2312”). But recently, he’s been circling closer to earth, and to the current crisis of catastrophic warming.
Futuristic stories about space exploration feel irrelevant to him now, Robinson said. He’s grown skeptical that humanity’s future lies in the stars, and dismissive of tech billionaires’ ambitions to explore space, even as he acknowledged, “I’m partially responsible for that fantasy.”
In his more recent novels — works like “New York 2140,” an oddly uplifting climate change novel that takes place after New York City is partly submerged by rising tides, and “Red Moon,” set in a lunar city in 2047 — he has traveled back in time, toward the present. Two years ago, he published “The Ministry for the Future,” which opens in 2025 and unfolds over the next few decades, as the world reels from floods, heat waves, and mounting ecological disasters, and an international ministry is created to save the planet.
The Square/Cube Law makes it physically impossible to build megastructures like space elevators, mass drivers, orbital rings, etc: https://en.wikipedia.org/wiki/Square%E2%80%93cube_law
This was a surprisingly ignorant comment by T. K. Van Allen, given that O'Neill was a physicist and included all his calculations. I suspect Van Allen never actually read the 'Steel structure' math in O'Neill's essay The Colonization of Space. The rest of Van Allen's bullet points also seem ignorant of O'Neill's calculations further down in the essay. I don't disagree with the bottomline that the cost is prohibitive, I just wished Van Allen engaged with O'Neill's math.
Eric Drexler wrote two essays that seem related, which I really loved.
The first is How to Understand Everything (and why). It's short enough to be quoted essentially whole, so if you don't mind I'll do so:
In science and technology, there is a broad and integrative kind of knowledge that can be learned, but isn’t taught. It’s important, though, because it makes creative work more productive and makes costly blunders less likely.
Formal education in science and engineering centers on teaching facts and problem-solving skills in a series of narrow topics. It is true that a few topics, although narrow in content, have such broad application that they are themselves integrative: These include (at a bare minimum) substantial chunks of mathematics and the basics of classical mechanics and electromagnetism, with the basics of thermodynamics and quantum mechanics close behind.
Most subjects in science and engineering, however, are narrower than these, and advanced education means deeper and narrower education. What this kind of education omits is knowledge of extent and structure of human knowledge on a trans-disciplinary scale. This means understanding — in a particular, limited sense — everything.
To avoid blunders and absurdities, to recognize cross-disciplinary opportunities, and to make sense of new ideas, requires knowledge of at least the outlines of every field that might be relevant to the topics of interest. By knowing the outlines of a field, I mean knowing the answers, to some reasonable approximation, to questions like these:
What are the physical phenomena?
What causes them?
What are their magnitudes?
When might they be important?
How well are they understood?
How well can they be modeled?
What do they make possible?
What do they forbid?And even more fundamental than these are questions of knowledge about knowledge:
What is known today?
What are the gaps in what I know?
When would I need to know more to solve a problem?
How could I find what I need?It takes far less knowledge to recognize a problem than to solve it, yet in key respects, that bit of knowledge is more important: With recognition, a problem may be avoided, or solved, or an idea abandoned. Without recognition, a hidden problem may invalidate the labor of an hour, or a lifetime. Lack of a little knowledge can be a dangerous thing.
Looking back over the last few decades, I can see that I’ve invested considerably more than 10,000 hours in learning about the structures, relationships, contents, controversies, open problems, limitations, capabilities, developing an understanding of how the fields covered in the major journals fit together to constitute the current state of science and technology. In some areas, of course, I’ve dug deeper into the contents and tools of a field, driven by the needs of problem solving; in others, I know only the shape of the box and where it sits.
This sort of knowledge is a kind of specialty, really — a limited slice of learning, but oriented crosswise. Because of this orientation, though, it provides leverage in integrating knowledge from diverse sources. I am surprised by the range of fields in which I can converse with scientists and engineers at about the level of a colleague in an adjacent field. I often know what to ask about their research, and sometimes make suggestions that light their eyes.
The follow-up essay is How to Learn About Everything. It's again short enough to quote wholesale:
Note that the title above isn’t “how to learn everything”, but “how to learn about everything”. The distinction I have in mind is between knowing the inside of a topic in deep detail — many facts and problem-solving skills — and knowing the structure and context of a topic: essential facts, what problems can be solved by the skilled, and how the topic fits with others.
This knowledge isn’t superficial in a survey-course sense: It is about both deep structure and practical applications. Knowing about, in this sense, is crucial to understanding a new problem and what must be learned in more depth in order to solve it. The cross-disciplinary reach of nanotechnology almost demands this as a condition of competence.
Studying to learn about everything
To intellectually ambitious students I recommend investing a lot of time in a mode of study that may feel wrong. An implicit lesson of classroom education is that successful study leads to good test scores, but this pattern of study is radically different. It cultivates understanding of a kind that won’t help pass tests — the classroom kind, that is.
- Read and skim journals and textbooks that (at the moment) you only half understand. Include Science and Nature.
- Don’t halt, dig a hole, and study a particular subject as if you had to pass a test on it.
- Don’t avoid a subject because it seems beyond you — instead, read other half-understandable journals and textbooks to absorb more vocabulary, perspective, and context, then circle back.
- Notice that concepts make more sense when you revisit a topic.
- Notice which topics link in all directions, and provide keys to many others. Consider taking a class.
- Continue until almost everything you encounter in Science and Nature makes sense as a contribution to a field you know something about.
Why is this effective?
You learned your native language by immersion, not by swallowing and regurgitating spoonfuls of grammar and vocabulary. With comprehension of words and the unstructured curriculum of life came what we call “common sense”.
The aim of what I’ve described is to learn an expanded language and to develop what amounts to common sense, but about an uncommonly broad slice of the world. Immersion and gradual comprehension work, and I don’t know of any other way.
This process led me to explore the potential of molecular nanotechnology as a basis for high-throughput atomically precise manufacturing. If broad-spectrum common sense were more widespread among scientists, there would be no air of controversy around the subject, milestones like the U.S. National Academies report on molecular manufacturing would have been reached a decade earlier, and today’s research agenda and perception of global problems would be very different.
I think I prefer either of Drexler's approach, Sarah Constantin's / Scott's fact-posting, and Holden Karnofsky's learning by writing, all of which can start with endless breadth but also require (quoting Drexler) deep structure and practical applications as focusing mechanisms, to the sort of learning that I think might be incentivised by budding panologists having to maximise their minimum score across some standardised battery of tests. I also liked Sarah's suggestion at the end:
Ideally, a group of people writing fact posts on related topics, could learn from each other, and share how they think. I have the strong intuition that this is valuable. It's a bit more active than a "journal club", and quite a bit more casual than "research". It's just the activity of learning and showing one's work in public.
What makes a good Royal Navy Officer? Motivation. Motivation matters more for performance evaluations and advancement to leadership than general intelligence or personality traits. Does this mean intelligence is not so important? Perhaps for this particular job it is so, especially in peacetime and until a high level is reached, more than that I would say it is a liability.
I prefer JFA's reinterpretation:
Let me reinterpret the findings: among a specific population of high achieving people, differences in motivation explains more than intelligence. I assume you would find the same thing among top.performing students. You have a truncated sample that's been selected on the outcome of interest (I.e. leadership). That's interesting as far as it goes, but this study seems plagued by survivorship bias if you actually want to learn about what variables predict what kind of people make it into top leadership.
Why not try out leogao's survey yourself to corroborate/falsify your priors?
That's the objective, not the strategy, which is explained in the rest of that writeup.
One frustrating conversation was about persuasion. Somehow there continue to be some people who can at least somewhat feel the AGI, but also genuinely think humans are at or close to the persuasion possibilities frontier – that there is no room to greatly expand one’s ability to convince people of things, or at least of things against their interests.
This is sufficiently absurd to me that I don’t really know where to start, which is one way humans are bad at persuasion. Obviously, to me, if you started with imitations of the best human persuaders (since we have an existence proof for that), and on top of that could correctly observe and interpret all the detailed signals, have limitless time to think, a repository of knowledge, the chance to do Monty Carlo tree search of the conversation against simulated humans, never make a stupid or emotional tactical decision, and so on, you’d be a persuasion monster. It’s a valid question ‘where on the tech tree’ that shows up how much versus other capabilities, but it has to be there. But my attempts to argue this proved, ironically, highly unpersuasive.
Scott tried out an intuition pump in responding to nostalgebraist's skepticism:
Nostalgebraist: ... it’s not at all clear that it is possible to be any better at cult-creation than the best historical cult leaders — to create, for instance, a sort of “super-cult” that would be attractive even to people who are normally very disinclined to join cults. (Insert your preferred Less Wrong joke here.) I could imagine an AI becoming L. Ron Hubbard, but I’m skeptical that an AI could become a super-Hubbard who would convince us all to become its devotees, even if it wanted to.
Scott: A couple of disagreements. First of all, I feel like the burden of proof should be heavily upon somebody who thinks that something stops at the most extreme level observed. Socrates might have theorized that it’s impossible for it to get colder than about 40 F, since that’s probably as low as it ever gets outside in Athens. But when we found the real absolute zero, it was with careful experimentation and theoretical grounding that gave us a good reason to place it at that point. While I agree it’s possible that the best manipulator we know is also the hard upper limit for manipulation ability, I haven’t seen any evidence for that so I default to thinking it’s false.
(lots of fantasy and science fiction does a good job intuition-pumping what a super-manipulator might look like; I especially recommend R. Scott Bakker’s Prince Of Nothing)
But more important, I disagree that L. Ron Hubbard is our upper limit for how successful a cult leader can get. L. Ron Hubbard might be the upper limit for how successful a cult leader can get before we stop calling them a cult leader.
The level above L. Ron Hubbard is Hitler. It’s difficult to overestimate how sudden and surprising Hitler’s rise was. Here was a working-class guy, not especially rich or smart or attractive, rejected from art school, and he went from nothing to dictator of one of the greatest countries in the world in about ten years. If you look into the stories, they’re really creepy. When Hitler joined, the party that would later become the Nazis had a grand total of fifty-five members, and was taken about as seriously as modern Americans take Stormfront. There are records of conversations from Nazi leaders when Hitler joined the party, saying things like “Oh my God, we need to promote this new guy, everybody he talks to starts agreeing with whatever he says, it’s the creepiest thing.” There are stories of people who hated Hitler going to a speech or two just to see what all the fuss was about and ending up pledging their lives to the Nazi cause. Even while he was killing millions and trapping the country in a difficult two-front war, he had what historians estimate as a 90% approval rating among his own people and rampant speculation that he was the Messiah. Yeah, sure, there was lots of preexisting racism and discontent he took advantage of, but there’s been lots of racism and discontent everywhere forever, and there’s only been one Hitler. If he’d been a little bit smarter or more willing to listen to generals who were, he would have had a pretty good shot at conquering the world. 100% with social skills.
The level above Hitler is Mohammed. I’m not saying he was evil or manipulative, just that he was a genius’ genius at creating movements. Again, he wasn’t born rich or powerful, and he wasn’t particularly scholarly. He was a random merchant. He didn’t even get the luxury of joining a group of fifty-five people. He started by converting his own family to Islam, then his friends, got kicked out of his city, converted another city and then came back at the head of an army. By the time of his death at age 62, he had conquered Arabia and was its unquestioned, God-chosen leader. By what would have been his eightieth birthday his followers were in control of the entire Middle East and good chunks of Africa. Fifteen hundred years later, one fifth of the world population still thinks of him as the most perfect human being ever to exist and makes a decent stab at trying to conform to his desires and opinions in all things.
The level above Mohammed is the one we should be worried about.
I like it too, although there's 500+ fiction posts on LW (not including the subreddit) so you probably meant something else.
What about just not pursuing a PhD and instead doing what OP did? With the PhD you potentially lose #1 in
I actually think that you can get great results doing research as a hobby because
- it gives you loads of slack, which is freedom to do things without constraints. In this context, I think slack is valuable because it allows you to research things outside of the publishing mainstream.
- and less pressure.
I think these two things are crucial for success. The slack allows you to look at risky and niche ideas are more likely to yield better research rewards if they are true, since surprising results will trigger further questions.
Also, since you are more likely to do better at topics you enjoy, getting money from a day job allows you to actually purse your interests or deviate from your supervisor’s wishes. Conversely, it also allows you to give up when you’re not enjoying something.
which is where much of the impact comes from, especially if you subscribe to a multiplicative view of impact.
Wikipedia says it's a SaaS company "specializing in AI-powered document processing and automation, data capture, process mining and OCR": https://en.wikipedia.org/wiki/ABBYY
To be clear, GiveWell won’t be shocked by anything I’ve said so far. They’ve commissioned work and published reports on this. But as you might expect, these quality of life adjustments wouldnt feature in GiveWell’s calculations anyway, since the pitch to donors is about the price paid for a life, or a DALY.
Can you clarify what you mean by these quality of life adjustments not featuring in GiveWell's calculations?
To be more concrete, let's take their CEA of HKI's vitamin A supplementation (VAS) program in Burkina Faso. They estimate that a $1M grant would avert 553 under-5 deaths (~80% of total program benefit) and incrementally increase future income for the ~560,000 additional children receiving VAS (~20%) (these figures vary considerably by location by the way, from 60 deaths averted in Anambra, Nigeria to 1,475 deaths averted in Niger) then they convert this to 81,811 income-doubling equivalents (their altruistic common denominator — they don't use DALYs in any of their CEAs, so I'm always befuddled when people claim they do), make a lot of leverage- and funging-related adjustments which reduces this to 75,272 income doublings, then compare it with the 3,355 income doublings they estimate would be generated by donating that $1M to GiveDirectly to get their 22.4x cash multiplier for HKI VAS in Burkina Faso.
So: are you saying that GiveWell should add a "QoL discount" when converting lives saved and income increase, like what Happier Lives Institute suggests for non-Epicurean accounts of badness of death?
I agree; Eichmann in Jerusalem and immoral mazes come to mind.
the obvious thing to happen is that nvidia realizes it can just build AI itself. if Taiwan is Dune, GPUs are the spice, then nvidia is house Atreides
You mention in another comment that your kid reads the encyclopaedia for fun, in which case I don't think The Martian would be too complex, no?
I'm also reminded of how I started perusing the encyclopaedia for fun at age 7. At first I understood basically nothing (English isn't my native language), but I really liked certain pictures and diagrams and keep going back to them wanting to learn more, realising that I'd comprehend say 20% more each time, which taught me to chase exponential growth in comprehension. Might be worth teaching that habit.
That's fair.
Society seems to think pretty highly of arithmetic. It’s one of the first things we learn as children. So I think it’s weird that only a tiny percentage of people seem to know how to actually use arithmetic. Or maybe even understand what arithmetic is for.
I was a bit thrown off by the seeming mismatch between the title ("underrated") and this introduction ("rated highly, but not used or understood as well as dynomight prefers").
The explanation seems straightforward: arithmetic at the fluency you display in the post is not easy, even with training. If you only spend time with STEM-y folks you might not notice, because they're a very numerate bunch. I'd guess I'm about average w.r.t. STEM-y folks and worse than you are, but I do quite a bit of spreadsheet-modeling for work, and I have plenty of bright hardworking colleagues who can't quite do the same at my level even though they want to, which suggests not underratedness but difficulty.
(To be clear I enjoy the post, and am a fan of your blog. :) )
This is great, thank you Sable. Some thoughts:
- Percentage seems logarithmic, like the Richter scale
- I wish there was a tabulated / spreadsheet version with the columns 'percentage', 'description', 'emotionally', 'cognitively', 'personally' or something (might create this myself based on your main text)
- Your "when depressed, I am not (in reality) the person I see myself as. I have a self-image, a conception of what I’m supposed to be like, and my depressed self isn’t it" resonates with how I felt when I was depressed. That said in my case my self-image was quite grounded in reality (the conception was of "where I was supposed to be in life" with externally observable correlates others could verify for themselves), improving my life condition over the years was key to me reducing my depression, and this matches some (but to my chagrin not all) of the depressed people I have tried to help
Misses my point, but never mind.
Counterpoint: the likes of Kelsey Piper and Dylan Matthews are great. Their business does seem to be to help us understand and better the world.
Setting low prices might mean the few gallons of gas, bottles of water, or flights that are available are allocated to people who get to them first, or who can wait in line the longest, rather than based on who is willing to pay, but it’s not clear that these allocations are more egalitarian.
If that last sentence can be rephrased as "the counterfactual benefit is unclear", then given that the counterfactual cost is clear (beneficiaries need to pay more, potentially a lot more) doesn't this show that price gouging is probably net negative i.e. aren't you arguing against, not for, it here?
(Genuine question, not a gotcha. I'm admittedly biased because I dislike the idea of profiting off others' misery, suspect that purchasing power is moderately anticorrelated with need via wealth/income i.e. fewer DALYs are averted per person helped this way, and find the bounty reframing uncompelling, but I'm willing to change my mind. Another personal bias: I work in the market-shaping for global health space, which is about redressing market failures due to e.g. mismatch between need and ability to pay for it, and I suspect a similar dynamic is at play after hurricanes have devastated an area.)
(Unrelated to the post: I'm glad you're back, ADS. I wasn't the only one who wondered)
Seems way underestimated. While I don't think he's at "the largest supergeniuses" level either, even +3 SD implies just top 1 in ~700 i.e. millions of Eliezer-level people worldwide. I've been part of more quantitatively-selected groups talent-wise (e.g. for national scholarships awarded on academic merit) and I've never met anyone like him.
Similar points apply to adding too many unnecessary links. Specifically links where it isn't clear where they lead and what point is made in the link target, as in the previous sentence.
This used to be a recurring failure mode of my own writing, which I've since partially mitigated. Reflecting on why, I think I wanted to do some combination of
- justifying contentious or surprising claims
- preventing being pattern-matched to straw versions of ideas / arguments commonly referenced and adjacent in concept-space
- finding excuses to share cool reads
- making sense of links I'd read by relating them, using publication as a focusing mechanism
I didn't notice the cost of overdoing it until I saw writers who did it worse, and became horrified at the thought that I was slowly becoming them.
(Gwern links a lot but it doesn't feel "worse", on the contrary I enjoy his writing, so "worseness" is as much about adding more value to the reader than the cost of disrupting their flow as it is about volume. His approach is also far more thought-out of course.)
Thanks for writing this. I only wish it was longer.
Ha, that's awesome. Thanks for including the screenshot in yours :) Scott's "invisible fence" argument was the main one I thought of actually.
You might be interested in Scott Aaronson's thoughts on this in section 4: Why Is Proving P != NP Difficult?, which is only 2 pages.
Yeah pretty much. In more detail:
Bezos explained why he chose to only sell books on his website — at least, at first — in a “lost” video interview recorded at a Special Libraries Association conference in June 1997, which resurfaced in 2019 when it was posted online by entrepreneur Brian Roemmele.
Out of all the different products you might be able to sell online, books offered an “incredibly unusual benefit” that set them apart, Bezos said.
“There are more items in the book category than there are items in any other category, by far,” said Bezos. “Music is No. 2 — there are about 200,000 active music CDs at any given time. But in the book space, there are over 3 million different books worldwide active in print at any given time across all languages, [and] more than 1.5 million in English alone.”
When Bezos launched Amazon in 1994, the internet and e-commerce industry were still in their earliest stages. He knew it would take some time before online shopping became ubiquitous, he said, so he wanted to start with a concept that couldn’t be replicated by a seller with only physical locations.
“When you have that many items, you can literally build a store online that couldn’t exist any other way,” he explained. “That’s important right now, because the web is still an infant technology. Basically, right now, if you can do things using a more traditional method, you probably should do them using the traditional method.”
Still, Bezos hinted at the company’s potential for expansion, noting that “we’re moving forward in so many different areas.”
“This is Day 1,” he added. “This is the very beginning. This is the Kittyhawk stage of electronic commerce.”
By this assessment, who in real life do you think has proven above-average intelligence?
you presumably also think that teleportation would only create copies while destroying the originals. You might then be hesitant to use teleportation.
As an aside, Holden's view of identity makes him unconcerned about this question, and I've gradually gotten round to it as well.
Is 3.1 points small? Well, a 100 IQ is higher than that of 50% of the population, while a 103.1 IQ is higher than 58%. Adding 3.1 IQ points to a kid ranked 13th in a 25-person class would push them up to around 11th. And, personally, if you were going to drop my IQ by 3.1 points, I would not be super stoked about it.
And remember, 3.1 points is still just the impact of a modest increase in breastfeeding intensity. If you ran a trial that compared no breastfeeding to exclusive breastfeeding for 12 months, the impact would surely have been much larger.
For context, in high-income countries lead poisoning is estimated to have lowered IQ by a comparable amount (the paper doesn't explicitly state the IQ drop, but does say that the mean blood lead level in HICs is 1.3 μg/dL and provides the chart below), and lead poisoning is taken pretty seriously.
ARIA feels like it has the same vibe (I might be wrong); I found out about it from davidad's bio (he's a PD).
I thought this review was fine: https://recordcrash.substack.com/p/mad-investor-chaos-woman-asmodeus
How would you falsify this model?
I have a similar experience. Do you know of any LLMs that aren't as agreeable in a useful way?
Ben Pace
Can you say slightly more detail about how you think the preference synthesizer thing is suposed to work?
zhukeepa
Well, yeah. An idealized version would be like a magic box that's able to take in a bunch of people with conflicting preferences about how they ought to coordinate (for example, how they should govern their society), figure out a synthesis of their preferences, and communicate this synthesis to each person in a way that's agreeable to them.
...
Ben Pace
Okay. So, you want a preference synthesizer, or like a policy-outputter that everyone's down for?
zhukeepa
Yes, with a few caveats, one being that I think preference synthesis is going to be a process that unfolds over time, just like truth-seeking dialogue that bridges different worldviews.
...
zhukeepa
Yeah. I think the thing I'm wanting to say right now is a potentially very relevant detail in my conception of the preference synthesis process, which is that to the extent that individual people in there have deep blind spots that lead them to pursue things that are at odds with the common good, this process would reveal those blind spots while also offering the chance to forgive them if you're willing to accept it and change.
I may be totally off, but whenever I read you (zhukeepa) elaborating on the preference synthesizer idea I kept thinking of democratic fine-tuning (paper: What are human values, and how do we align AI to them?), which felt like it had the same vibe. It's late night here so I'll butcher their idea if I try to explain them, so instead I'll just dump a long quote and a bunch of pics and hope you find it at least tangentially relevant:
We report on the first run of “Democratic Fine-Tuning” (DFT), funded by OpenAI. DFT is a democratic process that surfaces the “wisest” moral intuitions of a large population, compiled into a structure we call the “moral graph”, which can be used for LLM alignment.
- We show bridging effects of our new democratic process. 500 participants were sampled to represent the US population. We focused on divisive topics, like how and if an LLM chatbot should respond in situations like when a user requests abortion advice. We found that Republicans and Democrats come to agreement on values it should use to respond, despite having different views about abortion itself.
- We present the first moral graph, generated by this sample of Americans, capturing agreement on LLM values despite diverse backgrounds.
- We present good news about their experience: 71% of participants said the process clarified their thinking, and 75% gained substantial respect for those across the political divide.
- Finally, we’ll say why moral graphs are better targets for alignment than constitutions or simple rules like HHH. We’ll suggest advantages of moral graphs in safety, scalability, oversight, interpretability, moral depth, and robustness to conflict and manipulation.
In addition to this report, we're releasing a visual explorer for the moral graph, and open data about our participants, their experience, and their contributions.
...
Our goal with DFT is to make one fine-tuned model that works for Republicans, for Democrats, and in general across ideological groups and across cultures; one model that people all around the world can all consider “wise”, because it's tuned by values we have broad consensus on. We hope this can help avoid a proliferation of models with different tunings and without morality, fighting to race to the bottom in marketing, politics, etc. For more on these motivations, read our introduction post.
To achieve this goal, we use two novel techniques: First, we align towards values rather than preferences, by using a chatbot to elicit what values the model should use when it responds, gathering these values from a large, diverse population. Second, we then combine these values into a “moral graph” to find which values are most broadly considered wise.
Example moral graph, which "charts out how much agreement there is that any one value is wiser than another":
Also, "people endorse the generated cards as representing their values—in fact, as representing what they care about even more than their prior responses. We paid for a representative sample of the US (age, sex, political affiliation) to go through the process, using Prolific. In this sample, we see a lot of convergence. As we report further down, people overwhelmingly felt well-represented with the cards, and say the process helped them clarify their thinking", which is why I paid attention to DFT at all:
Not a substantive response, just wanted to say that I really really like your comment for having so many detailed real-world examples.
Just to check, you're referring to these?
They have in fact been published (it's in your link), at least the ones authors agreed to make publicly available: these are all the case studies, and Moritz von Knebel's write-ups are