LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

Long-Term Future Fund: May 2023 to March 2024 Payout recommendations
Linch · 2024-06-12T13:46:29.535Z · comments (0)

Manifund Q1 Retro: Learnings from impact certs
Austin Chen (austin-chen) · 2024-05-01T16:48:33.140Z · comments (1)

Californians, tell your reps to vote yes on SB 1047!
Holly_Elmore · 2024-08-12T19:50:09.817Z · comments (24)

[link] Conflict in Posthuman Literature
Martín Soto (martinsq) · 2024-04-06T22:26:04.051Z · comments (1)

Planning to build a cryptographic box with perfect secrecy
Lysandre Terrisse · 2023-12-31T09:31:47.941Z · comments (6)

[link] The Data Wall is Important
JustisMills · 2024-06-09T22:54:20.070Z · comments (20)

[link] Book review: Cuisine and Empire
eukaryote · 2024-01-21T06:15:12.969Z · comments (2)

Choosing My Quest (Part 2 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-02-24T21:31:45.377Z · comments (7)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

The Serendipity of Density
jefftk (jkaufman) · 2023-12-17T03:50:04.824Z · comments (4)

[link] Progress Conference 2024: Toward Abundant Futures
jasoncrawford · 2024-06-26T15:39:45.267Z · comments (2)

Neuroscience and Alignment
Garrett Baker (D0TheMath) · 2024-03-18T21:09:52.004Z · comments (25)

"Does your paradigm beget new, good, paradigms?"
Raemon · 2024-01-25T18:23:15.497Z · comments (6)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (12)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

D&D Sci Coliseum: Arena of Data
aphyer · 2024-10-18T22:02:54.305Z · comments (22)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Intrinsic Drives and Extrinsic Misuse: Two Intertwined Risks of AI
jsteinhardt · 2023-10-31T05:10:02.581Z · comments (0)

Debate, Oracles, and Obfuscated Arguments
Jonah Brown-Cohen (jonah-brown-cohen) · 2024-06-20T23:14:57.340Z · comments (2)

Beware unfinished bridges
Adam Zerner (adamzerner) · 2024-05-12T09:29:07.808Z · comments (9)

[link] Forecasting: the way I think about it
Molly (hickman-santini) · 2024-05-09T00:49:01.768Z · comments (4)

Whiteboard Pen Magazines are Useful
Johannes C. Mayer (johannes-c-mayer) · 2024-07-12T17:15:33.200Z · comments (8)

[link] Dequantifying first-order theories
jessicata (jessica.liu.taylor) · 2024-04-23T19:04:49.000Z · comments (9)

Box inversion revisited
Jan_Kulveit · 2023-11-07T11:09:36.557Z · comments (3)

Technologies and Terminology: AI isn't Software, it's... Deepware?
Davidmanheim · 2024-02-13T13:37:10.364Z · comments (10)

Quantopian contest, but for food intake and weight
Lucent · 2023-11-08T05:41:35.050Z · comments (9)

[link] Progress links digest, 2023-11-24: Bottlenecks of aging, Starship launches, and much more
jasoncrawford · 2023-11-24T15:25:07.721Z · comments (1)

Jobs, Relationships, and Other Cults
Ruby · 2024-03-13T05:58:45.043Z · comments (9)

Applying Force to the Wrong End of a Causal Chain
silentbob · 2024-06-22T18:06:32.364Z · comments (0)

Scaling of AI training runs will slow down after GPT-5
Maxime Riché (maxime-riche) · 2024-04-26T16:05:59.957Z · comments (5)

[link] Queuing theory: Benefits of operating at 60% capacity
ampdot · 2023-12-01T18:48:01.426Z · comments (4)

What's up with all the non-Mormons? Weirdly specific universalities across LLMs
mwatkins · 2024-04-19T13:43:24.568Z · comments (13)

Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning
So8res · 2023-12-19T23:39:59.689Z · comments (30)

Movie posters
KatjaGrace · 2024-03-06T06:20:03.034Z · comments (0)

Extrapolating from Five Words
Gordon Seidoh Worley (gworley) · 2023-11-15T23:21:30.865Z · comments (11)

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

Simple distribution approximation: When sampled 100 times, can language models yield 80% A and 20% B?
Teun van der Weij (teun-van-der-weij) · 2024-01-29T00:24:27.706Z · comments (5)

Instrumental deception and manipulation in LLMs - a case study
Olli Järviniemi (jarviniemi) · 2024-02-24T02:07:01.769Z · comments (13)

[Interim research report] Evaluating the Goal-Directedness of Language Models
Rauno Arike (rauno-arike) · 2024-07-18T18:19:04.260Z · comments (4)

Prepsgiving, A Convergently Instrumental Human Practice
JenniferRM · 2023-11-23T17:24:56.784Z · comments (0)

Medical Roundup #3
Zvi · 2024-07-09T13:10:06.862Z · comments (4)

Apply to the PIBBSS Summer Research Fellowship
Nora_Ammann · 2024-01-12T04:06:58.328Z · comments (1)

Logical Line-Of-Sight Makes Games Sequential or Loopy
StrivingForLegibility · 2024-01-19T04:05:44.782Z · comments (0)

[link] Linear infra-Bayesian Bandits
Vanessa Kosoy (vanessa-kosoy) · 2024-05-10T06:41:09.206Z · comments (5)

How To Do Patching Fast
Joseph Miller (Josephm) · 2024-05-11T20:13:52.424Z · comments (6)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (19)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

tailcalled on Alexander Gietelink Oldenziel's Shortform

For everyday life, flat earth is more convenient than round earth geocentrism, which in turn is more convenient than heliocentrism. Like we don't constantly change our city maps based on the time of year, for instance.

This is mainly because the sun and the earth are powerful enough to handle heliocentrism for you, e.g. the earth pulls you and the cities towards the earth so you don't have to put effort into staying on it.

The sun and the planetary motion does remain the most important governing factor for predicting activities on earth, though, even given this coordinate change. We just mix them together into ~epicyclic variables like "day"/"night" and "summer"/"autumn"/"winter"/"spring" rather than talking explicitly about the sun, the earth, and their relative positions.

tailcalled on Three Notions of "Power"

Can you explain what this coordination would look like?

khafra on Three Notions of "Power"

Your definition seems like it fits the Emperor of China example--by reputation, they had few competitors for being the most willing and able to pessimize another agent's utility function; e.g. 9 Familial Exterminations.
And that seems to be a key to understanding this type of power, because if they were able to pessimize all other agents' utility functions, that would just be an evil mirror of bargaining power. Being able to choose a sharply limited number of unfortunate agents, and punish them severely pour encourager les autres, seems like it might just stop working when the average agent is smart enough to implicitly coordinate around a shared understanding of payoff matrices.
So I think I might have arrived back to the "all dominance hierarchies will be populated solely by scheming viziers" conclusion.

fread2281 on Alexander Gietelink Oldenziel's Shortform

I guess this is sorta about your 3, which I disbelieve (though algorithms for tasks other than learning are also important). Currently, Bayesian inference vs SGD is a question of how much data you have (where SGD wins except for very little data). For small to medium amounts of data, even without AGI, I expect SGD to lose eventually due to better inference algorithms. For many problems I have the intuition that it's ~always possible to improve performance with more complicated algorithms (eg sat solvers). All that together makes me expect there to be inference algorithms that scale to very large amounts of data (that aren't going to be doing full Bayesian inference but rather some complicated approximation).

bolverk on I got dysentery so you don’t have to

Sequence 1 length:3 

Sequence 2 length:6 

Alignment length: 6 

Identity: 3/6 (50.00%) 

Similarity: 3/6 (50.00%) 

Gaps: 3/6 (50.00%)

---AGC
   |||
AGCAGC

Like this. Difference between lengths is considered non-matching.

https://en.vectorbuilder.com/tool/sequence-alignment.html

inquilinekea on What TMS is like

https://pmc.ncbi.nlm.nih.gov/articles/PMC8122027/

raemon on JargonBot Beta Test

I've reverted the part that automatically generates jargon for drafts until we've figured out a better overall solution.

meme-marine on The Compendium, A full argument about extinction risk from AGI

I am unsurprised but disappointed to read the same Catastrophe arguments rehashed here, based on an outdated Bostromian paradigm of AGI. This is the main section I disagree with.

The underlying principle beneath these hypothetical scenarios is grounded in what we can observe around us: powerful entities control weaker ones, and weaker ones can fight back only to the degree that the more powerful entity isn’t all that powerful after all.

I do not think this is obvious or true at all. Nation-States are often controlled by a small group of people or even a single person, no different physiologically to any other human being. If it really wanted to, there would be nothing at all stopping the US military from launching a coup on its civilian government; in fact, military coups are a commonplace global event. Yet, generally, most countries do not suffer constant coup attempts. We hold far fewer tools to "align" military leaders than we do AI models - we cannot control how generals were raised as children, cannot read their minds, cannot edit their minds.

I think you could also make a similar argument that big things control little things - with much more momentum and potential energy, we observe that large objects are dominant over small objects. Small objects can only push large objects to the extent that the large object is made of a material that is not very dense. Surely, then building vehicles substantially larger than people would result in uncontrollable runaways that would threaten human life and property! But in reality, runaway dump truck incidents are fairly uncommon. A tiny man can control a giant machine. Not all men can - only the one in the cockpit.

My point is that it is not at all obvious that a powerful AI would lack such a cockpit. If its goals are oriented around protecting or giving control to a set of individuals, I see no reason whatsoever why it would do a 180 and kill its commander, especially since the AI systems that we can build in practice are more than capable of understanding the nuances of their commands.

The odds of an average chess player with an ELO of 1200 against a grandmaster with ELO 2500 are 1 to a million. Against the best chess AI today with an ELO of 3600, the odds are essential 0.

Chess is a system that's perfectly predictable. Reality is a chaotic system. Chaotic systems - like a three-body orbital arrangement - are impossible to perfectly predict in all cases even if they're totally deterministic, because even minute inaccuracies in measurement can completely change the result. One example would be the edges of the Mandelbrot set. It's fractal. Therefore, even an extremely powerful AI would be beholden to certain probabilistic barriers, notwithstanding quantum-random factors.

Many assume that an AI is only dangerous if it has hostile intentions, but the danger of godlike AI is not a matter of its intent, but its power and autonomy. As these systems become increasingly agentic and powerful, they will pursue goals that will diverge from our own.

It would not be incorrect to describe someone who pursued their goals irrespective of its externalities to be malevolent. Bank robbers don't want to hurt people, they want money. Yet I don't think anyone would suggest that the North Hollywood shooters were "non-hostile but misaligned". I do not like this common snippet of rhetoric and I think it is dishonest. It attempts to distance these fears of misaligned AI from movie characters such as Skynet, but ultimately, this is the picture that is painted.

Goal divergence is a hallmark of the Bostromian paradigm - the idea that a misspecified utility function, optimized hypercompetently, would lead to disaster. Modern AI systems do not behave like this. They behave in a much more humanlike way. They do not have objective functions that they pursue doggedly. The Orthogonality Thesis states that intelligence is uncorrelated with objectives. The unstated connection here, I think, is that their initial goals must have been misaligned in the first place, but stated like this, it sounds a little like you expect a superintelligent AI to suddenly diverge from its instructions for no reason at all.

Overall, this is a very vague section. I think you would benefit from explaining some of the assumptions being made here.

I'm not going to go into detail on the Alignment section, but I think that many of its issues are similar to the ones listed above. I think that the arguments are not compelling enough for lay people, mostly because I don't think they're correct. I think that the definition of Alignment you have given - "the ability to “steer AI systems toward a person's or group's intended goals, preferences, and ethical principles.”" - does not match the treatment it is given. I think that it is obvious that the scope of Alignment is too vague, broad, and unverifiable for it to be a useful concept. I think that Richard Ngo's post:

https://www.lesswrong.com/posts/67fNBeHrjdrZZNDDK/defining-alignment-research [LW · GW]

is a good summary of the issues I see with the current idea of Alignment as it is often used in Rationalist circles and how it could be adapted to suit the world in which we find ourselves.

Finally, I think that the Governance section could very well be read uncharitably as a manifesto for world domination. Less than a dozen people attend PauseAI protests; you do not have the political ability to make this happen. The ideas contained in this document, which resemble many other documents, such as a similar one created by the PauseAI group, are not compelling enough to sway people who are not already believers in its ideas, and the Rationalist language used in them is anathemic to the largest ideological groups that would otherwise support your cause.

You may receive praise from Rationalist circles, but I do not think you will reach a large audience with this type of work. Leopold Aschenbrenner's essay managed to reach a fairly substantial audience, and it has similar themes to your document, so in principle, people are willing to read this sort of writing. The main flaw is that it doesn't add anything to the conversation, and because of that, it won't change anyone's minds. The reason that the public discourse doesn't involve Alignment talk isn't due to lack of awareness, it's because it isn't at all compelling to most people. Writing it better, with a nicer format, will not change this.

yair-halberstadt on Trading Candy

Counterpoint: when I was about 12, I was too old to collect candy at my Synagogue on Simchat Torah, so I would beg a single candy from someone, then trade it up (Dutch book style) with naive younger kids until I had a decent stash. I was particularly pleased whenever my traded up stash included the original sweet.

denkenberger on Is the Power Grid Sustainable?

It would be helpful to see a calculation with your rates, the installed cost of batteries, cost of the space taken up, losses in the batteries and convertor, any cost of maintenance, lifetime of batteries, and cost (or benefit) of disposal.