LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Dating Roundup #2: If At First You Don’t Succeed
Zvi · 2024-01-02T16:00:04.955Z · comments (29)

[link] Theories of Change for AI Auditing
Lee Sharkey (Lee_Sharkey) · 2023-11-13T19:33:43.928Z · comments (0)

Thiel on AI & Racing with China
Ben Pace (Benito) · 2024-08-20T03:19:18.966Z · comments (10)

Safe Stasis Fallacy
Davidmanheim · 2024-02-05T10:54:44.061Z · comments (2)

AI #44: Copyright Confrontation
Zvi · 2023-12-28T14:30:10.237Z · comments (13)

[Closed] PIBBSS is hiring in a variety of roles (alignment research and incubation program)
Nora_Ammann · 2024-04-09T08:12:59.241Z · comments (0)

Monthly Roundup #17: April 2024
Zvi · 2024-04-15T12:10:03.126Z · comments (4)

[link] Questions are usually too cheap
Nathan Young · 2024-05-11T13:00:54.302Z · comments (19)

Towards a formalization of the agent structure problem
Alex_Altair · 2024-04-29T20:28:15.190Z · comments (5)

Math-to-English Cheat Sheet
nahoj · 2024-04-08T09:19:40.814Z · comments (5)

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
leogao · 2023-12-16T05:39:10.558Z · comments (5)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

[link] Open Phil releases RFPs on LLM Benchmarks and Forecasting
LawrenceC (LawChan) · 2023-11-11T03:01:09.526Z · comments (0)

Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Patrick Leask (patrickleask) · 2024-08-17T01:16:53.764Z · comments (0)

[question] Could orcas be (trained to be) smarter than humans? 
Towards_Keeperhood (Simon Skade) · 2024-11-04T23:29:26.677Z · answers+comments (8)

AI #50: The Most Dangerous Thing
Zvi · 2024-02-08T14:30:13.168Z · comments (4)

We are headed into an extreme compute overhang
devrandom · 2024-04-26T21:38:21.694Z · comments (33)

Fat Tails Discourage Compromise
niplav · 2024-06-17T09:39:16.489Z · comments (5)

Per protocol analysis as medical malpractice
braces · 2024-01-31T16:22:21.367Z · comments (8)

Trading off Lives
jefftk (jkaufman) · 2024-01-03T03:40:05.603Z · comments (12)

[link] S-Risks: Fates Worse Than Extinction
aggliu · 2024-05-04T15:30:36.666Z · comments (2)

[link] LLMs seem (relatively) safe
JustisMills · 2024-04-25T22:13:06.221Z · comments (24)

AI #40: A Vision from Vitalik
Zvi · 2023-11-30T17:30:08.350Z · comments (12)

[question] Can we get an AI to "do our alignment homework for us"?
Chris_Leong · 2024-02-26T07:56:22.320Z · answers+comments (33)

Zvi's Manifold Markets House Rules
Zvi · 2023-11-13T00:28:02.147Z · comments (6)

Causal Graphs of GPT-2-Small's Residual Stream
David Udell · 2024-07-09T22:06:55.775Z · comments (7)

[link] Breaking Circuit Breakers
mikes · 2024-07-14T18:57:20.251Z · comments (13)

AI #71: Farewell to Chevron
Zvi · 2024-07-04T13:40:05.905Z · comments (9)

2022 (and All Time) Posts by Pingback Count
Raemon · 2023-12-16T21:17:00.572Z · comments (14)

AI #76: Six Shorts Stories About OpenAI
Zvi · 2024-08-08T13:50:04.659Z · comments (10)

Be More Katja
Nathan Young · 2024-03-11T21:12:14.249Z · comments (0)

AI #37: Moving Too Fast
Zvi · 2023-11-09T17:50:04.324Z · comments (5)

Acting Wholesomely
owencb · 2024-02-26T21:49:16.526Z · comments (64)

A D&D.Sci Dodecalogue
abstractapplic · 2024-04-12T01:10:01.625Z · comments (0)

The case for stopping AI safety research
catubc (cat-1) · 2024-05-23T15:55:18.713Z · comments (38)

Schelling points in the AGI policy space
mesaoptimizer · 2024-06-26T13:19:25.186Z · comments (2)

Was Releasing Claude-3 Net-Negative?
Logan Riggs (elriggs) · 2024-03-27T17:41:56.245Z · comments (5)

Anthropical Paradoxes are Paradoxes of Probability Theory
Ape in the coat · 2023-12-06T08:16:26.846Z · comments (18)

Announcing the Double Crux Bot
sanyer (santeri-koivula) · 2024-01-09T18:54:15.361Z · comments (8)

Reflections on my first year of AI safety research
Jay Bailey · 2024-01-08T07:49:08.147Z · comments (3)

[link] The Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-10-18T13:26:25.565Z · comments (9)

AI #45: To Be Determined
Zvi · 2024-01-04T15:00:05.936Z · comments (4)

[link] OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns
Seth Herd · 2023-11-20T14:20:33.539Z · comments (28)

BatchTopK: A Simple Improvement for TopK-SAEs
Bart Bussmann (Stuckwork) · 2024-07-20T02:20:51.848Z · comments (0)

Can we build a better Public Doublecrux?
Raemon · 2024-05-11T19:21:53.326Z · comments (6)

Gradient Descent on the Human Brain
Jozdien · 2024-04-01T22:39:24.862Z · comments (5)

Two LessWrong speed friending experiments
mikko (morrel) · 2024-06-15T10:52:26.081Z · comments (3)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

Pseudonymity and Accusations
jefftk (jkaufman) · 2023-12-21T19:20:19.944Z · comments (20)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

stephen-fowler on Daniel Kokotajlo's Shortform

My reply definitely missed that you were talking about tunnel densities beyond what has been historically seen.

I'm inclined to agree with your argument that there is a phase shift, but it seems like it is less to do the fact that there are tunnels, and more to do with the geography becoming less tunnel-like and more open.

I have a couple thoughts on your model that aren't direct refutations of anything you've said here:

I think the single term "density" is a too crude of a measure to get a good predictive model of how combat would play out. I'd expect there to be many parameters that describe a tunnel system and have a direct tactical impact. From your discussion of mines, I think "density" is referring to the number of edges in the network? I'd expect tunnel width, geometric layout etc would change how either side behaves.
I'm not sure about your background, but with zero hours of military combat under my belt, I doubt I can predict how modern subterranean combat plays out in tunnel systems with architectures that are beyond anything seen before in history.

l-rudolf-l on Survival without dignity

You have restored my faith in LessWrong! I was getting worried that despite 200+ karma and 20+ comments, no one had actually nitpicked the descriptions of what actually happens.

The zaps of light are diffraction limited.

In practice, if you want the atmospheric nanobots to zap stuff, you'll need to do some complicated mirroring because you need to divert sunlight. And it's not one contiguous mirror but lots of small ones. But I think we can still model this as basic diffraction with some circular mirror / lens.

Intensity , where $E$ is the total power of sunlight falling on the mirror disk, $r$ is the radius of the Airy disk, and $c_{e}$ is an efficiency constant I've thrown in (because of things like atmospheric absorption (Claude says, somewhat surprisingly, this shouldn't be ridiculuously large), and not all the energy in the diffraction pattern being in the Airy disk (about 84% is, says Claude), etc.)

Now, $E = π {(\frac{D}{2})}^{2} L$ , where $D$ is the diameter of the mirror configuration, $L$ is the solar irradiance. And $r = θ l$ , where $l$ is the focal length (distance from mirror to target), and $θ \approx 1.22 λ / D$ the angular size of the central spot.

So we have $I \approx \frac{c_{e} L D^{4}}{{1.22}^{2} \times 4 λ^{2} l^{2}}$ , so the required mirror configuration radius $D = \sqrt[4]{\frac{{1.22}^{2} \times 4 I λ^{2} l^{2}}{c_{e} L}}$ .

Plugging in some reasonable values like $λ \approx 5 \times 10^{- 7}$ m (average incoming sunlight - yes the concentration suffers a bit because it's not all this wavelength), $I = 10^{7}$ W/m^2 (the level of an industrial laser that can cut metal), $l = 10^{4}$ m (lower stratosphere), $L = 1361$ W/m^2 (solar irradiance), and a conservative guess that 99% of power is wasted so $c_{e} = 0.01$ , we get $D \approx 18$ m (and the resulting beam is about 3mm wide).

So a few dozen metres of upper atmosphere nanobots should actually give you a pretty ridiculous concentration of power!

(I did not know this when I wrote the story; I am quite surprised the required radius is this ridiculously tiny. But I had heard of the concept of a "weather machine" like this from the book Where is my flying car?, which I've reviewed here, which suggests that this is possible.)

Partly because it's hard to tell between an actual animal and a bunch of nanobots pretending to be an animal. So you can't zap the nanobots on the ground without making the ground uninhabitable for humans.

I don't really buy this, why is it obvious the nanobots could pretend to be an animal so well that it's indistinguishable? Or why would targeted zaps have bad side-effects?

The "California red tape" thing implies some alignment strategy that stuck the AI to obey the law, and didn't go too insanely wrong despite a superintelligence looking for loopholes

Yeah, successful alignment to legal compliance was established without any real justification halfway through. (How to do this is currently an open technical problem, which, alas, I did not manage to solve for my satirical short story.)

Convince humans that dyson sphere are pretty and don't block the view?

This is a good point, especially since high levels of emotional manipulation was an established in-universe AI capability. (The issue described with the Dyson sphere was less that it itself would block the view, and more that building it would require dismantling the planets in a way that ruins the view - though now I'm realising that "if the sun on Earth is blocked, all Earthly views are gone" is a simpler reason and removes the need for building anything on the other planets at all.)

There is also no clear explanation of why someone somewhere doesn't make a non-red-taped AI.

Yep, this is a plot hole.

cstinesublime on Oxidize's Shortform

I can't speak for the community but after having glanced at your entire post I can't be sure just what it is about. The closest you come to explaining it is near the end you promise to present a "high-level theory on the functional realities" that seem to be related to everything from increased military spending to someone accidentally creating a virus in the lab that wipes out humanity to combating cognitive bias. But what is your theory?

Your post also makes a number of generalize assumptions about the reader and human nature and invokes the pronoun "we" far too many times. I'm a hypocrite for pointing that out, because I tend to do it as well - but the problem is that unless you have a very narrow audience in mind, especially a community that you are a native to and know intimately, often you run the risk of making assumptions or statements they will at best be confused by, and at worst will get defensive for being included with.

Most of your assumptions aren't backed up by specific examples, citations to research. For example, in your first sentence you say that we subconsciously optimize for there being no major societal changes precipitated by technology. You don't back this up. I would assume that part of the reason why there are gold- bugs, just proves there is a huge contingent of people who invest real money based precisely on the fact that they can't anticipate what major economic changes future technologies might bring. There are currently billions of dollars being spent by firms like Apple, Google, even JP Morgan Chase into A.I. assistants, in anticipation of a major change.

I could one by one go through all these general assumptions, but there are too many for it to be worth my while. Not only that, most of the footnotes you use don't make reference to any concepts or observations which are particularly new or alien. The pareto principle, Compound Effect, Rumsfeld's Epistemology... I would expect your average Lesswrong reader is very familiar with these, they present no new insights.

danwil on The Median Researcher Problem

If an outsider's objective is to be taken seriously, they should write papers and submit them to peer review (e.g. conferences and journals).

Yann LeCun has gone so far to say that independent work only counts as "science" if submitted to peer review:

"Without peer review and reproducibility, chances are your methodology was flawed and you fooled yourself into thinking you did something great." - https://x.com/ylecun/status/1795589846771147018?s=19.

From my experience, professors are very open to discuss ideas and their work with anyone who seems serious, interested, and knowledgeable. Even someone inside academia will face skepticism if their work uses completely different methods. They will have to very convincingly prove the methods are valid.

cstinesublime on Johannes C. Mayer's Shortform

I'm missing a key piece of context here - when you say "doing something good" are you referring to educational or research reading; or do you mean any type of personal project which may or may not involve background research?

I may have some practical observations about note-taking which may be relevant, if I understand the context.

startattheend on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

That sounds about right. And "people sometimes feel that way" is a good explanation for the downvote in my opinion. I was arguing the object-level premises of the post because the "disagree" downvote was factually wrong, and this factual wrongness, I argue, is caused by a faulty understanding of how truth works, and this faulty understanding is most common in the western world and in educated people, and in the ideologies which correlate with western thought and academia.

If you disagree with something which is true, I think the only likely explanations are "Does not understand" and "Has a dislike of", and the bias I pointed out covers both of these possibilities (the former is a "map vs territory" issue and the latter is a "morality vs reality" issue).

I think you figured out what went wrong nicely, but in the end the disagreement remains. I still consider my point likely. If somebody comes along and tells me that they disagreed with it for other reasons, I might even argue that they're lying to themselves, as I'm way to disillusioned to think that a "will to truth" exists. I think social status, moral values and other such things are stronger motivators than people will admit even to themselves.

rob-lucas on Bigger Livers?

One reason is just that eating food is enjoyable. I limit the amount of food I eat to stay within a healthy range, but if I could increase that amount while staying healthy, I could enjoy that excess.

I think there are two aspects to the enjoyment of food. One is related to satiety. I enjoy the feeling of sating my appetite, and failing to sate it leaves me with te negative experience of craving food (negative if I don't satisfy those cravings.

But the other aspect is just the enjoyment of eating each individual bite of food. Not the separate enjoyment of sating my appetite, but just the experience of eating.*

When I was younger and much more physically active I ate very large amounts of food. I miss being able to do that. I'm just as sated now with the much smaller portions I eat, but eating a small breakfast instead of a large one is a different experience.

This probably doesn't justify some sort of risky intervention in increasing liver size. Food is enjoyable, but so are a lot of other things in life. But shifting to a higher protien diet seems like the kind of safe intervention, potentially even also healthier in other respects, that, if it has the side effect of being able to eat a little more food, could improve quality of life with minimal other costs. Potential costs I see are related to the price of protein relative to other sources of nutrition, the cost of additional food (if the point is being able to eat more, you've got spend money for that excess), and, depending on one's moral views, something related to the source of the protien being added.

*I think Kahneman's remembering vs. expereincing selves adds some confusion here as well. When we remember a meal we don't necessarily remember the enjoyment we got from every bite, but probably put more weight on the feeling of satiety and the peak experience (how good did it taste at its best?). But the experiencing self experiences every bite. How much you want to weight the remembering vs. experiencing self is a philosophical issue, but I just want to note that it comes up here.

lukehmiles on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

I wonder if anybody has tried to quantify how much it's worth to be a swing voter. I imagine if you are the government contractor up for renewal then it's worth quite a lot, but I wonder how much of the money/benefits the average Joe sees.

I don't know much about swing state benefits except that Milwaukee, Wisconsin got their lead pipes replaced by the fed and the workers were required to be local and they say they were paid quite well https://youtube.com/watch?v=4VpwgG0P8VU

lukehmiles on The hostile telepaths problem

Aw man we used the same word for different things again

lukehmiles on The hostile telepaths problem

Your examples fit the definition quite well. Apparently this is in the dictionary now. https://www.merriam-webster.com/dictionary/gaslighting