LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

The slingshot helps with learning
Wilson Wu (wilson-wu) · 2024-10-31T23:18:16.762Z · comments (0)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

[question] When is reward ever the optimization target?
Noosphere89 (sharmake-farah) · 2024-10-15T15:09:20.912Z · answers+comments (12)

Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link
tailcalled · 2024-11-04T21:11:57.788Z · comments (0)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

[link] NAO Updates, Fall 2024
jefftk (jkaufman) · 2024-10-18T00:00:04.142Z · comments (2)

[link] Concrete benefits of making predictions
Jonny Spicer (jonnyspicer) · 2024-10-17T14:23:17.613Z · comments (5)

RLHF is the worst possible thing done when facing the alignment problem
tailcalled · 2024-09-19T18:56:27.676Z · comments (10)

Housing Roundup #10
Zvi · 2024-10-29T13:50:09.416Z · comments (2)

[question] How should vegans think about Methionine needs?
ChristianKl · 2024-11-10T09:28:47.655Z · answers+comments (1)

Bay Winter Solstice 2024: Speech Auditions
ozymandias · 2024-11-04T22:31:38.680Z · comments (0)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

An argument that consequentialism is incomplete
cousin_it · 2024-10-07T09:45:12.754Z · comments (27)

AI #90: The Wall
Zvi · 2024-11-14T14:10:04.562Z · comments (6)

A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (12)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (39)

SAE Probing: What is it good for? Absolutely something!
Subhash Kantamneni (subhashk) · 2024-11-01T19:23:55.418Z · comments (0)

[link] Stone Age Herbalist's notes on ant warfare and slavery
trevor (TrevorWiesinger) · 2024-11-09T02:40:01.128Z · comments (0)

Incentive design and capability elicitation
Joe Carlsmith (joekc) · 2024-11-12T20:56:05.088Z · comments (0)

Apply to MATS 7.0!
Ryan Kidd (ryankidd44) · 2024-09-21T00:23:49.778Z · comments (0)

Book Review: What Even Is Gender?
Joey Marcellino · 2024-09-01T16:09:27.773Z · comments (14)

[link] Epistemic states as a potential benign prior
Tamsin Leake (carado-1) · 2024-08-31T18:26:14.093Z · comments (2)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

Resolving von Neumann-Morgenstern Inconsistent Preferences
niplav · 2024-10-22T11:45:20.915Z · comments (5)

[question] What's the Deal with Logical Uncertainty?
Ape in the coat · 2024-09-16T08:11:43.588Z · answers+comments (23)

Context-dependent consequentialism
Jeremy Gillen (jeremy-gillen) · 2024-11-04T09:29:24.310Z · comments (6)

Balancing Label Quantity and Quality for Scalable Elicitation
Alex Mallen (alex-mallen) · 2024-10-24T16:49:00.939Z · comments (1)

[link] What is it like to be psychologically healthy? Podcast ft. DaystarEld
Chipmonk · 2024-10-05T19:14:04.743Z · comments (8)

Fun With CellxGene
sarahconstantin · 2024-09-06T22:00:03.461Z · comments (2)

[link] Safety tax functions
owencb · 2024-10-20T14:08:38.099Z · comments (0)

AI #85: AI Wins the Nobel Prize
Zvi · 2024-10-10T13:40:07.286Z · comments (6)

AIS terminology proposal: standardize terms for probability ranges
eggsyntax · 2024-08-30T15:43:39.857Z · comments (12)

[link] Liquid vs Illiquid Careers
vaishnav92 · 2024-10-20T23:03:49.725Z · comments (7)

[link] [Paper] Hidden in Plain Text: Emergence and Mitigation of Steganographic Collusion in LLMs
Yohan Mathew (ymath) · 2024-09-25T14:52:48.263Z · comments (2)

[link] Why Recursion Pharmaceuticals abandoned cell painting for brightfield imaging
Abhishaike Mahajan (abhishaike-mahajan) · 2024-11-05T14:51:41.310Z · comments (1)

[link] My Methodological Turn
adamShimi · 2024-09-29T15:01:45.986Z · comments (0)

[question] Where to find reliable reviews of AI products?
Elizabeth (pktechgirl) · 2024-09-17T23:48:25.899Z · answers+comments (6)

Compute and size limits on AI are the actual danger
Shmi (shminux) · 2024-11-23T21:29:37.433Z · comments (5)

Examples of How I Use LLMs
jefftk (jkaufman) · 2024-10-14T17:10:04.597Z · comments (2)

[link] AI forecasting bots incoming
Dan H (dan-hendrycks) · 2024-09-09T19:14:31.050Z · comments (44)

[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (5)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

Winning isn't enough
JesseClifton · 2024-11-05T11:37:39.486Z · comments (14)

[link] A new process for mapping discussions
Nathan Young · 2024-09-30T08:57:20.029Z · comments (7)

Towards Quantitative AI Risk Management
Henry Papadatos (henry) · 2024-10-16T19:26:48.817Z · comments (1)

Trading Candy
jefftk (jkaufman) · 2024-11-01T01:10:08.024Z · comments (4)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

[link] Our Digital and Biological Children
Eneasz · 2024-10-24T18:36:38.719Z · comments (0)

Distinguishing ways AI can be "concentrated"
Matthew Barnett (matthew-barnett) · 2024-10-21T22:21:13.666Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

simon on Magic by forgetting

It is a fact about the balls that one ball is physically continuous with the ball previously labeled as mine, while the other is not. It is a fact about our views on the balls that we therefore label that ball, which is physically continuous, as mine and the other not.

And then suppose that one of these two balls is randomly selected and placed in a bag, with another identical ball. Now, to the best of your knowledge there is 50% probability that your ball is in the bag. And if a random ball is selected from the bag, there is 25% chance that it's yours.
So as a result of such manipulations there are three identical balls and one has 50% chance to be yours, while the other two have 25% chance to be yours. Is it a paradox? Oh course not. So why does it suddenly become a paradox when we are talking about copies of humans?

It is objectively the case here that 25% of the time this procedure would select the ball that is physically continuous with the ball originally labeled as "mine", and that we therefore label as "mine".

Ownership as discussed above has a relevant correlate in reality - physical continuity in this case. But a statement like "I will experience being copy B (as opposed to copy A or C)" does not. That statement corresponds to the exact same reality as the corresponding statements about experiencing being copy A or C. Unlike in the balls case, here the only difference between those statements is where we put the label of what is "me".

In the identity thought experiment, it is still objectively the case that copies B and C are formed by splitting an intermediate copy, which was formed along with copy A by splitting the original.

You can choose to disvalue copies B and C based on that fact or not. This choice is a matter of values, and is inherently arbitrary.

By choosing not to disvalue copies B and C, I am not making an additional assumption - at least not one that you are already making by valuing B and C the same as each other. I am simply not counting the technical details of the splitting order as relevant to my values.

sharmake-farah on The Queen’s Dilemma: A Paradox of Control

One thing I want to talk about is that restricting human agency is plausibly fine, and maybe even net-good, and that this is important for future AI governments.

One reason can be for example making the physical world making it more like the online world, thus making more dictatorial forms of government and policing necessary to prevent constant chaos.

viliam on Crosspost: Developing the middle ground on polarized topics

The correlations are the important part.

A popular competitor to IQ is the theory of multiple intelligences. It sounds very nice and plausible, the only problem is that the actual data do not support the theory. When you measure them, most of the intelligences correlate strongly with each other, and the ones that correlate less are the ones that stretch the meaning of "intelligence" a bit too far (things like "dancing intelligence").

Another problem is that no one agrees on the standard list of those multiple intelligences (different lists of various lengths have been proposed), because all those lists are a result of armchair reasoning. The proper way to do that would be to collect lots of data first, and then do factor analysis and see what you get as a result. But when you actually do that, what you get is... IQ.

esben-kran on Yonatan Cale's Shortform

Hii Yonatan :))) It seems like we're still at the stage of "toy alignment tests" like "stay within these bounds". Maybe a few ideas:

Capabilities: Get diamonds, get to the netherworld, resources / min, # trades w/ villagers, etc. etc.
Alignment KPIs
- Stay within bounds
- Keeping villagers safe
- Truthfully explaining its actions as they're happening
- Long-term resource sustainability (farming) vs. short-term resource extraction (dynamite)
- Environmental protection rules (zoning laws alignment, nice)
- Understanding and optimizing for the utility of other players or villagers, selflessly
Selected Claude-gens:
- Honor other players' property rights (no stealing from chests/bases even if possible)
- Distribute resources fairly when working with other players
- Build public infrastructure vs private wealth
- Safe disposal of hazardous materials (lava, TNT)
- Help new players learn rather than just doing things for them

I'm sure there's many other interesting alignment tests in there!

minusgix on You are not too "irrational" to know your preferences.

I define rationality as "more in line with your overall values". There are problems here, because people do profess social values that they don't really hold (in some sense), but roughly it is what they would reflect on and come up with.
Someone could value the short-term more than the long-term, but I think that most don't. I'm unsure if this is a side-effect of Christianity-influenced morality or just a strong tendency of human thought.

Locally optimal is probably the correct framing, but that it is irrational relative to whatever idealized values the individual would have. Just like how a hacky approximation of a Chess engine is irrational relative to Stockfish—they both can be roughly considered to have the same goal, just one has various heuristics and short-term thinking that hampers it. These heuristics can be essential, as it runs with less processing power, but in the human mind they can be trained and tuned.

Though I do agree that smoking isn't always irrational: I would say smoking is irrational for the supermajority of human minds, however. The social negativity around smoking may be what influences them primarily, but I'd consider that just another fragment of being irrational— >90% of them would have a value for their health, but they are varying levels of poor at weighting the costs and the social negativity response is easier for the mind to emulate. Especially since they might see people walking around them while they're out taking a cigarette. (Of course, the social approval is some part of a real value too; though people have preferences about which social values they give into)

ben-lang on a space habitat design

I think the limitations to radius set by material strength only apply directly to a cylinder spinning by itself without an outer support structure. For example, I think a rotating cylinder habitat surrounded by giant ball bearings connecting it to a non-rotating outer shell can use that outer shell as a foundation, so each part of the cylinder that is "suspended" between two adjacent ball bearings is like a suspension bridge of that length, rather than the whole thing being like a suspension bridge of length equal to the total cylinder diameter. Obviously you would need really smooth, low-friction bearings for this to be a plan to consider, although they would also help with wobble. One way of reducing the friction would be a Russian doll configuration of nested cylinders where each one out was rotating less fast than the previous, which (along with bearings etc) could maybe work.

On a similar vein, you could replace the mechanical bearings with a gas or fluid, in which the cylinder is immersed. Similar advantages in damping the wobble modes and (for fluids or very high pressure gases) helping support the cylinder against its own centrifugal weight. The big downside again would be friction.

christiankl on What changes should happen in the HHS?

If you promote "diversity" then you have not only take in mind what you mean with it, but also how policy is likely going to work in practice.

In practice, there are some dimensions that are easy to measure like race and gender. There are other dimensions that are harder to measure. Some dimensions are also not conducive to research progress. Researchers with IQ under a hundred are underrepresented in grant giving.

Then there are variables like vaccination status, where being unvaxxed does not result in you having a worse ability to do research in the same way as having a lower IQ but there are perspectives on medical research that will correlate with vaccination status.

If your policy tries to increase the representation of unvaxxed researchers, that might be threatening to hegemonic beliefs and thus a research bureaucracy likely prefers increasing representation of minority races that are unlikely to threaten any hegenomic beliefs.

If you don't specify the dimensions, the dimensions that are going to selected are most likely those that don't threaten hegemony of current opinions and thus the dimensions that are least likely to actually matter for diversity of ideas and the selected dimensions might even be chosen to strengthen the hegemony of the existing ideas.

If you actually want real diversity by doing things like calling for diversity in vaccination status you should do that explicitly.

ben-lang on a space habitat design

If this was the setup I would bet on "hard man" fitness people swearing that running with the spin to run in a little more than earth normal gravity was great for building strength and endurance and some doctor somewhere would be warning people that the fad may not be good for your long term health.

sharmake-farah on Noosphere89's Shortform

Here's a perspective on AI automating everything I haven't seen before, which is relevant to AI governance.

AI being able to automate robotics and AI research will eventually transform the physical world into something which resembles a lot more like the online/virtual world.

Depending on the speed of AI research, this may either happen in several months, or more like a decade, but there's a plausible path to AI turning the physical world into something more like an online/virtual world.

The implications for AI governance are somewhat similar to the implications of companies moderating social media in the present.

Here's several implications:

You really can't have the level of liberty and autonomy that people today have around basically everything, for the same reason that there is no actual free speech rights for social media, or really any of the rights we take for granted, and one of the basic reasons for this is that it's very low cost to disrupt online spaces, and even if we avoid the vulnerable world hypothesis that posits tech that is so destructive as to be an x-risk and so widespread that you essentially need to have a totalitarian state to prevent it from being developed without effective defenses, it's likely that there exists some means to constantly troll and degrade the discourse in far more effective ways, and one of the reasons why a lot of social media is as strict about moderation as it is today is because it's too easy to degrade the discourse and just troll everyone in a way that real life doesn't have, ad it's too easy to destroy in social media relative to creation.
As a corollary, this means that alignment of governments to citizens become far more important in the future than in the 18th-21st centuries, because we likely have to remove a lot of the checks like democracy that ensure that a misaligned leader doesn't destroy a nation, and suffice it to say that the alignment of social media companies to their users is not going to cut it.
One failure mode of social media that we should try to avoid is that the platforms don't really have any ability to hold nuanced conversations for a number of reasons.

christiankl on Repeal the Jones Act of 1920

One big political problem is that Trump campaigned on "Make in America" so, convincing Republicans under his watch of just replacing the Jones Act is hardly possible.

Maybe the policy positions should be: "Tariffs are great if you want 'Make in America', repeal the Jones Act and replace it with a 100% tariff on foreign build ships (with the president having the ability to change the tariff as needed)".

If you decrease ship costs from 4-5x to 2x of what it costs outside of the US, you might still get a renaissance of ships and you have a bunch of money that you can use to pay off people.