LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Essaying Other Plans
Screwtape · 2024-03-06T22:59:06.240Z · comments (4)

Singular learning theory and bridging from ML to brain emulations
kave · 2023-11-01T21:31:54.789Z · comments (16)

A list of all the deadlines in Biden's Executive Order on AI
Valentin Baltadzhiev (valentin-baltadzhiev) · 2023-11-01T17:14:31.074Z · comments (2)

Facebook is Paying Me to Post
jefftk (jkaufman) · 2023-11-14T19:10:07.303Z · comments (5)

[link] Let's Design A School, Part 2.1 School as Education - Structure
Sable · 2024-05-02T22:04:30.435Z · comments (2)

[link] Emotional issues often have an immediate payoff
Chipmonk · 2024-06-10T23:39:40.697Z · comments (2)

[question] Seeking AI Alignment Tutor/Advisor: $100–150/hr
MrThink (ViktorThink) · 2024-10-05T21:28:16.491Z · answers+comments (3)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (8)

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?
Taras Kutsyk · 2024-09-29T19:37:30.465Z · comments (7)

[link] Can a Bayesian Oracle Prevent Harm from an Agent? (Bengio et al. 2024)
mattmacdermott · 2024-09-01T07:46:26.647Z · comments (0)

SAE features for refusal and sycophancy steering vectors
neverix · 2024-10-12T14:54:48.022Z · comments (4)

Sleeping on Stage
jefftk (jkaufman) · 2024-10-22T00:50:07.994Z · comments (3)

[link] Introduction to Super Powers (for kids!)
Shoshannah Tekofsky (DarkSym) · 2024-09-20T17:17:27.070Z · comments (0)

[link] Fictional parasites very different from our own
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-08T14:59:39.080Z · comments (0)

[question] When can I be numerate?
FinalFormal2 · 2024-09-12T04:05:27.710Z · answers+comments (3)

AXRP Episode 36 - Adam Shai and Paul Riechers on Computational Mechanics
DanielFilan · 2024-09-29T05:50:02.531Z · comments (0)

You're Playing a Rough Game
jefftk (jkaufman) · 2024-10-17T19:20:06.251Z · comments (2)

The case for more Alignment Target Analysis (ATA)
Chi Nguyen · 2024-09-20T01:14:41.411Z · comments (13)

A Triple Decker for Elfland
jefftk (jkaufman) · 2024-10-11T01:50:02.332Z · comments (0)

[link] Conventional footnotes considered harmful
dkl9 · 2024-10-01T14:54:01.732Z · comments (16)

Fun With The Tabula Muris (Senis)
sarahconstantin · 2024-09-20T18:20:01.901Z · comments (0)

[link] A primer on the next generation of antibodies
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-01T22:37:59.207Z · comments (0)

[link] SB 1047 gets vetoed
ryan_b · 2024-09-30T15:49:38.609Z · comments (1)

[link] OpenAI Superalignment: Weak-to-strong generalization
Dalmert · 2023-12-14T19:47:24.347Z · comments (3)

[question] What ML gears do you like?
Ulisse Mini (ulisse-mini) · 2023-11-11T19:10:11.964Z · answers+comments (4)

[question] How to Model the Future of Open-Source LLMs?
Joel Burget (joel-burget) · 2024-04-19T14:28:00.175Z · answers+comments (9)

Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Logan Riggs (elriggs) · 2024-03-15T16:30:00.744Z · comments (5)

[link] Arrogance and People Pleasing
Jonathan Moregård (JonathanMoregard) · 2024-02-06T18:43:09.120Z · comments (7)

If a little is good, is more better?
DanielFilan · 2023-11-04T07:10:05.943Z · comments (16)

[link] Was a Subway in New York City Inevitable?
Jeffrey Heninger (jeffrey-heninger) · 2024-03-30T00:53:21.314Z · comments (4)

Control Symmetry: why we might want to start investigating asymmetric alignment interventions
domenicrosati · 2023-11-11T17:27:10.636Z · comments (1)

[link] **In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley**
mrtreasure · 2023-12-06T02:02:32.004Z · comments (3)

Testing for consequence-blindness in LLMs using the HI-ADS unit test.
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2023-11-24T23:35:29.560Z · comments (2)

Decent plan prize winner & highlights
lukehmiles (lcmgcd) · 2024-01-19T23:30:34.242Z · comments (2)

[link] Report: Evaluating an AI Chip Registration Policy
Deric Cheng (deric-cheng) · 2024-04-12T04:39:45.671Z · comments (0)

[question] Impressions from base-GPT-4?
mishka · 2023-11-08T05:43:23.001Z · answers+comments (25)

[link] An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs
Adam Karvonen (karvonenadam) · 2024-06-25T15:57:16.872Z · comments (0)

Changing Contra Dialects
jefftk (jkaufman) · 2023-10-26T17:30:10.387Z · comments (2)

$250K in Prizes: SafeBench Competition Announcement
ozhang (oliver-zhang) · 2024-04-03T22:07:41.171Z · comments (0)

A Review of In-Context Learning Hypotheses for Automated AI Alignment Research
alamerton · 2024-04-18T18:29:33.892Z · comments (4)

Economics Roundup #1
Zvi · 2024-03-26T14:00:06.332Z · comments (4)

[link] Executive Dysfunction 101
DaystarEld · 2024-05-23T12:43:13.785Z · comments (1)

A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers
Lennart Finke (l-f) · 2024-07-26T17:51:28.202Z · comments (4)

[link] Sticker Shortcut Fallacy — The Real Worst Argument in the World
ymeskhout · 2024-06-12T14:52:41.988Z · comments (15)

Beta Tester Request: Rallypoint Bounties
lukemarks (marc/er) · 2024-05-25T09:11:11.446Z · comments (4)

[link] Announcing Open Philanthropy's AI governance and policy RFP
Julian Hazell (julian-hazell) · 2024-07-17T02:02:39.933Z · comments (0)

Clipboard Filtering
jefftk (jkaufman) · 2024-04-14T20:50:02.256Z · comments (1)

To Boldly Code
StrivingForLegibility · 2024-01-26T18:25:59.525Z · comments (4)

Virtually Rational - VRChat Meetup
Tomás B. (Bjartur Tómas) · 2024-01-28T05:52:36.934Z · comments (3)

Twin Peaks: under the air
KatjaGrace · 2024-05-31T01:20:04.624Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

gunnar_zarncke on AI Safety Camp 10

Hi, is there a way to get people in touch with a project or project lead? For example, I'd like to get in touch with Masaharu Mizumoto because iVAIS sounds related to the aintelope project.

gunnar_zarncke on The Case For Bullying

The post was likely downvoted because it conflicts with principles of empathy, cooperation, and intellectual rigor. Defending bullying, even provocatively, clashes with commonly held beliefs. The zero-sum framing of status is overly simplistic, ignoring positive-sum approaches. The provocative style comes off as antagonistic. Reframing the argument around prosocial accountability might get more positive responses.

lc on Shortform

I'm interested too. I think several of the above are solvable issues. AFAICT:

Solved by simple modifications to markets:

Races to correct naive bidders
Defending the true price from incorrect bidders for $ w/o letting price shift

Seem doable with thought:

Billing for information value
Policy conditionals

Seem hard/idk if it's possible to solve:

Collating information known by different bidders
Preventing tricking other bidders for profit

green_leaf on A Logical Proof for the Emergence and Substrate Independence of Sentience

I think we're spinning on an undefined term. I'd bet there are LOTS of details that effect my perception in subtle and aggregate ways which I don't consciously identify.

You're equivocating between perceiving a collection of details and consciously identifying every separate detail.

If I show you a grid of 100 pixels, then (barring imperfect eyesight) you will consciously perceive all 100 them. But you will not consciously identify every individual pixel unless your attention is aimed at each pixel in a for loop (that would take longer than consciously perceiving the entire grid at once).

There are lots of details that affect your perception that you don't consciously identify. But there is no detail that affects your perception that wouldn't be contained in your consciousness (otherwise it, by definition, couldn't affect in your perception).

green_leaf on A Logical Proof for the Emergence and Substrate Independence of Sentience

Computability shows that you can have a classical computer that has the same input/output behavior

That's what I mean (I'm talking about the input/output behavior of individual neurons).

Input/Output behavior is generally not considered to be enough to guarantee same consciousness

It should be, because it is, in fact, enough. (However, neither the post, nor my comment require that.)

Eliezer himself argued that GLUT isn't conscious.

Yes, and that's false (but since that's not the argument in the OP, I don't think I should get sidetracked).

But nonetheless, if the only formalized proposal for consciousness doesn't have the property that simulations preserve consciousness, then clearly the property is not guaranteed.

That's false. If we assume for a second that the ITT really is the only formalized theory of consciousness, it doesn't follow that the property is not, in fact, guaranteed. It could also be that the ITT is wrong and that in the actual reality, the property is, in fact, guaranteed.

denkenberger on Is the Power Grid Sustainable?

But with what reliability? If you don't mind going without power (or dramatically curtailed power) a few weeks a year, then you could dramatically reduce the battery size, but most people in high income countries don't want to make that trade-off.

denkenberger on Is the Power Grid Sustainable?

And so are batteries.

Lithium-ion batteries have gotten a lot cheaper, but batteries in general have not. Lithium ion are just now starting to become competitive with lead acid for non-mobile applications. It's not clear that batteries in general will get significantly cheaper.

It's going to make sense for a lot of houses to go over to solar + batteries. And if batteries are too expensive for the longest stretch of cloudy days you might have, at least here a natural gas generator compares favorably.

In your climate, defection from the natural gas and electric grid is very far from being economical, because the peak energy demand for the year is dominated by heating, and solar peaks in the summer, so you would need to have extreme oversizing of the panels to provide sufficient energy in the winter. But if you have a climate that has a good match between solar output and energy demand, it gets better (or if you only defect from the electric grid). Still, even if batteries got 3 times cheaper to say $60 per kilowatt hour, and you needed to store 3 days of electricity, that would be about $4300 per kilowatt capital cost, which is much more expensive than large gas power plants + electrical transmission and distribution. Another big issue is that reliability would not be as high as with the central grid in developed countries (though it very well could be more reliable than the grid in a low income country).

While a power station could be up to 63% efficient, for a home generator maybe I'm looking at something like the 23% efficient Generac 7171, rated for 9kW on natural gas at full load. Or maybe something smaller, since this is probably in addition to batteries and only has to match the house's average consumption. This turns my $0.06kWh into $0.24/kWh, plus the cost of the generator and maintenance.

Yes, you would only want around 1 kW electrical, especially because the only hope to make this economical when you count the capital cost and maintenance is to utilize a lot of the waste heat (cogeneration), ideally both for heating and for cooling (through an absorption cycle, trigeneration). But though I don't think it works economically for a household (even in your favorable case of low natural gas prices and high electricity prices), you can have an economical cogeneration/trigeneration installation for a large apartment building, and certainly for college campuses.

tsvibt on Overview of strong human intelligence amplification methods

Are you claiming that this would help significantly with conceptual thinking? E.g., doing original higher math research, or solving difficult philosophical problems? If so, how would it help significantly? (Keep in mind that you should be able to explain how it brings something that you can't already basically get. So, something that just regular old Gippity use doesn't get you.)

dmitry-vaintrob on A bird's eye view of ARC's research

Yes - I generally agree with this. I also realized that "interp score" is ambiguous (and the true end-to-end interp score is negligible, I agree), but what's more clearly true is that SAE features tend to be more interpretable. This might be largely explained by "people tend to think of interpretable features as branches of a decision tree, which are sparsely activating". But also like it was surprising to me that the top SAE features are significantly more interpretable than top PCA features

sanxiyn on localdeity's Shortform

This is a good idea and it already works, it is just that AI is wholly unnecessary. Have a look at 2018 post Protecting Applications with Automated Software Diversity.