LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The Cognitive Bootcamp Agreement
Raemon · 2024-10-16T23:24:05.509Z · comments (0)

ARENA4.0 Capstone: Hyperparameter tuning for MELBO + replication on Llama-3.2-1b-Instruct
25Hour (aaron-kaufman) · 2024-10-05T11:30:11.953Z · comments (2)

(Maybe) A Bag of Heuristics is All There Is & A Bag of Heuristics is All You Need
Sodium · 2024-10-03T19:11:58.032Z · comments (17)

[question] If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?
KvmanThinking (avery-liu) · 2024-10-03T11:31:19.974Z · answers+comments (36)

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

What AI companies should do: Some rough ideas
Zach Stein-Perlman · 2024-10-21T14:00:10.412Z · comments (10)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
Joe Rogero · 2024-11-12T23:55:46.770Z · comments (9)

Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link
tailcalled · 2024-11-04T21:11:57.788Z · comments (0)

[link] NAO Updates, Fall 2024
jefftk (jkaufman) · 2024-10-18T00:00:04.142Z · comments (2)

A path to human autonomy
Nathan Helm-Burger (nathan-helm-burger) · 2024-10-29T03:02:42.475Z · comments (12)

[link] Concrete benefits of making predictions
Jonny Spicer (jonnyspicer) · 2024-10-17T14:23:17.613Z · comments (5)

An argument that consequentialism is incomplete
cousin_it · 2024-10-07T09:45:12.754Z · comments (27)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

Housing Roundup #10
Zvi · 2024-10-29T13:50:09.416Z · comments (2)

Intent alignment as a stepping-stone to value alignment
Seth Herd · 2024-11-05T20:43:24.950Z · comments (4)

The slingshot helps with learning
Wilson Wu (wilson-wu) · 2024-10-31T23:18:16.762Z · comments (0)

Bay Winter Solstice 2024: Speech Auditions
ozymandias · 2024-11-04T22:31:38.680Z · comments (0)

[question] When is reward ever the optimization target?
Noosphere89 (sharmake-farah) · 2024-10-15T15:09:20.912Z · answers+comments (12)

[question] Feedback request: what am I missing?
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-02T17:38:39.625Z · answers+comments (5)

[link] Stone Age Herbalist's notes on ant warfare and slavery
trevor (TrevorWiesinger) · 2024-11-09T02:40:01.128Z · comments (0)

Basics of Handling Disagreements with People
Camille Berger (Camille Berger) · 2024-11-12T17:55:08.143Z · comments (3)

Incentive design and capability elicitation
Joe Carlsmith (joekc) · 2024-11-12T20:56:05.088Z · comments (0)

Context-dependent consequentialism
Jeremy Gillen (jeremy-gillen) · 2024-11-04T09:29:24.310Z · comments (6)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

Balancing Label Quantity and Quality for Scalable Elicitation
Alex Mallen (alex-mallen) · 2024-10-24T16:49:00.939Z · comments (1)

[link] What is it like to be psychologically healthy? Podcast ft. DaystarEld
Chipmonk · 2024-10-05T19:14:04.743Z · comments (8)

SAE Probing: What is it good for? Absolutely something!
Subhash Kantamneni (subhashk) · 2024-11-01T19:23:55.418Z · comments (0)

Resolving von Neumann-Morgenstern Inconsistent Preferences
niplav · 2024-10-22T11:45:20.915Z · comments (5)

AI #85: AI Wins the Nobel Prize
Zvi · 2024-10-10T13:40:07.286Z · comments (6)

[link] Safety tax functions
owencb · 2024-10-20T14:08:38.099Z · comments (0)

Open Thread Fall 2024
habryka (habryka4) · 2024-10-05T22:28:50.398Z · comments (109)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (8)

[link] [Paper] Hidden in Plain Text: Emergence and Mitigation of Steganographic Collusion in LLMs
Yohan Mathew (ymath) · 2024-09-25T14:52:48.263Z · comments (2)

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean (euanmclean) · 2024-10-29T12:16:18.448Z · comments (7)

Examples of How I Use LLMs
jefftk (jkaufman) · 2024-10-14T17:10:04.597Z · comments (2)

[link] My Methodological Turn
adamShimi · 2024-09-29T15:01:45.986Z · comments (0)

[link] Liquid vs Illiquid Careers
vaishnav92 · 2024-10-20T23:03:49.725Z · comments (6)

[link] Why Recursion Pharmaceuticals abandoned cell painting for brightfield imaging
Abhishaike Mahajan (abhishaike-mahajan) · 2024-11-05T14:51:41.310Z · comments (1)

[link] A new process for mapping discussions
Nathan Young · 2024-09-30T08:57:20.029Z · comments (7)

[link] Our Digital and Biological Children
Eneasz · 2024-10-24T18:36:38.719Z · comments (0)

[link] Arithmetic Models: Better Than You Think
kqr · 2024-10-26T09:42:07.185Z · comments (5)

Towards Quantitative AI Risk Management
Henry Papadatos (henry) · 2024-10-16T19:26:48.817Z · comments (1)

Trading Candy
jefftk (jkaufman) · 2024-11-01T01:10:08.024Z · comments (4)

the Daydication technique
chaosmage · 2024-10-18T21:47:46.448Z · comments (0)

Why is there Nothing rather than Something?
Logan Zoellner (logan-zoellner) · 2024-10-26T12:37:50.204Z · comments (3)

[link] Generic advice caveats
Saul Munn (saul-munn) · 2024-10-30T21:03:07.185Z · comments (1)

An AI crash is our best bet for restricting AI
Remmelt (remmelt-ellen) · 2024-10-11T02:12:03.491Z · comments (3)

Domain-specific SAEs
jacob_drori (jacobcd52) · 2024-10-07T20:15:38.584Z · comments (0)

[question] What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
ChristianKl · 2024-09-26T09:17:39.088Z · answers+comments (21)

[question] Any real toeholds for making practical decisions regarding AI safety?
lukehmiles (lcmgcd) · 2024-09-29T12:03:08.084Z · answers+comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

camille-berger on Basics of Handling Disagreements with People

Hi! Thank you for writing this comment. I understand it can be a bit worrying to feel like your points might not be understood, but I'll give it a try nonetheless. I really genuinely want to fix any serious flaw in my approach.

However, I find myself in a slightly strange situation. Part of your feedback is very valuable. But I also believe that you misunderstood part of what I was saying. I could apply the skills I described in the post on your comment as a performative example, but I'm sensing that you could see it as a form of implied sarcasm, and it'd be unethical, so I'll refrain from doing that. There is a last part of me that just feels like your point is "part of this post is poorly written". I've made some minor edits in the hope that it accomodates your criticism.

My suggestion would be for you to watch real-life examples of the techniques I promote (say https://www.youtube.com/watch?v=d2WdbXsqj0M and https://www.youtube.com/watch?v=_tdjtFRdbAo ) then comment on those examples instead.

Alternatively, you can just read my answers:

Rephrasing is often terrible;

Agree, I've added the detail on "genuinely asking your interlocutor if this is what they mean, and if not, feel free to offer a correction" (e.g. "If I got you right, and feel free to correct me if I didn't.... "). I think that this form makes it almost always a pleasant experience and I somehow forgot this important detail.

Your suggestion for attacking personal experience [...]

You're referring to point 4, not 5, right ?
If yes, I think this is extrapolating beliefs I don't actually have. I admit however I didn't choose a good example, you can refer to the Street Epistemology video above for a better one.

I'll replace the example soonish. In the mean time, please note that I do not suggest to "attack" personal experiences. I suggest to ask "What helps us distinguish reliable personal experiences from unreliable ones ?". This is a valid question to ask, in my view. For a bunch of reasons, this question has more chances to bounce off, so I prefer to ask "How do you distinguish personal experiences from [delusions]?", where "[delusions]" is a term that has been deliberately imported by the conversation partner. I think most interlocutors will be tempted to answer something in the lines of intersubjectivity, repeatability or empirical experiments. But I agree this is a delicate example and I'd better off pointing to something else.

Stories need to actually be short, clear, and to the point or they just confuse the matter more.

This was part of the details I was omitting. I'll add it.

Caring about their underlying values is useful, but it needs to be preceeded by curiousity about and understanding of, or it does no good.

Agree. This was implied in several parts of the post, i.e "Be genuinely truth-seeking" in the ethical caveats. But I don't think it is that hard.

A working definition may or may not be better than a theoretical one.

Please note that I'm talking about conversations that happen between rationalists and non-rationalists on entry-level arguments. E.g. "We can't lose control of AI because it's made of silicon", not "Davidad has a promising alignment plan" (please note that I'm not making the argument to apply these techniques to AI Safety Outreach and Advocacy, this is just an example). I think we really should not spend 15 minutes with someone not acquainted with LessWrong or even AI to define "losing control" in a way that is close to mathematically formal. I think that "What do you mean with losing control? Do you mean that, if we ask to do something specific, then it won't do it? Or do you mean something else?" is a good enough question. I'd rather discuss the details when the said person is more acquainted with the topic.

There will, of course, be situations where this isn't true. Law of equal and opposite advice applies. But in most entry-level arguments, I'd rather have people spend less time problematizing definitions as opposed to asking to their interlocutor what are their reasons.

People don't generally use Bayes rule!

Of course. I'm not suggesting to mention Bayes' Rule out loud. Nor am I suggesting people actually use Baye's Rule in their everyday life. I'm noting that techniques I think are more robust are the ones that lead people to apply Bayes' Rule, or rather, an approximation thereoff, usually by contrasting one piece of evidence under two different hypotheses. I appreciate Bayes' Rule performatively.

Something said in point 8 seems like the key.

It is the key, I thought I hade made it clear with "Yet the mindset itself is the key".
However I don't want to make a post on it without explaining the ways in which it manifests, because healing myself made no sense, up until I started analyzing the habits of healed people. Some people who were already healed didn't want to "give the secrets away" or scoughed at my attempts. They came up to me as snob and preventing me to actually learn, I actually really got a lot out of noting down recurrent patterns in their conversations, if only because it allowed me to do Deliberate Practice.

Finally, please remember that this post is an MVP. It is not meant to be exhaustive and cover all the nuances of the techniques -it's just that I'd rather write a post than nothing at all, and the entire sequence will take time before publication.

If you feel like I completely misunderstood your points, and are open to have my skills applied to our very conversation, feel free to DM me a calendly link and we can sort it out live. I'd describe myself as a good conversation partner and I would put quite low the probability for the exchange to go awry.

PS: It would help me out if you could quote the [first sentence of the] parts you are reacting to, in order to make clear what you are talking about. I hope I'm right in understanding what parts of the post you are reacting to.

quetzal_rainbow on Quick look: applications of chaos theory

There are no properties of brain which define that brain is "you", except for the program that it runs.

lesswronguser123 on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

Honestly majority of the points presented here are not new and already been addressed in

https://www.lesswrong.com/rationality [? · GW]

or https://www.readthesequence.com/

I got into this conversation because I thought I would find something new here. As an egoist I am voluntarily leaving this conversation in disagreement because I have other things to do in life. Thank you for your time.

habryka4 on johnswentworth's Shortform

I think the comment more confirms rather than disconfirms John's comment (though I still think it's too broad for other reasons). OP "funding" something historically has basically always meant recommending a grant to GV. Luke's language to me suggests that indeed the right of center grants are no longer referred to GV (based on a vague vibe of how he refers to funders in plural).

OP has always made some grant recommendations of grants to other funders (historically OP would probably describe those grants as "rejected but referred to an external funder"). As Luke says, those are usually ignored, and OP's counterfactual effect on those grants is much less, and IMO it would be inaccurate to describe those recommendations as "OP funding something". As I said in the comment I quote below, most OP staff would like to fund things right of center, but GV does not seem to want to, as such the only choice OP has is to refer them to other funders (which sometimes works, but mostly doesn't).

As another piece of evidence, when OP defunded all the orgs that GV didn't want to fund anymore, the communication emails that OP sent said that "Open Philanthropy is exiting funding area X" or "exiting organization X". By the same use of language, yes, it seems like OP has exited funding right-of-center policy work.

(I think it would make sense to taboo "OP funding X" in future conversations to avoid confusion, but also, I think historically it was very meaningfully the case that getting funded by GV is much better described as "getting funded by OP" given that you would never talk to anyone at GV and the opinions of anyone at GV would basically have no influence on you getting funded. Things are different now, and in a meaningful sense OP isn't funding anyone anymore, they are just recommending grants to others, and it matters more what those others think then what OP staff thinks)

harfe on johnswentworth's Shortform

A related comment from lukeprog [EA(p) · GW(p)] (who works at OP) was posted on the EA Forum. It includes:

However, at present, it remains the case that most of the individuals in the current field of AI governance and policy (whether we fund them or not) are personally left-of-center and have more left-of-center policy networks. Therefore, we think AI policy work that engages conservative audiences is especially urgent and neglected, and we regularly recommend right-of-center funding opportunities in this category to several funders.

jeremy-gillen on Evaluating Stability of Unreflective Alignment

Intelligence/IQ is always good, but not a dealbreaker as long as you can substitute it with a larger population.

IMO this is pretty obviously wrong. There are some kinds of problem solving that scales poorly with population, just as there are some computations that scale poorly with parallelisation.

E.g. project euler problems [LW · GW].

When I said "problems we care about", I was referring to a cluster of problems that very strongly appear to not scale well with population. Maybe this [LW · GW] is an intuitive picture of the cluster of problems I'm referring to.

habryka4 on Bogdan Ionut Cirstea's Shortform

One of these types of orgs is developing a technology with the potential to kill literally all of humanity. The other type of org is funding research that if it goes badly mostly just wasted their own money. Of course the demands for legibility and transparency should be different.

jeremy-gillen on Context-dependent consequentialism

I buy that such an intervention is possible. But doing it requires understanding the internals at a deep level. You can't expect SGD to implement the patch in a robust way. The patch would need to still be working after 6 months on an impossible problem, in spite of it actively getting in the way of finding the solution!

quasi_quasar on Cryonics is free

To add to bogdanb's comment below, you might want to be careful because you seem to be 'forcing' people to subscribe to promotional newsletters in order to get a price quote which, aside from being quite a nasty thing to do, is also a blatant violation of European GDPR regulations for which you could receive a hefty fine

anthonyc on Heresies in the Shadow of the Sequences

I'm not entirely sure how many of these I agree with, but I don't really think any of them could be considered heretical or even all that uncommon as opinions on LW?

All but #2 seem to me to be pretty well represented ideas, even in the Sequences themselves (to the extent the ideas existed when the Sequences got written).

#2 seems to me to rely on the idea that the process of writing is central or otherwise critical to the process of learning about, and forming a take on, a topic. I have thought about this, and I think for some people it is true, but for me writing is often a process of translating an already-existing conceptual web into a linear approximation of itself. I'm not very good at writing in general, and having an LLM help me wordsmith concepts and workshop ideas as a dialogue partner is pretty helpful. I usually form takes my reading and discussing and then thinking quietly, not so much during writing if I'm writing by myself. Say I read a bunch of things or have some conversations, take notes on these, write an outline of the ideas/structure I want to convey, and share the notes and outline with an LLM. I ask it to write a draft that it and I then work on collaboratively. How is that meaningfully worse than writing alone, or writing with a human partner? Unless you meant literally "Ask an LLM for an essay on a topic and publish it," in which case yes, I agree.