LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI #75: Math is Easier
Zvi · 2024-08-01T13:40:05.539Z · comments (25)

[link] Contra Scott on Abolishing the FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-15T14:00:17.247Z · comments (3)

How to hire somebody better than yourself
lukehmiles (lcmgcd) · 2024-08-28T08:12:53.450Z · comments (5)

Saving the world sucks
Defective Altruism (Elijah Bodden) · 2024-01-10T05:55:46.504Z · comments (29)

[Valence series] 4. Valence & Liking / Admiring
Steven Byrnes (steve2152) · 2024-06-10T14:19:51.194Z · comments (12)

[link] Soft Prompts for Evaluation: Measuring Conditional Distance of Capabilities
porby · 2024-02-02T05:49:11.189Z · comments (1)

Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence
Towards_Keeperhood (Simon Skade) · 2024-05-06T17:09:10.729Z · comments (16)

AI Safety 101 : Capabilities - Human Level AI, What? How? and When?
markov (markovial) · 2024-03-07T17:29:53.260Z · comments (8)

D&D.Sci: The Mad Tyrant's Pet Turtles [Evaluation and Ruleset]
abstractapplic · 2024-04-09T14:01:34.426Z · comments (6)

On the Proposed California SB 1047
Zvi · 2024-02-12T16:40:04.854Z · comments (18)

Some costs of superposition
Linda Linsefors · 2024-03-03T16:08:20.674Z · comments (11)

On OpenAI’s Model Spec
Zvi · 2024-06-21T13:00:03.014Z · comments (3)

I'm open for projects (sort of)
cousin_it · 2024-04-18T18:05:01.395Z · comments (13)

[link] Robin Hanson AI X-Risk Debate — Highlights and Analysis
Liron · 2024-07-12T21:31:02.222Z · comments (7)

[link] The Leeroy Jenkins principle: How faulty AI could guarantee "warning shots"
titotal (lombertini) · 2024-01-14T15:03:21.087Z · comments (6)

Enriched tab is now the default LW Frontpage experience for logged-in users
Ruby · 2024-06-21T00:09:30.441Z · comments (27)

Thoughts on "The Offense-Defense Balance Rarely Changes"
Cullen (Cullen_OKeefe) · 2024-02-12T03:26:50.662Z · comments (4)

AI #68: Remarkably Reasonable Reactions
Zvi · 2024-06-13T16:30:02.969Z · comments (11)

[link] Metascience of the Vesuvius Challenge
Maxwell Tabarrok (maxwell-tabarrok) · 2024-03-30T12:02:38.978Z · comments (2)

Higher-effort summer solstice: What if we used AI (i.e., Angel Island)?
Rachel Shu (wearsshoes) · 2024-06-25T01:35:54.064Z · comments (9)

The predictive power of dissipative adaptation
dr_s · 2023-12-17T14:01:31.568Z · comments (14)

[link] Bayesians Commit the Gambler's Fallacy
Kevin Dorst · 2024-01-07T12:54:59.939Z · comments (28)

AI doing philosophy = AI generating hands?
Wei Dai (Wei_Dai) · 2024-01-15T09:04:39.659Z · comments (22)

Decision Theory in Space
lsusr · 2024-08-18T07:02:11.847Z · comments (18)

[link] For Civilization and Against Niceness
Gabriel Alfour (gabriel-alfour-1) · 2023-11-20T10:56:20.352Z · comments (14)

[link] If Clarity Seems Like Death to Them
Zack_M_Davis · 2023-12-30T17:40:42.622Z · comments (191)

D&D.Sci(-fi): Colonizing the SuperHyperSphere
abstractapplic · 2024-01-12T23:36:54.248Z · comments (23)

AI #41: Bring in the Other Gemini
Zvi · 2023-12-07T15:10:05.552Z · comments (16)

Big Picture AI Safety: Introduction
EuanMcLean (euanmclean) · 2024-05-23T11:15:44.037Z · comments (7)

1. The CAST Strategy
Max Harms (max-harms) · 2024-06-07T22:29:13.005Z · comments (19)

So You Created a Sociopath - New Book Announcement!
Garrett Baker (D0TheMath) · 2024-04-01T18:02:18.010Z · comments (3)

D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset
aphyer · 2024-10-29T01:21:03.075Z · comments (10)

Bounty for Evidence on Some of Palisade Research's Beliefs
benwr · 2024-09-23T20:01:20.917Z · comments (4)

[link] Michael Dickens' Caffeine Tolerance Research
niplav · 2024-09-04T15:41:53.343Z · comments (3)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

[link] I'd also take $7 trillion
bhauth · 2024-02-19T03:31:45.552Z · comments (12)

[link] Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)
Kaj_Sotala · 2024-01-23T14:05:40.986Z · comments (2)

[link] AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance · 2024-01-17T17:17:30.913Z · comments (9)

[link] How people stopped dying from diarrhea so much (& other life-saving decisions)
Writer · 2024-03-16T16:00:47.830Z · comments (0)

[link] Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Owain_Evans · 2023-12-19T19:14:26.423Z · comments (4)

Atlantis: Berkeley event venue available for rent
Jonas V (Jonas Vollmer) · 2023-11-22T01:47:12.026Z · comments (0)

Things Solenoid Narrates
Solenoid_Entity · 2024-04-12T23:57:16.169Z · comments (2)

[link] Book review: Deep Utopia
PeterMcCluskey · 2024-04-23T19:55:50.417Z · comments (14)

[link] AI Rights for Human Safety
Simon Goldstein (simon-goldstein) · 2024-08-01T23:01:07.252Z · comments (6)

Monthly Roundup #18: May 2024
Zvi · 2024-05-13T12:30:04.863Z · comments (10)

We ran an AI safety conference in Tokyo. It went really well. Come next year!
Blaine (blaine-rogers) · 2024-07-17T06:55:39.620Z · comments (1)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

[link] Rational Animations' intro to mechanistic interpretability
Writer · 2024-06-14T16:10:57.015Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cubefox on Set Theory Multiverse vs Mathematical Truth - Philosophical Discussion

Arguably, "basic logical principles" are those that are true in natural language. Otherwise nothing stops us from considering absurd logical systems where "true and true" is false, or the like. Likewise, "one plus one is two" seems to be a "basic mathematical principle" in natural language. Any axiomatization which produces "one plus one is three" can be dismissed on grounds of contradicting the meanings of terms like "one" or "plus" in natural language.

The trouble with set theory is that, unlike logic or arithmetic, it often doesn't involve strong intuitions from natural language. Sets are a fairly artificial concept compared to natural language collections (empty sets, for example, can produce arbitrary nestings), especially when it comes to infinite sets.

screwtape on 2024 Unofficial LW Community Census, Request for Comments

So, I think Fight 1 is funny, but it is kind of high context, involving reading two somewhat long stories. (Planecrash in particular is past a million words long!) I'd considered "Who would win in a fight, Eliezer Yudkowsky or Scott Alexander? ["Eliezer", "Scott", "Wait, what's this? It's Aella with a steel chair!"]" and "Who is the rightful caliph? ["Eliezer Yudkowsky","Scott Alexander", "Wait, what's this? It's Robin Hanson with a steel chair!"]" but feel a bit weird about including real people.

I think they're just as funny though, and far more people will understand it, so maybe I should switch. Anyone have convincing thoughts here?

programcrafter on 2024 Unofficial LW Community Census, Request for Comments

It would be nice to see at least three questions which would demonstrate how person extracts evidence from others' words, how much time and emotions could they spend if they needed to communicate a point precisely, etc.

I'll have to sleep on that, actually. Will return tomorrow, presumably with more concrete ideas)

jan-betley on 2024 Unofficial LW Community Census, Request for Comments

I don't think there's a lot of value in distinguishing 3000 and 1,000,000 and probably for any aggregate you'll want to show this will just be "later than 2200" or something like that. But yes this way they can't make a statement that this will be 1,000,000 which is some downside.

I'm not a big fan of looking at the neighbors to decide whether this is a missing answer or high estimate (it's OK to not want to answer this one question). So some N/A or -1 should be ok.

(Just to make it clear, I'm not saying this is an important problem)

programcrafter on What can we learn from insecure domains?

99.9% of all cryptocurrency projects are complete scams (conservative estimate).

On first skim, I agree with the estimate as stated and would post a limit order for either side. I'd also like to note that "crypto in general is terrible" instead of "all crypto is terrible", as there have been applications developed that do not allow you to lose all funds without explicit acknowledgement.

Similarly, Cyber Security is terrible. Basically every computer on the internet is infected with multiple types of malware.

It is presumably terrible (or, 30%, result of availability bias), and I've observed bugs happen because functionality upgrade did not consider its interaction with all other code. However, I disagree that every computer is infected; probably you meant that it is under constant stream of attack attempts?

The insecure domains mainly work because people have charted known paths, and shown that if you follow those paths your loss probability is non-null but small. As a matter of IT, it would be really nice to have systems which don't logically fail at all, but that requires good education and pressure-resistance skills for software developers.

screwtape on 2024 Unofficial LW Community Census, Request for Comments

I have no opinion on the difference and chatgpt agrees with you, so sure, changed to "eighty percent of the benefit."

raemon on JargonBot Beta Test

The most important thing is "There is a small number of individuals who are paying attention, who you can argue with, and if you don't like what they're doing, I encourage you to write blogposts or comments complaining about it. And if your arguments make sense to me/us, we might change our mind. If they don't make sense, but there seems to be some consensus that the arguments are true, we might lose the Mandate of Heaven or something."

I will personally be using my best judgment to guide my decisionmaking. Habryka is the one actually making final calls about what gets shipped to the site, insofar as I update that we're doing a wrong thing, I'll argue about it."

It happening at all already constitutes “going wrong”.

This particular sort of comment doesn't particularly move me. I'm more likely to be moved by "I predict that if AI used in such and such a way it'll have such and such effects, and those effects are bad." Which I won't necessarily automatically believe, but, I might update on if it's argued well or seems intuitively obvious once it's pointed out.

I'll be generally tracking a lot of potential negative effects and if it seems like it's turning out "the effects were more likely" or "the effects were worse than I thought", I'll try to update swiftly.

screwtape on 2024 Unofficial LW Community Census, Request for Comments

Thanks for the year catch.

I could check their expected price of bitcoin, but that feels like more weight than I want to put on bitcoin- it's already a little bit overlapping with the S&P question. What I'd like to replace it with is something that 1. will have a definitive answer by next summer, 2. people have enough context to understand the question, and 3. isn't at obvious.

The questions are not checking for social skills. I am not sure how I'd do that on an online survey that's going to be self reported, and if you have thoughts about that I'm kind of curious? What percentage of the survey being about social skills would be sufficient? (I'm heavily into meetups and in-person gatherings for LessWrong events, so I might be one of the more receptive audiences for this line of argument!)

programcrafter on What TMS is like

I think TMS doesn't rewrite anything, instead activating neural circuits in another pattern? Then, new pattern is not depressed, brain can notice that (on either conscious or subconscious level) and make appropriate changes to neural connections.

Basically, I believe that whatever resulting patterns (including "other parts of you changed into something non-native and alien") you dis-endorse, are "committed" with significantly lower probability.

screwtape on 2024 Unofficial LW Community Census, Request for Comments

I could, but what if someone genuinely thinks it's that high number? Someone put 1,000,000 on the 2022 version of that question.