LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Ultralearning in 80 days
aproteinengine · 2024-11-26T00:01:23.679Z · comments (7)

Germany-wide ACX Meetup
Fernand0 · 2024-11-17T10:08:54.584Z · comments (0)

[link] Entropic strategy in Two Truths and a Lie
dkl9 · 2024-11-21T22:03:28.986Z · comments (2)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

Visualizing small Attention-only Transformers
WCargo (Wcargo) · 2024-11-19T09:37:42.213Z · comments (0)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)

A better “Statement on AI Risk?”
Knight Lee (Max Lee) · 2024-11-25T04:50:29.399Z · comments (4)

[question] What (if anything) made your p(doom) go down in 2024?
Satron · 2024-11-16T16:46:43.865Z · answers+comments (6)

What are Emotions?
Myles H (zarsou9) · 2024-11-15T04:20:27.388Z · comments (13)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)

On AI Detectors Regarding College Applications
Kaustubh Kislay (kaustubh-kislay) · 2024-11-27T20:25:48.151Z · comments (0)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

[question] What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood (Simon Skade) · 2024-11-07T09:40:20.275Z · answers+comments (15)

Hope to live or fear to die?
Knight Lee (Max Lee) · 2024-11-27T10:42:37.070Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)

Reducing x-risk might be actively harmful
MountainPath · 2024-11-18T14:25:07.127Z · comments (5)

Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)

The boat
RomanS · 2024-11-22T12:56:45.050Z · comments (0)

Antonym Heads Predict Semantic Opposites in Language Models
Jake Ward (jake-ward) · 2024-11-15T15:32:14.102Z · comments (0)

[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)

[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)

Beyond Gaussian: Language Model Representations and Distributions
Matt Levinson · 2024-11-24T01:53:38.156Z · comments (0)

[link] Decorated pedestrian tunnels
dkl9 · 2024-11-24T22:16:03.794Z · comments (3)

(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)

notes on prioritizing tasks & cognition-threads
Emrik (Emrik North) · 2024-11-26T00:28:03.400Z · comments (1)

Should you increase AI alignment funding, or increase AI regulation?
Knight Lee (Max Lee) · 2024-11-26T09:17:01.809Z · comments (1)

[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)

[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)

[link] When the Scientific Method Doesn't Really Help...
casualphysicsenjoyer (hatta_afiq) · 2024-11-27T19:52:30.023Z · comments (0)

[question] Poll: what’s your impression of altruism?
David Gross (David_Gross) · 2024-11-09T20:28:15.418Z · answers+comments (4)

Root node of my posts
AtillaYasar (atillayasar) · 2024-11-19T20:09:02.973Z · comments (0)

aspirational leadership
dhruvmethi · 2024-11-20T16:07:43.507Z · comments (0)

[link] Sparks of Consciousness
Charlie Sanders (charlie-sanders) · 2024-11-13T04:58:27.222Z · comments (0)

MIT FutureTech are hiring ‍a Product and Data Visualization Designer
peterslattery · 2024-11-13T14:48:06.167Z · comments (0)

[question] Have we seen any "ReLU instead of sigmoid-type improvements" recently
KvmanThinking (avery-liu) · 2024-11-23T03:51:52.984Z · answers+comments (4)

[link] Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong · 2024-10-31T09:21:11.623Z · comments (0)

Don't want Goodhart? — Specify the variables more
YanLyutnev (YanLutnev) · 2024-11-21T22:43:48.362Z · comments (2)

Agenda Manipulation
Pazzaz · 2024-11-09T14:13:33.729Z · comments (0)

Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)

Which AI Safety Benchmark Do We Need Most in 2025?
Loïc Cabannes (loic-cabannes) · 2024-11-17T23:50:56.337Z · comments (2)

Workshop Report: Why current benchmarks approaches are not sufficient for safety?
Tom DAVID (tom-david) · 2024-11-26T17:20:47.453Z · comments (0)

Breaking beliefs about saving the world
Oxidize · 2024-11-15T00:46:03.693Z · comments (3)

Jakarta ACX December 2024 Meetup
Aud (aud) · 2024-11-19T15:01:31.101Z · comments (0)

'Meta', 'mesa', and mountains
Lorec · 2024-10-31T17:25:53.635Z · comments (0)

AI alignment via civilizational cognitive updates
AtillaYasar (atillayasar) · 2024-11-10T09:33:35.023Z · comments (10)

Truth Terminal: A reconstruction of events
crvr.fr (crdevio) · 2024-11-17T23:51:21.279Z · comments (1)

[question] A Coordination Cookbook?
azergante · 2024-11-10T23:20:34.843Z · answers+comments (0)

A Meritocracy of Taste
Daniele De Nuntiis (daniele-de-nuntiis) · 2024-11-28T09:10:10.598Z · comments (0)

Modeling AI-driven occupational change over the next 10 years and beyond
2120eth · 2024-11-12T04:58:26.741Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

bart-bussmann on Visible Thoughts Project and Bounty Announcement

Three years later, and we actually got LLMs with visible thoughts, such as Deepseek, QwQ, and (although partially hidden from the user) o1-preview.

I (Nate) find it plausible that there are capabilities advances to be had from training language models on thought-annotated dungeon runs.

Good call!

t14n on Raemon's Shortform

Skill ceilings across humanity is quite high. I think of super genius chess players, Terry Tao, etc.

A particular individual's skill ceiling is relatively low (compared to these maximally gifted individuals). Sure, everyone can be better at listening, but there's a high non-zero chance you have some sort of condition or life experience that makes it more difficult to develop it (hearing disability, physical/mental illness, trauma, an environment of people who are actually not great at communicating themselves, etc).

I'm remindness of what Samo Burja calls "completeness hypothesis":

> It is the idea that having all of the important contributing pieces makes a given effect much, much larger than having most of the pieces. Having 100% of the pieces of a car produces a very different effect than having 90% of the pieces. The four important pieces for producing mastery in a domain are good feedback mechanisms, extreme motivation, the right equipment, and sufficient time. According to the Completeness Hypothesis, people that stably have all four of these pieces will have orders-of-magnitude greater skill than people that have only two or three of the components.

This is not a fatalistic recommendation to NOT invest in skill development. Quite the opposite.

I recommend Dan Luu's 95th %-tile is not that good.

Most people do not approach anywhere near their individual skill ceiling because they lack the four things that Burja lists. As Luu points out, most people don't care that much to develop their skills. People do not care to find good feedback loops, cultivate the motivation, or carve out sufficent time to develop skills. Certain skills may be limited by resources (equipment), but there are hacks that can lead to skill development at a sub-optimal rate (e.g. calisthenics for muscle mass development vs weighted training. Maybe you can't afford a gym membership but push-ups are free).

As @sunwillrise mentioned, there are diminishing returns for developing a skill. The gap from 0th % -> 80th % is actually quite narrow. 80th % -> 98% requires work but is doable for most people, and you probably start to experience diminishing returns around in this range (in my experience).

98%+ results are reserved for those who can have long-term stable environments to cultivate the skill, or the extremely talented.

notfnofn on A very strange probability paradox

Jumping in here: the whole point of the paragraph right after defining "A" and "B" was to ensure we were all on the same page. I also don't understand what you mean by:

Most ordinary people will assume it means that all the rolls were even

and much else of what you've written. I tell you I will roll a die until I get two 6s and let you know how many odds I rolled in the process. I then do so secretly and tell you there were 0 odds. All rolls are even. You can now make a probability distribution on the number of rolls I made, and compute its expectation.

donatas-luciunas on Alignment is not intelligent

philosophy is prone to various kinds of mistakes, such as anthropomorphization

Yes, common mistake, but not mine. I prove orthogonality thesis to be wrong using pure logic.

For example, I don't think that an intelligent general intelligence will necessarily reflect on its algorithm and find it wrong.

Me and LessWrong would probably disagree with you, consensus is that AI will optimize [? · GW] itself.

I am not really interested in debating this

OK, thanks. I believe that my concern is very important, is there anyone you could put in me in touch with so I could make sure it is not overlooked? I could pay.

viliam on Alignment is not intelligent

That's probably the root cause for our disagreement. My findings are on a very high philosophical level (fact value distinction) and you seem to try to interpret them on very low level (code). I think this gap prevent us from finding consensus.

Great point!

In defense of my position... well, I am going to skip the part about "the AI will ultimately be written in code", because it could be some kind of inscrutable code like the huge matrices of weights in LLMs, so for all practical purposes the result may resemble philosophy-as-usual more than code-as-usual...

Instead I will says that philosophy is prone to various kinds of mistakes, such as anthropomorphization: judging an inhuman system (such as AI) by attributing it human traits (even if there is no technical reason why it should have them). For example, I don't think that an intelligent general intelligence will necessarily reflect on its algorithm and find it wrong.

Thanks for the video.

Sorry, I am not really interested in debating this, and definitely not on the philosophical level; that is exhausting and not really enjoyable to me. I guess we have figure out the root causes of our disagreement, and I would leave it here.

darrenreynolds on A very strange probability paradox

I'm not sure about the off-topic rules here, but how about this:

Why are some of the drinks so expensive, given that all of them are mostly water?

Sometimes we use the phrase "given that" to mean, "considering that". Here, we do not mean, some of the drinks are not mostly water but we are not talking about them. We mean that literally all the drinks are mostly water.

nat-martin on How to use bright light to improve your life.

So glad to hear!

viliam on Noosphere89's Shortform

Huh, I just realized there are two different meanings/goals of moderation/censorship, and it is too easy to conflate them if you don't pay attention.

One is the kind where you don't want the users of your system to e.g. organize a crime. The other is where you want discussions to be disrupted e.g. by trolls.

Superficially, they seem like the same thing: you have moderators, they make the rules, and give bans to people who break them. But now this seems mostly coincidental to me: you have some technical tools, so you use them for both purposes, because that's all you have. However, from the perspective of the people who want to organize a crime, those who try to prevent them are the disruptive trolls.

I guess, my point is that when we try to think about how to improve the moderation, we may need to think about these purposes as potential opposites. Things that make it easier to ban trolls may also make it easier to organize the crime. Which is why people may simultaneously be attracted to Substack or Telegram, and also horrified by what happens at Substack or Telegram.

Maybe there is a more general lesson for the society, unrelated to tech. If you allow people to organize bottom-up, you can get a lot of good things, but you will also get groups dedicated to doing bad things. Western countries seem to optimize for the bottom-up organizations: companies, non-profits, charities, churches, etc. Soviet Union used to optimize for top-down control: everything was controlled by the state, any personal initiative was viewed as suspicious and potentially disruptive. As a result, Soviet Union collapsed economically, but the West got its anti-vaxers and flat-Eathers and everything. During the Cold War, USA was good at pushing the Soviet economical buttons. These days, Russia is good at pushing the Western free speech buttons.

Huh, maybe the analogies go deeper. Soviet Union was surprisingly tolerant of petty crime (people stealing from each other, not from the state). There were some ideological excuses, the petty criminals being technically part of the proletariat. But from the practical perspective, the more people worry about being potential victims of crime, the less attention they pay to organizing a revolution; they may actually wish for more state power, as a protection. So there was an unspoken alliance between the ruling class and the undesirables at the bottom, against everyone in between. And perhaps similarly, big platforms such as Facebook or Twitter seem to have an unspoken alliance with trolls; their shared goal is to maximize user engagement. By reacting to trolls, you don't only make the trolls happy, you also make Zuck happy, because you have spent more time on Facebook, and more ads were displayed to you. It would be naive to expect Facebook to make the discussions better; if they knew how to do that, they do not have the incentive; they actually want to hit exactly the level of badness where most people are frustrated but won't leave yet.

Finding the technical solution against trolls isn't that difficult; you basically need invite-only clubs. The things that the members write could be public or private; the important part is that in order to become a member, you need to get some kind of approval first. This can be implemented in various ways: a member needs to send you an invitation link by an e-mail, a moderator needs to approve your account before you can post. A weaker version of this is the way Less Wrong uses: anyone can join, but the new accounts are fragile and can be downvoted out of existence by the existing members, if necessary. (Works well against individual accounts created infrequently. Wouldn't work against hundred people joining at the same time and mass-upvoting each other. But I assume that the moderators have a red button that could simply disable creating new accounts for a while until the chaos is sorted out.)

But when you look at the offline analogy, these things are usually called "old boy networks", and some people think they should be disrupted. Whether you agree with that or not, probably depends on your value judgment about the network versus the people who are trying to get inside. Do you support the rights of new people to join the groups they want to join, or the rights of the existing members to keep out the people they want to keep out? One person's "trolls" are other person's "diverse voices that deserve to be heard".

So there are two lines of conflict: the established groups versus potential disruptors, and the established groups versus the owners of the system. The owners of the system may want some groups to stop existing, or to change so much that from the perspective of the current members they become different groups under the same name. Offline, the owner of the system could be a dictator, or could be a democratically elected government; I am not proposing a false equivalence here, just saying that from the perspective of the group survival, both can be seen as the strong hand crushing the community. Online, the owners are the administrators. And it is a design choice whether "the owners crushing the community, should they choose so" is made easy or difficult. If it is easy, it will make the groups feel uneasy, especially once the crushing of other groups start. If it is difficult, at least politically if not technically (e.g. Substack or Telegram advertising themselves as the uncensored spaces), we should not be surprised if some really bad things come out of there, because that is the system working exactly as designed.

In case of Less Wrong, we are a separate island, where the owners of the system are simultaneously the moderators of the group, so this level of conflict is removed. But such solutions are very expensive; we are lucky to have enough people with high tech skills and a lot of money available if the group really wants it. For most groups this is not an option; they need to build their community on someone else's land, and sometimes the owners evict them, or increase the rent (by pushing more ads on them).

If you are a free speech absolutist, or if you believe that the world is not fragile, the right way seems kinda obvious: you need an open protocol for decentralized communication with digital signatures. And you should also provide a few reference implementations that are easy to use: a website, a smartphone app, and maybe a desktop app.

At the bottom layer, you have users who provide content on demand; the content is digitally signed and can be cached and further distributed by third parties. A "user" could be a person, a pseudonym, or a technical user. (For example, if you tried to implement Facebook or Reddit on top of this protocol, its "users" would be the actual users, and the groups/subreddits, and the website itself.) This layer would be content-agnostic; it would provide any kind of content for given URI, just like you can send anything using an e-mail attachment, HTTP GET, or a torrent. The content would be digitally signed, so that the third parties (mostly servers, but also peer-to-peer for smaller amounts of data) can cache it and further distribute. In practice, most people wouldn't host their own servers, so they would publish by on a website that is hosted on a server, or using their application which would most likely upload it to some server. (Analogically to e-mail, which can be written in an app and sent by SMTP, or written directly in some web mail.) The system would automatically support downloading your own content, so you could e.g. publish using a website, then change your mind, install a desktop app, download all your content from the website (just like anyone who reads your content could do), and then delete your account on the website and continue publishing using the app. Or move to another website, create an account, and then upload the content from your desktop app. Or skip the desktop app entirely; create a new web account, and import everything from your old web account.

The next layer is versioning; we need some way to say "I want the latest version of this user's 'index.html' file". Also, some way to send direct messages between users (not just humans, but also technical users).

The next layer is about organizing the content. The system can already represent your tweets as tiny plain-text files, your photos as bitmap files, etc. Now you need to put it all together and add some resource descriptors, like XML or JSON files that say "this is a tweet, it consists of this text and this image or video, and was written at this date and time" or "this is a list of links to tweets, ordered chronologically, containing items 1-100 out of 5678 total" or "this is a blog post, with this title, its contents are in this HTML file". To support groups, you also need resource descriptors that say "this is a group description: name, list of members, list of tweets". Now make the reference applications that support all of this, with optional encryption, and you basically have Telegram, but decentralized. Yay freedom; but also expect this system to be used for all kinds of horrible crimes. :(

celarix on What epsilon do you subtract from "certainty" in your own probability estimates?

My opinion is that whatever value of epsilon you pick should be low enough such that it never happens once in your life. "I flipped a coin but it doesn't actually exist" should never happen. Maybe it would happen if you lived for millions of years, but in a normal human lifespan, never once.

anders-lindstroem on Dave Kasten's AGI-by-2027 vignette

Yes, the soon-to-be-here "human level" AGI people talk about is for all intent and purposes ASI. Show me one person who is at the highest expert level on thousands of subjects and that have the content of all human knowledge memorized and can draw the most complex inferences on that knowledge across multiple domains in seconds.