Posts
Comments
Yudkowsky has a pinned tweet that states the problem quite well: it's not so much that alignment is necessarily infinitely difficult, but that it certainly doesn't seem anywhere as easy as advancing capabilities, and that's a problem when what matters is whether the first powerful AI is aligned:
Safely aligning a powerful AI will be said to be 'difficult' if that work takes two years longer or 50% more serial time, whichever is less, compared to the work of building a powerful AI without trying to safely align it.
It seems to me like the "more careful philosophy" part presupposes a) that decision-makers use philosophy to guide their decision-making, b) that decision-makers can distinguish more careful philosophy from less careful philosophy, and c) that doing this successfully would result in the correct (LW-style) philosophy winning out. I'm very skeptical of all three.
Counterexample to a): almost no billionaire philanthropy uses philosophy to guide decision-making.
Counterexample to b): it is a hard problem to identify expertise in domains you're not an expert in.
Counterexample to c): from what I understand, in 2014, most of academia did not share EY's and Bostrom's views.
Presumably it was because Google had just bought DeepMind, back when it was the only game in town?
This NYT article (archive.is link) (reliability and source unknown) corroborates Musk's perspective:
As the discussion stretched into the chilly hours, it grew intense, and some of the more than 30 partyers gathered closer to listen. Mr. Page, hampered for more than a decade by an unusual ailment in his vocal cords, described his vision of a digital utopia in a whisper. Humans would eventually merge with artificially intelligent machines, he said. One day there would be many kinds of intelligence competing for resources, and the best would win.
If that happens, Mr. Musk said, we’re doomed. The machines will destroy humanity.
With a rasp of frustration, Mr. Page insisted his utopia should be pursued. Finally he called Mr. Musk a “specieist,” a person who favors humans over the digital life-forms of the future.
That insult, Mr. Musk said later, was “the last straw.”
And this article from Business Insider also contains this context:
Musk's biographer, Walter Isaacson, also wrote about the fight but dated it to 2013 in his recent biography of Musk. Isaacson wrote that Musk said to Page at the time, "Well, yes, I am pro-human, I fucking like humanity, dude."
Musk's birthday bash was not the only instance when the two clashed over AI.
Page was CEO of Google when it acquired the AI lab DeepMind for more than $500 million in 2014. In the lead-up to the deal, though, Musk had approached DeepMind's founder Demis Hassabis to convince him not to take the offer, according to Isaacson. "The future of AI should not be controlled by Larry," Musk told Hassabis, according to Isaacson's book.
Most configurations of matter, most courses of action, and most mind designs, are not conducive to flourishing intelligent life. Just like most parts of the universe don't contain flourishing intelligent life. I'm sure this stuff has been formally stated somewhere, but the underlying intuition seems pretty clear, doesn't it?
What if whistleblowers and internal documents corroborated that they think what they're doing could destroy the world?
Ilya is demonstrably not in on that mission, since his step immediately after leaving OpenAI was to found an additional AGI company and thus increase x-risk.
I don't understand the reference to assassination. Presumably there are already laws on the books that outlaw trying to destroy the world (?), so it would be enough to apply those to AGI companies.
Just as one example, OpenAI was against SB 1047, whereas Musk was for it. I'm not optimistic about regulation being enough to save us, but presumably they would be helpful, and some AI companies like OpenAI were against even the limited regulations of SB 1047. Plus SB 1047 also included stuff like whistleblower protections, and that's the kind of thing that could help policymakers make better decisions in the future.
I'm sympathetic to Musk being genuinely worried about AI safety. My problem is that one of his first actions after learning about AI safety was to found OpenAI, and that hasn't worked out very well. Not just due to Altman; even the "Open" part was a highly questionable goal. Hopefully Musk's future actions in this area would have positive EV, but still.
That might very well help, yes. However, two thoughts, neither at all well thought out:
- If the Trump administration does fight OpenAI, let's hope Altman doesn't manage to judo flip the situation like he did with the OpenAI board saga, and somehow magically end up replacing Musk or Trump in the upcoming administration...
- Musk's own track record on AI x-risk is not great. I guess he did endorse California's SB 1047, so that's better than OpenAI's current position. But he helped found OpenAI, and recently founded another AI company. There's a scenario where we just trade extinction risk from Altman's OpenAI for extinction risk from Musk's xAI.
You can't trust exit polls on demographics crosstabs. From Matt Yglesias on Slow Boring:
Over and above the challenge inherent in any statistical sampling exercise, the basic problem exit pollsters have is that they have no way of knowing what the electorate they are trying to sample actually looks like, but they do know who won the election. They end up weighting their sample to match the election results, which is good because otherwise you’d have polling error about the topline outcome, which would look absurd. But this weighting process can introduce major errors in the crosstabs.
For example, the 2020 exit poll sample seems to have included too many college- educated white people. That was a Biden-leaning demographic group, so in a conventional poll, it would have simply exaggerated Biden’s share of the total vote. But the exit poll knows the “right answer” for Biden’s aggregate vote share, so to compensate for overcounting white college graduates in the electorate, it has to understate Biden’s level of support within this group. That is then further offset by overstating Biden’s level of support within all other groups. So we got a lot of hot takes in the immediate aftermath of the election about Biden’s underperformance with white college graduates, which was fake, while people missed real trends, like Trump doing better with non-white voters.
To get the kind of data that people want exit polls to deliver, you actually need to wait quite a bit for more information to become available from the Census and the voter files about who actually voted. Eventually, Catalist produced its “What Happened in 2020” document, and Pew published its “Behind Biden’s 2020 Victory” report. But those take months to assemble, and unfortunately, conventional wisdom can congeal in the interim.
So just say no to exit poll demographic analysis!
Democrats lost by sufficient margins, and sufficiently broadly, that one can make the argument that any pet cause is a responsible or contributing factor. But that seems like entirely the wrong level of analysis to me. See this FT article called "Democrats join 2024’s graveyard of incumbents", which includes this striking graph:
So global issues like inflation and immigration seem like much better explanatory factors to me, rather than things like the Gaza conflict which IIRC never made the top 10 in any issue polling I saw.
(The article may be paywalled; I got the unlocked version by searching Bing for "ft every governing party facing election in a developed country".)
I would strongly bet against majority using AI tools ~daily (off the top of my head: <40% with 80% confidence?): adoption of any new tool is just much slower than people would predict, plus the LW team is liable to vastly overpredict this since you're from California.
That said, there are some difficulties with how to operationalize this question, e.g. I know some particularly prolific LW posters (like Zvi) use AI.
Oh, and the hover tooltip for the agreement votes is now bugged; IIRC hovering over the agreement vote number is supposed to give you some extra info just like with karma, but now it just explains what agreement votes are.
Comparing with this Internet Archive snapshot from Oct 6, both at 150% zoom, both in desktop Firefox in Windows 11: Comparison screenshot, annotated
- The new font seems... thicker, somehow? There's a kind of eye test you do at the optician where they ask you if the letters seem sharper or just thicker (or something), and this font reminds me of that. Like something is wrong with the prescription of my glasses.
- The new font also feels noticeably smaller in some way. Maybe it's the letter height? I lack the vocabulary to properly describe this. At the very least, the question mark looks noticeably weird. And e.g. in "t" and "p", the upper and lower parts of the respective letter are weirdly tiny.
- Incidentally there were also some other differences in the shape and alignment of UI elements (see the annotated screenshot).
Up to a few days ago, the comments looked good on desktop Firefox, Windows 11, zoom level 150%. Now I find them uncomfortable to look at.
I don't know what specific change is responsible, but ever since that change, for me the comments are now genuinely uncomfortable to read.
No, and "invested in the status quo" wasn't meant as a positive descriptor, either. This is describing a sociopath who's optimizing for success within a system, not one who overthrows the system. Not someone farsighted.
Even for serious intellectual conversations, something I appreciate in this kind of advice is that it often encourages computational kindness. E.g. it's much easier to answer a compact closed question like "which of these three options do you prefer" instead of an open question like "where should we go to eat for lunch". The same applies to asking someone about their research; not every intellectual conversation benefits from big open questions like the Hamming Question.
Stylistic feedback: this was well-written. I didn't notice any typos. However, there are a lot of ellipses (17 in 2k words), to the point that I found them somewhat distracting from the story. Also, these ellipses are all formatted as ". . .", i.e. as three periods and two spaces. So they take up extra room on the page due to the two extra spaces, and are rendered poorly at the end of a row. These latter issues don't occur when you instead use something like an ellipsis symbol ("…").
I enjoyed this. Thanks for writing it!
Worldwide sentiment is pretty against immigration nowadays. Not that it will happen, but imagine if anti-immigration sentiment could be marshalled into a worldwide ban on AI development and deployment. That would be a strange, strange timeline.
I'm very familiar with this issue; e.g. I regularly see Steam devs get hounded in forums and reviews whenever they dare increase their prices.
I wonder to which extent this frustration about prices comes from gamers being relatively young and international, and thus having much lower purchasing power? Though I suppose it could also be a subset of the more general issue that people hate paying for software.
The idea that popularity must be a sign of shallowness, and hence unpopularity or obscurity a sign of depth, sounds rather shallow to me. My attitude here is more like, if supposedly world-shattering insights can't be explained in relatively simple language, they either aren't that great, or we don't really understand them. Like in this Feynman quote:
Once I asked him to explain to me, so that I can understand it, why spin-1/2 particles obey Fermi-Dirac statistics. Gauging his audience perfectly, he said, "I'll prepare a freshman lecture on it." But a few days later he came to me and said: "You know, I couldn't do it. I couldn't reduce it to the freshman level. That means we really don't understand it."
Does this tag on Law-Thinking help? Or do you mean "lawful" as in Dungeons & Dragons (incl. EY's Planecrash fic), i.e. neutral vs. chaos vs. lawful?
This is far too cynical. Great writers (e.g. gwern, Scott Alexander, Matt Yglesias) can write excellent, technical posts and comments while still getting plenty attention.
Agreed, except for the small caveat of LLMs answers which can be easily verified as approximately correct. E.g. answers to math problems where the solution is hard but the verification is easy; or Python scripts you've tested yourself and whose output looks correct; or reformatted text (like plaintext -> BBCode) if it looks correct on a word diff website.
Incidentally, are there any LLM services which can already this kind of verification in specific domains?
What about "the Unconscious" vs. "Deliberation"?
This random article I found repeats the Tim Ferriss claim re: successful people who meditate, but I haven't checked where it appears in the book Tools of Titans:
In his best-selling book Tools of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers, Tim Ferriss interviews more than 200 executives, leaders, and world-class performers. He found that more than 80 percent practiced some form of mindfulness or meditation. Among some of the most successful people in the world, Ferriss uncovered the Most Consistent Pattern Of All, connecting world-class athletes with billionaire investors: meditation.
Other than that, I don't see why you'd relate meditation just to high-pressure contexts, rather than also conscientiousness, goal-directedness, etc. To me, it does also seem directly related to increasing philosophical problem-solving ability. Particularly when it comes to reasoning about consciousness and other stuff where an improved introspection helps most. Sam Harris would be kind of a posterchild for this, right?
What I can't see meditation doing is to provide the kind of multiple SD intelligence amplification you're interested in, plus it has other issues like taking a lot of time (though a "meditation pill" would resolve that) and potential value drift.
Re: successful people who meditate, IIRC in Tim Ferriss' book Tools of Titans, meditation was one of the most commonly mentioned habits of the interviewees.
I see. Then I'll point to my feedback in the other comment and say that the journalism post was likely better written despite your lower time investment. And that if you spend a lot of time on a post, I recommend spending more of that time on the title in particular, because of the outsized importance of a) getting people to click on your thing and b) having people understand what the thing you're asking them to click on even is. Here are two long comments on this topic.
Separately, when it comes to the success of stuff like blog posts, I like the framing in Ben Kuhn's post Searching for outliers, about the implications for activities (like blogging) whose impacts are dominated by heavy-tailed outcomes.
You may think the post is far more important and well informed, but if it isn't sufficiently clear, then maybe that didn't come across to your audience.
Separate feedback / food for thought: You mention that your post on mapping discussions is a summary of months of work, and that the second post took 5h to write and received far more karma. But did you spend at least 5h on writing the first one, too?
I haven't read either post, but maybe this problem reduces partly to more technical posts getting less views, and thus less karma? One problem with even great technical posts is that very few readers can evaluate that such a post is indeed great. And if you can't tell whether a post is accurate, then it can feel irresponsible to upvote it. Even if your technical post is indeed great, it's not clear that a policy of "upvote all technical posts I can't judge myself" would make great technical posts win in terms of karma.
A second issue I'm just noticing is that the first post contains lots of text-heavy screenshots, and that has a bunch of downsides for engagement. Like, the blue font in the first screenshot is very small and thus hard to read. I read stuff in a read-it-later app (called Readwise Reader), incl. with text-to-speech, and neither the app nor the TTS work great with such images. Also, such images usually don't respect dark mode on either LW or other apps. You can't use LW's inline quote comments. And so on and so forth. Screenshots work better in a presentation, but not particularly well in a full essay.
Another potential issue is that the first post doesn't end on "Tell me what you think" (= invites discussion and engagement), but rather with a Thanks section (does anyone ever read those?) and then a huge skimmable section of Full screenshots.
I'm also noticing that the LW version of the first post is lacking the footnotes from the Substack version.
EDIT: And the title for the second post seems way better. Before clicking on either post, I have an idea what the second one is about, and none whatsoever what the first one is about. So why would I even click on the latter? Would the readers you're trying to reach with that post even know what you mean by "mapping discussions"?
EDIT2: And when I hear "process", I think of mandated employee trainings, not of software solutions, so the title is misleading to me, too. Even "A new website for mapping discussions" or "We built a website for mapping discussions" would already sound more accurate and interesting to me, though I still wouldn't know what the "mapping discussions" part is about.
Just make it a full post without doing much if any editing, and link to this quick take and its comments when you do. A polished full post is better than an unpolished one, but an unpolished one is better than none at all.
Again, people block one another on social media for any number of reasons. That just doesn't warrant feeling alarmed or like your views are pathetic.
People should feel free to liberally block one another on social media. Being blocked is not enough to warrant an accusation of cultism.
I wish you'd made this a top-level post; the ultra-long quote excerpts in a comment made it ~unreadable to me. And you don't benefit from stuff like bigger font size or automatic table of contents. And scrolling to the correct position on this long comment thread also works poorly, etc.
Anyway, I read your rebuttals on the first two points and did not find them persuasive (thus resulting in a strong disagree-vote on the whole comment). So now I'm curious about the upvotes without accompanying discussion. Did others find this rebuttal more persuasive?
Something went wrong with this quote; it's full of mid-sentence line breaks.
You could use the ad & content blocker uBlock Origin to zap any addictive elements of the site, like the main page feed or the Quick Takes or Popular Comments. Then if you do want to access these, you can temporarily turn off uBlock Origin.
Incidentally, uBlock Origin can also be installed on mobile Firefox, and you can manually sync its settings across devices.
It's a good point that this is an extreme conflict of interest. But isn't one obvious reason why these companies design their incentive structures this way because they don't have the cash lying around to "simply" compensate people in cash instead?
E.g. here's ChatGPT:
Q: Why do companies like OpenAI partly pay their employees in equity?
ChatGPT: Companies like OpenAI partly pay employees in equity to align the employees' incentives with the company's long-term success. By offering equity, companies encourage employees to take ownership of the business's outcomes, fostering a sense of responsibility and motivation to contribute to its growth.Additionally, equity compensation can attract top talent by offering potential future financial rewards tied to the company’s success, especially for startups or high-growth firms. It also helps conserve cash, which is valuable for companies in early stages or those heavily investing in research and development.
It would indeed be good if AI safety research was paid for in cash, not equity. But doesn't this problem just reduce to there being far less interest in (and cash for) safety research relative to capability research than we would need for a good outcome?
The new design means that I now move my mouse cursor first to the top right, and then to the bottom left, on every single new post. This UI design is bad ergonomics and feels actively hostile to users.
Tip: To increase the chance that the LW team sees feature requests, it helps linking to them on Intercom.
My impression: The new design looks terrible. There's suddenly tons of pointless whitespace everywhere. Also, I'm very often the first or only person to tag articles, and if the tagging button is so inconvenient to reach, I'm not going to do that.
Until I saw this shortform, I was sure this was a Firefox bug, not a conscious design decision.
Thanks for crossposting this. I also figured it might be suitable for LW. Two formatting issues due to crossposting from Twitter: the double spaces occasionally turn into single spaces at the beginning of a line; and the essay would benefit a lot from headings and a TOC.
Eh, wasn't Arbital meant to be that, or something like it? Anyway, due to network effects I don't see how any new wiki-like project could ever reasonably compete with Wikipedia.
The article can now be found as a LW crosspost here.
I love the equivalent feature in Notion ("toggles"), so I appreciate the addition of collapsible sections on LW, too. Regarding the aesthetics, though, I prefer the minimalist implementation of toggles in Notion over being forced to have a border plus a grey-colored title. Plus I personally make extensive use of deeply nested toggles. I made a brief example page of how toggles work in Notion. Feel free to check it out, maybe it can serve as inspiration for functionality and/or aesthetics.
That's a fair rebuttal. The actor analogy seems good: an actor will behave more or less like Abraham Lincoln in some situations, and very differently in others: e.g. on movie set vs. off movie set, vs. being with family, vs. being detained by police.
Similarly, the shoggoth will output similar tokens to Abraham Lincoln in some situations, and very different ones in others: e.g. in-distribution requests of famous Abraham Lincoln speeches, vs. out-of-distribution requests like asking for Abraham Lincoln's opinions on 21st century art, vs. requests which invoke LLM token glitches like SolidGoldMagikarp, vs. unallowed requests that are denied by company policy & thus receive some boilerplate corporate response.