Posts
Comments
Yeah, IMO we should just add a bunch of functionality for integrating alignment forum stuff more with academic things. It’s been on my to do list for a long time.
I think "full visibility" seems like the obvious thing to ask for, and something that could maybe improve things. Also, preventing you from selling your products to the public, and basically forcing you to sell your most powerful models only to the government, gives the government more ability to stop things when it comes to it.
I will think more about this, I don't have any immediate great ideas.
If the project was fueled by a desire to beat China, the structure of the Manhattan project seems unlikely to resemble the parts of the structure of the Manhattan project that seemed maybe advantageous here, like having a single government-controlled centralized R&D effort.
My guess is if something like this actually happens, it would involve a large number of industry subsidies, and would create strong institutional momentum that even when things got dangerous, to push the state of the art forward, and in as much as there is pushback, continue dangerous development in secret.
In the case of nuclear weapons the U.S. really went very far under the advisement of Edward Teller, so I think the outside view here really doesn't look good:
I don't remember ever adjudicating this, but my current intuition, having not thought about it hard, is that I don't see a super clear line here (like, in a moderation dispute I can imagine judging either way depending on the details).
The Truman Show: Great depiction of crisis of faith, noticing your confusion, and generally is about figuring out the truth.
Most relevant sequence posts: Crisis of Faith, Lonely Dissent
Going by today's standards, we should have banned Gwern in 2012.
(I don't understand what this is referring to)
Indeed. I fixed it. Let's see whether it repeats itself (we got kind of malformed HTML from the RSS feed).
Update: I have now cross-referenced every single email for accuracy, cleaned up and clarified the thread structure, and added subject lines and date stamps wherever they were available. I now feel comfortable with people quoting anything in here without checking the original source (unless you are trying to understand the exact thread structure of who was CC'd and when, which was a bit harder to compress into a linear format).
(For anyone curious, the AI transcription and compilation made one single error, which is that it fixed a typo in one of Sam's messages from "We did this is a way" to "We did this in a way". Honestly, my guess is any non-AI effort would have had a substantially higher error rate, which was a small update for me on the reliability of AI for something like this, and also makes the handwringing about whether it is OK post something like this feel kind of dumb. It also accidentally omitted one email with a weird thread structure.)
FWIW, my best guess is the document contains fewer errors than having a human copy-paste things and stitch it together. The errors have a different nature to them, and so it makes sense to flag them, but like, I started out with copy-pasting and OCR, and that did not actually have an overall lower error rate.
If other people have to check it before they quote it, why is it OK for you not to check it before you post it?
Because I said prominently at the top that I used AI assistance for it. Of course, feel free to do the same.
Fixed! That specific response had a very weird thread structure, so makes sense the AI I used got confused. Plausible something else is still missing, though I think I've now read through all the original PDFs and didn't see anything new.
What do you mean by "applied research org"? Like, applied alignment research?
A bunch of very interesting emails between Elon, Sam Altman, Ilya and Greg were released (I think in some legal proceedings, but not sure). It would IMO be cool for someone to gather them all and do some basic analysis of them.
This was a really good analysis of a bunch of election stuff that I hadn't seen presented clearly like this anywhere else. If it wasn't about elections and news I would curate it.
Not sure what you mean. The API continues to exist (and has existed since the beginning of LW 2.0).
I think the comment more confirms than disconfirms John's comment (though I still think it's too broad for other reasons). OP "funding" something historically has basically always meant recommending a grant to GV. Luke's language to me suggests that indeed the right of center grants are no longer referred to GV (based on a vague vibe of how he refers to funders in plural).
OP has always made some grant recommendations to other funders (historically OP would probably describe those grants as "rejected but referred to an external funder"). As Luke says, those are usually ignored, and OP's counterfactual effect on those grants is much less, and IMO it would be inaccurate to describe those recommendations as "OP funding something". As I said in the comment I quote in the thread, most OP staff would like to fund things right of center, but GV does not seem to want to, as such the only choice OP has is to refer them to other funders (which sometimes works, but mostly doesn't).
As another piece of evidence, when OP defunded all the orgs that GV didn't want to fund anymore, the communication emails that OP sent said that "Open Philanthropy is exiting funding area X" or "exiting organization X". By the same use of language, yes, it seems like OP has exited funding right-of-center policy work.
(I think it would make sense to taboo "OP funding X" in future conversations to avoid confusion, but also, I think historically it was very meaningfully the case that getting funded by GV is much better described as "getting funded by OP" given that you would never talk to anyone at GV and the opinions of anyone at GV would basically have no influence on you getting funded. Things are different now, and in a meaningful sense OP isn't funding anyone anymore, they are just recommending grants to others, and it matters more what those others think then what OP staff thinks)
One of these types of orgs is developing a technology with the potential to kill literally all of humanity. The other type of org is funding research that if it goes badly mostly just wasted their own money. Of course the demands for legibility and transparency should be different.
My best guess this is false. As a quick sanity-check, here are some bipartisan and right-leaning organizations historically funded by OP:
- FAI leans right. https://www.openphilanthropy.org/grants/foundation-for-american-innovation-ai-safety-policy-advocacy/
- Horizon is bipartisan https://www.openphilanthropy.org/grants/open-philanthropy-technology-policy-fellowship-2022/ .
- CSET is bipartisan https://www.openphilanthropy.org/grants/georgetown-university-center-for-security-and-emerging-technology/ .
- IAPS is bipartisan. https://www.openphilanthropy.org/grants/page/2/?focus-area=potential-risks-advanced-ai&view-list=false, https://www.openphilanthropy.org/grants/institute-for-ai-policy-strategy-general-support/
- RAND is bipartisan. https://www.openphilanthropy.org/grants/rand-corporation-emerging-technology-fellowships-and-research-2024/.
- Safe AI Forum. https://www.openphilanthropy.org/grants/safe-ai-forum-operating-expenses/
- AI Safety Communications Centre. https://www.openphilanthropy.org/grants/effective-ventures-foundation-ai-safety-communications-centre/ seems to lean left.
Of those, I think FAI is the only one at risk of OP being unable to fund them, based on my guess of where things are leaning. I would be quite surprised if they defunded the other ones on bipartisan grounds.
Possibly you meant to say something more narrow like "even if you are trying to be bipartisan, if you lean right, then OP is substantially less likely to fund you" which I do think is likely true, though my guess is you meant the stronger statement, which I think is false.
Curious whether this is a different source than me. My current best model was described in this comment, which is a bit different (and indeed, my sense was that if you are bipartisan, you might be fine, or might not, depending on whether you seem more connected to the political right, and whether people might associate you with the right):
Yep, my model is that OP does fund things that are explicitly bipartisan (like, they are not currently filtering on being actively affiliated with the left). My sense is in-practice it's a fine balance and if there was some high-profile thing where Horizon became more associated with the right (like maybe some alumni becomes prominent in the republican party and very publicly credits Horizon for that, or there is some scandal involving someone on the right who is a Horizon alumni), then I do think their OP funding would have a decent chance of being jeopardized, and the same is not true on the left.
Another part of my model is that one of the key things about Horizon is that they are of a similar school of PR as OP themselves. They don't make public statements. They try to look very professional. They are probably very happy to compromise on messaging and public comms with Open Phil and be responsive to almost any request that OP would have messaging wise. That makes up for a lot. I think if you had a more communicative and outspoken organization with a similar mission to Horizon, I think the funding situation would be a bunch dicier (though my guess is if they were competent, an organization like that could still get funding).
More broadly, I am not saying "OP staff want to only support organizations on the left". My sense is that many individual OP staff would love to fund more organizations on the right, and would hate for polarization to occur, but that organizationally and because of constraints by Dustin, they can't, and so you will see them fund organizations that aim for more engagement with the right, but there will be relatively hard lines and constraints that will mostly prevent that.
If it is true that OP has withdrawn funding from explicitly bipartisan orgs, even if not commonly associated with the right, then that would be an additional update for me, so am curious whether this is mostly downstream of my interpretations or whether you have additional sources.
Huh o1 and the latest Claude were quite huge advances to me. Basically within the last year LLMs for coding went to "occasionally helpful, maybe like a 5-10% productivity improvement" to "my job now is basically to instruct LLMs to do things, depending on the task a 30% to 2x productivity improvement".
(and Anthropic has a Usage Policy, with exceptions, which disallows weapons stuff — my guess is this is too strong on weapons).
I think usage policies should not be read as commitments, and so I think it would be reasonable to expect that Anthropic will allow weapon development if it becomes highly profitable (and in contrast to other things Anthropic has promised, to not be interpreted as a broken promise when they do so).
FWIW, as a common critic of Anthropic, I think I agree with this. I am a bit worried about engaging with the DoD being bad for Anthropic's epistemics and ability to be held accountable by the government and public, but I think the basics of engaging on defense issues seems fine to me, and I don't think risks from AI route basically at all through AI being used for building military technology, or intelligence analysis.
Ah, we should maybe font-subset some system font for that (same as what we did for greek characters). If someone gives me a character range specification I could add it.
"stop reading here if you don't want to be spoiled."
(I added that sentence based on Jonathan Claybrough's comment, feel free to suggest an alternative one)
I watched the video and didn't see any stats from their own experiment. Do you have a frame or a section?
(Most people in AI Alignment work at scaling labs and are therefore almost exclusively working on LLM alignment. That said, I don't actually know what it means to work on LLM alignment over aligning other systems, it's not like we have a ton of traction on LLM alignment, and most techniques and insights seem general enough to not be conditional specifically on LLMs)
Note: I added some spoiler warnings (given the one comment complaining). I don't feel strongly, so feel free to revert
It was "advice" just not... "investment advice"? I do admit I do not understand the proper incantations and maybe should study them more.
Who says those things? That doesn't really sound like something that people say. Like, I think there are real arguments about why LLM agents might not be the most likely path to AGI, but "they are still pretty dumb, therefore that's not a path to AGI" seems like obviously a strawman, and I don't think I've ever seen it (or at least not within the last 4 years or so).
Yep, it seems like pretty standard usage to me (and IMO seems conceptually fine, despite the fact that "genetic" means something different, since for some reason using "memetic" in the same way feels very weird or confused to me, like I would almost never say "this has memetic origin")
Great idea!
The "could" here is (in context) about "could not get funding from modern OP". The whole point of my comment was about the changes that OP underwent. Sorry if that wasn't as clear, it might not be as obvious to others that of course OP was very different in the past.
Like, here's a sanity-check: suppose you must convince a specific Creationist that the AGI Risk is real. Do you need to argue them out of Creationism in order to do so?
My guess is no, but also, my guess is we will probably still have better comms if I err on the side of explaining things how they come naturally to me, and entangled with the way I came to adopt a position, and then they can do a bunch of the work of generalizing. Of course, if something is deeply triggering or mindkilly to someone, then it's worth routing, but it's not like any analogy with evolution is invalid from the perspective of someone who believes in Creationism. Yes, some of the force of such an analogy would be lost, but most of it comes from the logical consistency, not the empirical evidence.
In 2023/2024 OP drastically changed it's funding process and priorities (in part in response to FTX, in part in response to Dustin's preferences). This whole conversation is about the shift in OPs giving in this recent time period.
See also: https://forum.effectivealtruism.org/posts/foQPogaBeNKdocYvF/linkpost-an-update-from-good-ventures
No, I think this kind of very naive calculation does predictably result in worse arguments propagating, people rightfully dismissing those bad arguments (because they are not entangled with the real reasons why any of the people who have thought about the problem have formed beliefs on an issue themselves), and then ultimately the comms problem getting much harder.
I am in favor of people thinking hard about these issues, but I think exactly this kind of naive argument are in an uncanny valley where I think your comms gets substantially worse.
Yeah, I agree with a lot of this in-principle. But I think the specific case of avoiding saying anything that might have something to do with evolution is I think a pretty wrong take, on this dimension, trying to communicate clearly.
Seems like a mistake! Agree it's not uncommon to use them less, though my guess (with like 60% confidence) is that the majority of authors on LW use them daily, or very close to daily.
First of all, even taking what Gwern says there at face value, how many of the posts here that are written “with AI involvement” would you say actually are checked, edited, etc., in the rigorous way which Gwern describes? Realistically?
My guess is very few people are using AI output directly (at least the present it's pretty obvious as their writing is kind of atrocious). I do think most posts probably involved people talking to an LLM through their thoughts, or ask for some editing help, or ask some factual questions. My guess is basically 100% of those went through the kind of process that Gwern was describing here.
Do you not use LLMs daily? I don't currently find them out-of-the-box useful for editing, but find them useful for a huge variety of tasks related to writing things.
I think it would be more of an indictment of LessWrong if people somehow didn't use them, they obviously increase my productivity at a wide variety of tasks, and being an early-adopter of powerful AI technologies seems like one of the things that I hope LessWrong authors excell at.
In general, I think Gwern's suggested LLM policy seems roughly right to me. Of course people should use LLMs extensively in their writing, but if they do, they really have to read any LLM writing that makes it into their post and check what it says is true:
I am also fine with use of AI in general to make us better writers and thinkers, and I am still excited about this. (We unfortunately have not seen much benefit for the highest-quality creative nonfiction/fiction or research, like we aspire to on LW2, but this is in considerable part due to technical choices & historical contingency, which I've discussed many times before, and I still believe in the fundamental possibilities there.) We definitely shouldn't be trying to ban AI use per se.
However, if someone is posting a GPT-4 (or Claude or Llama) sample which is just a response, then they had damn well better have checked it and made sure that the references existed and said what the sample says they said and that the sample makes sense and they fixed any issues in it. If they wrote something and had the LLM edit it, then they should have checked those edits and made sure the edits are in fact improvements, and improved the improvements, instead of letting their essay degrade into ChatGPTese. And so on.
And also, I do not personally want to be running into any writing that AI had a hand in.
(My guess is the majority of posts written daily on LW are now written with some AI involvement. My best guess is most authors on LessWrong use AI models on a daily level, asking factual questions, and probably also asking for some amount of editing and writing feedback. As such, I don't think this is a coherent ask.)
I don't think this kind of surface-level naive popularity optimization gives rise to a good comms strategy. Evolution is true, and mostly we should focus on making arguments based on true premises.
On mobile we by default use a markdown editor, so you can use markdown to format things.
Interesting, thanks! Checking an older version of Gill Sans probably wouldn't have been something would have thought to do, so your help is greatly appreciated.
I'll experiment some with getting Gill Sans MT Pro.
Sure, I was just responding to this literal quote:
Couldn't you please just set the comment font to the same as the post font?
(My model of Daniel thinks the AI will likely take over, but probably will give humanity some very small fraction of the universe, for a mixture of "caring a tiny bit" and game-theoretic reasons)
The "Recommended" tab filters out read posts by default. We never had much demand for showing recently-sorted posts while filtering out only ones you've read, but it wouldn't be very hard to build.
Not sure what you mean by "load more at once". We could add a whole user setting to allow users to change the number of posts on the frontpage, but done consistently that would produce a ginormous number of user settings for everything, which would be a pain to maintain (not like, overwhelmingly so, but I would be surprised if it was worth the cost).
We previously had Calibri for Windows (indeed a very popular Windows system font). Gill Sans (which we now ship to all operating systems) is a quite popular MacOS and iOS system font. I currently think there are some weird rendering issues on Windows, but if that's fixed, my guess is you would get used to it quickly enough. Gill Sans is not a rare font on the internet.
Yep, definitely a bug. Should be fixed soon.
We have done lots of users interviews over the years! Fonts are always polarizing, but people have a strong preference for sans serifs at small font sizes (and people prefer denser comment sections, though it's reasonably high variance).
Plausible we might want to revert to Calibri on Windows, but I would like to make Gill Sans work. Having different font metrics on different devices makes a lot of detailed layout work much more annoying.
Curious if you can say more about the nature of discomfort. Also curious whether fellow font optimizer @Said Achmiz has any takes, since he has been helpful here in the past, especially on the "making things render well on Windows" side.