Posts
Comments
Noting that a nicer name that's just waiting to be had, in this context, is "Future of the Lightcone Institute" :)
Two notes:
- I think the title is a somewhat obscure pun referencing the old saying that Stanford was the "Harvard of the West". If one is not familiar with that saying, I guess some of the nuance is lost in the choice of term. (I personally had never heard that saying before recently, and I'm not even quite sure I'm referencing the right "X of the West" pun)
- habryka did have a call with Nick Bostrom a few weeks back, to discuss his idea for an "FHI of the West", and I'm quite confident he referred to it with that phrase on the call, too. Far as I'm aware Nick didn't particularly react to it with more than a bit humor.
See this: https://www.lesswrong.com/posts/CTBta9i8sav7tjC2r/how-to-hopefully-ethically-make-money-off-of-agi
Can you CC me too?
I work from the same office as John; and the location also happens to have dozens of LessWrong readers work there on a regular basis. We could probably set up an experiment here with many willing volunteers; and I'm interested in helping to make it happen (if it continues to seem promising after thinking more about it).
[Mod note: I edited out your email from the comment, to save you from getting spam email and similar. If you really want it there, feel free to add it back! :) ]
Mod here: most of the team were away over the weekend so we just didn't get around to processing this for personal vs frontpage yet. (All posts start as personal until approved to frontpage.) About to make a decision in this morning's moderation review session, as we do for all other new posts.
Jake himself has participated in both Zika and Shigella challenge trials.
Your civilisation thanks you 🫡
Cool idea and congrats on shipping! Installed it now and am trying it. One user feedback is I found the having-to-wait for replies a bit frictiony. Maybe you could stream responses in chunks? (I did for a gpt-to-slack app once. You just can't do letter-by-letter because you'll be rate limited).
If that's your belief, I think you should edit in a disclaimer to your TL;DR section, like "Gemini and GPT-4 authors report results close to or matching human performance at 95%, though I don't trust their methodology".
Also, the numbers aren't "non-provable": anyone could just replicate them with the GPT-4 API! (Modulo dataset contamination considerations.)
Humans achieve over 95% accuracy, while no model surpasses 50% accuracy. (2019)
A series on benchmarks does seem very interesting and useful -- but you really gotta report more recent model results than from 2019!! GPT-4 allegedly surpasses 95.3% on HellaSwag, making that initial claim in the post very misleading.
Ah! I investigated and realise what the bug is. (Currently, only the single dialogue main author can archive it, not the other authors.) Will fix!
You can go to your profile page and press the "Archive" icon, that appears when hovering to the right of a dialogue.
Yeah, I'm interested in features in this space!
Another idea is to implement a similar algorithm to Twitter's community votes: identify comments that have gotten upvotes by people who usually disagree with each other, and highlight those.
Oops, somehow didn't see there was actually a market baked into your question
I'd also be interested in "Will there be a publicly revealed instance of a pause in either deployment or development, as a result of a model scoring High or Critical on a scorecard, by Date X?"
Made a Manifold market
Might make more later, and would welcome others to do the same! (I think one could ask more interesting questions than the one I asked above.)
Heads up, we support latex :)
Use Ctrl-4 to open the LaTex prompt (or Cmd-4 if you're on a Mac). Open a centred LaTex popup using Ctrl-M (aka Cmd-M). If you’ve written some maths in normal writing and want to turn it into LaTex, if you highlight the text and then hit the LaTex editor button it will turn straight into LaTex.
https://www.lesswrong.com/posts/xWrihbjp2a46KBTDe/editor-mini-guide
I feel pretty frustrated at how rarely people actually bet or make quantitative predictions about existential risk from AI.
Without commenting on how often people do or don't bet, I think overall betting is great and I'd love to see more it!
I'm also excited how much of it I've seen since Manifold started gaining traction. So I'd like to give a shout out to LessWrong users who are active on Manifold, in particular on AI questions. Some I've seen are:
Good job everyone for betting on your beliefs :)
There are definitely more folks than this: feel free to mention more folks in the comments who you want to give kudos to (though please don't dox anyone who's name on either platforms is pseudonymous and doesn't match the other).
LLM summaries aren't yet non-hallucinatory enough that we've felt comfortable putting them on the site, but we have run some internal experiments on this.
Yep. Will set myself a reminder for 6 months from now!
They get a list of topics I've written/commented on, but so far as I can see I don't have any way to see that list
Yeah, users can't currently see that list for themselves (unless of course you create a new account, upvote yourself, and then look at the matching page through that account!).
However, the SQL for this is actually open source, in the function getUserTopTags: https://github.com/ForumMagnum/ForumMagnum/blob/master/packages/lesswrong/server/repos/TagsRepo.ts
What we show is "The tags a user commented on in the last 3 years, sorted by comment count, and excluding a set of tags that I deemed as less interesting to show to other users, for example because they were too general (World Modeling, ...), too niche (Has Diagram, ...) or too political (Drama, LW Moderation, ...)."
(Sidenote, but you probably want to fix it: https://bristolaisafety.org/ appears to be down, as of the posting of this message)
I use Cursor, Copilot, sometimes GPT-4 in the chat, and also Hex.tech's built-in SQL shoggoth.
I would say the combination of all those helps a huge amount, and I think has been key in allowing me to go from pre-junior to junior dev in the last few months. (That is, from not being able to make any site changes without painstaking handholding, to leading and building a lot of the Dialogue matching feature and associated stuff (I also had a lot of help from teammates, but less in a "they need to carry things over the finish line for me", and more "I'm able to build features of this complexity, and they help out as collaborators")).
But also, PR review and advise from senior devs on the team has also been key, and much appreciated.
Yeah, that reminds me of this thread https://www.lesswrong.com/posts/P32AuYu9MqM2ejKKY/so-geez-there-s-a-lot-of-ai-content-these-days
In the poll most people (31) disagreed with the claim John is defending here, but I'm tagging the additional few (3) who agreed with it @Charlie Steiner @Oliver Sourbut @Thane Ruthenis
Interested to hear your guys' reasons, in addition to John's above!
One of my takeaways of how the negotiations went is that it seems sama is extremely concerned with securing access to lots of compute, and that the person who ultimately got their way was the person who sat on the compute.
The "sama running Microsoft" idea seems a bit magical to me. Surely the realpolitik update here should be: power lies in the hands of those with legal voting power, and those controlling the compute. Sama has neither of those things at Microsoft. If he can be fired by a board most people have never heard of, then for sure he can get fired by the CEO of Microsoft.
People seem to think he is somehow a linchpin of building AGI. Remind me... how many of OpenAI's key papers did he coauthor? Paul Graham says if you dropped him into an island of cannibals he would be king in 5 years. Seems plausible. Paul Graham did not say he would've figured out how to engineer a raft good enough to get him out of there. If there were any Manifold markets on "Sama is the linchpin to building AGI", I would short them for sure.
We already have strong suspicion from the open letter vote counts there's a personality cult around Sama at OpenAI (no democratic election ever ends with a vote of 97% in favor). It also makes sense people in the LessWrong sphere would view AGI as the central thing to the future of the world and on everyone's minds, and thus fall in the trap of also viewing Sama as the most important thing at Microsoft. (Question to ask yourself about such a belief: who does it benefit? And is that beneficiary also a powerful agent deliberately attempting to shape narratives to their own benefit?)
Satya Nadella might have a very different perspective than that, on what's important for Microsoft and who's running it.
It would be a promising move, to reduce existential risk, for Anthropic to take over what will remain of OpenAI and consolidate efforts into a single project.
EAs need to aggressively recruit and fund additional ambitious Sam's, to ensure there's one to sacrifice for Samsgiving November 2024.
New leadership should shut down OpenAI.
If there was actually a spooky capabilities advance that convinced the board that drastic action was needed, then the board's actions were on net justified, regardless of what other dynamics were at play and whether cooperative principles were followed.
Open-ended: A dialogue between an OpenAI employee who signed the open letter, and someone outside opposed to the open letter, about their reasoning and the options.
(Up/down-vote if you're interested in reading discussion of this. React paperclip if you have an opinion and would be up for dialoguing)
If the board did not abide by cooperative principles in the firing nor acted on substantial evidence to warrant the firing in line with the charter, and nonetheless were largely EA motivated, then EA should be disavowed and dismantled.
The events of the OpenAI board CEO-ousting on net reduced existential risk from AGI.
Open-ended: If >50% of employees end up staying at OpenAI: how, if at all, should OpenAI change its structure and direction going forwards?
(Up/down-vote if you're interested in reading discussion of this. React paperclip if you have an opinion and would be up for discussing)
Open-ended: If >90% of employees leave OpenAI: what plan should Emmett Shear set for OpenAI going forwards?
(Up/down-vote if you're interested in reading discussion of this. React paperclip if you have an opinion and would be up for discussing)
It is important that the board release another public statement explaining their actions, and providing any key pieces of evidence.
Yeah I'm gonna ship a fix to that now. No more monologues!
(If others want this too, upvote @faul_sname's comment as a vote! It would be easy to build, most of my uncertainty is in how it would change the experience)
Those are some interesting papers, thanks for linking.
In the case at hand, I do disagree with your conclusion though.
In this situation, the most a user could find out is who checked them in dialogues. They wouldn't be able to find any data about checks not concerning themselves.
If they happened to be a capable enough dev and were willing to go through the schleps to obtain that information, then, well... we're a small team and the world is on fire, and I don't think we should really be prioritising making Dialogue Matching robust to this kind of adversarial cyber threat for information of comparable scope and sensitivity! Folks with those resources could probably uncover all kinds of private vote data already, if they wanted to.
On data privacy
Here's some quick notes on how I think of LessWrong user data.
Any data that's already public -- reacts, tags, comments, etc -- is fair game. It just seems nice to do some data science and help folks uncover interesting patterns here.
On the other side of the spectrum, me and the team generally never look at users' up and downvotes, except in cases where there's strong enough suspicion of malicious voting behavior (like targeted mass downvoting).
Then there's stuff in the middle. Like, what if we tell a user "you and this user frequently upvote each other"? That particular example currently feels like it reveals too much private data. As another example, the other day me and a teammate had a discussion of whether, on the matchmaking page, we could show people recently active users who already checked you, to make it more likely you'd find a match. We tenatively postulated it would be fine to do this as long as seeing a name on your match page gave no more than like a 5:1 update about those people having checked you. We sketched out some algorithms to implement this, that would also be stable under repeated refreshing and similar. (We haven't implemented the algorithm nor the feature yet.)
So my general take on features "in the middle" is for now to treat them on a case by case basis, with some principles like "try hard to avoid revealing anything that's not already public, and if doing so, try to leave plausible deniability bounded by some number of leaked bits, only reveal metadata or aggregate data, reveal it only to one other or a smaller set of users, think about whether this is actually a piece of info that seems high or low stakes, and see if you can get away with just using data from people who opted in to revealing it".
I can't quite tell how that's different from embeddedness. (Also if you have links to other places it's explained feel free to share them.)
bounded, embedded, enactive, nested.
I know about boundedness, embededness, and I guess nestedness is about hierarchical agents.
But what's enactive?
Space flight doesn't involve a 100 percent chance of physical death
I think historically folks have gone to war or on other kinds of missions that had death rates of like, at least, 50%. And folks, I dunno, climb Mount Everest, or figured out how to fly planes before they could figure out how to make them safe.
Some of them were for sure fanatics or lunatics. But I guess I also think there's just great, sane, and in many ways whole, people, who care about things greater than their own personal life and death, and are psychologically consituted to be willing to pursue those greater things.
Hm, here's a test case:
GPT4 can't solve IMO problems. Now take an IMO gold medalist about to walk into their exam, and upload them at that state into an Em without synaptic plasticity. Would the resulting upload would still be able to solve the exam at a similar level as the full human?
I don't have a strong belief, but my intuition is that they would. I recall once chatting to @Neel Nanda about how he solved problems (as he is in fact an IMO gold winner), and recall him describing something that to me sounded like "introspecting really hard and having the answers just suddenly 'appear'..." (though hopefully he can correct that butchered impression)
Do you think such a student Em would or would not perform similarly well in the exam?
I have an important appointment this weekend that will take up most of my time, but hope to come back to this after that, but wanted to quickly note:
but definitely are not back propagation.
Why?
Last time I looked into this 6 years ago seemed like an open question and it could plausibly be backprop or at least close enough: https://www.lesswrong.com/posts/QWyYcjrXASQuRHqC5/brains-and-backprop-a-key-timeline-crux
3yrs ago Daniel Kokotajlo shared some further updates in that direction: https://www.lesswrong.com/posts/QWyYcjrXASQuRHqC5/brains-and-backprop-a-key-timeline-crux?commentId=RvZAPmy6KStmzidPF
Separately, I'm kind of awed by the idea of an "uploadonaut": the best and brightest of this young civilisation, undergoing extensive mental and research training to have their minds able to deal with what they might experience post upload, and then courageously setting out on a dangerous mission of crucial importance for humanity.
(I tried generating some Dall-E 1960's style NASA recruitment posters for this, but they didn't come out great. Might try more later)
Noting that I gave this a weak downvote as I found this comment to be stating many strong claims but without correspondingly strong (or sometimes not really any) arguments. I am still interested in the reasons you believe these things though (for example, like a fermi on inferece cost at runtime).
I don't think you're going to get a lot of volunteers for destructive uploading (or actually even for nondestructive uploading). Especially not if the upload is going to be run with limited fidelity. Anybody who does volunteer is probably deeply atypical and potentially a dangerous fanatic.
Seems falsified by the existence of astronauts?
https://manifold.markets/ZviMowshowitz/will-google-have-the-best-llm-by-eo?r=SmFjb2JMYWdlcnJvcw
Reference class: I'm old enough to remember the founding of the Partnership on AI. My sense from back in the day was that some (innocently misguided) folks wanted in their hearts for it to be an alignment collaboration vehicle. But I think it's decayed into some kind of epiphenomenal social justice thingy. (And for some reason they have 30 staff. I wonder what they all do all day.)
I hope Frontier Model Forum can be something better, but my hopes ain't my betting odds.