I suggest making bets for insignificant amounts of money, like a dollar, or half a dollar, if the money causes you to be nervous, as the information about your odds is still really useful, especially for yourself (e.g. to update on if you're wrong).
As a site admin, I don't think I have any concerns about the legal reasons. I obviously feel philosophically good about sci-hub, I don't know the others so well (though I'd be a bit surprised if one of them was bad enough to remove).
I find the "A / B" to be fairly ugly in tag naming, and think that even "A (and B)" is more attractive.
My guess is that we should just go with Bayesianism, because it feels more general? Like, if it's a wiki page later, the page on Bayesianism would naturally have a section on Bayes' Theorem that explains it.
I think I understand why you think the term is misleading, though I still think it's helpfully concrete and not inaccurate. I have a bunch of work to get back to, not planning to follow up on this more right now. Welcome to ping me via PM if you'd like me to follow up another day.
Can you say more about the distinction between enemy AIs and misaligned mesa-optmizers? I feel like I don't have a concrete grasp of what the difference would look like in, say, an AI system in charge of a company.
I guess it's an odd boundary. Insofar as it's an accident, the accident is "we created agents that had different goals and stole all of our resources". In the world Paul describes, there'll be lots of powerful agents around, and we'll be cooperating and working with them (and plausibly talking with them via GPT-style tech), and at some point the agents we've been cooperating with will have lots of power and start defecting on us.
I called it that because in Paul's post, he gave examples of agents that can do damage to us – "organisms, corrupt bureaucrats, companies obsessed with growth" – and then argued that ML systems will be added to that list, things like "an automated corporation may just take the money and run". This is a world where we have built other agents who work for us and help us, and then they suddenly take adversarial action (or alternatively the adversarial action happens gradually and off-screen and we only notice when it's too late). The agency feels like a core part of it.
So I feel like accident and adversarial action are sort of the same thing in this case.
Then there are more interesting proposals that require being able to fully inspect the cognition of an ML system and have it be fully introspectively clear and then use it as a building block to build stronger, competitive, corrigible and aligned ML systems. I think this is an accurate description of Iterated Amplification + Debate as Zhu says in section 1.1.4 of his FAQ, and I think something very similar to this is what Chris Olah is excited about re: microscopes about reverse engineering the entire codebase/cognition of an ML system.
I don't deny that there are lot of substantive and fascinating details to a lot of these proposals and that if this is possible we might indeed solve the alignment problem, but I think that is a large step that sounds from some initial perspectives kind of magical. And don't forget that at the same time we have to be able to combine it in a way that is competitive and corrigible and aligned.
I feel like it's one reasonable position to call such proposals non-starters until a possibility proof is shown, and instead work on basic theory that will eventually be able to give more plausible basic building blocks for designing an intelligent system. I feel confident that certain sorts of basic theories are definitely there to be discovered, that there are strong intuitions about where to look, they haven't been worked on much, and that there is low-hanging fruit to be plucked. I think Jessica Taylor wrote about a similar intuition about why she moved away from ML to do basic theory work.
Pretty sure OpenPhil and OpenAI currently try to fund plans that claim to look like this (e.g. all the ones orthonormal linked in the OP), though I agree that they could try increasing the financial reward by 100x (e.g. a prize) and see what that inspires.
As I understand it, the high level summary (naturally Eliezer can correct me) is that (a) corrigible behaviour is very unnatural and hard to find (most nearby things in mindspace are not in equilibrium and will move away from corrigibility as they reflect / improve), and (b) using complicated recursive setups with gradient descent to do supervised learning is incredible chaotic and hard to manage, and shouldn't be counted on working without major testing and delays (i.e. could not be competitive).
There's also some more subtle and implicit disagreement that's not been quite worked out but feeds into the above, where a lot of the ML-focused alignment strategies contain this idea that we will be able to expose ML system's thought processes to humans in a transparent and inspectable way, and check whether it has corrigibility, alignment, and intelligence, then add them up together like building blocks. My read is that Eliezer finds this to be an incredible claim that would be a truly dramatic update if there was a workable proposal for it, whereas many of the proposals above take it more as a starting assumption that this is feasible and move on from there to use it in a recursive setup, then alter the details of the recursive setup in order to patch any subsequent problems.
For more hashed out details on that subtle disagreement, see the response post linked above which has several concrete examples.
I think I want John to feel able to have this kind of conversation when it feels fruitful to him, and not feel obligated to do so otherwise. I expect this is the case, but just wanted to make it common knowledge.
Note that I have no idea what math to do here. The actual thing I'd do is try to figure out the reference class of 'things that could be major disasters', look how well the situation around them was handled (carefully, coordinated, sloppily, clumsily, etc) and then after getting close to the territory in that way, reflect loads on anthropics and wtf to update about it. I don't know how to really do math on either.
(The LW team hired a professional editor to make line edits to the above post. Buck went through and accepted/rejected the edits, and we updated the post before curating, so that the many people on the curated email list got the edited version. This was an experiment, we may do more of this.)
Overall I continue to think it's very healthy to write down your major updates and mistakes, and I'm curating this for similar reasons I curated Buck's last piece. It feels to me like records of these updates allow one to do something like actually update on the highest level. Related, something about the insights in this piece feel very 'hard-earned', in a way where I think a person can only write this kind of post after a lot of time and effort and cycles of thought have passed. (Ray's Sunset at Noon feels similar to me, where I think that Ray will only be able to write that specific kind of post every 5 years or so, and shouldn't try to write another one much faster.)
And I learned a ton from this post. Everything is explained very simply and concisely, no section is written in a way that signalled to me "You should be afraid of math and experts" which happens to me often for stuff about econ, and I think I basically understood all your updates. I mean, I need to read Friedman's essay in full and think on it, and I also need to actually use the insight about there being many dimensions on which markets compete other than price (e.g. quality, employee work standards, etc) to feel like it's truly-part-of-me, but they were at least very effective pointers to ideas that I want to use when thinking about these questions.
I also really like your predictions at the end and meta-level updates.
Something about this feels compelling... I need to do some empiricism to understand what my counterfactuals are. By the time a real human gets to the 5-and-10 problem they’ve done enough, but I’d you just appear in a universe and it‘s your first experience, I’m not too surprised you need to actually check these fundamentals.
(I’m not sure if this actually matches up philosophically with the logical inductors.)
I guess I feel like we're at an event for the physics institute and someone's being nerdy/awkward in the corner, and there's a question of whether or not we should let that person be or whether we should publicly tell them off / kick them out. I feel like the best people there are a bit nerdy and overly analytical, and that's fine, and deciding to publicly tell them off is over the top and will make all the physicists more uptight and self-aware.
To pick a very concrete problem we've worked on: the AI alignment problem is totally taken seriously by very important people who are also aware that LW is weird, but Eliezer goes on the Sam Harris podcast and Bostrom is invited by the UK government to advise and so on and Karnofsky's got a billion dollars and focusing to a large part on the AI problem. We're not being defined by this odd stuff, and I think we don't need to feel like we are. I expect as we find similar concrete problems or proposals, we'll continue to be taken very seriously and have major success.
Curated. I resonate with many of the examples in this, and have made a lot of similar mistakes (including before I met the rationalist and the EA communities). This essay described those thinking patterns and their pathologies pretty starkly and helps me look at them directly. I expect to reference this post in future conversations when people I know are making big decisions, especially where I feel they're not understanding how much they're sacrificing for this one decision, with so many decisions still ahead (i.e. your framing about policies vs one-shot).
One hesitation I have is that, while I strongly inside-view connect to this post, perhaps I am typical-minding on how much other people share these thinking patterns, and people might find it a bit uncomfortable to read. But I do think a lot of people I respect have thoughts like this, so expect it will strongly help a lot of people to read it. (Also it's a well-written essay and quite readable.)
Lol, I wasn't aware it used to talk about a 'mating plan' everywhere, which I think is amusing and I agree sounds kind of socially oblivious.
I really think that we shouldn't optimise for people not joining us because of weak, negative low-level associations. I think the way that you attract good people is by strong wins, not because of not hearing any bad-associations. Nassim Taleb is an example I go to here, where the majority of times I hear about him I think he's being obnoxious or aggressive, and often just disagree with what he says, but I don't care too much about reading that because occasionally he's saying something important that few others are.
Elon Musk is another example, where the majority of coverage I see of him his negative, and sometimes he writes kinda dumb tweets, but he gives me hope for humanity and I don't care about the rest of the stuff. Had I seen the news coverage first, I'd still have been mindblown by seeing the rockets land and changed my attitude towards him. I could keep going on with examples... new friends occasionally come to me saying they read a review of HPMOR saying Harry's rude and obnoxious, and I respond "you need to learn that's not the most important aspect of a person's character". Harry is determined and takes responsibility and is curious and is one of the few people who has everyone's back in that book, so I think you should definitely read and learn from him, and then the friend is like "Huh, wow, okay, I think I'll read it then. That was very high and specific praise."
A lot of this comes down to the graphs in Lukeprog's post on romance (lol, another dating post, I'm so sorry).
I think that LessWrong is home to some of the most honest and truth-seeking convo on the internet. We have amazing thinkers who come here like Zvi and Paul and Anna and Scott and more and the people who care about the conversations they can have will come here even if we have weird associations and some people hate us and call us names.
(Sarah also wrote the forces of blandness post that I think is great and I think about a lot in this context.)
I guess I didn't address the specific example of your friend. (Btw I am also a person who was heavily involved with EA at Oxford, I ran the 80k student group while I was there and an EAGx!) I'm sorry your friend decided to write-off LessWrong because they heard it was sexist. I know you think that's a massive cost that we're paying in terms of thousands of good people avoiding us for that reason too.
I think that negative low-level associations really matter if you're trying to be a mass movement and scale, like a political movement. Republicans/Democrats kind of primarily work to manage whether the default association is positive or negative, which is why they spend so much time on image-management. I don't think LW should grow 100x users in the next 4 years. That would be terrible for our mission of refining the art of human rationality and our culture. I think that the strong positive hits are the most important, as I said already.
Suppose you personally get really valuable insights from LW, and that people's writing here helps you understand yourself as a person and become more virtuous in your action. If you tell your EA friend that LessWrong was a key causal factor in you levelling up as a person, and they reply "well that's net bad because I once heard they're sexist" I'm not that impressed by them. And I hope that a self-identified EA would see the epistemic and personal value there as primary rather than the image-management thing as primary. And I think that if we all think everybody knows everyone else thinks the image-management is primary... then I think it's healthy to take the step of saying out loud "No, actually, the actual intellectual progress on rationality is more important" and following through.
oops sry i write a lot when i don't have time to make it short
I haven't read the OP, am not that interested in it, though Geoffrey Miller is quite thoughtful.
I think that the main things building up what LW is about right now are the core tags, the tagging page, and the upcoming LW books based on the LW review vote. If you look at the core tags, there's nothing about dating there ("AI" and "World Modeling" etc). If you look at the vote, it's about epistemology and coordination and AI, not dating. The OP also hasn't got much karma, so I'm a bit confused that you're arguing this shouldn't be discussed on LW, and weak-downvoted this comment. (If you want to argue that a dating post has too much attention, maybe pick something that was better received like Jacobian's recent piece, which I think embodies a lot of the LW spirit and is quite healthy.)
I'm not much worried about dating posts like this being what we're known for. Given that it's a very small part of the site, if it still became one of the 'attack vectors', I'm pretty pro just fighting those fights, rather than giving in and letting people on the internet who use the representativeness heuristic to attack people decide what we get to talk about. (Once you open yourself to giving in on those fights, they just start popping up everywhere, and then like 50% of your cognition is controlled by whether or not you're stepping over those lines.)
As a very quick answer, I'd basically be interested to hear about any part of it that you're thinking a lot about, and for you to explain at least two different perspectives on the question.
It also helps if we can get the basic picture in 5 mins so that in the Q&A and convo after we can provide thoughts that would be actually useful to you, where even if we're wrong you can explain why to us. Basically this idea is for you to optimise for you getting to explain things to us in a way that you find most useful.
Major props for writing down your understanding in such a readable, clear, and relatively short way. I expect this will be a benefit to you in 6, 12, 18 months, when you look back and see how your big picture thinking has changed.
I also think it's substantially lessened certain things that felt aesthetically pleasing. That said I had permanently changed my habits to instead use our staging server (lessestwrong.com) because for me the difference in eye-strain was so dramatic. I expect we'll find ways to build a strong aesthetic with this theme, so I'm not too worried about the local change on that dimension being fairly negative.
This is tricky because not letting someone join your conversation is often seen as a sign that they're unwelcome/you-don't-like-them, and I wish this could change as an overall rationalist-culture-norm specifically so that it _wouldn't_ send that signal. Instead it's just understood that "having a high context conversation" is an activity you can't really interrupt once it's started.
I want to express a feeling I had reading this paragraph, though not necessarily because I think we have a disagreement about the concrete policy recommendation. (You’re right, this norm seems good, and I’ve been actively trying to follow it myself at recent events.)
My feeling, reading this paragraph, is that it’s making a naive mistake of assuming that the reason a certain behaviour signals a certain underlying fact about social reality is just a mistake and could just be changed.
Whereas in actuality, there’s a reason that this behaviour normally has this signal. There’s something different in this community than in many/most other communities, and I think for this post to change people’s feelings it would be better to name the true fact about other social situations that people are accurately modelling, and then make a simple argument about what’s different here.
I can't answer your question properly, in part because I am not BERI. I'll just share some of my thoughts that seems relevant for this question:
I expect everything BERI supports and funds to always be justified in terms of x-risk. It will try to support all the parts of EA that are focused on x-risk, and not the rest. For example, their grant to EA Sweden is described as "Effective Altruism Sweden will support Markus Stoor’s project to coordinate two follow-up lunch-to-lunch meetings in Sweden for x-risk-focused individuals."
I think it would be correct to classify it entirely as an x-risk org and not as an EA org. I don't think it does any EA-style analysis of what it should work on that is not captured under x-risk analysis, and I think that people working to do things like, say, fight factory farming, should never expect support from BERI (via the direct work BERI does).
I think it will have indirect effects on other EA work. For example BERI supports FHI and this will give FHI a lot more freedom to take actions in the world, and FHI does some support of other areas of EA (e.g. Owen Cotton-Barratt advises the CEA, and probably that trades off against his management time on the RSP programme). I expect BERI does not count this in their calculations on whether to help out with some work, but I'm not confident.
I would call it an x-risk org and not an EA-aligned org in its work, though I expect its staff all care about EA more broadly.
Pardon for being so challenging, you know I’m always happy to talk with you and answer your questions Evan :) Am just a bit irritated, and let that out here.
I do think that “identity” and “brand” mustn’t become decoupled from what actually gets done - if you want to talk meaningfully about ‘EA’ and what’s true about it, it shouldn’t all be level 3/4 simulacra.
Identity without substance or action is meaningless, and sort of not something you get to decide for yourself. If you decide to identify as ‘an EA‘ this causes no changes in your career or your donations, has the average EA donations suddenly gone down? Has EA actually grown? It’s good to be clear on the object level and whether the proxy actually measures anything, and I’m not sure I should call that person and EA’ despite their speech acts to the contrary.