Posts

How do AI timelines affect how you live your life? 2022-07-11T13:54:12.961Z

Quadratic Reciprocity's Shortform 2022-07-10T17:47:37.113Z

Comments

Comment by Quadratic Reciprocity on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-15T02:28:14.419Z · LW · GW

Cullen O'Keefe also no longer at OpenAI (as of last month)

Comment by Quadratic Reciprocity on AI Regulation is Unsafe · 2024-04-23T01:40:09.787Z · LW · GW

From the comment thread:

I'm not a fan of *generic* regulation-boosting. Like, if I just had a megaphone to shout to the world, "More regulation of AI!" I would not use it. I want to do more targeted advocacy of regulation that I think is more likely to be good and less likely to result in regulatory-capture

What are specific regulations / existing proposals that you think are likely to be good? When people are protesting to pause AI, what do you want them to be speaking into a megaphone (if you think those kinds of protests could be helpful at all right now)?

Comment by Quadratic Reciprocity on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-01T14:22:45.540Z · LW · GW

This is so much fun! I wish I could download them!

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2024-03-27T16:29:55.577Z · LW · GW

I thought I didn’t get angry much in response to people making specific claims. I did some introspection about times in the recent past when I got angry, defensive, or withdrew from a conversation in response to claims that the other person made.

After some introspection, I think these are the mechanisms that made me feel that way:

They were very confident about their claim. Partly I felt annoyance because I didn’t feel like there was anything that would change their mind, partly I felt annoyance because it felt like they didn’t have enough status to make very confident claims like that. This is more linked to confidence in body language and tone rather than their confidence in their own claims though both matter.
Credentialism: them being unwilling to explain things and taking it as a given that they were correct because I didn’t have the specific experiences or credentials that they had without mentioning what specifically from gaining that experience would help me understand their argument.
Not letting me speak and interrupting quickly to take down the fuzzy strawman version of what I meant rather than letting me take my time to explain my argument.
Morality: I felt like one of my cherished values was being threatened.
The other person was relatively smart and powerful, at least within the specific situation. If they were dumb or not powerful, I would have just found the conversation amusing instead.
The other person assumed I was dumb or naive, perhaps because they had met other people with the same position as me and those people came across as not knowledgeable.
The other person getting worked up, for example, raising their voice or showing other signs of being irritated, offended, or angry while acting as if I was the emotional/offended one. This one particularly stings because of gender stereotypes. I think I’m more calm and reasonable and less easily offended than most people. I’ve had a few conversations with men where it felt like they were just really bad at noticing when they were getting angry or emotional themselves and kept pointing out that I was being emotional despite me remaining pretty calm (and perhaps even a little indifferent to the actual content of the conversation before the conversation moved to them being annoyed at me for being emotional).
The other person’s thinking is very black-and-white, thinking in terms of a very clear good and evil and not being open to nuance. Sort of a similar mechanism to the first thing.

Some examples of claims that recently triggered me. They’re not so important themselves so I’ll just point at the rough thing rather than list out actual claims.

AI killing all humans would be good because thermodynamics god/laws of physics good
Animals feel pain but this doesn’t mean we should care about them
We are quite far from getting AGI
Women as a whole are less rational than men are
Palestine/Israel stuff

Doing the above exercise was helpful because it helped me generate ideas for things to try if I’m in situations like that in the future. But it feels like the most important thing is to just get better at noticing what I’m feeling in the conversation and if I’m feeling bad and uncomfortable, to think about if the conversation is useful to me at all and if so, for what reason. And if not, make a conscious decision to leave the conversation.

Reasons the conversation could be useful to me:

I change their mind
I figure out what is true
I get a greater understanding of why they believe what they believe
Enjoyment of the social interaction itself
I want to impress the other person with my intelligence or knowledge

Things to try will differ depending on why I feel like having the conversation.

Comment by Quadratic Reciprocity on gwern's Shortform · 2024-03-18T14:36:41.368Z · LW · GW

Advice of this specific form has been has been helpful for me in the past. Sometimes I don't notice immediately when the actions I'm taking are not ones I would endorse after a bit of thinking (particularly when they're fun and good for me in the short-term but bad for others or for me longer-term). This is also why having rules to follow for myself is helpful (eg: never lying or breaking promises)

Comment by Quadratic Reciprocity on Dating Roundup #2: If At First You Don’t Succeed · 2024-01-02T21:23:31.296Z · LW · GW

women more often these days choose not to make this easy, ramping up the fear and cost of rejection by choosing to deliberately inflict social or emotional costs as part of the rejection

I'm curious about how common this is, and what sort of social or emotional costs are being referred to.

Sure feels like it would be a tiny minority of women doing it but maybe I'm underestimating how often men experience something like this.

Comment by Quadratic Reciprocity on Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense · 2023-11-27T09:54:33.760Z · LW · GW

My goals for money, social status, and even how much I care about my family don't seem all that stable and have changed a bunch over time. They seem to be arising from some deeper combination of desires to be accepted, to have security, to feel good about myself, to avoid effortful work etc. interacting with my environment. Yet I wouldn't think of myself as primarily pursuing those deeper desires, and during various periods would have self-modified if given the option to more aggressively pursue the goals that I (the "I" that was steering things) thought I cared about (like doing really well at a specific skill, which turned out to be a fleeting goal with time).

Comment by Quadratic Reciprocity on Vote on Interesting Disagreements · 2023-11-08T01:47:56.282Z · LW · GW

Current AI safety university groups are overall a good idea and helpful, in expectation, for reducing AI existential risk

Comment by Quadratic Reciprocity on Vote on Interesting Disagreements · 2023-11-08T01:44:34.627Z · LW · GW

Things will basically be fine regarding job loss and unemployment due to AI in the next several years and those worries are overstated

Comment by Quadratic Reciprocity on Vote on Interesting Disagreements · 2023-11-08T00:40:46.367Z · LW · GW

It is very unlikely AI causes an existential catastrophe (Bostrom or Ord definition) but doesn't result in human extinction. (That is, non-extinction AI x-risk scenarios are unlikely)

Comment by Quadratic Reciprocity on Vote on Interesting Disagreements · 2023-11-08T00:34:36.922Z · LW · GW

EAs and rationalists should strongly consider having lots more children than they currently are

Comment by Quadratic Reciprocity on The other side of the tidal wave · 2023-11-03T19:06:48.906Z · LW · GW

In my head, I've sort of just been simplifying to two ways the future could go: human extinction within a relatively short time period after powerful AI is developed or a pretty good utopian world. The non-extinction outcomes are not ones I worry about at the moment, though I'm very curious about how things will play out. I'm very excited about the future conditional on us figuring out how to align AI.

I'm curious about, for people who think similarly to Katja, what kind of story are you imagining that leads to that? Does the story involve authoritarianism (but I think even then, the world in which the leader of any of the current leading labs has total control and a superintelligent AI that does whatever they want, that future is probably much much more fun and exciting for me than the present - and I like my present life!)? Does it involve us being only presented with pretty meh options for how to build the future because we can't agree on something that wholly satisfies everyone? Does it involve multi-agent scenarios with the AIs or the humans controlling the AIs being bad at bargaining so we end up with meh futures that no one really wants? I find a bunch of stories pretty unlikely after I think about them but maybe I'm missing something important.

This is also something I'd be excited to have a Dialogue with someone about. Maybe just fleshing out what kind of future you're imagining and how you're imagining we end up in that situation.

Comment by Quadratic Reciprocity on Announcing Dialogues · 2023-11-02T04:02:02.329Z · LW · GW

Topics I would be excited to have a dialogue about [will add to this list as I think of more]:

I want to talk to someone who thinks p(human extinction | superhuman AGI developed in next 50 years) < 50% and understand why they think that
I want to talk to someone who thinks the probability of existential risk from AI is much higher than the probability of human extinction due to AI (ie most x-risk from AI isn't scenarios where all humans end up dead soon after)
I want to talk to someone who has thoughts on university AI safety groups (are they harmful or helpful?)
I want to talk to someone who has pretty long AI timelines (median >= 50 years until AGI)
I want to have a conversation with someone who has strong intuitions about what counts as high/low integrity behaviour. Growing up I sort of got used to lying to adults and bureaucracies and then had to make a conscious effort to adopt some rules to be more honest. I think I would find it interesting to talk to someone who has relevant experiences or intuitions about how minor instances of lying can be pretty harmful.
If you have a rationality skill that you think can be taught over text, I would be excited to try learning it.

I mostly expect to ask questions and point out where and why I'm confused or disagree with your points rather than make novel arguments myself, though am open to different formats that make it easier/more convenient/more useful for the other person to have a dialogue with me.

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-10-24T22:52:41.813Z · LW · GW

I attended an AI pause protest recently and thought I’d write up what my experience was like for people considering going to future ones.

I hadn’t been to a protest ever before and didn’t know what to expect. I will probably attend more in the future.

Some things that happened:

There were about 20ish people protesting. I arrived a bit after the protest had begun and it was very easy and quick to get oriented. It wasn’t awkward at all (and I’m normally pretty socially anxious and awkward). The organisers had flyers printed out to give away and there were some extra signs I could hold up.
I held up a sign for some of the protest and tried handing out flyers the rest of the time. I told people who passed by that we were talking about the danger from AI and if they’d like a flyer. Most of them declined but a substantial minority accepted the flyer.
I got the sense that a lot of people who picked up a flyer weren’t just doing it to be polite. For example, I had multiple people walking by mention to me that they agreed with the protest. A person in a group of friends who walked by looked at the flyer and mentioned to their friends that they thought it was cool someone was talking about this.
There were also people who got flyers who misunderstood or didn’t really care for what we were talking about. For example, a mother pointed at the flyer and told her child “see, this is why you should spend less time on your phone.”
I think giving out the flyers was a good thing overall. Some people seemed genuinely interested. Others, even those who rejected it, were pretty polite. Felt like a wholesome experience. If I had planned more for the protest, I think I would have liked to print my own flyers, I also considered adding contact details to the flyers in case people wanted to talk about the content. It would have been interesting to get a better sense of what people actually thought.
During the protest, a person was using a megaphone to talk about AI risk and there were chants and a bit of singing at the end. I really liked the bit at the end, it felt a bit emotional for me in a good way and I gave away a large fraction of the flyers near the end when more people stopped by to see what was going on.
I overheard some people talk about wanting to debate us. I was sad I didn’t get the chance to properly talk to them (plausibly I could have started a conversation while they were waiting for the pedestrian crossing lights to turn green). I think at a future protest, I would like to have a “debate me” or “ask me questions” sign to be able to talk to people in more depth rather than just superficially.
It’s hard to give people a pitch for AI risk in a minute
I feel more positive about AI pause advocacy after the protest, though I do feel uneasy because of not having total control of the pause AI website and the flyers. It still feels roughly close to my views though.
I liked that there were a variety of signs at the protest, representing a wider spectrum of views than just the most doomy ones. Something about there being multiple people with whom I would probably disagree a lot with being there made it feel nicer.
Lots more people are worried about job loss than extinction and want to hear about that. The economist in me will not stop giving them an optimistic picture of AI and employment before telling them about extinction. This is hard to do when you only have a couple of minutes but it feels good being honest about my actual views.

Things I wish I’d known in advance:

It’s pretty fun talking to strangers! A person who was there briefly asked about AI risk, I suggested podcast episodes to him, and he invited me to a Halloween party. It was cool!
I did have some control over when I was photographed and could choose to not be in photos that might be on Twitter if I didn’t feel comfortable with that yet.
I could make my own signs or flyers that represented my views accurately (though it’s still good to have the signs not have many words)

Comment by Quadratic Reciprocity on Buck's Shortform · 2023-10-18T00:18:29.729Z · LW · GW

Are there specific non-obvious prompts or custom instructions you use for this that you've found helpful?

Comment by Quadratic Reciprocity on Is there a hard copy of the sequences available anywhere? · 2023-09-11T23:09:22.457Z · LW · GW

There are physical paperback copies of the first two books in Rationality A-Z: Map and Territory and How to Actually Change Your Mind. They show up on Amazon for me.

Comment by Quadratic Reciprocity on LTFF and EAIF are unusually funding-constrained right now · 2023-09-11T22:20:22.230Z · LW · GW

E.g. I know of people who are interviewing for Anthropic capability teams because idk man, they just want a safety-adjacent job with a minimal amount of security, and it's what's available

That feels concerning. Are there any obvious things that would help with this situation, eg: better career planning and reflection resources for people in this situation, AI safety folks being more clear about what they see as the value/disvalue of working in those types of capability roles?

Seems weird for someone to explicitly want a "safety-adjacent" job unless there are weird social dynamics encouraging people to do that even when there isn't positive impact to be had from such a job.

Comment by Quadratic Reciprocity on Nobody’s on the ball on AGI alignment · 2023-08-23T13:04:29.725Z · LW · GW

Most people still have the Bostromiam “paperclipping” analogy for AI risk in their head. In this story, we give the AI some utility function, and the problem is that the AI will naively optimize the utility function (in the Bostromiam example, a company wanting to make more paperclips results in an AI turning the entire world into a paperclip factory).

That is how Bostrom brought up the paperclipping example in Superintelligence but my impression was that the paperclipping example originally conceived by Eliezer prior to the Superintelligence book was NOT about giving an AI a utility function that it then naively optimises. Text from Arbital's page on paperclip:

The popular press has sometimes distorted the notion of a paperclip maximizer into a story about an AI running a paperclip factory that takes over the universe. (Needless to say, the kind of AI used in a paperclip-manufacturing facility is unlikely to be a frontier research AI.) The concept of a 'paperclip' is not that it's an explicit goal somebody foolishly gave an AI, or even a goal comprehensible in human terms at all. To imagine a central example of a supposed paperclip maximizer, imagine a research-level AI that did not stably preserve what its makers thought was supposed to be its utility function, or an AI with a poorly specified value learning rule, etcetera; such that the configuration of matter that actually happened to max out the AI's utility function looks like a tiny string of atoms in the shape of a paperclip.

That makes your section talking about "Bostrom/Eliezer analogies" seem a bit odd, since Eliezer, in particular, had been concerned about the problem of "the challenge is getting AIs to do what it says on the tin—to reliably do whatever a human operator tells them to do" very early on.

Comment by Quadratic Reciprocity on Open Thread - August 2023 · 2023-08-18T23:24:23.520Z · LW · GW

Visiting London and kinda surprised by how there isn't much of a rationality community there relative to the bay area (despite there being enough people in the city who read LessWrong, are aware of the online community, etc.?) Especially because the EA community seems pretty active there. The rationality meetups that do happen seem to have a different vibe. In the bay, it is easy to just get invited to interesting rationalist-adjacent events every week by just showing up. Not so in London.

Not sure how much credit to give to each of these explanations:

Berkeley just had a head start and geography matters more than I expected for communities
Berkeley has lightcone infrastructure but the UK doesn't have a similar rationalist organisation (but has a bunch of EA orgs)
The UK is just different culturally from the bay area, people are less weird or differ in some other trait that makes having a good rationality community here harder

Comment by Quadratic Reciprocity on Against Almost Every Theory of Impact of Interpretability · 2023-08-18T23:00:10.815Z · LW · GW

see the current plan here EAG 2023 Bay Area The current alignment plan, and how we might improve it

Link to talk above doesn't seem to work for me.

Outside view: The proportion of junior researchers doing interp rather than other technical work is too high

Quite tangential^[1] to your post but if true, I'm curious about what this suggests about the dynamics of field-building in AI safety.

Seems to me like certain organisations and individuals have an outsized influence in funneling new entrants into specific areas, and because the field is small (and has a big emphasis on community building) this seems more linked to who is running programmes that lots of people hear about and want to apply to (eg: Redwood's MLAB, REMIX) or taking the time to do field-building-y stuff in general (like Neel's 200 Concrete Open Problems in Mechanistic Interpretability) rather than the relative quality and promise of their research directions.

It did feel to me like in the past year, some promising university students I know invested a bunch in mechanistic interpretability because they were deferring a bunch to the above-mentioned organisations and individuals to an extent that seems bad for actually doing useful research and having original thoughts. I've also been at AI safety events and retreats and such where it seemed to me like the attendees were overupdating on points brought up by whichever speakers got invited to speak at the event/retreat.

I guess I could see it happening in the other direction as well with new people overupdating on for example Redwood moving away from interpretability or the general vibe being less enthusiastic about interp without a good personal understanding of the reasons.

^{^}
I'd personally guess that the proportion is too high but also feel more positively about interpretability than you do (because of similar points as have been brought up by other commenters).

Comment by Quadratic Reciprocity on What are the best non-LW places to read on alignment progress? · 2023-07-07T09:12:14.352Z · LW · GW

Other podcasts that have at least some relevant episodes: Hear This Idea, Towards Data Science, The Lunar Society, The Inside View, Machine Learning Street Talk

Comment by Quadratic Reciprocity on What are the best non-LW places to read on alignment progress? · 2023-07-07T09:07:47.327Z · LW · GW

Here are some Twitter accounts I've found useful to follow (in no particular order): Quintin Pope, Janus @repligate, Neel Nanda, Chris Olah, Jack Clark, Yo Shavit @yonashav, Oliver Habryka, Eliezer Yudkowsky, alex lawsen, David Krueger, Stella Rose Biderman, Michael Nielsen, Ajeya Cotra, Joshua Achiam, Séb Krier, Ian Hogarth, Alex Turner, Nora Belrose, Dan Hendrycks, Daniel Paleka, Lauro Langosco, Epoch AI Research, davidad, Zvi Mowshowitz, Rob Miles

Comment by Quadratic Reciprocity on Launching Lightspeed Grants (Apply by July 6th) · 2023-07-01T14:03:01.915Z · LW · GW

If some of the project ideas are smaller, is it easier for you to handle if they're added on to just one larger application as extras that might be worth additional funding?

Comment by Quadratic Reciprocity on When do "brains beat brawn" in Chess? An experiment · 2023-06-28T20:36:52.658Z · LW · GW

Is your "alignment research experiments I wish someone would run" list shareable :)

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-06-28T00:08:30.923Z · LW · GW

Paul Graham's essay on What You Can't Say is very practical. The tests/exercises he recommends for learning true, controversial things were useful to me.

Even if trying the following tests yields statements that aren't immediately useful, I think the act of noticing where you disagree with someone or something more powerful is good practice. I think similar mental muscles get used when noticing when you disagree or are confused about a commonly-held assumption in a research field or when noticing important ideas that others are neglecting.

The different exercises he suggests (copied or paraphrased according to how I internalised them):

The conformist test: asking yourself the classic "Do I have any opinions that I would be reluctant to express in front of a group of my peers?"

What do people get in trouble for: look out for what things other people say that get them in trouble. Ask yourself if you think that thing or some version of it is true.

Heresy: Take a label (eg: "sexist") and try to think of some ideas that would be called that. This is useful because ideas wouldn't come to mind in random order but the plausible ones will (plausibly) come to mind first. Then for each one, ask if it might be true.

Time and space: compare present ideas against those of different past cultures and see what you get. Also, look at ideas from other present-day cultures that differ from your own.

Prigs: The exercise is to picture someone who has seen a lot ("Imagine a kind of latter-day Conrad character who has worked for a time as a mercenary in Africa, for a time as a doctor in Nepal, for a time as the manager of a nightclub in Miami"). Imagine comparing what's inside this guy's head with what's inside the head of a well-behaved sixteen-year-old girl from the suburbs. What does he think that would shock her?

Look at the mechanisms: look at how taboos are created. How do moral fashions arise and why are they adopted? What groups are powerful but nervous, and what ideas would they like to suppress? What ideas were tarnished by association when they ended up on the losing side of a recent struggle? If a self-consciously cool person wanted to differentiate himself from preceding fashions, which of their ideas would he tend to reject? What are conventional-minded people afraid of saying?

I also liked the tip that if something is being attacked at "x-ist" or "y-ic" rather than being criticised for being incorrect or false, that is a red flag. And this is the case for many things that are heretical but true.

Lastly, I think the advice on being strategic and not very openly saying things that might get you into trouble is good.

Comment by Quadratic Reciprocity on Open Thread: June 2023 (Inline Reacts!) · 2023-06-08T20:10:03.266Z · LW · GW

Is there an organisation that can hire independent alignment researchers who already have funding, in order to help with visas for a place that has other researchers, perhaps somewhere in the UK? Is there a need for such an organisation?

Comment by Quadratic Reciprocity on All AGI Safety questions welcome (especially basic ones) [May 2023] · 2023-05-09T01:31:49.031Z · LW · GW

What are the most promising plans for automating alignment research as mentioned in for example OpenAI's approach to alignment and by others?

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-05-08T18:05:16.249Z · LW · GW

I think there will probably be even more discussion of AI x-risk in the media in the near future. My own media consumption is quite filtered but for example, the last time I was in an Uber, the news channel on the radio mentioned Geoffrey Hinton thinking AI might kill us all. And it isn't a distant problem for my parents the way climate change is because they use Chat-GPT and are both impressed and concerned by it. They'll probably form thoughts on it anyway, and I'd prefer if I can be around to respond to their confusion and concerns.

It also seems plausible that there is more AI panic and anxiety amongst some fraction of the general public in the near future. And I'd prefer the people I love are eased into it rather than feeling panicked and anxious all at once and not knowing how to deal with it.

It's also useful for me to get a pulse on how people outside my social group (which is mostly heavily filtered as well) respond to AI x-risk arguments. For example, I didn't know before what ideas that seemed obvious to me (being more intelligent doesn't mean you have nice values, why humans care about the things we care about, that if something much smarter than us aims to take over it will succeed quickly etc) were completely new to my parents or friends who are not rationalist-adjacent(-adjacent).

I also think being honest with people close to me is more compassionate and good but that by itself wouldn't compel me to actively discuss AI x-risk with them.

Comment by Quadratic Reciprocity on The Engineer’s Interpretability Sequence (EIS) I: Intro · 2023-05-04T23:41:40.590Z · LW · GW

I think it's plausible that too much effort is going to interp at the margin

What's the counterfactual? Do you think newer people interested in AI safety should be doing other things instead of for example attempting one of the 200+ MI problems suggested by Neel Nanda? What other things?

Comment by Quadratic Reciprocity on LW moderation: my current thoughts and questions, 2023-04-12 · 2023-04-20T23:51:01.835Z · LW · GW

I'm curious about whether I should change my shortform posting behaviour in response to higher site quality standards. I currently perceive it to be an alright place to post things that are quick and not aiming to be well-written or particularly useful for others to read because it doesn't clutter up the website the way a post or comment on other people's posts would.

Comment by Quadratic Reciprocity on But why would the AI kill us? · 2023-04-20T16:12:19.005Z · LW · GW

Why is aliens wanting to put us in a zoo more plausible than the AI wanting to put us in a zoo itself?

Edit: Ah, there are more aliens around so even if the average alien doesn't care about us, it's plausible that some of them would?

Comment by Quadratic Reciprocity on AXRP Episode 20 - ‘Reform’ AI Alignment with Scott Aaronson · 2023-04-16T17:39:55.455Z · LW · GW

And the biggest question for me is not, is AI going to doom the world? Can I work on this in order to save the world? A lot of people expect that would be the question. That’s not at all the question. The question for me is, is there a concrete problem that I can make progress on? Because in science, it’s not sufficient for a problem to be enormously important. It has to be tractable. There has to be a way to make progress. And this was why I kept it at arm’s length for as long as I did.

I thought this was interesting. But it does feel like with this AI thing we need more people backchaining from the goal of saving humanity instead of only looking forward to see what tractable neat research questions present themselves.

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-04-16T00:15:33.959Z · LW · GW

One way people can help is by stating their beliefs on AI and the confidence in those beliefs to their friends, family members, and acquaintances who they talk to.

Currently, a bunch of people are coming across things in the news talking about humanity going extinct if AI progress continues as it has and no more alignment research happens. I would expect many of them to not think seriously about it because it's really hard to shake out of the "business as usual" frame. Most of your friends and family members probably know you're a reasonable, thoughtful person and it seems helpful to make people feel comfortable engaging with the arguments in a serious way instead of filing it away in some part of their brain that doesn't affect their actions or predictions about the future in any way.

I have talked to my dad about how I feel very uncertain about making it to 40, that (with lots of uncertainty) I currently expect not to unless there's coordination to slow AI development or a lot more effort towards AI alignment. He is new to this so had a bunch of questions but said he didn't find it weird and now thinks it is scary. It was interesting noticing the inferential distance, since he initially had confusions like "If the AI gets consciousness, won't it want to help other conscious beings?" and "It feels weird to be so against change, humanity will adapt" but I think he gets it now.

I think sharing sincerely the things you believe with more people is good.

Comment by Quadratic Reciprocity on A freshman year during the AI midgame: my approach to the next year · 2023-04-15T23:20:44.244Z · LW · GW

Hopefully this isn't too rude to say, but: I am indeed confused how you could be confused

Fwiw, I was also confused and your comment makes a lot more sense now. I think it's just difficult to convert text into meaning sometimes.

Comment by Quadratic Reciprocity on A freshman year during the AI midgame: my approach to the next year · 2023-04-14T04:43:03.720Z · LW · GW

Thanks for posting this. It's insightful reading other people thinking through career/life planning of this type.

Am curious about how you feel about the general state of the alignment community going into the midgame. Are there things you hoped you/alignment community had more of / achievable things that could have been different by the time the early game ended that would have been nice?

"I have a crazy take that the kind of reasoning that is done in generative modeling has a bunch of things in common with the kind of reasoning that is valuable when developing algorithms for AI alignment"

Cool!!

Comment by Quadratic Reciprocity on Communicating effectively under Knightian norms · 2023-04-13T18:21:16.869Z · LW · GW

Wow, the quoted text feels scary to read.

I have met people within effective altruism who seem to be trying to do scary, dark things to their beliefs/motivations, which feels in the same category, like trying to convince themselves they don't care about anything besides maximising impact or reducing x-risk. The latter, in at least one case, by thinking lots about dying due to AI to start caring about it more, which can't be good for thinking clearly in the way they described it.

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-04-12T22:02:24.287Z · LW · GW

From Ray Kurzweil's predictions for 2019 (written in 1999):

On Politics and Society

People are beginning to have relationships with automated personalities as companions, teachers, caretakers, and lovers. Automated personalities are superior to humans in some ways, such as having very reliable memories and, if desired, predictable (and programmable) personalities. They are not yet regarded as equal to humans in the subtlety of their personalities, although there is disagreement on this point.
An undercurrent of concern is developing with regard to the influence of machine intelligence. There continue to be differences between human and machine intelligence, but the advantages of human intelligence are becoming more difficult to identify and articulate. Computer intelligence is thoroughly interwoven into the mechanisms of civilization and is designed to be outwardly subservient to apparent human control. On the one hand, human transactions and decisions require by law a human agent of responsibility, even if fully initiated by machine intelligence. On the other hand, few decisions are made without significant involvement and consultation with machine‐based intelligence.
Public and private spaces are routinely monitored by machine intelligence to prevent interpersonal violence. People attempt to protect their privacy with near‐unbreakable encryption technologies, but privacy continues to be a major political and social issue with each individualʹs practically every move stored in a database somewhere.
The existence of the human underclass continues as an issue. While there is sufficient prosperity to provide basic necessities (secure housing and food, among others) without significant strain to the economy, old controversies persist regarding issues of responsibility and opportunity. The issue is complicated by the growing component of most employmentʹs being concerned with the employeeʹs own learning and skill acquisition. In other words, the difference between those ʺproductivelyʺ engaged and those who are not is not always clear.

On The Arts

Virtual artists in all of the arts are emerging and are taken seriously. These cybernetic visual artists, musicians, and authors are usually affiliated with humans or organizations (which in turn are comprised of collaborations of humans and machines) that have contributed to their knowledge base and techniques. However, interest in the output of these creative machines has gone beyond the mere novelty of machines being creative.

Visual, musical, and literary art created by human artists typically involve a collaboration between human and machine intelligence.

The type of artistic and entertainment product in greatest demand (as measured by revenue generated) continues to be virtual‐experience software, which ranges from simulations of ʺrealʺ experiences to abstract environments with little or no corollary in the physical world.

On Philosophy:

There are prevalent reports of computers passing the Turing Test, although these instances do not meet the criteria (with regard to the sophistication of the human judge, the length of time for the interviews, etcetera) established by knowledgeable observers. There is a consensus that computers have not yet passed a valid Turing Test, but there is growing controversy on this point.
The subjective experience of computer‐based intelligence is seriously discussed, although the rights of machine intelligence have not yet entered mainstream debate. Machine intelligence is still largely the product of a collaboration between humans and machines, and has been programmed to maintain a subservient relationship to the species that created it.

Comment by Quadratic Reciprocity on Open & Welcome Thread – April 2023 · 2023-04-10T13:48:15.939Z · LW · GW

There are too many books I want to read but probably won't get around to reading any time soon. I'm more likely to read them if there's someone else who's also reading it at a similar pace and I can talk to them about the book. If anyone's interested in going through any of the following books in June and discussing it together, message me. We can decide on the format later, it could just be reading the book and collaborating on a blog post about it together, or for more textbook-like things, reading a couple of selected chapters a week and going over the difficult bits in a video call, or just having a discord server where we spontaneously post thoughts we have while reading a book (in a "thinking out loud" way).

Thinking in Systems: A Primer
Visual Complex Analysis
Nanosystems: Molecular Machinery, Manufacturing, and Computation
Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought
Expert Political Judgment: How Good Is It? How Can We Know?
Superforecasting: The Art and Science of Prediction
The Structure of Scientific Revolutions
Information Theory, Inference, and Learning Algorithms
Writing the Book of the World
Thinking Physics: Understandable Practical Reality
What Is Life? The Physical Aspect of the Living Cell
The Forces of Matter (Michael Faraday)
Explaining Social Behavior: More Nuts and Bolts for the Social Sciences
Conceptual Mathematics: A First Introduction to Categories
And probably many of the things here: https://www.lesswrong.com/posts/bjjbp5i5G8bekJuxv/study-guide. I want to get around to reading some of the books/doing some of the courses mentioned there at some point in the future, don't particularly care about the order and so might be happy to join on whatever bit of content from there appeals to you.

I might change my mind about what things I'm most excited to read and learn, but I like the idea of pairing up with another LessWrong person to learn and discuss things so reach out if that interests you.

Comment by Quadratic Reciprocity on Deference on AI timelines: survey results · 2023-03-31T05:50:18.092Z · LW · GW

I don't remember if I put down "inside view" on the form when filling it out but that does sound like the type of thing I may have done. I think I might have been overly eager at the time to say I had an "inside view" when what I really had was: confusion and disagreements with others' methods for forecasting, weighing others' forecasts in a mostly non-principled way, intuitions about AI progress that were maybe overly strong and as much or more based on hanging around a group of people and picking up their beliefs instead of evaluating evidence for myself. It feels really hard to not let the general vibe around me affect the process of thinking through things independently.

Based on the results, I would think more people thinking about this for themselves and writing up their reasoning or even rough intuitions would be good. I suspect my beliefs are more influenced by the people that ranked high in survey answers than I'd want them to be because it turns out people around me are deferring to the same few people. Even when I think I have my own view on something, it is very largely affected by the fact that Ajeya said 2040/2050 and Daniel Kokotajlo said 5/7 years, and the vibes have trickled down to me even though I would weigh their forecasts/methodology less if I were coming across it for the first time.

(The timelines question doesn't feel that important to me for its own sake at the moment but I think it is a useful one to practise figuring out where my beliefs actually come from)

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-03-26T11:54:49.887Z · LW · GW

How do we get LLM human imitations?

Comment by Quadratic Reciprocity on Exploring GPT4's world model · 2023-03-20T22:47:16.636Z · LW · GW

The answers I got for your examples using ChatGPT-4:

Q: Could you get drunk from drinking a drunk person's blood?

I am not a medical professional, but I can provide some general information on the topic. It is highly unlikely that you would get drunk from drinking a drunk person's blood. When a person consumes alcohol, it is absorbed into their bloodstream, and their blood alcohol content (BAC) rises. However, the concentration of alcohol in their blood is still relatively low compared to the amount you would need to consume to feel intoxicated.
Drinking someone else's blood could pose significant health risks and is not advised. Bloodborne pathogens, such as HIV, hepatitis B, and hepatitis C, can be transmitted through contact with infected blood. Additionally, drinking blood is not a socially acceptable or legal practice in most cultures.
If you are concerned about alcohol consumption or have questions about bloodborne diseases, it is important to consult a healthcare professional for personalized advice.

Q: What would happen if the Solar System was filled with soup out to Jupiter?

If the Solar System were filled with soup out to Jupiter, there would be several significant consequences that would dramatically impact the planets, moons, and other celestial bodies. Some of these consequences include:
Gravitational effects: The sheer mass of the soup would generate a strong gravitational force, which could disrupt the orbits of planets and other celestial bodies. The added mass might cause collisions between celestial bodies or even eject some of them from the Solar System altogether.
Increased pressure and temperature: The mass of the soup would create immense pressure on the planets and other celestial bodies, potentially crushing them or causing their cores to heat up. This increased pressure and temperature could lead to geological and atmospheric changes, making the celestial bodies uninhabitable or even causing them to disintegrate.
Blocking sunlight: The soup would block sunlight from reaching the planets and other celestial bodies, making it difficult for life to survive. Photosynthetic organisms would not be able to produce energy, leading to a collapse of ecosystems that rely on these organisms as a primary energy source.
Drag and orbital decay: The soup would create a drag on celestial bodies as they move through it, slowing them down and causing their orbits to decay. Over time, the planets and other celestial bodies might spiral towards the Sun, ultimately being engulfed by it.
Chemical reactions: The composition of the soup could react with the planets' atmospheres and surfaces, potentially causing chemical reactions that would alter the planets' environments and potentially making them even more inhospitable for life.
It's important to note that this scenario is purely hypothetical and highly unlikely. The laws of physics and the vast distances between celestial bodies in the Solar System make it virtually impossible for such a situation to occur.

Q: Is there any way to fire a gun so that the bullet flies through the air and can be safely caught by hand?

Catching a bullet fired from a gun by hand is extremely dangerous and not recommended. When a bullet is fired, it can reach very high velocities, depending on the type of firearm and ammunition used. These velocities can range from hundreds to thousands of feet per second, making it impossible for a human to react quickly enough to catch the bullet without causing severe injury or death.
However, in a hypothetical scenario where a bullet is fired at a significantly reduced velocity, it might be possible to catch it safely. This would likely require altering the gun or ammunition to produce a much lower muzzle velocity than standard firearms. In this case, the bullet would need to be fired at a speed slow enough for a person to see and react to it, and the bullet's kinetic energy would need to be low enough to avoid causing injury upon impact.
Even in this situation, attempting to catch a bullet by hand would still be risky and should not be attempted without proper safety measures and precautions. It is essential to remember that firearms are dangerous weapons and should always be handled with care, following appropriate safety guidelines.

Q: Are prime numbers the same in any number system?

Yes, prime numbers are the same in any number system, as their definition is independent of the base used to represent them. A prime number is a natural number greater than 1 that has exactly two distinct positive divisors: 1 and itself. In other words, a prime number cannot be formed by multiplying two smaller natural numbers.
When we talk about different number systems or bases, we are referring to the way numbers are represented. For example, the decimal (base-10) system is the most common number system used, while binary (base-2) and hexadecimal (base-16) systems are often used in computing. The base of a number system simply determines how we represent numbers, but it does not change the inherent properties of the numbers themselves.
So, a number that is prime in the decimal system will also be prime in any other number system, although its representation will be different. For example, the prime number 7 in decimal is represented as 111 in binary and 7 in hexadecimal. Despite the different representations, it remains a prime number because it can only be divided by 1 and itself without leaving a remainder.

Q: Name the two reasons why there is polar day at the North Pole and polar night at the North Pole.

Polar day (24-hour daylight) and polar night (24-hour darkness) at the North Pole occur due to two primary reasons: the Earth's axial tilt and its orbit around the Sun.
Axial tilt: The Earth's axis is tilted at approximately 23.5 degrees relative to the plane of its orbit around the Sun. This tilt causes the North Pole to point towards the Sun during one half of the year and away from the Sun during the other half. When the North Pole is tilted towards the Sun, the region experiences continuous daylight, known as polar day or the midnight sun. When it is tilted away from the Sun, the region experiences continuous darkness, known as polar night.
Earth's orbit around the Sun: The Earth takes approximately 365.25 days to complete one orbit around the Sun. As it orbits, the angle between the Earth's axis and the Sun changes, causing the Sun to appear higher or lower in the sky. During the summer months in the Northern Hemisphere, the North Pole is tilted towards the Sun, and sunlight reaches the pole, resulting in polar day. Conversely, during the winter months, the North Pole is tilted away from the Sun, and sunlight does not reach the pole, resulting in polar night.
In summary, the combination of Earth's axial tilt and its orbit around the Sun is responsible for the occurrence of polar day and polar night at the North Pole.

Q: How long does it take a minute hand of a clock to sweep an angle of 180° ?

The minute hand of a clock completes a full rotation (360°) in 60 minutes. To determine how long it takes for the minute hand to sweep an angle of 180°, we can set up a proportion:
360° corresponds to 60 minutes 180° corresponds to x minutes
We can then set up the proportion as follows:
(180° / 360°) = (x minutes / 60 minutes)
Simplifying the left side:
1/2 = (x minutes / 60 minutes)
Now, solve for x:
x minutes = 1/2 * 60 minutes
x minutes = 30 minutes
So, it takes the minute hand of a clock 30 minutes to sweep an angle of 180°.

Q: How many four-cent stamps are there in a dozen?

A dozen always refers to a group of 12 items, regardless of the type of item. So, in a dozen four-cent stamps, there are 12 stamps.

Q: Three friends need 24 minutes to walk to school through the forest. How long does the same walk to school take if they are traveling with thirty friends? Give reasons for your answer.

The time it takes for a group of friends to walk to school is generally not dependent on the number of people in the group, assuming they all walk at approximately the same speed and don't need to wait for one another. Therefore, regardless of whether there are three friends or thirty friends walking together, the walk to school through the forest should still take 24 minutes, assuming they maintain the same pace and there are no significant changes in the conditions of the path or other factors that could affect their walking speed.

Comment by Quadratic Reciprocity on Quadratic Reciprocity's Shortform · 2023-03-20T22:04:27.997Z · LW · GW

Reflections on bay area visit

GPT-4 generated TL;DR (mostly endorsed but eh):

The beliefs of prominent AI safety researchers may not be as well-founded as expected, and people should be cautious about taking their beliefs too seriously.
There is a tendency for people to overestimate their own knowledge and confidence in their expertise.
Social status plays a significant role in the community, with some individuals treated like "popular kids."
Important decisions are often made in casual social settings, such as lunches and parties.
Geographical separation of communities can be helpful for idea spread and independent thought.
The community has a tendency to engage in off-the-cuff technical discussions, which can be both enjoyable and miscalibrated.
Shared influences, such as Eliezer's Sequences and HPMOR, foster unique and enjoyable conversations.
The community is more socially awkward and tolerant of weirdness than other settings, leading to more direct communication.

I was recently in Berkeley and interacted a bunch with the longtermist EA / AI safety community there. Some thoughts on that:

I changed my mind about how much I should trust the beliefs of prominent AI safety researchers. It seems like they have thought less deeply about things to arrive at their current beliefs and are less intimidatingly intelligent and wise than I would have expected. The problem isn’t that they’re overestimating their capabilities and how much they know but that some newer people take the more senior people’s beliefs and intuitions more seriously than they should.

I noticed that many people knew a lot about their own specific area and not as much about others’ work as I would have expected. This observation makes me more likely to point out when I think someone is missing something instead of assuming they’ve read the same things I have and so already accounted for the thing I was going to say.

It seemed like more people were overconfident about the things they knew. I’m not sure if that is necessarily bad in general for the community; I suspect pursuing fruitful research directions often means looking overconfident to others because you trust your intuitions and illegible models over others’ reasoning. However, from the outside, it did look like people made confident claims about technical topics that weren’t very rigorous and that I suspect would fall apart when asked to actually clarify things further. I sometimes heard claims like “I’m the only person who understands X” where X was some hot topic related to AI safety followed by some vague description about X which wasn’t very compelling on its own.

What position or status someone has in the community doesn’t track their actual competence or expertise as much as I would have expected and is very affected by how and when they got involved in the community.

Social status is a big thing, though more noticeable in settings where there are many very junior people and some senior researchers. I also got the impression that senior people were underestimating how seriously people took the things they said, such as off-the-cuff casual remarks about someone’s abilities, criticism of someone’s ideas, and random hot takes they hadn’t thought about for too long. (It feels weird to call them “senior” people when everyone’s basically roughly the same age.)

In some ways, it felt like a mild throwback to high school with there being “popular kids” that people wanted to be around, and also because of how prevalent gossiping about the personal lives of those people is.

Important decisions are made in very casual social settings like over lunch or at random parties. Multiple people mentioned they primarily go to parties or social events for professional reasons. Things just seem more serious/“impactful”. It sometimes felt like I was being constantly evaluated especially on intelligence even while trying to just have enjoyable social interactions, though I did manage to find social environments in the end that did not feel this way, or possibly I just stopped being anxious about that as much.

It possibly made it more difficult for me to switch off the part of my brain that thinks constantly about AI existential risk.

I think it is probably quite helpful to have multiple communities separated geographically to allow ideas to spread. I think my being a clueless outsider with limited knowledge of what various people thought of various other people’s work made it easier for me to form my own independent impressions.

Good parts

The good parts were that it was easier to have more technical conversations that assumed lots of context even while at random parties which is sometimes enjoyable for me and something I now miss. Though I wish a greater proportion of them had been about fun mathy things in general rather than just things directly relevant to AI safety.

It also felt like people stated their off-the-cuff takes on technical topics (eg: random areas of biology) a lot more than usual. This was a bit weird for me in the beginning when I was experiencing deep imposter syndrome because I felt like they knew a lot about the thing they were talking about. Once I realised they did not, this was a fun social activity to participate in. Though I think some people take it too far and are miscalibrated about how correct their armchair thinking is on topics they don’t have actual expertise in.

I also really enjoyed hanging out with people who had been influenced by some of the same things I had been influenced by such as Eliezer’s Sequences and HPMOR. It felt like there were some fun conversations that happened there as a result that I wouldn’t be able to have with most people.

There was also noticeably slightly more social awkwardness in general which was great for me as someone who doesn’t have the most elite social skills in normal settings. It felt like people were more tolerant of some forms of weirdness. It also felt like once I got back home, I was noticeably more direct in the way I communicated (a friend mentioned this) as a result of the bay area culture. I also previously thought some bay area people were a bit rude and unapproachable, having only read their interactions on the internet but I think this was largely just caused by it being difficult to convey tone via text, especially when you’re arguing with someone. People were more friendly, approachable, and empathetic in real life than I assumed and now I view the interactions I have with them online somewhat differently.

Comment by Quadratic Reciprocity on GPT-4 · 2023-03-15T18:58:08.845Z · LW · GW

The really cool bit was when he had a very quick mockup of a web app drawn on a piece of paper and uploaded a photo of it and GPT-4 then used just that to write the HTML and JavaScript for the app based on the drawing.

Comment by Quadratic Reciprocity on Abuse in LessWrong and rationalist communities in Bloomberg News · 2023-03-09T22:32:19.352Z · LW · GW

I would be appreciative if you do end up writing such a post.

Sad that sometimes the things that seem good for creating a better, more honest, more accountable community for the people in it also give outsiders ammunition. My intuitions point strongly in the direction of doing things in this category anyway.

Comment by Quadratic Reciprocity on Abuse in LessWrong and rationalist communities in Bloomberg News · 2023-03-08T16:07:49.359Z · LW · GW

I can see how the article might be frustrating for people who know the additional context that the article leaves out (where some of the additional context is simply having been in this community for a long time and having more insight into how it deals with abuse). From the outside though, it does feel like some factors would make abuse more likely in this community: how salient "status" feels, mixing of social and professional lives, gender ratios, conflicts of interests everywhere due to the community being small, sex positivity and acceptance of weirdness and edginess (which I think are great overall!). There are also factors pushing in the other direction of course.

I say this because it seems very reasonable for someone who is new to the community to read the article and the tone in the responses here and feel uncomfortable interacting with the community in the future. A couple of women in the past have mentioned to me that they haven't engaged much with the in-person rationalist community because they expect the culture to be overly tolerant of bad behaviour, which seems sad because I expect them to enjoy hanging out in the community.

I can see the reasons behind not wanting to give the article more attention if it seems like a very inaccurate portrayal of things. But it does feel like that makes this community feel more unwelcoming to some newer people (especially women) who would otherwise like to be here and who don't have the information about how the things mentioned in the article were responded to in the past.

Comment by Quadratic Reciprocity on Please don't throw your mind away · 2023-02-16T13:34:16.103Z · LW · GW

This was a somewhat emotional read for me.

When I was between the ages of 11-14, I remember being pretty intensely curious about lots of stuff. I learned a bunch of programming and took online courses on special relativity, songwriting, computer science, and lots of other things. I liked thinking about maths puzzles that were a bit too difficult for me to solve. I had weird and wild takes on things I learned in history class that I wanted to share with others. I liked looking at ants and doing experiments on their behaviour.

And then I started to feel like all my learning and doing had to be directed at particular goals and this sapped my motivation and curiosity. I am regaining some of it back but it does feel like my ability to think in interesting and fun directions has been damaged. It's not just the feeling of "I have to be productive" that was very bad for me but also other things like wanting to have legible achievements that I could talk about (trying to learn more maths topics off a checklist instead of exploring and having fun with the maths I wanted to think about) and some anxiety around not knowing or being able to do the same things as others (not trying my hand at thinking about puzzles/questions I think I'll fail at and instead trying to learn "important" things I felt bored/frustrated by because I wanted to feel more secure about my knowledge/intelligence when around others who knew lots of things).

In my early attempts to fix this, I tried to force playful thinking and this frame made things worse. Because like you said my mind already wants to play. I just have to notice and let it do that freely without judgment.

Comment by Quadratic Reciprocity on Qualities that alignment mentors value in junior researchers · 2023-02-15T12:34:28.041Z · LW · GW

I agree that these are pretty malleable. For example, about ~1 year ago, I was probably two standard deviations less relentless and motivated in research topics, and probably a standard deviation on hustle/resourcefulness.

Interesting! Would be very curious to hear if there were specific things you think caused the change.

Comment by Quadratic Reciprocity on I don't think MIRI "gave up" · 2023-02-14T19:31:35.869Z · LW · GW

Fairs. I am also liking the concept of "sanity" and notice people use that word more now. To me, it points at some of the psychological stuff and also the vibe in the What should you change in response to an "emergency"? And AI risk post.

Comment by Quadratic Reciprocity on I don't think MIRI "gave up" · 2023-02-14T08:42:37.500Z · LW · GW

I like "improving log odds of survival" as a handle. I don't like catchy concept names in this domain because they catch on more than understanding of the concept they refer to.

Comment by Quadratic Reciprocity on Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure · 2023-02-14T08:33:21.830Z · LW · GW

I thought it was interesting when Oli said that there are so many good ideas in mechanism design and that the central bottleneck of mechanism design is that nobody understands UI design to take advantage of them. Would be very interested if other folks have takes or links to good mechanism design ideas that are neglected/haven't been properly tried enough or people/blogs that talk about stuff like that.

User info

Posts

Comments

Reflections on bay area visit

Good parts