Infohazard Discussion with Anders Sandberg 2021-03-30T10:12:45.901Z
AI Safety Beginners Meetup (Pacific Time) 2021-03-04T01:44:33.856Z
AI Safety Beginners Meetup (European Time) 2021-02-20T13:20:42.748Z
AISU 2021 2021-01-30T17:40:38.292Z
Online AI Safety Discussion Day 2020-10-08T12:11:56.934Z
AI Safety Discussion Day 2020-09-15T14:40:18.777Z
Online LessWrong Community Weekend 2020-08-31T23:35:11.670Z
Online LessWrong Community Weekend, September 11th-13th 2020-08-01T14:55:38.986Z
AI Safety Discussion Days 2020-05-27T16:54:47.875Z
Announcing Web-TAISU, May 13-17 2020-04-04T11:48:14.128Z
Requesting examples of successful remote research collaborations, and information on what made it work? 2020-03-31T23:31:23.249Z
Coronavirus Tech Handbook 2020-03-21T23:27:48.134Z
[Meta] Do you want AIS Webinars? 2020-03-21T16:01:02.814Z
TAISU - Technical AI Safety Unconference 2020-01-29T13:31:36.431Z
Linda Linsefors's Shortform 2020-01-24T13:08:26.059Z
1st Athena Rationality Workshop - Retrospective 2019-07-17T16:51:36.754Z
Learning-by-doing AI Safety Research workshop 2019-05-24T09:42:49.996Z
TAISU - Technical AI Safety Unconference 2019-05-21T18:34:34.051Z
The Athena Rationality Workshop - June 7th-10th at EA Hotel 2019-05-11T01:01:01.973Z
The Athena Rationality Workshop - June 7th-10th at EA Hotel 2019-05-10T22:08:03.600Z
The Game Theory of Blackmail 2019-03-22T17:44:36.545Z
Optimization Regularization through Time Penalty 2019-01-01T13:05:33.131Z
Generalized Kelly betting 2018-07-19T01:38:21.311Z
Non-resolve as Resolve 2018-07-10T23:31:15.932Z
Repeated (and improved) Sleeping Beauty problem 2018-07-10T22:32:56.191Z
Probability is fake, frequency is real 2018-07-10T22:32:29.692Z
The Mad Scientist Decision Problem 2017-11-29T11:41:33.640Z
Extensive and Reflexive Personhood Definition 2017-09-29T21:50:35.324Z
Call for cognitive science in AI safety 2017-09-29T20:35:16.738Z
The Virtue of Numbering ALL your Equations 2017-09-28T18:41:35.631Z
Suggested solution to The Naturalized Induction Problem 2016-12-24T16:03:03.000Z
Suggested solution to The Naturalized Induction Problem 2016-12-24T15:55:16.000Z


Comment by Linda Linsefors on Should I advocate for people to not buy Ivermectin as a treatment for COVID-19 in the Philippines for now? · 2021-06-05T12:17:11.433Z · LW · GW

Matthew, what's your current best guess about Ivermectin as Covid treatment?

Comment by Linda Linsefors on Should I advocate for people to not buy Ivermectin as a treatment for COVID-19 in the Philippines for now? · 2021-06-05T12:13:37.242Z · LW · GW

I just listened to a podcast talking about how great Ivermecin is, and fund this post because I was trying to find out what LWers though about this.

Here's a bunch of links from that podcast, which seems legit, but I've mostly have not looked into them. If you do follow up on this, let me know what you think.


British Ivermectin Recommendation Development group:

The BIRD Recommendation on the Use of Ivermectin for Covid-19:

Executive Summary:

Carvallo et al 2020. Study of the efficacy and safety of topical ivermectin+ iota-carrageenan in the prophylaxis against COVID-19 in health personnel. J. Biomed. Res. Clin. Investig., 2.

Cobos-Campos et al 2021.Potential use of ivermectin for the treatment and prophylaxis of SARS-CoV-2 infection: Efficacy of ivermectin for SARS-CoV-2. Clin Res Trials, 7: 1-5.

Database of all ivermectin COVID-19 studies. 93 studies, 55 peer reviewed, 56 with results comparing treatment and control groups:

Karale et al 2021. A Meta-analysis of Mortality, Need for ICU admission, Use of Mechanical Ventilation and Adverse Effects with Ivermectin Use in COVID-19 Patients. medRxiv.

Kory et al 2021. Review of the Emerging Evidence Demonstrating the Efficacy of Ivermectin in the Prophylaxis and Treatment of COVID-19. American Journal of Therapeutics, 28(3): e299:

Nardelli et al 2021. Crying wolf in time of Corona: the strange case of ivermectin and hydroxychloroquine. Is the fear of failure withholding potential life-saving treatment from clinical use?. Signa Vitae, 1: 2.

Yagisawa et al 2021. Global trends in clinical studies of ivermectin in COVID-19. The Japanese Journal of Antibiotics, 74: 1.

Comment by Linda Linsefors on "You and Your Research" – Hamming Watch/Discuss Party · 2021-03-19T11:41:09.911Z · LW · GW

I will miss the beginning of this meeting. Can you share a video link so I can watch it in advance?

Comment by Linda Linsefors on AI Safety Beginners Meetup (European Time) · 2021-02-20T23:29:21.614Z · LW · GW

I think this happened because I unselected "Alignment Forum" for this event. To my best understanding, evens are not supposed to be Alignment Forum content, and it is a but that this is even possible. Therefore, I decided that the cooperative thing to do would be not to use this bug. Though I'm not sure what is better, since I think events should be allowed on the Alignment Forum.

> I assume that when the event is updated that the additional information will include how to join the meetup?
Yes. We'll probably be in Zoom, but I have not decided. 

> I am interested in attending.
Great, see you there.

Comment by Linda Linsefors on The "Commitment Races" problem · 2021-02-08T15:55:43.456Z · LW · GW

Imagine your life as a tree (as in data structure). Every observation which (from your point of view of prior knowledge) could have been different, and every decision which (from your point of view) could have been different, is a node in this tree. 

Ideally you would would want to pre-analyse the entire tree, and decide the optimal pre-commitment for each situation. This is too much work. 

So instead you wait and see which branch you find yourself in, only then make the calculations needed to figure out what you would do in that situation, given a complete analysis of the tree (including logical constraints, e.g. people predicting what you would have done, etc). This is UDT. In theory, I see no drawbacks with UDT. Except in practice UDT is also too much work. 

What you actually do, as you say, is to rely on experience based heuristics. Experience based heuristics is much superior for computational efficiency, and will give you a leg up in raw power. But you will slide away from optimal DT, which will give you a negotiating disadvantage. Given that I think raw power is more important than negotiating advantage, I think this is a good trade-off. 

The only situation where you want to rely more on DT principles, is in super important one-off situations, and you basically only get those in weird acausal trade situations. Like, you could frame us building a friendly AI as acausal trade, like Critch said, but that framing does not add anything useful. 

And then there is things like this and this and this, which I don't know how to think of. I suspect it breaks somehow, but I'm not sure how. And if I'm wrong, getting DT right might be the most important thing.

But in any normal situation, you will either have repeated games among several equals, where some coordination mechanism is just uncomplicatedly in everyone interest. Or your in a situation where one person just have much more power over the other one.

Comment by Linda Linsefors on The "Commitment Races" problem · 2021-02-08T12:57:32.207Z · LW · GW

(This is some of what I tried to say yesterday, but I was very tried and not sure I said it well)

Hm, the way I understand UDT, is that you give yourself the power to travel back in logical time. This means that you don't need to actually make commitment early in your life when you are less smart.

If you are faced with blackmail or transparent Newcomb's problem, or something like that, where you realise that if you had though of the possibility of this sort of situation before it happened (but with your current intelligence), you would have pre-committed to something, then you should now do as you would have pre-committed to.

This means that an UDT don't have to do tons of pre-commitments. It can figure things out as it goes, and still get the benefit of early pre-committing. Though as I said when we talked, it does loose some transparency which might be very costly in some situations. Though I do think that you loose transparency in general by being smart, and that it is generally worth it.

(Now something I did not say)

However the there is one commitment that you (maybe?[1]) have to do to get the benefit of UDT if you are not already UDT, which is to commit to become UDT. And I get that you are wary of commitments. 

Though more concretely, I don't see how UDT can lead to worse behaviours. Can you give an example? Or do you just mean that UDT get into commitment races at all, which is bad? But I don't know any DT that avoids this, other than always giving in to blackmail and bullies, which I already know you don't, given one of the stories in the blogpost.

[1] Or maybe not. Is there a principled difference between never giving into blackmail becasue you pre-committed something, or just never giving into blackmail with out any binding pre-commitment? I suspect not really, which means you are UDT as long as you act UDT, and no pre-commitment needed, other than for your own sake.

Comment by Linda Linsefors on The "Commitment Races" problem · 2021-02-07T20:07:34.309Z · LW · GW

I mostly agree with this post, except I'm not convinced it is very important. (I wrote some similar thought here.)

Raw power (including intelligence) will always be more important than having the upper hand in negotiation. Because I can only shift you up to the amount I can threaten you.

Let's say I can cause you up to X utility of harm, according to your utility function. If I'm maximally skilled at blackmail negotiation then I can decide your action with in the set of action such that your utility is with in (max-X, max] utility.

If X utility is a lot, then I can influence you a lot. If X is not so much then I don't have much power over you. If I'm strong then X will be large, and influencing your action will probably be of little importance to me. 

Blackmail is only important when players are of similar straights which is probably unlikely, or if the power to destroy is much more than the power to create, which I also find unlikely. 

The main scenario where I expect blackmail to seriously matter (among super intelligences) is in aclausal trade between different universes. I'm sceptical to this being a real thing, but admit I don't have strong arguments on this point.

Comment by Linda Linsefors on The "Commitment Races" problem · 2021-02-07T19:57:30.394Z · LW · GW

Meanwhile, a few years ago when I first learned about the concept of updatelessness, I resolved to be updateless from that point onwards. I am now glad that I couldn't actually commit to anything then.


Why is that?

Comment by Linda Linsefors on AISU 2021 · 2021-01-30T19:48:07.361Z · LW · GW

AISU needs a logo. If you are interested in making one for us, let me know.

Comment by Linda Linsefors on AISU 2021 · 2021-01-30T17:53:41.142Z · LW · GW

A note about previous events, and name changes

This is indeed the third AI Safety Unconference I'm involved in organising. The previous too where TAISU (short for Technical AI Safety Unconference), and Web-TAISU.

The first event was an in person event which took place at EA Hotel (CEEALAR). I choose to give that event a more narrow focus due to lack of space, and Web-TAISU where mainly just a quick adaptation to there suddenly being a plague about.

Having a bit more time to reflect this time, me and Aaron Roth have decided that there is less reason to put restriction on an online event, so this time we are inviting everything AI Safety related.

Buy the way, another thing that is new for this year, is that I have a co-organiser. Thanks to Aaron for joining and also reminding me that it was time for another AI Safety Unconference.

Comment by Linda Linsefors on Reflections on Larks’ 2020 AI alignment literature review · 2021-01-08T18:52:22.548Z · LW · GW

Ok, that makes sense. Seems like we are mostly on the same page then. 

I don't have strong opinions weather drawing in people via prestige is good or bad. I expect it is probably complicated. For example, there might be people who want to work on AI Safety for the right reason, but are too agreeable to do it unless it reach some level of acceptability. So I don't know what the effects will be on net. But I think it is an effect we will have to handle, since prestige will be important for other reasons. 

On the other hand, there are lots of people who really do want to help, for the right reason. So if growth is the goal, helping these people out seems like just an obvious thing to do. I expect there are ways funders can help out here too. 

I would not update much on the fact that currently most research is produced by existing institutions. It is hard to do good research, and even harder with out collogues, sallary and other support that comes with being part of an org. So I think there is a lot of room for growth, by just helping the people who are already involved and trying.

Comment by Linda Linsefors on Reflections on Larks’ 2020 AI alignment literature review · 2021-01-06T00:46:29.522Z · LW · GW

There are two basic ways to increase the number of AI Safety refreshers.
1) Take mission aligned people (usually EA undergraduates) and help then gain the skills.
2) Take a skilled AI researcher and convince them to join the mission.

I think these two types of growth may have very different effects. 

A type 1 new person might take some time to get any good, but will be mission aligned. If that person looses sight of the real problem, I am very optimistic about just reminding them what AI Safety is really about, and they will get back on track. Further more, these people already exist, and are already trying to become AI Safety researches. We can help them, ignore them, or tell them to stop. Ignoring them will produce more noise compared to helping them, since the normal pressure of building academic prestige is currently not very aligned with the mission. So do we support them or tell them to stop? Actively telling people not to try to help with AI Safety seems very bad, it is something I would expect to have bad cultural effects outside just regulating how many people are doing AI Safety research.

A type 2 new person who are converted to AI Safety research becasue they actually care about the mission is not to dissimilar from a type 2 new person, so I will not write more about that.

However there is an other type of type 2 person who will be attracted to AI Safety as a side effect of AI Safety being cool and interesting. I think there is a risk that these people takes over the field and diverts the focus completely. I'm not sure how to stop this though since this is a direct side effect of gaining respectability, and AI Safety will need respectability. And we can't just work in the shadows until it is the right time, because we don't know the timelines. The best plan I have for keeping global AI Safety research on course, is to put as many of "our" people in to the field as we can. We have a founders effect advantage, and I expect this to get stronger the more truly mission aligned people we can put into academia. 

I agree with alexflint, that there are bad growth trajectories and good growth trajectories. But I don't think the good one is as hard to hit as they do. I think partly what is wrong is the model of AI Safety as a single company. I don't think this is a good intuition pump. Noise is a thing, but it is much less intrusive that this metaphor suggest. Someone at MIRI told me that to first approximation he don't read other peoples work, so at least for this person, it don't matter how much noise is published, and I think this is a normal situation, especially for people interested deep work.

What mostly keep people in academia from doing deep work is the pressure to constantly publish. 

I think focusing on growth v.s. not growth is the wrong question. But I think focusing on deep work is the right question. So let's help people do deep work. Or, at least that what I aim to do. And I'm also happy to discuss with anyone.

Comment by Linda Linsefors on On Destroying the World · 2020-09-29T23:00:55.960Z · LW · GW

I think it's great that you did this. It made the game more real, and hopefully the rest of us learned something.

Comment by Linda Linsefors on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-29T22:30:52.512Z · LW · GW

I know that it is designed to guide decisions made in the real world. This does not force me to agree with the conclusions in all circumstances. Lots of models are not up to the task they are designed to deal with. 

But I should have said "not in that game theory situation", becasue there is probably a way to construct some game theory game that applies here. That was my bad.

However, I stand by the claim that the full information game is too far from reality to be a good guide in this case. With stakes this high even small uncertainty becomes important.

Comment by Linda Linsefors on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-29T22:25:38.028Z · LW · GW

What possible reason could Petrov or those in similar situations have had for not pushing the button? Maybe he believed that the US would retaliate and kill his family at home, and that deterred him. In other words, he believed his enemy would push the button.

Or maybe he just did not want to kill millions of people?

Comment by Linda Linsefors on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-29T22:23:46.161Z · LW · GW

I should probably have said "we are not in that game theory situation". 
(Though I do think that the real world is more complex that current game theory can handle. E.g. I don't think current game theory can fully handle unknown-unknown, but I could be wrong on this point)

The game of mutually assured destruction is very different even when just including known unknown.

Comment by Linda Linsefors on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-27T10:29:05.962Z · LW · GW

But we are not in a game theory situation. We are in an imperfect world with imperfect information. There are malfunctioning warning systems and liars. And we are humans and not programs that get to read each others source code. There are no perfect commitments and if there where, there would be no way of verifying them.

So I think that the lesson is, that what ever your public stance, and whether or not you think that there are counterfactual situation where you should nuke. In practice, you should not nuke.

Do you see what I'm getting at?

Comment by Linda Linsefors on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T12:10:12.579Z · LW · GW

From this we learn that you should not launch nukes, even if someone tells you to do it.  

Comment by Linda Linsefors on Online LessWrong Community Weekend · 2020-09-05T13:23:49.269Z · LW · GW

Applications are now closed. We'll be ~120 participants!

Me and the Helpers are finishing up the final preparations in Discord, and other places, so that we are ready to invite you in on Monday.

I've just sent out the last acceptance emails. If you have applied and not heard from me or anyone else on the organising team, then let me know asap, so we can find out what went wrong.

I may accept late applications in exchange for a bribe. (I'm actually serious about this. A *few* late applications is not a problem, and the bribe is so that you don't make a habit of it.)

Comment by Linda Linsefors on Online LessWrong Community Weekend, September 11th-13th · 2020-09-05T13:22:57.563Z · LW · GW

Applications are now closed. We'll be ~120 participants!

Me and the Helpers are finishing up the final preparations in Discord, and other places, so that we are ready to invite you in on Monday.

I've just sent out the last acceptance emails. If you have applied and not heard from me or anyone else on the organising team, then let me know asap, so we can find out what went wrong.

I may accept late applications in exchange for a bribe. (I'm actually serious about this. A *few* late applications is not a problem, and the bribe is so that you don't make a habit of it.)

Comment by Linda Linsefors on Online LessWrong Community Weekend · 2020-09-02T13:58:02.703Z · LW · GW

Good point. I also got this question elsewhere.

Sofia Gallego suggests:

If your country allows it, there are plenty of platforms (e.g. TransferWise) with which you can do international transfers easier and with lower fees than banks.

If that don't work, you can also send money to me on paypal, with a message what it is for, and I'll transfer it to LessWrong Deutschland for you. Bank transfer with in Europe is super easy.

Comment by Linda Linsefors on The Curse Of The Counterfactual · 2020-08-18T20:57:38.975Z · LW · GW
he admits that she did not actually do any of the things she thinks she should have. But her brain persists in arguing that reality is wrong.

This is interesting. We use the word "should" both as to command ourselves and others. "You should eat vegetables", and to make predictions "This should work". Both types has a similar type of uncertainty, we do not know if the suggestion will be obeyed or if our prediction will be right.

I'm not sure how much one should read in to linguistic quirks like this.

Comment by Linda Linsefors on Raising funds to establish a new AI Safety charity · 2020-08-08T22:40:08.344Z · LW · GW

That makes sense. Thanks for taking the time to answer.

Comment by Linda Linsefors on Solving Key Alignment Problems Group · 2020-08-08T18:06:29.905Z · LW · GW

Yes, that one.

Comment by Linda Linsefors on Raising funds to establish a new AI Safety charity · 2020-08-08T12:30:15.725Z · LW · GW

Is this rule still in place?

Why do you have this rule? It seems to me like banning organizational announcement will make it much harder to get new initiatives of the ground.

Comment by Linda Linsefors on Solving Key Alignment Problems Group · 2020-08-08T11:24:03.268Z · LW · GW

This seems great. Would be ok to join one or two times to see if your group and method, and what ever topic you decide to zoom into, is a good fit for me?

I've started to collect various AI Safety initiative s here, so that we are aware of each other and hopefully can support each other. Let me know if you want to be listed there too.

Also people who are interested in joining elriggs group, might also be interested in the AI Safety discussion days, men and JJ are organising. Same topic, different format.

FLI have done a map of all AI Safety research (or all that they could find at the time). Would this be a useful recourse for you? I'm not linking it directly, becasue you might want to think for yourself first,before becoming to biased by others ideas. But it seems that at least it would be a useful tool at the literature review stage.

Comment by Linda Linsefors on Online LessWrong Community Weekend, September 11th-13th · 2020-08-02T12:47:46.457Z · LW · GW

Good point. I don't think LessWrong Deutschland have paypal. But if you sent it to me, then I can forward the money.

My paypal is my normal email:

Comment by Linda Linsefors on Was a PhD necessary to solve outstanding math problems? · 2020-07-14T08:06:33.081Z · LW · GW
A PhD is an opportunity to do focused, original research. People should only choose that path if that’s what they really want.

I completely agree. Doing a PhD for credentials is not a good strategy. Doing a PhD for money makes no sense what so ever.

Comment by Linda Linsefors on AI Safety Discussion Days · 2020-07-13T21:57:29.553Z · LW · GW

The next discussion day is on July 18th.

More info here:

Comment by Linda Linsefors on Was a PhD necessary to solve outstanding math problems? · 2020-07-11T11:54:26.918Z · LW · GW

There is also the fact that there are much fewer academic post-doc jobs compared to PhD position. This is probably different in different fields, but my math friend says this is defiantly the case in math. Sure the more successful are more likely to get the next job, but it is more about relative success compared to your competition, than absolute success. I don't know if the bar to keep going happens to be reasonable in absolute terms.

The way I view a PhD is that it is an entry level research job. If you want to have a research career, you start with an entry level research job, more or less similar to other career path.

I wonder, if you want to do maths research, and don't do a PhD, what is the alternative? The best thing about a PhD is that you get paid to do research, which is very uncommon every where else, unless you do something very applied.

Do you know of any reasonable alternatives to working in academia for less applied research? Or maybe this is what you mean by gate-keeping, that academia has monopolised funding?

Comment by Linda Linsefors on Was a PhD necessary to solve outstanding math problems? · 2020-07-10T23:29:34.982Z · LW · GW

Some what related:

  • This trailer for the documentary "Death (& Rebirth) Of A PhD" claims that getting a PhD used to be great, but is now crap.
  • And here's an almost finished blogpost I'm working on: "Should you do a PhD?" Where I try to sort out some misconceptions I've seen, and give some very general advise.

However, neither of these exactly address the question of the post.

However again, I think it is probably more useful to ask the question: If I want to solve outstanding maths problems, is a PhD my best choice.

Comment by Linda Linsefors on Was a PhD necessary to solve outstanding math problems? · 2020-07-10T23:16:51.560Z · LW · GW

I've read this after I wrote my own reply. This seems like a reasonable hypothesis too. One thing a PhD supervisor is great for, is telling you what has already been done, and what papers you should read to learn more about some particular thing.

Comment by Linda Linsefors on Was a PhD necessary to solve outstanding math problems? · 2020-07-10T23:14:53.374Z · LW · GW

I don't think a PhD is necessary for ground breaking math. A more plausible explanation (or so I think) is that academia is a preferable work environment, compared to being by yourself. Even for an introvert, being part of academia will be more convenient. Therefore, everyone who want to do math research will try to find a job in academia, and everyone who is smart/competent enough to do groundbreaking research is also more than smart/competent enough to get a PhD.

I have to say that I also expected some of the work to be done by non-PhDs. But given the result I think that the correlation has at least as much to do with common cause, as with causality from PhD -> research.

On the other hand, it could be the other way around? Did you check if they got their PhD before or after that result. If you do a ground breaking research, you can just write it up as a thesis and get a PhD.

Comment by Linda Linsefors on Rationality: From AI to Zombies · 2020-07-05T06:14:39.219Z · LW · GW

I'm leaving this comment so that I can find my way back here in the future.

Comment by Linda Linsefors on Announcing Web-TAISU, May 13-17 · 2020-05-15T12:25:40.648Z · LW · GW

So... apparently I underestimate the need to send out event reminders, but better late than never. Today is the 2:nd day (out of 4) of Web-TAISU, and it is not too late to join.

General information about the event:

Collaborative Schedule:

Let me know if you have any questions.

Comment by Linda Linsefors on Using vector fields to visualise preferences and make them consistent · 2020-04-23T13:14:04.478Z · LW · GW

As mentioned, I did think of this of this model before, and I also disagree with Justin/Convergence on how to use it.

Lets say that the underlying space for the vector field is the state of the world. Should we really remove curl? I'd say no. It is completely valid to want to move along some particular path, even a circle, or more likely, a spiral.

Alternatively, lets say that the underlying space for the vector field is world histories. Now we should remove curl, becasue any circular preference in this space is inconsistent. But what even is the vector field in this picture?


My reason for considering values as a vector is becasue that is sort of how it feels to me on the inside. I have noticed that my own values are very different depending on my current mood and situation.

  • When I'm sand/depressed, I become a selfish hedonist. All I care about is for me to be happy again.
  • When I'm happy I have more complex and more altruistic values. I care about truth and the well-being of others.

It's like these wants are not tracking my global values at all, but just pointing out a direction in which I want to move. I doubt that I even have global values, because that would be very complicated, and also what would be the use of that? (Except when building a super intelligent AI, but that did not happen much in our ancestral environment.)

Comment by Linda Linsefors on I'm leaving AI alignment – you better stay · 2020-03-22T23:28:22.919Z · LW · GW

Ok, thanks. I have changed it ...

... I mean that is what what I wrote all along, can't you see? :P

Comment by Linda Linsefors on I'm leaving AI alignment – you better stay · 2020-03-22T14:45:16.702Z · LW · GW

Hm, I did not think about the tax part.

What country to you live in?

Maybe BERI would be willing to act as middle hand. They have non profit status in the US.

Comment by Linda Linsefors on [Meta] Do you want AIS Webinars? · 2020-03-22T02:24:31.001Z · LW · GW

Would you be interested in just participating? I read your post about leaving AIS. Seems like you have enough experience to be able to contribute to the discussion.

Comment by Linda Linsefors on I'm leaving AI alignment – you better stay · 2020-03-22T02:22:34.501Z · LW · GW

Nice diagram.

I'm currently doing interviews with early career and aspiring AIS researchers to learn how to better support this group, since I know a lot of us are struggling. Even though you left, I think there are valuable information in your experience. You can answer here publicly or contact me via your preferred method.


What could have been different about the world for you to succeed in getting a sustainable AI Safety research career?

What if you got more funding?

What if you got some sort of productivity coaching?

What if you had a collaboration partner?


Random suggestion

Would you be interested in being a research sponsor. I'm guessing wildly here but maybe you can earn enough to live the fun life you want while also supporting a AI Safety researcher? Given that you been in the field, you have some capability to evaluate their work. You can give someone not just money but also a discussion partner and some amount of feedback.

If you can help someone else succeed, that creates as much good as doing the work yourself.

I just started doing these interviews with people, so I don't know for sure. But if my current model is more or less right, there will be lots of people who are in the situation you just left behind. And if I would make some wild guesses again, I would say that most of them will quit after a few year, like you, unless we can create better support.

This is just something that came to my mind late at night. I have not though long and hard about this idea. But maybe check if something like this feels right for you?

Comment by Linda Linsefors on [Meta] Do you want AIS Webinars? · 2020-03-21T21:37:45.732Z · LW · GW

Let's do it!

If you pick a time and date and write up an abstract, then I will sort out the logistic. Worst case it's just you and me having a conversation, but most likely some more people will show up.

Comment by Linda Linsefors on TAISU - Technical AI Safety Unconference · 2020-03-19T13:49:52.321Z · LW · GW

COVID-19 Update!

TAISU in it's planed form is cancelled. But there will be a Web-TAISU around the same theme, and around the same time. I will make an announcement and probably open up for more applications when this thing is a bit more planed out.

Comment by Linda Linsefors on TAISU - Technical AI Safety Unconference · 2020-03-19T13:45:31.869Z · LW · GW

Hi Jarson.

Due to the current pandemic TAISU will take a very different form than originally planed. I will organize some sort of online event on the same theme around the same time, but I don't know much more yet. I don't want to take on board more participants until I know what I'm organising. But ass soon as I know a bit more, I will do a new announcement and open up applications again. I expect this will happen with in a week or two.

Regarding your project, I'd bee happy to take a look at your google dock. Pleas share it.

Comment by Linda Linsefors on TAISU - Technical AI Safety Unconference · 2020-02-24T12:26:55.925Z · LW · GW

Official application deadline has now passed. Those of you who have applied to participate will soon get an email.

However, since TAISU is not full yet. I will now accept people on a first come first serve bases, for anyone who is qualified.

Comment by Linda Linsefors on Linda Linsefors's Shortform · 2020-01-24T13:08:26.165Z · LW · GW

I'm basically ready to announce the next Technical AI Safety Unconference (TAISU). But I have hit a bit of decision paralysis as to what dates it should be.

If you are reasonably interested in attending, please help me by filling in this doodle

If you don't know what this is about, have a look at the information for the last one.

The venue will be EA Hotel in Blackpool UK again.

Comment by Linda Linsefors on “embedded self-justification,” or something like that · 2019-11-13T19:11:32.891Z · LW · GW

The way I understand your division of floors and sealing, the sealing is simply the highest level meta there is, and the agent has *typically* no way of questioning it. The ceiling is just "what the algorithm is programed to do". Alpha Go is had programed to update the network weights in a certain way in response to the training data.

What you call floor for Alpha Go, i.e. the move evaluations, are not even boundaries (in the sense nostalgebraist define it), that would just be the object level (no meta at all) policy.

I think this structure will be the same for any known agent algorithm, where by "known" I mean "we know how it works", rather than "we know that it exists". However Humans seems to be different? When I try to introspect it all seem to be mixed up, with object level heuristics influencing meta level updates. The ceiling and the floor are all mixed together. Or maybe not? Maybe we are just the same, i.e. having a definite top level, hard coded, highest level meta. Some evidence of this is that sometimes I just notice emotional shifts and/or decisions being made in my brain, and I just know that no normal reasoning I can do will have any effect on this shift/decision.

Comment by Linda Linsefors on Vanessa Kosoy's Shortform · 2019-11-13T14:09:50.217Z · LW · GW

I agree that you can assign what ever belief you want (e.g. what ever is useful for the agents decision making proses) for for what happens in the counterfactual when omega is wrong, in decision problems where Omega is assumed to be a perfect predictor. However if you want to generalise to cases where Omega is an imperfect predictor (as you do mention), then I think you will (in general) have to put in the correct reward for Omega being wrong, becasue this is something that might actually be observed.

Comment by Linda Linsefors on All I know is Goodhart · 2019-10-24T23:03:03.460Z · LW · GW

Weather this works or not is going to depend heavily on what looks like.

Given , i.e. , what does this say about ?

The answer depends on the amount of mutual information between , and . Unfortunately the the more generic is, (i.e. any function is possible) the less mutual information there will be. Therefore, unless we know some structure about , the restriction to is not going to do much. The agent will just find a very different policy that also actives very high in some very Goodharty way, but does not get penalized because low value for on is not correlated with low value on .

This could possibly be fixed by adding assumptions of the type for any that does too well on . That might yield something interesting, or it might just be a very complicated way of specifying as satisfiser, I don't know.

Comment by Linda Linsefors on TAISU 2019 Field Report · 2019-10-16T07:18:42.515Z · LW · GW

Mainly that we had two scheduling sessions, one on the morning of the first day an one on the morning of the third day. At each scheduling session, it was only possible to add activities for the upcoming two days.

At the start of unconference encouraged people to think of it as 2 day event and try to put in everything they really wanted to do the first two days. On the morning of day three, the schedule was cleared to let people add sessions about topic that where alive to them at that time.

The main reason for this design choice was to allow continued/deeper conversation. I if ideas where created during the first half, I wanted there to be space to keep talking about those ideas.

Also, some people only attended the last two days, and this set up guaranteed they would get a chance to add things to the schedule too. But that could also have been solved in other ways, so that was not a crux for my design choice.

Comment by Linda Linsefors on Conceptual Problems with UDT and Policy Selection · 2019-10-15T12:59:34.447Z · LW · GW

I think UDT1.1 have two fundamentally wrong assumptions built in.

1) Complete prior: UDT1.1 follows the policy that is optimal according to it's prior. This is incommutable in general settings and will have to be approximated some how. But even an approximation of UDT1.1 assumes that UDT1.1 is at least well defined. However in some multi agent settings or when the agent is being fully simulated by the environment, or any other setting where the environment is necessary bigger than the agent, then UDT1.1 is ill defined.

2) Free will: In the problem Agent Simulates Predictor, the environment is smaller than the agent, so it is falls outside the above point. Here instead I think the problem is that the agent assumes that it has free will, when in fact it behaves in a deterministic manner.

The problem of free will in Decision Problems is even clearer in the smoking lesion problem:

You want to smoke and you don't want Cancer. You know that people who smoke are more likely get cancer, but you also know that smoking does not cause cancer. Instead, there is a common cause, some gene, that happens to both increase the risk of cancer and make it more likely that a person with this gene are more likely to choose to smoke. You can not test if you have the gene.

Say that you decide to smoke, becasue ether you have the gene or not so you might as well enjoy smoking. But what if everyone though like this? Then there would be no correlation between the cancer gene and smoking. So where did the statistics about smokers getting cancer come from (in this made up version of reality).

If you are the sort of person who smokes no mater what, then ether:

a) You are sufficiently different from most people such that the statistics does not apply to you.


b) The cancer gene is correlated with being the sort of person that has a decision possess that leads to smoking.

If b is correct, then maybe you should be the sort of algorithm that decides not to smoke, as to increase the chance of being implemented into a brain that lives in a body with less risk of cancer. But if you start thinking like that, then you are also giving up your hope at affecting the universe, and resign to just choosing where you might find yourself, and I don't think that is what we want from a decision theory.

But there also seems to be no good way of thinking about how to steer the universe with out pretending to have free will. But since that is actually a falls assumption, there will be weird edge cases where you're reasoning breaks down.