Posts

AXRP Episode 12 - AI Existential Risk with Paul Christiano 2021-12-02T02:20:17.041Z
Even if you're right, you're wrong 2021-11-22T05:40:00.747Z
The Meta-Puzzle 2021-11-22T05:30:01.031Z
Everything Studies on Cynical Theories 2021-10-27T01:31:20.608Z
AXRP Episode 11 - Attainable Utility and Power with Alex Turner 2021-09-25T21:10:26.995Z
Announcing the Vitalik Buterin Fellowships in AI Existential Safety! 2021-09-21T00:33:08.074Z
AXRP Episode 10 - AI’s Future and Impacts with Katja Grace 2021-07-23T22:10:14.624Z
Handicapping competitive games 2021-07-22T03:00:00.498Z
CGP Grey on the difficulty of knowing what's true [audio] 2021-07-13T20:40:13.506Z
A second example of conditional orthogonality in finite factored sets 2021-07-07T01:40:01.504Z
A simple example of conditional orthogonality in finite factored sets 2021-07-06T00:36:40.264Z
AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant 2021-06-24T22:10:12.645Z
Up-to-date advice about what to do upon getting COVID? 2021-06-19T02:37:10.940Z
AXRP Episode 8 - Assistance Games with Dylan Hadfield-Menell 2021-06-08T23:20:11.985Z
AXRP Episode 7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra 2021-05-28T00:20:10.801Z
AXRP Episode 7 - Side Effects with Victoria Krakovna 2021-05-14T03:50:11.757Z
Challenge: know everything that the best go bot knows about go 2021-05-11T05:10:01.163Z
AXRP Episode 6 - Debate and Imitative Generalization with Beth Barnes 2021-04-08T21:20:12.891Z
AXRP Episode 5 - Infra-Bayesianism with Vanessa Kosoy 2021-03-10T04:30:10.304Z
Privacy vs proof of character 2021-02-28T02:03:31.009Z
AXRP Episode 4 - Risks from Learned Optimization with Evan Hubinger 2021-02-18T00:03:17.572Z
AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch 2020-12-29T20:45:23.435Z
AXRP Episode 2 - Learning Human Biases with Rohin Shah 2020-12-29T20:43:28.190Z
AXRP Episode 1 - Adversarial Policies with Adam Gleave 2020-12-29T20:41:51.578Z
Cognitive mistakes I've made about COVID-19 2020-12-27T00:50:05.212Z
Announcing AXRP, the AI X-risk Research Podcast 2020-12-23T20:00:00.841Z
Security Mindset and Takeoff Speeds 2020-10-27T03:20:02.014Z
Robin Hanson on whether governments can squash COVID-19 2020-03-19T18:23:57.574Z
Should we all be more hygenic in normal times? 2020-03-17T06:14:23.093Z
Did any US politician react appropriately to COVID-19 early on? 2020-03-17T06:12:31.523Z
An Analytic Perspective on AI Alignment 2020-03-01T04:10:02.546Z
How has the cost of clothing insulation changed since 1970 in the USA? 2020-01-12T23:31:56.430Z
Do you get value out of contentless comments? 2019-11-21T21:57:36.359Z
What empirical work has been done that bears on the 'freebit picture' of free will? 2019-10-04T23:11:27.328Z
A Personal Rationality Wishlist 2019-08-27T03:40:00.669Z
Verification and Transparency 2019-08-08T01:50:00.935Z
DanielFilan's Shortform Feed 2019-03-25T23:32:38.314Z
Robin Hanson on Lumpiness of AI Services 2019-02-17T23:08:36.165Z
Test Cases for Impact Regularisation Methods 2019-02-06T21:50:00.760Z
Does freeze-dried mussel powder have good stuff that vegan diets don't? 2019-01-12T03:39:19.047Z
In what ways are holidays good? 2018-12-28T00:42:06.849Z
Kelly bettors 2018-11-13T00:40:01.074Z
Bottle Caps Aren't Optimisers 2018-08-31T18:30:01.108Z
Mechanistic Transparency for Machine Learning 2018-07-11T00:34:46.846Z
Research internship position at CHAI 2018-01-16T06:25:49.922Z
Insights from 'The Strategy of Conflict' 2018-01-04T05:05:43.091Z
Meetup : Canberra: Guilt 2015-07-27T09:39:18.923Z
Meetup : Canberra: The Efficient Market Hypothesis 2015-07-13T04:01:59.618Z
Meetup : Canberra: More Zendo! 2015-05-27T13:13:50.539Z
Meetup : Canberra: Deep Learning 2015-05-17T21:34:09.597Z

Comments

Comment by DanielFilan on Even if you're right, you're wrong · 2021-11-25T19:00:58.156Z · LW · GW

Indeed! The post would be boring if none of the bullet points were legit.

Comment by DanielFilan on Even if you're right, you're wrong · 2021-11-22T09:02:22.923Z · LW · GW

You're not wrong. This post wasn't really meant literally.

Speaking in the language of the post:

Well, look. Let's put to the side whether or not sin^2x + cos^2x is actually 1 or not. In today's culture, the obvious and natural interpretation of what you said is that sin^2x - cos^2x = 0. But that's a damaging belief for the future engineers of America to have, that could seriously harm their faith in the math education they received at our upstanding public schools, and so it's irresponsible for you to go around saying "sin^2x + cos^2x = 1" without clarifying exactly what you do or don't mean.

Comment by DanielFilan on The Meta-Puzzle · 2021-11-22T08:58:17.370Z · LW · GW

That was my solution.

Comment by DanielFilan on The Meta-Puzzle · 2021-11-22T08:52:24.518Z · LW · GW

I didn't think of that one!

Comment by DanielFilan on The Meta-Puzzle · 2021-11-22T06:52:36.993Z · LW · GW

I actually thought about including that: similarly, in American logic, you can't take an arbitrary claim P and come up with a sentence S such that you can derive P from "S is true" and also from "S is false".

Comment by DanielFilan on The Meta-Puzzle · 2021-11-22T05:54:27.824Z · LW · GW

Depending on how the 'always lie' part is defined, a liar could say something 'impossible' like 'I am neither single nor married'.

I mean that everything they say is false.

both the honest and the liars will say 'I speak the truth'.

And they also both say "I worship God".

Comment by DanielFilan on Everything Studies on Cynical Theories · 2021-11-11T05:13:42.647Z · LW · GW

Regarding the authors' attempts to get papers published in these journals, the review doesn't make it seem like the book relies on that experiment being valid (and the review itself does not) - it just talks about various features of these fields and theorizes about their causes and effects. I also don't think that their experiment was 'bad science' in the sense of being uninformative. If 'grievance studies' journals are willing to publish bad papers, that does tell you something about those journals, even if 'hard science' journals are also willing to publish bad papers (which we know, thanks to the replication crisis and bloggers like Andrew Gelman, that they are). Also, Wikipedia says "By the time of the reveal, 4 of their 20 papers had been published; 3 had been accepted but not yet published; 6 had been rejected; and 7 were still under review.". It seems unfair to include the papers that were under review in the denominator, since their efforts ended early, so I'd evaluate their success rate at 1 in 2, rather than 1 in 3, which isn't so bad.

The line of arguments that critical studies are illiberal because they remove the focus from the individuals, their personal choices and responsibilities to structural and systemic tendencies seems to prove too much.

This is really not what the review portrays the arguments to be, so I'm having difficulty engaging with this paragraph. Could you perhaps quote an example of that argument in the book or in the review that you think is invalid?

But a more important point is that in this methaphor wokeness and social justice are our immune system against fashism.

I'd say that liberalism is a sufficient immune system - altho I'm obviously interested in ways that it isn't.

I don't see how creating a rallying flag for liberals to stand against social justice in a culture war is going to help. On the contrary, this leads to evaporation of group beliefs and more radicalisation of the left due to toxoplasma of rage.

I think the idea is to give a certain strain of thinking a name, and analyse what its like, to make it easier for people to figure out if that strand of thinking is somehow bad and avoid it if it is. Presumably you're sometimes in favour of this kind of thing, so I'd like to know what you think makes this effort different

I don't think you can make a case against default ideology in science from the liberal position.

I think liberalism allows the idea that many people can have a wrong worldview!

Regarding how the 'second secularism' deals with issues like de facto segregation in the US: I agree that that's the sort of thing that critical social justice cares about, but it's also something that liberalism can discuss and grapple with. As you mention, in order to understand the problem you probably also need to understand racism, but that doesn't automatically mean that things other than critical social justice can't deal with the problem, or that critical social justice frames are going to be successful (e.g. it might make you think that people only bring up house values as a pretext for racism, when it seems pretty intuitive to me that people actually do care about how much money they have).

A final note: I get the sense that you maybe think I wrote this review. I actually didn't, but I mostly liked it, somewhat mooting the point.

Comment by DanielFilan on What exactly is GPT-3's base objective? · 2021-11-11T00:17:05.114Z · LW · GW

Expected return in a particular environment/distribution? Or not? If not, then you may be in a deployment context where you aren't updating the weights anymore and so there is no expected return

I think you might be misunderstanding this? My take is that "return" is just the discounted sum of future rewards, which you can (in an idealized setting) think of as a mathematical function of the future trajectory of the system. So it's still well-defined even when you aren't updating weights.

Comment by DanielFilan on What exactly is GPT-3's base objective? · 2021-11-11T00:13:50.502Z · LW · GW

I continue to think that the Risks from Learned Optimization terminology is really good, for the specific case that it's talking about. The problem is just that it's not general enough to handle all possible ways of training a model using machine learning.

GPT-3 was trained using supervised learning, which I would have thought was a pretty standard way of training a model using machine learning. What training scenarios do you think the Risks from Learned Optimization terminology can handle, and what's the difference between those and the way GPT-3 was trained?

Comment by DanielFilan on What is the most evil AI that we could build, today? · 2021-11-02T04:53:54.548Z · LW · GW

Nice try FBI

Comment by DanielFilan on AMA: Paul Christiano, alignment researcher · 2021-10-27T17:41:40.422Z · LW · GW

What changed your mind about Chaitin's constant?

Comment by DanielFilan on Everything Studies on Cynical Theories · 2021-10-27T01:33:11.021Z · LW · GW

One amusing part:

If we combine this with the idea above that contemporary wokeness in many ways stand directly opposed to the ideals of early postmodernism — and much of what e.g. Foucault said about the production of what was considered “truth” and “knowledge” in his society applies very well to the woke-industrial complex of today — we can say with a not-quite-straight face that the problem with wokeism (or Critical Social Justice) lies in it being insufficiently postmodernist and too reliant on reason.

Comment by DanielFilan on Emergent modularity and safety · 2021-10-22T17:59:39.092Z · LW · GW

It's true! Altho I think of putting something up on arXiv as a somewhat lower bar than 'publication' - that paper has a bit of work left.

Comment by DanielFilan on Common knowledge about Leverage Research 1.0 · 2021-10-14T20:32:20.110Z · LW · GW

Isn't the thing Rob is calling crazy that someone "believed he was learning from Kant himself live across time", rather than believing that e.g. Geoff Anders is a better philosopher than Kant?

Comment by DanielFilan on Book Review: Open Borders · 2021-10-13T19:36:29.484Z · LW · GW

You can if the other 20 million pay more in taxes than the 30 million use on net - especially if the whole 50 million then have citizen kids that are net fiscally positive (and who would presumably not be counted as "non-citizen households" in these statistics). The balance here is a quantitative problem that Caplan and Weinersmith try to estimate in their book.

Comment by DanielFilan on How to think about and deal with OpenAI · 2021-10-13T19:13:03.420Z · LW · GW

I don't know what 'we' think, but as a person somewhat familiar with OpenAI employees and research output, they are definitely willing to pursue safety and transparency research that's relevant to existential risk, and I don't really know how one could do that without opening oneself up to producing research that provides evidence of AI danger.

Comment by DanielFilan on Common knowledge about Leverage Research 1.0 · 2021-10-13T19:03:13.926Z · LW · GW

This strikes me as an obviously good question and I'm surprised it hasn't been answered.

Comment by DanielFilan on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-27T07:22:27.852Z · LW · GW

An attempted non-mystical justification for Petrov day sensitivity, for those who think it's ridiculous:

If the LW home page were 'nuked', my day would be slightly worse: there would be interesting posts and comments I wouldn't as easily find out about (e.g. this one by Katja about literal and metaphorical fire alarms). So it makes sense for me to feel a bit bummed if it gets taken down. In addition, if someone else takes the page down, I should feel more bummed: not only did this slightly bad thing happen, but I just learned that someone will make my life a bit worse for no good reason. Maybe they'd make my life significantly worse for no good reason!

Are some people taking things a bit too seriously? Maybe, or maybe they derive much more value from LessWrong than I do, I don't know. But I think the basic stance of "hoping people don't 'nuke' the sites, and being upset at people who do" makes sense.

Comment by DanielFilan on Beyond fire alarms: freeing the groupstruck · 2021-09-26T19:08:07.604Z · LW · GW

AI: It seems like there has been nothing like a ‘fire alarm’ for this, and yet for instance most random ML authors alike agree that there is a serious risk.

"most ML authors agree risk of extinction-level bad >= 5%" seems not the same as "most ML authors agree risk of extinction-level stuff is serious".

Comment by DanielFilan on CGP Grey on the difficulty of knowing what's true [audio] · 2021-09-21T22:06:47.805Z · LW · GW

Another related CGP Grey video: Someone Dead Ruined My Life… Again.

Comment by DanielFilan on Announcing the Vitalik Buterin Fellowships in AI Existential Safety! · 2021-09-21T17:23:27.388Z · LW · GW

I suppose that types of s-risks that did drastically curtail humanity's potential would count, but s-risks that don't have that issue (e.g. humanity decides to suffer massively, but still has the potential to do lots of other things) would not.

Comment by DanielFilan on The Schelling Choice is "Rabbit", not "Stag" · 2021-09-18T22:05:09.517Z · LW · GW

This Aumann paper is about (a variant of?) the stag hunt game. In this version, it's great for everyone if we both hunt stag, it's somewhat worse for everyone if we hunt rabbit, and if I hunt stag and you hunt rabbit, it's terrible for me, and you're better off than in the world in which we both hunted rabbit, but worse off than in the world in which we both hunted stag.

He makes the point that in this game, even if we agree to hunt stag, if we make our decisions alone and without further accountability, I might think to myself "Well, you would want that agreement if you wanted to hunt stag, but you would also want that agreement if you wanted to hunt rabbit - either way, it's better for you if I hunt stag. So the agreement doesn't really change my mind as to whether you want to hunt rabbit or stag. Since I was presumably uncertain before, I should probably still be uncertain, and that means rabbit is the safer bet."

I'm not sure how realistic the setup is, but I thought it was an interesting take - a case where an agreement to both choose an outcome that's a Nash equilibrium doesn't really persuade me to keep the agreement.

Comment by DanielFilan on The Schelling Choice is "Rabbit", not "Stag" · 2021-09-18T21:57:31.865Z · LW · GW

Two of these asterisks still exist.

Comment by DanielFilan on DanielFilan's Shortform Feed · 2021-09-17T03:10:28.034Z · LW · GW

You might be better at writing than I am.

Comment by DanielFilan on DanielFilan's Shortform Feed · 2021-09-14T20:24:15.458Z · LW · GW

Some puzzles:

  • rubber ducking is really effective
  • it's very difficult to write things clearly, even if you understand them clearly

These seem like they should be related, but I don't quite know how. Maybe if someone thought about it for an hour they could figure it out.

Comment by DanielFilan on Welcome & FAQ! · 2021-08-26T02:50:50.359Z · LW · GW

I really like the art!

Comment by DanielFilan on Delta Strain: Fact Dump and Some Policy Takeaways · 2021-07-28T04:25:24.934Z · LW · GW

if interval is 4 instead of 5.5 days, this would mean that reported R of 7 would turn into R of 9^(4 / 5.5) = 4.

I think this 9 should be a 7?

Comment by DanielFilan on Black ravens and red herrings · 2021-07-27T20:24:21.216Z · LW · GW

Note that if you are a Solomonoff inductor, seeing a black raven doesn't always increase your credence that all ravens are black: see this paper.

Comment by DanielFilan on Handicapping competitive games · 2021-07-22T07:24:38.038Z · LW · GW

Imo this is better modelled as splitting players into a team in the taxonomy of this post, giving the weaker side a computational advantage. But it points to an awkwardness in the formalism.

Comment by DanielFilan on Handicapping competitive games · 2021-07-22T06:50:11.085Z · LW · GW

This argument would be more compelling to me if komi weren't already used - given that you already have to factor that number in, it doesn't seem like such a big deal to use a different number instead.

Comment by DanielFilan on CGP Grey on the difficulty of knowing what's true [audio] · 2021-07-13T22:25:04.064Z · LW · GW

post now has correct timestamps for the non-ad version for what I mean.

Comment by DanielFilan on CGP Grey on the difficulty of knowing what's true [audio] · 2021-07-13T22:22:13.235Z · LW · GW

oops, forgot that I didn't have to deal with ads

Comment by DanielFilan on Finite Factored Sets: Orthogonality and Time · 2021-07-07T23:40:55.820Z · LW · GW

OK I think this is a typo, from the proof of prop 10 where you deal with condition 5:

Thus .

I think this should be .

Comment by DanielFilan on Finite Factored Sets: Orthogonality and Time · 2021-07-07T23:09:02.249Z · LW · GW

From def 16:

... if for all

Should I take this to mean "if for all and "?

[EDIT: no, I shouldn't, since and are both subsets of ]

Comment by DanielFilan on A simple example of conditional orthogonality in finite factored sets · 2021-07-06T19:38:37.511Z · LW · GW

Seems right. I still think it's funky that X_1 and X_2 are conditionally non-orthogonal even when the range of the variables is unbounded.

Comment by DanielFilan on AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant · 2021-07-05T22:53:00.146Z · LW · GW

I'm glad to hear that the podcast is useful for people :)

Comment by DanielFilan on rohinmshah's Shortform · 2021-06-29T07:53:21.418Z · LW · GW

My best guess is that rationalists aren't that sane, especially when they've been locked up for a while and are scared and socially rewarding others being scared.

Comment by DanielFilan on rohinmshah's Shortform · 2021-06-29T07:51:27.605Z · LW · GW

TBH I think what made the uCOVID tax work was that once you did some math, it was super hard to justify levels that would imply anything like the existing risk-avoidance behaviour. So the "active ingredient" was probably just getting people to put numbers on the cost-benefit analysis.

[context note: I proposed the EH uCOVID tax]

Comment by DanielFilan on rohinmshah's Shortform · 2021-06-29T07:48:21.407Z · LW · GW

I feel like Noah's argument implies that states won't incur any costs to reduce CO2 emissions, which is wrong. IMO, the argument for a Pigouvian tax in this context is that for a given amount of CO2 reduction that you want, the tax is a cheaper way of getting it than e.g. regulating which technologies people can or can't use.

Comment by DanielFilan on rohinmshah's Shortform · 2021-06-29T07:44:14.017Z · LW · GW

Another way costs are nonlinear in uCOVIDs is if you think you'll probably get COVID.

Comment by DanielFilan on Knowledge is not just mutual information · 2021-06-10T18:26:36.929Z · LW · GW

Seems like maybe the solution should perhaps be that you should only take 'the system' to be the 'controllable' physical variables, or those variables that are relevant for 'consequential' behaviour? Hopefully if one can provide good definitions for these, it will provide a foundation for saying what the abstractions should be that let us distinguish between 'high-level' and 'low-level' behaviour.

Comment by DanielFilan on Survey on AI existential risk scenarios · 2021-06-08T21:31:38.314Z · LW · GW

As a respondent, I remember being unsure whether I should include those catastrophes.

Comment by DanielFilan on Challenge: know everything that the best go bot knows about go · 2021-06-03T18:25:56.446Z · LW · GW

Ah, understood. I think this is basically covered by talking about what the go bot knows at various points in time, a la this comment - it seems pretty sensible to me to talk about knowledge as a property of the actual computation rather than the algorithm as a whole. But from your response there it seems that you think that this sense isn't really well-defined.

Comment by DanielFilan on Challenge: know everything that the best go bot knows about go · 2021-06-03T18:09:05.196Z · LW · GW

This is correct, altho I'm specifically interested in the case of go AI because I think it's important to understand neural networks that 'plan', as well as those that merely 'perceive' (the latter being the main focus of most interpretability work, with some notable exceptions).

Comment by DanielFilan on Challenge: know everything that the best go bot knows about go · 2021-06-03T18:06:36.858Z · LW · GW

OP is a fine way to refer to me, I was just confused since I didn't think my post indicated that my desire was to efficiently program a go bot.

Comment by DanielFilan on Challenge: know everything that the best go bot knows about go · 2021-06-03T18:04:10.916Z · LW · GW

I guess by 'learner' you mean the human, rather than the learned model? If so, then I guess your transparency/explanation/knowledge-extraction method could be learner-specific, and still succeed at the above challenge.

Comment by DanielFilan on Feed the spinoff heuristic! · 2021-06-02T00:55:05.875Z · LW · GW

This is no longer true.

Comment by DanielFilan on Curated conversations with brilliant rationalists · 2021-06-01T16:36:20.523Z · LW · GW

FWIW I find it taking more than 1x for native speakers, but I think never longer than 2.5x for anybody.

Comment by DanielFilan on AXRP Episode 7 - Side Effects with Victoria Krakovna · 2021-05-14T19:18:09.095Z · LW · GW

And also thanks for your kind words :)

Comment by DanielFilan on Challenge: know everything that the best go bot knows about go · 2021-05-14T19:07:29.904Z · LW · GW

I suppose this gets back to OP's desire to program a Go Bot in the most efficient manner possible.

If by "OP" you mean me, that's not really my desire (altho that would be nice).