Posts

How has the cost of clothing insulation changed since 1970 in the USA? 2020-01-12T23:31:56.430Z · score: 14 (3 votes)
Do you get value out of contentless comments? 2019-11-21T21:57:36.359Z · score: 28 (12 votes)
What empirical work has been done that bears on the 'freebit picture' of free will? 2019-10-04T23:11:27.328Z · score: 9 (4 votes)
A Personal Rationality Wishlist 2019-08-27T03:40:00.669Z · score: 43 (26 votes)
Verification and Transparency 2019-08-08T01:50:00.935Z · score: 37 (17 votes)
DanielFilan's Shortform Feed 2019-03-25T23:32:38.314Z · score: 19 (5 votes)
Robin Hanson on Lumpiness of AI Services 2019-02-17T23:08:36.165Z · score: 16 (6 votes)
Test Cases for Impact Regularisation Methods 2019-02-06T21:50:00.760Z · score: 65 (19 votes)
Does freeze-dried mussel powder have good stuff that vegan diets don't? 2019-01-12T03:39:19.047Z · score: 18 (5 votes)
In what ways are holidays good? 2018-12-28T00:42:06.849Z · score: 22 (6 votes)
Kelly bettors 2018-11-13T00:40:01.074Z · score: 23 (7 votes)
Bottle Caps Aren't Optimisers 2018-08-31T18:30:01.108Z · score: 66 (26 votes)
Mechanistic Transparency for Machine Learning 2018-07-11T00:34:46.846Z · score: 55 (21 votes)
Research internship position at CHAI 2018-01-16T06:25:49.922Z · score: 25 (8 votes)
Insights from 'The Strategy of Conflict' 2018-01-04T05:05:43.091Z · score: 73 (27 votes)
Meetup : Canberra: Guilt 2015-07-27T09:39:18.923Z · score: 1 (2 votes)
Meetup : Canberra: The Efficient Market Hypothesis 2015-07-13T04:01:59.618Z · score: 1 (2 votes)
Meetup : Canberra: More Zendo! 2015-05-27T13:13:50.539Z · score: 1 (2 votes)
Meetup : Canberra: Deep Learning 2015-05-17T21:34:09.597Z · score: 1 (2 votes)
Meetup : Canberra: Putting Induction Into Practice 2015-04-28T14:40:55.876Z · score: 1 (2 votes)
Meetup : Canberra: Intro to Solomonoff induction 2015-04-19T10:58:17.933Z · score: 1 (2 votes)
Meetup : Canberra: A Sequence Post You Disagreed With + Discussion 2015-04-06T10:38:21.824Z · score: 1 (2 votes)
Meetup : Canberra HPMOR Wrap Party! 2015-03-08T22:56:53.578Z · score: 1 (2 votes)
Meetup : Canberra: Technology to help achieve goals 2015-02-17T09:37:41.334Z · score: 1 (2 votes)
Meetup : Canberra Less Wrong Meet Up - Favourite Sequence Post + Discussion 2015-02-05T05:49:29.620Z · score: 1 (2 votes)
Meetup : Canberra: the Hedonic Treadmill 2015-01-15T04:02:44.807Z · score: 1 (2 votes)
Meetup : Canberra: End of year party 2014-12-03T11:49:07.022Z · score: 1 (2 votes)
Meetup : Canberra: Liar's Dice! 2014-11-13T12:36:06.912Z · score: 1 (2 votes)
Meetup : Canberra: Econ 101 and its Discontents 2014-10-29T12:11:42.638Z · score: 1 (2 votes)
Meetup : Canberra: Would I Lie To You? 2014-10-15T13:44:23.453Z · score: 1 (2 votes)
Meetup : Canberra: Contrarianism 2014-10-02T11:53:37.350Z · score: 1 (2 votes)
Meetup : Canberra: More rationalist fun and games! 2014-09-15T01:47:58.425Z · score: 1 (2 votes)
Meetup : Canberra: Akrasia-busters! 2014-08-27T02:47:14.264Z · score: 1 (2 votes)
Meetup : Canberra: Cooking for LessWrongers 2014-08-13T14:12:54.548Z · score: 1 (2 votes)
Meetup : Canberra: Effective Altruism 2014-08-01T03:39:53.433Z · score: 1 (2 votes)
Meetup : Canberra: Intro to Anthropic Reasoning 2014-07-16T13:10:40.109Z · score: 1 (2 votes)
Meetup : Canberra: Paranoid Debating 2014-07-01T09:52:26.939Z · score: 1 (2 votes)
Meetup : Canberra: Many Worlds + Paranoid Debating 2014-06-17T13:44:22.361Z · score: 1 (2 votes)
Meetup : Canberra: Decision Theory 2014-05-26T14:44:31.621Z · score: 1 (2 votes)
[LINK] Scott Aaronson on Integrated Information Theory 2014-05-22T08:40:40.065Z · score: 22 (23 votes)
Meetup : Canberra: Rationalist Fun and Games! 2014-05-01T12:44:58.481Z · score: 0 (3 votes)
Meetup : Canberra: Life Hacks Part 2 2014-04-14T01:11:27.419Z · score: 0 (1 votes)
Meetup : Canberra Meetup: Life hacks part 1 2014-03-31T07:28:32.358Z · score: 0 (1 votes)
Meetup : Canberra: Meta-meetup + meditation 2014-03-07T01:04:58.151Z · score: 3 (4 votes)
Meetup : Second Canberra Meetup - Paranoid Debating 2014-02-19T04:00:42.751Z · score: 1 (2 votes)

Comments

Comment by danielfilan on Realism about rationality · 2020-01-14T00:58:17.575Z · score: 2 (1 votes) · LW · GW

Meta/summary: I think we're talking past each other, and hope that this comment clarifies things.

How critical is it that rationality is as real as electromagnetism, rather than as real as reproductive fitness? I think the latter seems much more plausible, but I also don't see why the distinction should be so cruxy...

Reproductive fitness implies something that's quite mathematizable, but with relatively "fake" models

I was thinking of the difference between the theory of electromagnetism vs the idea that there's a reproductive fitness function, but that it's very hard to realistically mathematise or actually determine what it is. The difference between the theory of electromagnetism and mathematical theories of population genetics (which are quite mathematisable but again deal with 'fake' models and inputs, and which I guess is more like what you mean?) is smaller, and if pressed I'm unsure which theory rationality will end up closer to.

Separately, I feel weird having people ask me about why things are 'cruxy' when I didn't initially say that they were and without the context of an underlying disagreement that we're hashing out. Like, either there's some misunderstanding going on, or you're asking me to check all the consequences of a belief that I have compared to a different belief that I could have, which is hard for me to do.

I am curious why you expect electromagnetism-esque levels of mathematical modeling. Even AIXI describes a heavy dependence on programming language. Any theory of bounded rationality which doesn't ignore poly-time differences (ie, anything "closer to the ground" than logical induction) has to be hardware-dependent as well.

I confess to being quite troubled by AIXI's language-dependence and the difficulty in getting around it. I do hope that there are ways of mathematically specifying the amount of computation available to a system more precisely than "polynomial in some input", which should be some input to a good theory of bounded rationality.

If I didn't believe the above,

What alternative world are you imagining, though?

I think I was imagining an alternative world where useful theories of rationality could only be about as precise as theories of liberalism, or current theories about why England had an industrial revolution when it did, and no other country did instead.

Comment by danielfilan on Realism about rationality · 2020-01-13T02:49:29.608Z · score: 2 (1 votes) · LW · GW

I think the mathematical theory of natural selection + the theory of DNA / genes were probably very influential in both medicine and biology, because they make very precise predictions and the real world is a very good fit for the models they propose. (That is, they are "real", in the sense that "real" is meant in the OP.)

In contrast, I think the general insight of "each part of these organisms has been designed by a local hill-climbing process to maximise reproduction" would not have been very influential in either medicine or biology, had it not been accompanied by the math.

But surely you wouldn't get the mathematics of natural selection without the general insight, and so I think the general insight deserves to get a bunch of the credit. And both the mathematics of natural selection and the general insight seem pretty tied up to the notion of 'reproductive fitness'.

Comment by danielfilan on Realism about rationality · 2020-01-13T02:30:48.568Z · score: 2 (1 votes) · LW · GW

Ah, I didn't quite realise you meant to talk about "human understanding of the theory of evolution" rather than evolution itself. I still suspect that the theory of evolution is so fundamental to our understanding of biology, and our understanding of biology so useful to humanity, that if human understanding of evolution doesn't contribute much to human welfare it's just because most applications deal with pretty long time-scales.

(Also I don't get why this discussion is treating evolution as 'non-real': stuff like the Price equation seems pretty formal to me. To me it seems like a pretty mathematisable theory with some hard-to-specify inputs like fitness.)

Comment by danielfilan on Realism about rationality · 2020-01-13T02:10:02.917Z · score: 2 (1 votes) · LW · GW

Sorry, how is this not saying "people who don't know evo-psych don't get anything out of knowing evo-psych"?

Comment by danielfilan on Realism about rationality · 2020-01-13T00:41:58.599Z · score: 8 (4 votes) · LW · GW
  • I believe in some form of rationality realism: that is, that there's a neat mathematical theory of ideal rationality that's in practice relevant for how to build rational agents and be rational. I expect there to be a theory of bounded rationality about as mathematically specifiable and neat as electromagnetism (which after all in the real world requires a bunch of materials science to tell you about the permittivity of things).
  • If I didn't believe the above, I'd be less interested in things like AIXI and reflective oracles. In general, the above tells you quite a bit about my 'worldview' related to AI.
  • Searching for beliefs I hold for which 'rationality realism' is crucial by imagining what I'd conclude if I learned that 'rationality irrealism' was more right:
    • I'd be more interested in empirical understanding of deep learning and less interested in an understanding of learning theory.
    • I'd be less interested in probabilistic forecasting of things.
    • I'd want to find some higher-level thing that was more 'real'/mathematically characterisable, and study that instead.
    • I'd be less optimistic about the prospects for an 'ideal' decision and reasoning theory.
  • My research depends on the belief that rational agents in the real world are likely to have some kind of ordered internal structure that is comprehensible to people. This belief is informed by rationality realism but distinct from it.
Comment by danielfilan on Realism about rationality · 2020-01-13T00:25:59.231Z · score: 4 (2 votes) · LW · GW

In contrast, I struggle to name a way that evolution affects an everyday person

I'm not sure what exactly you mean, but examples that come to mind:

  • Crops and domestic animals that have been artificially selected for various qualities.
  • The medical community encouraging people to not use antibiotics unnecessarily.
  • [Inheritance but not selection] The fact that your kids will probably turn out like you without specific intervention on your part to make that happen.
Comment by danielfilan on How has the cost of clothing insulation changed since 1970 in the USA? · 2020-01-12T23:36:42.258Z · score: 4 (2 votes) · LW · GW

On Facebook, Stefan Schubert linked to this CPI chart showing that the price of clothing has declined relative to the prices of other things over the past 20 years.

Comment by danielfilan on Unrolling social metacognition: Three levels of meta are not enough. · 2020-01-12T04:00:41.333Z · score: 9 (4 votes) · LW · GW

It would help me to evaluate this post if people who have attempted this technique would report on how it went, and whether or not they found that they reached more than three levels of meta.

Comment by danielfilan on Please Critique Things for the Review! · 2020-01-12T02:58:15.763Z · score: 4 (2 votes) · LW · GW

FWIW from a karma perspective I've found writing reviews to be significantly more profitable than most comments. IDK how this translates into social prestige though.

Comment by danielfilan on Realism about rationality · 2020-01-10T06:51:04.792Z · score: 4 (2 votes) · LW · GW

To answer the easy part of this question/remark, I don't work at MIRI and don't research agent foundations, so I think I shouldn't count as a "MIRI person", despite having good friends at MIRI and having interned there.

(On a related note, it seems to me that the terminology "MIRI person"/"MIRI cluster" obscures intellectual positions and highlights social connections, which makes me wish that it was less prominent.)

Comment by danielfilan on Voting Phase of 2018 LW Review (Deadline: Sun 19th Jan) · 2020-01-08T15:41:03.846Z · score: 10 (3 votes) · LW · GW

One wish I have related to the voting is that either there were much fewer posts that got to the voting stage, or I could choose to just vote on a subset of posts, rather than all of them. As it is, I'm somewhat intimidated by the prospect of reading 75 posts to decide how to vote on each one.

Comment by danielfilan on Raemon's Scratchpad · 2019-12-31T23:52:55.347Z · score: 2 (1 votes) · LW · GW

I like the idea of this song existing. Any progress?

Comment by danielfilan on Good goals for leveling up? · 2019-12-29T06:56:16.239Z · score: 7 (4 votes) · LW · GW

I don't know how general this is, but I think that the online probabilistic forecasting community is joinable, and has affordances to improve (e.g. gather relevant data and make a linear model, make 50 predictions about your personal life and see how calibrated you are). Relevant websites include Metaculus, Foretold, and the Good Judgement Project. That being said, after a certain point I think this has limited practical benefits.

Comment by danielfilan on The LessWrong 2018 Review · 2019-12-28T19:24:50.031Z · score: 2 (1 votes) · LW · GW

Hopefully it's different if you explicitly say "vote for helpful reviews, not just reviews that you agree with", or if you have one button for "I agree with this review" and a different button for "This review was helpful for my assessment of the post" (and it's possible to select both buttons).

Comment by danielfilan on The LessWrong 2018 Review · 2019-12-28T17:26:55.268Z · score: 2 (1 votes) · LW · GW

For what it's worth, I bid for the review prizes to be based off of people voting for which reviews were useful. The alternatives, and why I think they're worse:

  • Karma mixes "I found this review useful" and "I already agree with this review but am glad somebody said it", which can reward things which everybody already knew (but I guess both components are important).
  • Moderator's picks have the problem that moderators suffer from the curse of knowledge, and may not be in touch with what's useful for the average voter.
Comment by danielfilan on Towards a New Impact Measure · 2019-12-28T16:47:47.627Z · score: 7 (3 votes) · LW · GW

I'm curious whether these are applications I've started to gesture at in Reframing Impact

I confess that it's been a bit since I've read that sequence, and it's not obvious to me how to go from the beginnings of gestures to their referents. Basically what I mean is 'when trying to be cooperative in a group, preserve generalised ability to achieve goals', nothing more specific than that.

Comment by danielfilan on Towards a New Impact Measure · 2019-12-28T05:34:59.189Z · score: 8 (4 votes) · LW · GW

Note: this is on balance a negative review of the post, at least least regarding the question of whether it should be included in a "Best of LessWrong 2018" compilation. I feel somewhat bad about writing it given that the author has already written a review that I regard as negative. That being said, I think that reviews of posts by people other than the author are important for readers looking to judge posts, since authors may well have distorted views of their own works.

  • The idea behind AUP, that ‘side effect avoidance’ should mean minimising changes in one’s ability to achieve arbitrary goals, seems very promising to me. I think the idea and its formulation in this post substantially moved forward the ‘impact regularisation’ line of research. This represents a change in opinion since I wrote this comment.
  • I think that this idea behind AUP has fairly obvious applications to human rationality and cooperation, although they aren’t spelled out in this post. This seems like a good candidate for follow-up work.
  • This post is very long, confusing to me in some sections, and contains a couple of English and mathematical typos.
  • I still believe that the formalism presented in this post has some flaws that make it not suitable for canonisation. For more detail, see my exchange in the descendents of this comment - I still mostly agree with my claims about the technical aspects of AUP as presented in this post. Fleshing out these details is also, in my opinion, a good candidate for follow-up work.
  • I think that the ideas behind AUP that I’m excited about are better communicated in other posts by TurnTrout.
Comment by danielfilan on What spiritual experiences have you had? · 2019-12-27T22:39:07.554Z · score: 2 (1 votes) · LW · GW

Note that I would not usually describe this as a spiritual experience.

Comment by danielfilan on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2019-12-27T05:48:44.277Z · score: 6 (4 votes) · LW · GW

I think people quite frequently tell unambiguous lies of the form "I have read these terms and conditions".

Comment by danielfilan on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2019-12-27T05:45:05.577Z · score: 2 (1 votes) · LW · GW

'IIRC' because I remember being asked this question multiple times and lying once as an answer, but don't remember exactly who was around or who asked the time I remember lying, and am not certain that I actually lied as opposed to being very evasive or murmuring nonsensical syllables or something.

Comment by danielfilan on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2019-12-27T05:43:09.005Z · score: 11 (5 votes) · LW · GW

When's the last time you needed to consciously tell a bald-faced, unambiguous lie?—something that could realistically be outright proven false in front of your peers, rather than dismissed with a "reasonable" amount of language-lawyering.

I don't know about the last time I needed to do so, but the last time I did so was two days ago (Christmas Eve), when (IIRC) one of my grandparents asked me if I had brought board games to my aunt and uncle's house while in the presence of my aunt, uncle, and/or cousins. In fact I had, but didn't want to say that, because I had brought them as Christmas gifts for my aunt and uncle's family, and didn't want to reveal that fact, and didn't think I could get away with being evasive, so (again, IIRC) I lied about bringing them.

I have a pretty strong preference against literal/unambiguous lying, and usually I can get away with evasion when I want to conceal things, and I don't remember unambiguously lying previously, but I'm bad at remembering things and wouldn't be all that surprised if somebody showed me a recording of me telling a bald-faced lie at some other point during December.

Comment by danielfilan on What spiritual experiences have you had? · 2019-12-27T05:32:25.494Z · score: 8 (5 votes) · LW · GW

I once laid down on the floor of an empty bedroom, went through thinking of every thing and/or person and/or group of people I could think of, and thought about how excellent/beautiful/fitting they were, for something like an hour (not on purpose, it just sort of happened).

Comment by danielfilan on We run the Center for Applied Rationality, AMA · 2019-12-22T20:55:20.954Z · score: 26 (8 votes) · LW · GW

To reduce sampling error you could ask everyone again.

Comment by danielfilan on We run the Center for Applied Rationality, AMA · 2019-12-22T07:33:08.784Z · score: 10 (7 votes) · LW · GW

For what it's worth, I think that saying "Person X tends to err in Y direction" does not mean "Person X endorses or believes Y".

Comment by danielfilan on We run the Center for Applied Rationality, AMA · 2019-12-19T21:28:02.471Z · score: 43 (19 votes) · LW · GW

What organisation, if it existed and ran independently of CFAR, would be the most useful to CFAR?

Comment by danielfilan on We run the Center for Applied Rationality, AMA · 2019-12-19T20:28:05.676Z · score: 47 (17 votes) · LW · GW

My impression is that CFAR has moved towards a kind of instruction where the goal is personal growth and increasing one's ability to think clearly about very personal/intuition-based matters, and puts significantly less emphasis on things like explicit probabilistic forecasting, that are probably less important but have objective benchmarks for success.

  1. Do you think that this is a fair characterisation?
  2. How do you think these styles of rationality should interact?
  3. How do you expect CFAR's relative emphasis on these styles of rationality to evolve over time?
Comment by danielfilan on Daniel Kokotajlo's Shortform · 2019-12-18T00:24:36.858Z · score: 4 (2 votes) · LW · GW

I'm thinking about making a version in Foretold.io, since that's where people who are excited about making predictions live.

Well, many of them live on Metaculus.

Comment by danielfilan on When would an agent do something different as a result of believing the many worlds theory? · 2019-12-16T19:20:25.922Z · score: 2 (1 votes) · LW · GW

I think it's more natural to ask "how might an agent behave differently as a result of believing an objective collapse theory?" One answer that comes to mind is that they will be less likely to invest in quantum computers, which will need to rely on entanglement between a large number of quantum systems that under objective collapse theories might not be maintained (depending on the exact collapse theory). Similarly, other different physical theories of quantum mechanics will result in different predictions about what will happen in various somewhat-arcane situations.

More flippantly, an agent might answer the question 'What do you think the right theory of quantum mechanics is?' differently.

[Edited to put the serious answer where people will see it in the preview]

Comment by danielfilan on My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms · 2019-12-12T23:06:54.572Z · score: 8 (4 votes) · LW · GW

Well, I'm significantly more confident that at least one is wrong than about any particular one being wrong. That being said:

  • It seems wrong to claim that meditation tells people the causes of mental processes. You can often learn causal models from observations, but it's tricky, and my guess is that people don't do it automatically.
  • I don't think that most people implicitly act like they need to avoid mental experiences.
  • I don't know if 'suffering' is the right word for what painful experiences cause, but it sure seems like they are bad and worth avoiding.
  • My guess is that unsatisfactoriness is not a fundamental aspect of existence.

That being said, there's enough wiggle room in these claims that the intended meanings would be things that I'd agree with, and I also think that there's a significant shot that I'm wrong about all of the above.

Comment by danielfilan on Bottle Caps Aren't Optimisers · 2019-12-12T22:47:28.141Z · score: 15 (4 votes) · LW · GW

Review by the author:

I continue to endorse the contents of this post.

I don't really think about the post that much, but the post expresses a worldview that shapes how I do my research - that agency is a mechanical fact about the workings of a system.

To me, the main contribution of the post is setting up a question: what's a good definition of optimisation that avoids the counterexamples of the post? Ideally, this definition would refer or correspond to the mechanistic properties of the system, so that people could somehow statically determine whether a given controller was an optimiser. To the best of my knowledge, no such definition has been developed. As such, I see the post as not having kicked off a fruitful public conversation, and its value if any lies in how it has changed the way other people think about optimisation.

Comment by danielfilan on My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms · 2019-12-12T22:28:44.641Z · score: 29 (8 votes) · LW · GW

As far as I can tell, this post successfully communicates a cluster of claims relating to "Looking, insight meditation, and enlightenment". It's written in a quite readable style that uses a minimum of metaphorical language or Buddhist jargon. That being said, likely due to its focus as exposition and not persuasion, it contains and relies on several claims that are not supported in the text, such as:

  • Many forms of meditation successfully train cognitive defusion.
  • Meditation trains the ability to have true insights into the mental causes of mental processes.
  • "Usually, most of us are - on some implicit level - operating off a belief that we need to experience pleasant feelings and need to avoid experiencing unpleasant feelings."
  • Flinching away from thoughts of painful experiences is what causes suffering, not the thoughts of painful experiences themselves, nor the actual painful experiences.
  • Impermanence, unsatisfactoriness, and no-self are fundamental aspects of existence that "deep parts of our minds" are wrong about.

I think that all of these are worth doubting without further evidence, and I think that some of them are in fact wrong.

If this post were coupled with others that substantiated the models that it explains, I think that that would be worthy of inclusion in a 'Best of LW 2018' collection. However, my tentative guess is that Buddhist psychology is not an important enough set of claims that a clear explanation of it deserves to be signal-boosted in such a collection. That being said, I could see myself being wrong about that.

Comment by danielfilan on Concretize Multiple Ways - Los Angeles LW/SSC Meetup #139 (Wednesday, December 11th) · 2019-12-12T04:09:33.210Z · score: 4 (2 votes) · LW · GW

Please let me know how this goes!

Comment by danielfilan on DanielFilan's Shortform Feed · 2019-12-11T18:15:15.631Z · score: 5 (2 votes) · LW · GW

Better to concretise 3 ways than 1 if you have the time.

Here's a tale I've heard but not verified: in the good old days, Intrade had a prediction market on whether Obamacare would become law, which resolved negative, due to the market's definition of Obamacare.

Sometimes you're interested in answering a vague question, like 'Did Donald Trump enact a Muslim ban in his first term' or 'Will I be single next Valentine's day'. Standard advice is to make the question more specific and concrete into something that can be more objectively evaluated. I think that this is good advice. However, it's inevitable that your concretisation may miss out on aspects of the original vague question that you cared about. As such, it's probably better to concretise the question multiple ways which have different failure modes. This is sort of obvious for evaluating questions about things that have already happened, like whether a Muslim ban was enacted, but seems to be less obvious or standard in the forecasting setting. That being said, sometimes it is done - OpenPhil's animal welfare series of questions seems to me to basically be an example - to good effect.

This procedure does have real costs. Firstly, it's hard to concretise vague questions, and concretising multiple times is harder than concretising once. It's also hard to predict multiple questions, especially if they're somewhat independent as is necessary to get the benefits, meaning that each question will be predicted less well. In a prediction market context, this may well manifest in having multiple thin, unreliable markets instead of one thick and reliable one.

Comment by danielfilan on Coherence arguments do not imply goal-directed behavior · 2019-12-08T05:34:33.225Z · score: 6 (3 votes) · LW · GW

it was written without any input from me

Well, I didn't consult you in the process of writing the review, but we've had many conversations on the topic which presumably have influenced how I think about the topic and what I ended up writing in the review.

Comment by danielfilan on Coherence arguments do not imply goal-directed behavior · 2019-12-08T05:31:05.006Z · score: 4 (2 votes) · LW · GW

I think of 'coherence arguments' as including things like 'it's not possible for you to agree to give me a limitless number of dollars in return for nothing', which does imply some degree of 'goal-direction'.

Yeah, maybe I should say "coherence theorems" to be clearer about this?

Sorry, I meant theorems taking 'no limitless dollar sink' as an axiom and deriving something interesting from that.

Comment by danielfilan on Coherence arguments do not imply goal-directed behavior · 2019-12-08T01:07:44.588Z · score: 8 (4 votes) · LW · GW

Putting my cards on the table, this is my guess at the answers to the questions that I raise:

  1. I don't know.
  2. Low.
  3. Frequent if it's an 'intelligent' one.
  4. Relatively. You probably don't end up with systems that resist literally all changes to their goals, but you probably do end up with systems that resist most changes to their goals, barring specific effort to prevent that.
  5. Probably.

That being said, I think that a better definition of 'goal-directedness' would go a long way in making me less confused by the topic.

Comment by danielfilan on Coherence arguments do not imply goal-directed behavior · 2019-12-08T01:06:15.958Z · score: 40 (10 votes) · LW · GW

I think that strictly speaking this post (or at least the main thrust) is true, and proven in the first section. The title is arguably less true: I think of 'coherence arguments' as including things like 'it's not possible for you to agree to give me a limitless number of dollars in return for nothing', which does imply some degree of 'goal-direction'.

I think the post is important, because it constrains the types of valid arguments that can be given for 'freaking out about goal-directedness', for lack of a better term. In my mind, it provokes various follow-up questions:

  1. What arguments would imply 'goal-directed' behaviour?
  2. With what probability will a random utility maximiser be 'goal-directed'?
  3. How often should I think of a system as a utility maximiser in resources, perhaps with a slowly-changing utility function?
  4. How 'goal-directed' are humans likely to make systems, given that we are making them in order to accomplish certain tasks that don't look like random utility functions?
  5. Is there some kind of 'basin of goal-directedness' that systems fall in if they're even a little goal-directed, causing them to behave poorly?

Off the top of my head, I'm not familiar with compelling responses from the 'freak out about goal-directedness' camp on points 1 through 5, even though as a member of that camp I think that such responses exist. Responses from outside this camp include Rohin's post 'Will humans build goal-directed agents?'. Another response is Brangus' comment post, although I find its theory of goal-directedness uncompelling.

I think that it's notable that Brangus' post was released soon after this was announced as a contender for Best of LW 2018. I think that if this post were added to the Best of LW 2018 Collection, the 'freak out' camp might produce more of these responses and move the dialogue forward. As such, I think it should be added, both because of the clear argumentation and because of the response it is likely to provoke.

Comment by danielfilan on Bottle Caps Aren't Optimisers · 2019-12-08T00:35:35.240Z · score: 4 (2 votes) · LW · GW

I'm surprised nobody has yet replied that the two examples are both products of significant optimizers with relevant optimization targets.

Yes, this seems pretty important and relevant.

That being said, I think that that definition suggests that natural selection and/or the earth's crust are downstream from an optimiser of the number of Holiday Inns, or that my liver is downstream from an optimiser from my income, both of which aren't right.

Probably it's important to relate 'natural subgoals' to some ideal definition - which offers some hope, since 'subgoal' is really a computational notion, so maybe investigation along these lines would offer a more computational characterisation of optimisation.

[EDIT: I made this comment longer and more contentful]

Comment by danielfilan on Open & Welcome Thread - December 2019 · 2019-12-05T21:43:33.147Z · score: 4 (2 votes) · LW · GW

Also, I think there's an important difference between books and podcasts in how carefully constructed and checked they are.

Comment by danielfilan on Open & Welcome Thread - December 2019 · 2019-12-05T21:12:58.929Z · score: 5 (3 votes) · LW · GW

I listen to econtalk, but "every econtalk episode" doesn't have enough density of what I'm interested in here.

Comment by danielfilan on Open & Welcome Thread - December 2019 · 2019-12-05T20:45:16.137Z · score: 4 (2 votes) · LW · GW

What audiobooks should I listen to? Things I'd like to learn more about:

  • economic history
  • the causes of productivity of researchers and research fields
  • the evolution of intelligence
Comment by danielfilan on Birth order effect found in Nobel Laureates in Physics · 2019-12-01T01:25:14.694Z · score: 4 (2 votes) · LW · GW

This is now a hypothesis I look out for and see many places, thanks in part to this post.

Comment by danielfilan on Historical mathematicians exhibit a birth order effect too · 2019-12-01T01:24:18.793Z · score: 2 (1 votes) · LW · GW

This is now a hypothesis I look out for and see many places, thanks in part to this post.

Comment by danielfilan on Realism about rationality · 2019-11-22T18:26:00.208Z · score: 2 (1 votes) · LW · GW

I think that rationality realism is to Bayesianism is to rationality anti-realism as theism is to Christianity is to atheism. Just like it's feasible and natural to write a post advocating and mainly talking about atheism, despite that position being based on default skepticism and in some sense defined by theism, I think it would be feasible and natural to write a post titled 'rationality anti-realism' that focussed on that proposition and described why it was true.

Comment by danielfilan on Bottle Caps Aren't Optimisers · 2019-11-22T09:01:56.979Z · score: 14 (4 votes) · LW · GW

Daniel Filan's bottle cap example

Note that Abram Demski deserves a large part of the credit for that specific example (somewhere between 'half' and 'all'), as noted in the final sentence of the post.

Comment by danielfilan on Realism about rationality · 2019-11-22T05:49:09.450Z · score: 4 (2 votes) · LW · GW

Note that the linked technical report by Salamon, Rayhawk, and Kramar does a good job at looking at evidence for and against 'rationality realism', or as they call it, 'the intelligibility of intelligence'.

Comment by danielfilan on Realism about rationality · 2019-11-22T05:43:05.269Z · score: 4 (2 votes) · LW · GW

I do think that it was an interesting choice for the post to be about 'realism about rationality' rather than its converse, which the author seems to subscribe to. This probably can be chalked up to it being easier to clearly see a thinking pattern that you don't frequently use, I guess?

Comment by danielfilan on Realism about rationality · 2019-11-22T05:41:22.729Z · score: 7 (4 votes) · LW · GW

This post gave a short name for a way of thinking that I naturally fall into, and implicitly pointing to the possibility of that way of thinking being mistaken. This makes a variety of discussions in the AI alignment space more tractable. I do wish that the post were more precise at characterising the position of 'realism about rationality' and its converse, or (even better) that it gave arguments for or against 'realism about rationality' (even a priors-based one as in this closely related Robin Hanson post), but pointing to a type of proposition and giving it a name seems very valuable.

Comment by danielfilan on Open question: are minimal circuits daemon-free? · 2019-11-22T05:33:40.385Z · score: 6 (3 votes) · LW · GW

This post formulated a concrete open problem about what are now called 'inner optimisers'. For me, it added 'surface area' to the concept of inner optimisers in a way that I think was healthy and important for their study. It also spurred research that resulted in this post giving a promising framework for a negative answer.

Comment by danielfilan on The LessWrong 2018 Review · 2019-11-21T21:57:59.596Z · score: 2 (1 votes) · LW · GW

Have thrown up the question post.