AXRP Episode 6 - Debate and Imitative Generalization with Beth Barnes 2021-04-08T21:20:12.891Z
AXRP Episode 5 - Infra-Bayesianism with Vanessa Kosoy 2021-03-10T04:30:10.304Z
Privacy vs proof of character 2021-02-28T02:03:31.009Z
AXRP Episode 4 - Risks from Learned Optimization with Evan Hubinger 2021-02-18T00:03:17.572Z
AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch 2020-12-29T20:45:23.435Z
AXRP Episode 2 - Learning Human Biases with Rohin Shah 2020-12-29T20:43:28.190Z
AXRP Episode 1 - Adversarial Policies with Adam Gleave 2020-12-29T20:41:51.578Z
Cognitive mistakes I've made about COVID-19 2020-12-27T00:50:05.212Z
Announcing AXRP, the AI X-risk Research Podcast 2020-12-23T20:00:00.841Z
Security Mindset and Takeoff Speeds 2020-10-27T03:20:02.014Z
Robin Hanson on whether governments can squash COVID-19 2020-03-19T18:23:57.574Z
Should we all be more hygenic in normal times? 2020-03-17T06:14:23.093Z
Did any US politician react appropriately to COVID-19 early on? 2020-03-17T06:12:31.523Z
An Analytic Perspective on AI Alignment 2020-03-01T04:10:02.546Z
How has the cost of clothing insulation changed since 1970 in the USA? 2020-01-12T23:31:56.430Z
Do you get value out of contentless comments? 2019-11-21T21:57:36.359Z
What empirical work has been done that bears on the 'freebit picture' of free will? 2019-10-04T23:11:27.328Z
A Personal Rationality Wishlist 2019-08-27T03:40:00.669Z
Verification and Transparency 2019-08-08T01:50:00.935Z
DanielFilan's Shortform Feed 2019-03-25T23:32:38.314Z
Robin Hanson on Lumpiness of AI Services 2019-02-17T23:08:36.165Z
Test Cases for Impact Regularisation Methods 2019-02-06T21:50:00.760Z
Does freeze-dried mussel powder have good stuff that vegan diets don't? 2019-01-12T03:39:19.047Z
In what ways are holidays good? 2018-12-28T00:42:06.849Z
Kelly bettors 2018-11-13T00:40:01.074Z
Bottle Caps Aren't Optimisers 2018-08-31T18:30:01.108Z
Mechanistic Transparency for Machine Learning 2018-07-11T00:34:46.846Z
Research internship position at CHAI 2018-01-16T06:25:49.922Z
Insights from 'The Strategy of Conflict' 2018-01-04T05:05:43.091Z
Meetup : Canberra: Guilt 2015-07-27T09:39:18.923Z
Meetup : Canberra: The Efficient Market Hypothesis 2015-07-13T04:01:59.618Z
Meetup : Canberra: More Zendo! 2015-05-27T13:13:50.539Z
Meetup : Canberra: Deep Learning 2015-05-17T21:34:09.597Z
Meetup : Canberra: Putting Induction Into Practice 2015-04-28T14:40:55.876Z
Meetup : Canberra: Intro to Solomonoff induction 2015-04-19T10:58:17.933Z
Meetup : Canberra: A Sequence Post You Disagreed With + Discussion 2015-04-06T10:38:21.824Z
Meetup : Canberra HPMOR Wrap Party! 2015-03-08T22:56:53.578Z
Meetup : Canberra: Technology to help achieve goals 2015-02-17T09:37:41.334Z
Meetup : Canberra Less Wrong Meet Up - Favourite Sequence Post + Discussion 2015-02-05T05:49:29.620Z
Meetup : Canberra: the Hedonic Treadmill 2015-01-15T04:02:44.807Z
Meetup : Canberra: End of year party 2014-12-03T11:49:07.022Z
Meetup : Canberra: Liar's Dice! 2014-11-13T12:36:06.912Z
Meetup : Canberra: Econ 101 and its Discontents 2014-10-29T12:11:42.638Z
Meetup : Canberra: Would I Lie To You? 2014-10-15T13:44:23.453Z
Meetup : Canberra: Contrarianism 2014-10-02T11:53:37.350Z
Meetup : Canberra: More rationalist fun and games! 2014-09-15T01:47:58.425Z
Meetup : Canberra: Akrasia-busters! 2014-08-27T02:47:14.264Z
Meetup : Canberra: Cooking for LessWrongers 2014-08-13T14:12:54.548Z
Meetup : Canberra: Effective Altruism 2014-08-01T03:39:53.433Z
Meetup : Canberra: Intro to Anthropic Reasoning 2014-07-16T13:10:40.109Z


Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-22T18:14:10.381Z · LW · GW

OK there's something important here I think. To some degree, I 'identify' as being an Australian, due in part to the fact that I am an Australian (but also in part to the fact that I don't live in Australia). But I don't think of Australians as an embattled group, and I also don't think this identity hinders my ability to reason about Australian affairs. So maybe there's a thing where there are different ways people can have identities that have different impacts on rationality.

Comment by DanielFilan on Does the lottery ticket hypothesis suggest the scaling hypothesis? · 2021-04-22T16:59:05.614Z · LW · GW

None of those quotes claim that training just reinforces the 'winning tickets'. Also those are referred to as the "strong" or "multi-ticket" LTH.

Comment by DanielFilan on Does the lottery ticket hypothesis suggest the scaling hypothesis? · 2021-04-22T01:01:02.525Z · LW · GW

Yep, I agree that this question does not accurately describe the lottery ticket hypothesis.

Comment by DanielFilan on [AN #147]: An overview of the interpretability landscape · 2021-04-21T18:04:50.390Z · LW · GW

Additionally, in an intuitive sense, pruning a network seems as though it could be defined in terms of clusterability notions, which limits my enthusiasm for that result.

I see what you mean, but there exist things called expander graphs which are very sparse (i.e. very pruned) but minimally clusterable. Now, these don't have a topology compatible with being a neural network, but are proofs of concept that you can prune without being clusterable. For more evidence, note that our pruned networks are more clusterable than if you permuted the weights randomly - that is, than random pruned networks.

Comment by DanielFilan on [AN #147]: An overview of the interpretability landscape · 2021-04-21T18:03:15.847Z · LW · GW


  • I see our results on paired images as less conclusive than the summary implies. From the paper:

Networks trained on halves-diff [i.e. paired-image] datasets are more relatively clusterable than those trained on halves-same ['pairs' of the same thing, which we used as a control] datasets, but not more absolutely clusterable... networks trained on stack-diff [paired] datasets are somewhat more clusterable, both in absolute and relative terms, than those trained on stack-same [control] datasets.

  • Appendix A.5 gives results for training MLPs on noise images with random labels. When the MLP is able to memorize the data, it's as clusterable as when trained on MNIST, but when it can't memorize, it's not particularly clusterable.
  • "Another challenge is whether networks are more modular just because in a bigger model there are more chances to find good cuts. (In other words, what's the default to which we should be comparing?)" - the answer is basically it depends. In the paper, we present 'absolute' clusterability numbers, as well as 'relative' statistics that indicate how clusterable the network is relative to versions of the networks where the weights are randomly shuffled. So relative clusterability answers this question. The ResNets are much more relatively clusterable than normal nets, meaning that their clusterability advantage isn't just because of the architecture, but other big CNNs are relatively clusterable compared to their shuffles but not more so than most small CNNs, meaning their clusterability advantage over small nets is likely due to their aspect ratio (if nets are deep, you can 'cluster' by layer and be very clusterable). But TBC, the clusterability of non-ResNet big CNNs is not totally due to their architecture, their weights are still more clusterable than if you randomly permute them. (I think I explained this terribly, so do please ask clarifying questions)
Comment by DanielFilan on Updating the Lottery Ticket Hypothesis · 2021-04-20T23:44:06.639Z · LW · GW

Yup, I agree that that quote says something which is probably true, given current evidence.

I don't know what the referent of 'that quote' is. If you mean the passage I quoted from the original lottery ticket hypothesis ("LTH") paper, then I highly recommend reading a follow-up paper which describes how and why it's wrong for large networks. The abstract of the paper I'm citing here:

We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random data order and augmentation). We find that standard vision models become stable to SGD noise in this way early in training. From then on, the outcome of optimization is determined to a linearly connected region. We use this technique to study iterative magnitude pruning (IMP), the procedure used by work on the lottery ticket hypothesis to identify subnetworks that could have trained in isolation to full accuracy. We find that these subnetworks only reach full accuracy when they are stable to SGD noise, which either occurs at initialization for small-scale settings (MNIST) or early in training for large-scale settings (ResNet-50 and Inception-v3 on ImageNet).

I don't think "picking a winning lottery ticket" is a good way analogy for what that implies

Again, assuming "that" refers to the claim in the original LTH paper, I also don't think it's a good analogy. But by default I think that claim is what "the lottery ticket hypothesis" refers to, given that it's a widely cited paper that has spawned a large number of follow-up works.

Comment by DanielFilan on Updating the Lottery Ticket Hypothesis · 2021-04-20T22:29:40.023Z · LW · GW

So in particular I basically disagree with the opening summary of the content of the "lottery ticket hypothesis". I think a better summary is found in the abstract of the original paper:

dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that—when trained in isolation—reach test accuracy comparable to the original network in a similar number of iterations

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-20T07:42:31.711Z · LW · GW

Another possible counterexample: being a gym bro is sort of an identity, but being a weak man isn't really. I imagine gym bros don't feel embattled?

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-20T07:12:09.442Z · LW · GW

A possible counterexample: I gather that many people identify as being part of the USA, which is the most powerful country on the planet. Do they think of themselves as beset by iniquity from all sides?

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-20T06:55:35.923Z · LW · GW

FWIW that sentence ends with a citation of a forum post written in 2014, so unless you're saratiara2 on WeddingBee, you can probably be confident that it isn't you.

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-20T06:51:22.670Z · LW · GW

Like, identities often feel 'morally powerful'. As the book quotes Megan McArdle saying: "The messages that make you feel great about yourself... are the ones that suggest you're a moral giant striding boldly across the landscape, wielding your inescapable ethical logic". What's so different about feeling like you're a literal giant striding boldly across the landscape, wielding your inescapable power?

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-20T06:49:00.145Z · LW · GW

Chapter 13: How Beliefs Become Identities

This chapter claims that identities tend to be things that people feel embattled about. Is this just a fact about contemporary English-speaking rich Western culture, or universal? You'd think that people could derive an identity about being in a group of powerful elites or something.

Comment by DanielFilan on The Scout Mindset - read-along · 2021-04-19T17:18:35.519Z · LW · GW

Chapter 6: How Sure Are You?

You should never see a well-calibrated person say something's impossible for it to then happen. But you also shouldn't expect Spock's predictions in the episodes to be calibrated: when he predicts well and things go normally, that's less interesting and therefore not likely to be the sort of thing an episode gets made about! (assuming that Star Trek doesn't purport to show everything that happens to those people, which may or may not be right)

Comment by DanielFilan on Updating the Lottery Ticket Hypothesis · 2021-04-19T17:12:50.243Z · LW · GW

One important thing to note here is that the LTH paper doesn't demonstrate that SGD "finds" a ticket: just that the subnetwork you get by training and pruning could be trained alone in isolation to higher accuracy. That doesn't mean that the weights in the original training are the same when the network is trained in isolation!

Comment by DanielFilan on Intermittent Distillations #2 · 2021-04-15T19:25:26.853Z · LW · GW

Actually in the podcast he chalks it up to no common knowledge of disagreement due to no common knowledge on what exactly high-level words mean.

Comment by DanielFilan on Test Cases for Impact Regularisation Methods · 2021-04-15T00:27:27.636Z · LW · GW

OK, I now think the above comment is wrong, because proposals using stepwise inaction baselines often compare what would happen if you didn't take the current action and were inactive to what would happen if you took the current action but were inactive from then on - at least that's how it's represented in this paper.

Comment by DanielFilan on Intermittent Distillations #2 · 2021-04-14T18:26:03.175Z · LW · GW

I have a slight lingering confusion about how the assumption that agents have knowledge about other agents' beliefs interacts with Aumann's Agreement theorem, but I think it works because they don't have common knowledge about each other's rationality? I suspect I might also be misunderstanding the assumption or the theorem here.

From what I could gather from talking to Critch, this is based off thinking that Aumann Agreement relies on assumptions that don't actually hold about human reasoning, but that the probability theory in this paper is pretty similar to how human reasoning works. Technically, you could get it from uncommon priors.

Comment by DanielFilan on AXRP Episode 6 - Debate and Imitative Generalization with Beth Barnes · 2021-04-09T14:52:12.762Z · LW · GW

Yeah the initial title was not good

Comment by DanielFilan on Reflective Bayesianism · 2021-04-07T19:00:28.435Z · LW · GW

Seems to me that time has been satisfactorily mathematized.

Comment by DanielFilan on Don't Sell Your Soul · 2021-04-07T15:05:40.009Z · LW · GW

FWIW I'm pretty sure people have historically used words that get translated to 'soul' and not believed that it was immortal or went to heaven. I don't have time to read this at the moment but I guess this SEP article is relevant.

Comment by DanielFilan on Reflective Bayesianism · 2021-04-07T15:01:44.756Z · LW · GW

In my head, the argument goes roughly like this, with 'surely' to be read as 'c'mon I would be so confused if not':

  1. Surely there's some precise way the universe is.
  2. If there's some precise way the universe is, surely one could describe that way using a precise system that supports logical inference.

I guess it could fail if the system isn't 'mathematical', or something? Like I just realized that I needed to add 'supports logical inference' to make the argument support the conclusion.

Comment by DanielFilan on Don't Sell Your Soul · 2021-04-07T02:54:36.259Z · LW · GW

Out of all the possible metaphysical constructs which could 'exist', why believe that souls are particularly likely?

Because there are good candidates for what a soul might be. E.g. the algorithm that's running in your head.

Comment by DanielFilan on Reflective Bayesianism · 2021-04-07T02:51:56.947Z · LW · GW

Eg. time (particularly passing-time), consciousness (particularly qualia). If you want to know what the potentially non-mathematical features are, look at how people argue against physicalism.

I don't get why these wouldn't be mathematizable.

Formally, some formal systems can fail to describe themselves.

Sure, but for every formal system, there's some formal system that describes it (right?)

Comment by DanielFilan on Don't Sell Your Soul · 2021-04-07T01:27:56.473Z · LW · GW

One protestant friend of mine thinks that the standard Christian view is that an em would not be or have a soul.

Comment by DanielFilan on Don't Sell Your Soul · 2021-04-07T01:26:52.117Z · LW · GW

EMs? Would a religious person really think that an EM is/has a soul?

I declare replies to this comment to be devoted to getting data on this question.

Comment by DanielFilan on Don't Sell Your Soul · 2021-04-07T01:18:53.127Z · LW · GW

I mean "soul" is clearly much closer to having a meaning than "florepti xor bobble". You can tell that an em is pretty similar to being a soul but hand sanitizer is not really. You know some properties that souls are supposed to have. There are various secular accounts of what a soul is that basically match the intuiton (e.g. your personality).

Comment by DanielFilan on Open & Welcome Thread – March 2021 · 2021-04-06T21:53:37.269Z · LW · GW

I think there should be a tag along the lines of commerce/trading, but I'm not sure what exactly it should be. I was sort-of-humorously going to apply it to this post. Thoughts on what tag in this realm should exist?

Comment by DanielFilan on Reflective Bayesianism · 2021-04-06T21:25:25.600Z · LW · GW

This reminds me strongly of Robin Hanson's pre-priors work. I guess the pre-prior has to do with the reflective belief, and replacing your prior over the average prior you could have been born with must be a tempting non-Bayesian update (assuming the framework makes any sense which I'm not sure it does).

Comment by DanielFilan on Reflective Bayesianism · 2021-04-06T21:12:08.966Z · LW · GW

Just referring to the primary thing.

Comment by DanielFilan on Reflective Bayesianism · 2021-04-06T21:11:43.701Z · LW · GW

But there is no necessary law saying the universe must be mathematical, any more than there's a necessary law saying the universe has to be computational.

What would a non-mathematical universe look like that's remotely compatible with ours? I guess it would have to be that there are indescribable 'features' of the universe that are real and maybe even relevant to the describable features?

I guess I'm confused because in my head "mathematical" means "describable by a formal system", and I don't know how a thing could fail to be so describable.

Comment by DanielFilan on Reflective Bayesianism · 2021-04-06T20:58:31.256Z · LW · GW

But footnote 3 refers to footnote 2 for the discussion of Maudlin's book where it should refer to footnote 1.

Comment by DanielFilan on Reflective Bayesianism · 2021-04-06T20:46:42.829Z · LW · GW

I think this way of organizing footnotes is better than most that I've seen on LW.

Comment by DanielFilan on How do we prepare for final crunch time? · 2021-03-31T23:58:28.833Z · LW · GW

Also if you count work done by people not publicly identified as motivated by existential risk, I think the concrete:abstract ratio will increase.

Comment by DanielFilan on How do we prepare for final crunch time? · 2021-03-31T23:53:28.376Z · LW · GW

Most current AI alignment work is pretty abstract and theoretical, for two reasons.

FWIW, this is not obvious to me (or at least depends a lot on what you mean by 'AI alignment'). Work at places like OpenAI, CHAI, and DeepMind tends to be relatively concrete.

Comment by DanielFilan on Logan Strohl on exercise norms · 2021-03-30T22:37:27.200Z · LW · GW

For what it's worth, to me "strength/dexterity privilege" sounds like a stat that makes you more successful at picking up heavy things or moving in a tricky manner. Similarly to how I imagine the "intelligence" stat makes you better at figuring out the solutions to problems, rather than more likely to take a nootropic when considering whether or not to do so.

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T22:41:50.186Z · LW · GW

I think you are wrong to think that it's overwhelmingly likely that Solomonoff will predict the even bits well.

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T21:00:03.310Z · LW · GW

Specifically, I'm not sure if there's some specific way that infra-Bayesianism learns this hypothesis

Well you had the misfortune to listen to a podcast where I was asking the questions, and I didn't understand infra-Bayesian learning theory and was too afraid to ask.

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:57:45.329Z · LW · GW

But to me it's not obvious how we could do better than that, given that this is inherently computationally expensive.

If the even bits are computable and the odd bits aren't, the whole sequence isn't computable so Solomonoff (plausibly) fails. You might hope that even if you can't succeed at predicting the odd bits, you could still succeed at predicting the even bits (which on their own are eminently predictable).

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:55:49.450Z · LW · GW

Notably, these are equivalent in the context of our 'expectations' being infima - if we were doing a mixture rather than taking worst-case bounds, these would not be equivalent (or rather, I don't know what it would mean to take expectations over a circumstance that didn't have any possible worlds)

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:53:47.974Z · LW · GW

So just like an interval says "any probability in this interval might pan out", the sets are saying "I want to be able to deal with any probability distribution in this set". And the sets happen to be convex. I don't think you need to know what 'convex' means to understand the podcast episode, but I tried to give a good explanation here:

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:47:58.312Z · LW · GW

A convex set is like a generalization of an interval.

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:47:16.932Z · LW · GW

There are a few equivalent ways to view infra-distributions:

  • single infra-distribution
  • mixture of infra-distributions
  • concave functional

So far, only the 'mixture of infra-distributions' view really makes sense to me in my head. Like, I don't know how else I'd design/learn an infra-distribution. So that's a limitation of my understanding.

Comment by DanielFilan on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-03-28T20:44:43.686Z · LW · GW

FWIW the first post in the infra-Bayes sequence has an example that I think gives you a clue why you need to include off-history terms into your belief update rule.

Comment by DanielFilan on Toward A Bayesian Theory Of Willpower · 2021-03-26T18:31:31.873Z · LW · GW

Adding to the metaphor here: suppose every day I, a Bayesian, am deciding what to do. I have some prior on what to do, which I update based on info I hear from a couple of sources, including my friend and the blogosphere. It seems that I should have some uncertainty over how reliable these sources are, such that if my friend keeps giving advice that in hindsight looks better than the advice I'm getting from the blogosphere, I update to thinking that my friend is more reliable than the blogosphere, and in future update more on my friend's advice than on the blogosphere's.

This means that if we take this sort of Bayesian theory of willpower seriously, it seems like you're going to have 'more willpower' if in the past the stuff that your willpower advised you to do seemed 'good'. Which sounds like the standard theory of "if being diligent pays off you'll be more diligent" but isn't: if your 'willpower/explicit reasoning module' says that X is a good idea and Y is a terrible idea, but other evidence comes in saying that Y will be great such that you end up doing Y anyway, and it sucks, you should have more willpower in the future. I guess the way this ends up not being what the Bayesian framework predicts is if what the evidence is actually for is the proposition "I will end up taking so-and-so action" - but that's loopy enough that I at most want to call it quasi-Bayesian. Or I guess you could have an uninformative prior over evidence reliability, such that you don't think past performance predicts future performance.

Comment by DanielFilan on Toward A Bayesian Theory Of Willpower · 2021-03-26T18:14:27.350Z · LW · GW

Maybe this is what you meant by "What you can have is some parameter controlling the prior itself (so that the prior can be less or more confident about certain things)."

Comment by DanielFilan on Toward A Bayesian Theory Of Willpower · 2021-03-26T18:13:41.253Z · LW · GW

I mean, there is something of a free parameter which is how strong your prior is over 'hypotheses' vs how much of a likelihood ratio you get from observing 'evidence' if there's a difference between hypotheses and evidence, and you can set your prior joint distribution over hypotheses and evidence however you want.

Comment by DanielFilan on Toward A Bayesian Theory Of Willpower · 2021-03-26T18:11:07.736Z · LW · GW

See also the correspondence between prediction markets of kelly bettors and Bayesian updating.

Comment by DanielFilan on Introduction To The Infra-Bayesianism Sequence · 2021-03-25T16:42:16.151Z · LW · GW

Thanks for the offer, but I don't think I have room for that right now.

Comment by DanielFilan on Introduction To The Infra-Bayesianism Sequence · 2021-03-24T20:43:11.955Z · LW · GW

I agree that infra-bayesianism isn't just thinking about sampling properties, and maybe 'statistics' is a bad word for that. But the failure on transparent Newcomb without kind of hacky changes to me suggests a focus on "what actions look good thru-out the probability distribution" rather than on "what logically-causes this program to succeed".

Comment by DanielFilan on Introduction To The Infra-Bayesianism Sequence · 2021-03-23T17:07:24.423Z · LW · GW

One thing I realized after the podcast is that because the decision theory you get can only handle pseudo-causal environments, it's basically trying to think about the statistics of environments rather than their internals. So my guess is that further progress on transparent newcomb is going to have to look like adding in the right kind of logical uncertainty or something. But basically it unsurprisingly has more of a statistical nature than what you imagine you want reading the FDT paper.