## Posts

## Comments

**MrMind**on Fusion and Equivocation in Korzybski's General Semantics · 2020-12-21T10:57:32.863Z · LW · GW

One thing to remember when talking about distinction/defusion is that it's not a free operation: if you distinguish two things that you previously considered the same, you need to store at least a bit of information more than before. That is something that demands effort and energy. Sometimes, you need to store a lot more bits. You cannot simply become superintelligent by defusing everything in sight.

Sometimes, making a distinction is important, but some other times, erasing distinctions is more important. Rationality is about creating and erasing distinctions to achieve a more truthful or more useful model.

This is also why I vowed to never object that something is "more complicated" if I cannot offer a better model, because it's always very easy to inject distinctions, the harder part is to make those distinctions matter.

**MrMind**on [deleted post] 2020-10-15T14:30:08.576Z

I don't think you need the concept of evidence. In Bayesian probability, the concept of evidence is equivalent to the concept of truth; both in the sense that P(X|X) = 1, whatever you consider evidence is true, but also P(X) = 1 --> P(A /\ X) = P(A|X), you can consider true sentences as evidence without changing anything else.

Add to this that good rationalist practice is to never assume that anything is P(A) = 1, so that nothing is actually true or actually an evidence. You can do epistemology exclusively in the hypotethical: what happens if I consider this true? And then derive consequences.

**MrMind**on Rationality and Climate Change · 2020-10-05T15:47:11.386Z · LW · GW

Well, I share the majority of your points. I think that in 30 years millions of people will try to relocate in more fertile areas. And I think that not even the firing of the clathrate gun will force humans to coordinate globally. Although I am a bit more optimist about technology, the actual status quo is broken beyond repair

**MrMind**on What am I missing? (quantum physics) · 2020-08-24T12:38:01.393Z · LW · GW

The fact is surprising when coupled with the fact that particles do not have a definite spin direction before you measure it. The anti-correlation is maintained non-locally, but the directions are decided by the experiment.

A better example is: take two spheres, send them far away, then make one sphere spin in any orientation that you want. How much would you be surprised to learn that the other sphere spins with the same axis in the opposite directions?

**MrMind**on Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models" · 2020-06-30T08:46:27.676Z · LW · GW

How probable is that someone knows their internal belief structure? How probable is that someone who knows their internal belief structure tells you that truthfully instead of using a self-serving lie?

**MrMind**on Is Altruism Selfish? · 2020-06-15T09:21:21.066Z · LW · GW

The causation order in the scenario is important. If the mother is instantly killed by the truck, then she cannot feel any sense of pleasure after the fact. But if you want to say that the mother feels the pleasure during the attempt or before, then I would say that the word "pleasure" here is assuming the meaning of "motivation", and the points raised by Viliam in another comment are valid, it becomes just a play on words, devoid of intrinsic content.

**MrMind**on Introduction to Introduction to Category Theory · 2020-06-11T14:34:56.559Z · LW · GW

So far, Bayesian probability has been extended to infinite sets only as a limit of continuous transfinite functions. So I'm not quite sure of the official answer to that question.

On the other hand, what I know is that even common measure theory cannot talk about the probability of a singleton if the support is continuous: no sigma-algebra on supports the atomic elements.

And if you're willing to bite the bullet, and define such an algebra through the use of a measurable cardinal, you end up with an ultrafilter that allows you to define infinitesimal quantities

**MrMind**on Introduction to Introduction to Category Theory · 2020-06-10T07:30:38.924Z · LW · GW

Under the paradigm of probability as extended logic, it is wrong to distinguish between empirical and demonstrative reasoning, since classical logic is just the limit of Bayesian probability with probabilities 0 and 1.

Besides that, category theory was born more than 70 years ago! Sure, very young compared to other disciplines, but not *so* young. Also, the work of Lawvere (the first to connect categories and logic) began in the 70's, so it dates at least forty years back.

That said, I'm not saying that category theory cannot in principle be used to reason about reasoning (the effective topos is a wonderful piece of machinery), it just cannot say that much right now about Bayesian reasoning

**MrMind**on Why Rationalists Shouldn't be Interested in Topos Theory · 2020-05-26T09:17:00.099Z · LW · GW

Yeah, my point is that they aren't truth values per se, not intuitionistic or linear or MVs or anything else

**MrMind**on Why Rationalists Shouldn't be Interested in Topos Theory · 2020-05-25T10:50:52.460Z · LW · GW

I've also dabbled into the matter, and I have two observation:

- I'm not sure that probabilities should be understood as truth values. I cannot prove it, but my gut feeling is telling me that they are two different things altogether. Sure, operations on truth values should turn into operations on probabilities, but their underlying logic is different (probabilities after all should be measures, while truth values are algebras)
- While 0 and 1 are not (good) epistemic probabilities, they are of paramount importance in any model of probability. For example, P(X|X) = 1, so 0/1 should be included in any model of probability

**MrMind**on Utility need not be bounded · 2020-05-19T13:28:08.743Z · LW · GW

The way it's used in the set theory textbooks I've read is usually this:

- define a function
*successor*on a set S: *assume*the existence of an*inductive*set that contains a set and all its successors. This is a weak and very limited form of infinite induction.- Use Replacement on the inductive set to define a
*general*form of transfinite recursion. - Use transfinite recursion and the union operation to define the step "taking the limit of a sequence".

So, there is indeed the assumption of a kind of infinite process before the assumption of the existence of an infinite set, but it's not (necessarily) the ordinal . You can't also use it to deduce anything else, you still need Replacement. The same can be said for the existence and uniqueness of the empty set, which can be deduced from the axioms of Separation.

This approach is not equivalent nor weaker to having fiat transfinite recursion , it's the only correct way if you want to make the least amount of new assumptions.

Anyway, as far as I can tell, having a well defined theory of sets is crucial to the definitions of surreals, since they are based on set operations and ontology, and use infinite sets of every kind.

On the other hand, I don't understand your problem with the impredicativity of the definitions of the surreals. These are often resolved into recursive definitions and since ZF-sets are well-founded, you never run into any problem.

**MrMind**on Utility need not be bounded · 2020-05-19T07:39:16.003Z · LW · GW

> Transfinite induction does feel a bit icky in that finite prooflines you outline a process that has infinitely many steps. But as limits have a similar kind of thing going on I don't know whether it is any ickier.

Well, transfinite induction / recursions is reduced to (at least in ZF set theory) the existence of an infinite set and the Replacement axioms (a class function on a set is a set). I suspect you don't trust the latter.

**MrMind**on Problems with p-values · 2020-04-08T07:49:07.956Z · LW · GW

The first link in the article is broken...

**MrMind**on What Surprised Me About Entrepreneurship · 2020-04-06T07:25:58.863Z · LW · GW

Obviously, only the wolves that survive.

**MrMind**on Are veterans more self-disciplined than non-veterans? · 2020-03-23T08:19:47.170Z · LW · GW

Beware of the selection bias: even if veterans show more productivity, it could just be because the military training has selected those with higher discipline

**MrMind**on I'm leaving AI alignment – you better stay · 2020-03-12T10:23:54.667Z · LW · GW

The diagram at the beginning is very interesting. I'm curious about the arrow from relationship to results... care to explain? It refers to joint works or collaborations?

On the other hand, it's not surprising to me that AI alignment is a field that requires much more research and math than software writing skills... the field is completely new and not very well formalized yet, probably your skill set is misaligned with the need of the market

**MrMind**on Training Regime Day 6: Seeking Sense · 2020-02-21T10:05:44.899Z · LW · GW

> The first thing that you must accept in order to seek sense properly is the claim that minds actually make sense

This is somewhat weird to me. Since Kahneman & Tverski, we know that system 2 is mostly good at rationalizing the actions taken by system 1, to create a self-coherent narrative. Not only thus minds generally don't make any sense, my minds in general lacks any sense. I'm here just because my system 1 is well adjusted to this modern environment, I don't *need* to make any sense.

From this perspective, "making sense" appears to be a tiring and pointless exercise...

**MrMind**on The Bus Ticket Theory of Genius · 2019-11-25T13:36:02.341Z · LW · GW

Isn't "just the right kind of obsession" a natural ability? It's not that you can orient your 'obsessions' at will...

**MrMind**on Examples of Categories · 2019-10-10T09:59:26.986Z · LW · GW

Two of my favorite categories show that they really are everywhere: the free category on any graph and the presheaves of gamma.

The first: take any directed graph, unfocus your eyes and instead of arrows consider paths. That is a category!

The second: take any finite graph. Take sets and functions that realize this graph. This is a category, moreover you can make it dagger-compact, so you can do quantum mechanics with it. Take as the finite graph gamma, which is just two vertex with two arrows between them. Sets and functions that realize this graph are... any graph! So, CT allows you to do quantum mechanics with graphs.

Amazing!

**MrMind**on What is category theory? · 2019-10-09T13:18:34.245Z · LW · GW

Lambda calculus is though the internal language of a very common kind of category, so, in a sense, category theory allows lambda calculus to do computations not only with functions, but also sets, topological spaces, manifolds, etc.

**MrMind**on Introduction to Introduction to Category Theory · 2019-10-09T12:48:09.140Z · LW · GW

While I share your enthusiasm toward categories, I find suspicious the claim that CT is the correct framework from which to understand rationality. Around here, it's mainly equated with Bayesian Probability, and the categorial grasp of probability or even measure is less than impressive. The most interesting fact I've been able to dig up is that the Giry monad is the codensity monad of the inclusion of convex spaces into measure spaces, hardly an illuminating fact (basically a convoluted way of saying that probabilities are the most general ways of forming convex combinations out of measures).

I've searched and searched for categorial answers or hints about the problem of extending probabilities to other kinds of logic (or even simply extending it to classical predicate logic), but so far I've had no luck.

**MrMind**on Odds are not easier · 2019-08-22T10:25:12.415Z · LW · GW

The difference between the two is literally a single summation, so... yeah?

**MrMind**on Occam's Razor: In need of sharpening? · 2019-08-09T13:15:36.555Z · LW · GW

I'd like to point out a source of confusion around Occam's Razor that I see you're falling for, dispelling it will make things clearer: "you should not multiplicate entities **without necessities**!". This means that Occam's Razor helps decide between competing theories if and only if they have the same explanation and predictive power. But in the history of science, it was almost *never* the case that competing theories had the same power. Maybe it happened a couple of times (epicycles, the Copenhagen interpretation), but in all other instances a theory was selected not because it was simpler, but because it was much more powerful.

Contrary to popular misconception, Occam's razor gets to be used very, very rarely.

We do have, anyway, a formalization of that principle in algorithmic information theory: Solomonoff induction. A agent that, to predict the outcome of a sequence, places the highest probabilities in the shortest compatible programs, will eventually outperform every other class of predictor. The catch here is the word 'eventually': in every measure of complexity, there's a constant that offset the values due to the definition of the reference universal Turing machine. Different references will indicate different complexities for the same first programs, but all measure will converge after a finite amount.

This is also why I think that the problem explaining thunders with "Thor vs clouds" is such a poor example of Occam's razor: Solomonoff induction is a formalization of Occam razor for *theories*, not *explanations*. Due to the aforementioned constant, you cannot have absolutely simpler model of a finite sequence of event. There's no such a thing, it will always depend on the complexity of the starting Turing machine. However, you can have **eventually simpler** models of **infinite** sequence of events (infinite sequence predictor are equivalent to programs). In that case, the natural event program will prevail because it will allow to control better the outcomes.

**MrMind**on Philosophy as low-energy approximation · 2019-08-01T13:12:15.408Z · LW · GW

I arrived at the same conclusion when I tried to make sense of the Metaethics Sequence. My summary of Eliezer's writings is: "morality is a bunch of mental computations shared between most human beings". Morality thus grew out of our evolutive history, and it should not be surprising that in extreme situations it might be incoherent or maladaptive.

Only if you believe that morality should be like systematic and universal and coherent, then you can say that extreme examples are uncovering something interesting about peoples' morality.

Otherwise, extreme situations are as interesting as saying that people cannot mentally factor long numbers.

**MrMind**on [deleted post] 2019-08-01T10:39:52.254Z

First of all, the community around LW2.0 can only be loosely associated to a movement: I don't think there's anyone that explicitly endorses *every* technique or theory appeared here. LW is not CFAR, is not the Alignment forum, etc. So I would caution against enticing someone into LW by saying that the community supports this or that technique.

The main advantage of rationality, in its present stage, is defensive: if you're aspiring to be rational, you wouldn't waste time attending religious gatherings that you despise; you wouldn't waste money buying ineffective treatments (sugar pills, crystals, etc.); you wouldn't waste resources following people that mistake fiction for facts. At the moment, rationality is just a very good filter for every product, knowledge and praxis that society presents to you (hint: 99% of those things is crap).

On the other hand, what you can or should do with all the resources you're not wasting, is something rationality cannot answer in full today. Metaethics and akrasia are, after all, the greatest unsolved problems of our community.

There were notorious attempts (e.g. Torture vs Dust specks or the Basilisk), but nothing has emerged with the clarity and effectiveness of Bayesian reasoning. Effective Altruism and MIRI are perhaps the most famous examples of trying to solve the most pressing problems. A definitive framework though still eludes us.

**MrMind**on 1960: The Year The Singularity Was Cancelled · 2019-05-08T10:33:15.858Z · LW · GW

In Foerster's paper, he links the increase in productivity linearly with the increase in population. But Scott has also proposed that the rate of innovation is slowing down, due to a *logarithmic *increase of productivity from population. So maybe Foerster's model is still valid, and 1960 is only the year where we exhausted the almost linear part of progress (the "low hanging fruits").

Perhaps nowadays we combine the exponential growth of population from population with the logarithmic increase in productivity, to get the linear economic growth we see.

**MrMind**on Why does category theory exist? · 2019-05-07T13:56:05.403Z · LW · GW

Algebraic topology is the discipline that studies geometries by associating them with algebraic objects (usually, groups or vector spaces) and observing how changing the underlying space affects the related algebras. In 1941, two mathematicians working in that field sought to generalize a theorem that they discovered, and needed to show that their solution was still valid for a larger class of spaces, obtained by "natural" transformations. Natural, at that point, was a term lacking a precise definition, and only meant something like "avoiding arbitrary choices", in the same way a vector space is naturally isomorphic to its double dual, while it's isomorphic to its dual only through the choice of a basis.

The need to make precise the notion of naturality for algebraic topology led them to the definition of natural transformation, which in turn required the notion of functor which in turn required the notion of category.

This answers questions 1 and 2: category theory was born to give a precise definition of naturality, and was sought to generalize the "universal coefficient theorem" to a larger class of spaces.

This story is told with a lot of details in the first paragraphs of Riehl's wonderful "Category theory in context".

To answer n° 3, though, even if category theory was rapidly expanding during the '50s and the '60s, it was only with the work of Lawvere (who I consider a genius on par with Gödel) in the '70s that it became a foundational discipline: guided by his intuitions, category theory became the unifying language for every branch of mathematics, from geometry to computation to logic to algebras. Basically, it showed how the variety of mathematical disciplines are just different ways to say the same thing.

**MrMind**on Highlights from "Integral Spirituality" · 2019-04-16T08:05:41.239Z · LW · GW

Is it really quite different, besides halo effect? It strongly depends on the detail, though if the two say the exact same thing, how are things different?

**MrMind**on Highlights from "Integral Spirituality" · 2019-04-15T13:45:27.897Z · LW · GW

The concept of "fake framework", elucidated in the original post, to me it seems one of a model of reality that hides some complexity, sometimes even to the point of being very wrong, but that is nonetheless useful because it makes some other complex area manageable.

On the other hand, when I read the quotes you presented, I see a rich tapestry of metaphors and jargon, of which the proponent himself says that they can be wrong... but I fail completely to see what part of reality they make manageable. These frameworks seems to just add complexity to complexity, without any real leverage over reality. This makes those frameworks draw nearer fiction, rather than useful but simplified models.

For example, if there's no post-rational stage of developement, what use is the advice of not confusing it with a pre-rational stage of developement? If Enlightenment is not a thing, what use is the exortation to come up with a chronologically robust definition of the same?

This to me is the most striking difference between "Integral spirituality" and say a road map. With the road map, you know exactly what is hidden and why, and it's evident how to use it. With Wilber's framework, it seems exactly the opposite.

Maybe this is due to of my unfamiliarity with that material... so someone who has effectively found out something useful out of that model can chime in and tell their experience, and I will stand corrected.

**MrMind**on What I've Learned From My Parents' Arranged Marriage · 2019-03-27T17:04:57.984Z · LW · GW

I'm sorry, but you cannot really learn anything from one example. I'm happy that your parents are faring well in their marriage, but if they didn't would you have learned the same thing?

I've consulted a few statistics on arranged marriage, and they all are:

- underpowered
- showing no significative difference between autonomous and arranged marriages

The latter part is somewhat surprising for a Westerner, but given what you say, the same should be said for an Indian coming from your background.

The only conclusion I can draw fairly conclusively is that, for a long term relationship, the way or the why it started doesn't really matter.

**MrMind**on Plans are Recursive & Why This is Important · 2019-03-12T17:12:13.843Z · LW · GW

Are you familiar with the concept of fold/unfold? Folds are functions that consume structures and produce values, while unfolds do the opposite. The composition of an unfold plus a fold is called a hylomorphism, of which the factorial is a perfect example: the unfold creates a list from 1 to *n*, the fold multiplies together the entire list. Your section on the "two-fold recursion" is a perfect description of a hylomorphism: you take a goal, unfold it into a plan composed of a list of micro-steps, then you fold it by executing each one of the micro-steps in order.

**MrMind**on On Doing the Improbable · 2018-10-29T15:40:28.709Z · LW · GW

Luke already wrote that there are at least four factors that feed motivation, and the expectation of success is only one of them. No amount of expectancy can increment drive if other factors are lacking, and as Eliezer notice, it's not sane to expect only one factor to be 10x the others so that it alone powers the engine.

What Eliezer is asking is basicall if anyone has solved the basic coordination problem of mankind, and I think he knows very well that the answer to his question is no. Also, because we are operating in a relatively small mindspace (humans' system 1), the fact that no one solved that problem in hundreds of thousands of years of cooperation points strongly toward the fact that such a solution doesn't exist.

**MrMind**on (A -> B) -> A · 2018-10-05T10:41:27.824Z · LW · GW

Re: the third point, I think it's important to differentiate between and , where is the true prediction, that is what actually happens when an agent performs the action .

is simply the outcome the agent is aiming at, while is the outcome the agent eventually gets. So maybe it's more interesting a measure of similarity in , from which you can compare the two.

**MrMind**on (A -> B) -> A · 2018-10-04T16:18:43.244Z · LW · GW

Let's say that is the set of available actions and is the set of consequences. is then the set of predictions, where a single prediction associates to every possible action a consequence. is then a choice operator, that selects for each prediction an action to take.

What we have seen so far:

- There's no 'general' or 'natural' choice operator, that is, every choice operator must be based on at least a partial knowledge of the domain or the codomain;
- Unless the possible consequences are trivial, a choice operator will choose the same action for many different predictions, that is a choice operator only uses certain feature of the predictions' space and is indifferent to anything else [1];
- A choice operator defines naturally a 'preferred outcome' operator, which is simply the predicted outcome of the chosen action, and is defined by 'sandwiching' the choice operator between two predictions. I just thought
*interleave*is a better name than*sandwich*. It's of type .

[1] To show this, let be a partition of and let be the equivalence relation uniquely generated by the partition. Then

**MrMind**on (A -> B) -> A · 2018-09-13T08:42:50.932Z · LW · GW

I wonder if there are any plausible examples of this type where the constraints don't look like ordering on B and search on A.

Yes, as I shown in my post, such operators must know at least an element of one of the domains of the function. If it knows at least an element of A, a constant function on that element has the right type. Unfortunately, it's not much interesting.

**MrMind**on (A -> B) -> A · 2018-09-13T08:12:23.446Z · LW · GW

It's interesting to notice that there's nothing with that type on hoogle (Haskell language search engine), so it's not the type of any common utility.

On the other hand, you can still say quite a bit on functions of that type, drawing from type and set theory.

First, let's name a generic function with that type . It's possible to show that k cannot be parametric in both types. If it were, would be valid, which is absurd ( has an element!). It' also possible to show that if k is not parametric in one type, it must have access to at least an element of that type (think about and ).

A simple cardinality argument also shows that k must be many-to-one (that is, non injective): unless B is 1 (the one element type),

There is an interesting operator that uses k, which I call interleave:

Trivially,

It's interesting because partially applying interleave to some k has the type , which is the type of continuations, and I suspect that this is what underlies the common usage of such operators.

**MrMind**on Youtube channel devoted to the art of rationality · 2017-12-18T09:09:04.246Z · LW · GW

The difference would be that I'm doing it more for myself than for those out there, because I don't expect my youtube video to get out much.

I also don't know if I'll get some attention, I'm doing that entirely for myself: to leave a legacy, to look back and say that I too did something to raise the sanity waterline.

My biggest hurdle currently is video editing.

My motto: "think big, act small, move quickly". I know that my first videos will suck, I've prepared to embrace suckiness and plunge forward anyway.

**MrMind**on Youtube channel devoted to the art of rationality · 2017-12-18T09:03:01.541Z · LW · GW

Honestly, I'm not sure how explaining Bayesian thinking will help people with understanding media claims.

Sometimes important news are based entirely on the availability bias or the base rate fallacy: knowing them is important to cultivate a critical view of media. To understanding why they are wrong you need probabilistic reasoning. But media awareness is just an excuse, a hook to introduce Bayesian thinking, which will allow me to also talk about how to construct a critical view of science.

**MrMind**on Youtube channel devoted to the art of rationality · 2017-12-15T13:53:22.075Z · LW · GW

These are all excellent tips, thank you!

**MrMind**on Bayes and Paradigm Shifts - or being wrong af · 2017-12-15T11:20:53.813Z · LW · GW

A much, much easier think that still works is P(sunrise) = 1, which I expect is what ancient astronomers felt about.

**MrMind**on Bayes and Paradigm Shifts - or being wrong af · 2017-12-14T10:48:30.744Z · LW · GW

That entirely depends on your cosmological model, and in all cosmological models I know, the sun is a definite and fixed object, so usually

**MrMind**on Will IOTA work as promized? · 2017-12-13T11:07:56.387Z · LW · GW

From what I've understood of the white paper, there's no transaction fee because, instead of rewarding active nodes like in the blockchain, the Tangle punishes inactive nodes. So when a node performes few transactions, other nodes tends to disconnect from it and in the long run an inactive node will be dropped entirely.

On the other hand, a node has only a partial copy of the entire Tangle at each time, so it is possible to keep it small even when the total volume is large.

Economically, I don't know if switching from incentives to partecipate to punishments for leaving makes sense.

**MrMind**on The list · 2017-12-13T10:42:12.653Z · LW · GW

With the magic of probability theory, you can convert one into the other. By the way, you yourself should search for evidence that you're wrong, as any honest intellectual would do.

**MrMind**on Bayes and Paradigm Shifts - or being wrong af · 2017-12-13T09:59:44.303Z · LW · GW

This might be a minor or a major nitpick, depending on your point of view: Laplace rule works only if the repeated trials are thought to be independent of one another. That is why you cannot use it to predict sunrise: even without accurate cosmological model, it's quite clear that the ball of fire rising up in the sky every morning is always the same object. But what prior you use after that information is another story...

**MrMind**on The list · 2017-12-11T15:53:24.716Z · LW · GW

This is a standard prediction since the unconscious was theorized more than a century ago, so unfortunately it's not good evidence that the model is correct. Unfortunately, if what you've written is the only things that the list has to say, then I would say that no, this is not worth pursuing.

**MrMind**on The list · 2017-12-11T13:12:54.568Z · LW · GW

In a vein similar to Erfeyah's comment, I think that your model needs to be developed much more. For example, what predictions does it make that are notably different from other psychological models? It's just an explanation that feels too "overfitted".

**MrMind**on Security Mindset and Ordinary Paranoia · 2017-11-28T13:19:29.189Z · LW · GW

I feel that Eliezer's dialogue are optimized for "one-pass reading", when someone reads an article once and moves along to other contents. To convey certain ideas, or better yet, certain modes of thinking, they necessarily need to be very long, very repetitive, grasping the same concept from different directions.

On the other hand, I prefer much more direct and concise articles that one can re-read at will, grasping a smidge of concept at every pass. This is though a very unpopular format to be consumed on social media, so I guess that, as long as the format is intentional, this is the reason.

**MrMind**on Arbitrary Math Questions · 2017-11-21T17:00:50.020Z · LW · GW

Probably those questions needs to be polished and stated more clearly to receive a precise answer. I'll try to add something regarding the second point (the first I'm not sure I understand): from the point of view of VNM-rationality, which is the only guarantee that an agents has a utility function, you can only deduce that utility order-type is isomorphic to R, the set of reals. So in full generality, you cannot deduce anything about the dimensionality of the utility function before stating which actually it is.

**MrMind**on Simple refutation of the ‘Bayesian’ philosophy of science · 2017-11-10T11:32:40.562Z · LW · GW

To explain, e.g. to describe "why" something happened, is to talk about causes and effects.

I would still say that cause and effect is a subset of the kind of models that are used in statistics. A case in point is for example Bayesian networks, that can accomodate both probabilistc and causal relations.

I'm aware that Judea Pearl and probably others reverse the picture, and think that C&E are the real relations, which are only approximated in our mind as probabilistic relations. On that, I would say that quantum mechanics seems to point out that there is something fundamentally undetermined about our relations with cause and effect. Also, causal relations are very useful in physics, but one may want to use other models where physics is not especially relevant.

From what one may call "instrumentalist" point of view, time is a dimension so universal that any model can compress information by incorporating it, but it is not *necessarily* so, as relativity shows us: indeed, general relativity shows us you can compress a lot of information by not explicitly talking about time, and thus by sidestepping clean causal relations (what is cause in a reference frame is effect in another).

Prediction and explanation are very very different.

I'm not aware of a theory or a model that uses vastly different entities to explain and to predict. The typical case of a physical law posits an ontology governed by a stable relation, thus using the precise same pieces to explain the past and predict the future. Besides, such a model would be very difficult to tune: any set of data can be partitioned in any way you like between training and test, and it seems odd that a model is so dependent from the experimenter's intent.

**MrMind**on Simple refutation of the ‘Bayesian’ philosophy of science · 2017-11-09T13:02:29.793Z · LW · GW

By ‘Bayesian’ philosophy of science I mean the position that (1) the objective of science is, or should be, to increase our ‘credence’ for true theories [...]

Phew, I thought for a moment he was about to refute the *actual* Bayesian philosophy of science...

Snark aside, as others have noticed, point 1 is highly problematic. From a broader perspective, if Bayesian probability has to inform the practice of science, then a scientist should be wary of the concept of truth. Once a model has reached probability 1, it becomes an unwieldy object: it cannot be swayed by further, contrary evidence, and if we ever encounter an impossible piece of data (impossible for that model), the whole system breaks down. It is then considered good practice to always hedge models with a small probability for 'unknown unknowns', even with our most certain beliefs. After all, humans are finite and the universe is much, much bigger.

On the other hand, I don't think it's fair to say that the objective of science is either to "just explain" or "just predict". Both views are unified and expanded by the Bayesian perspective: "explanation", as far as the concept can be modelled mathematically, is fitness to data and low complexity. On the other hand, predictive power is fitness to future data, which can only be checked once the future data had been acquired. What is one man's prediction can be another man's explanation.