## Posts

## Comments

**kim0**on Critiquing Gary Taubes, Part 1: Mainstream Nutrition Science on Obesity · 2013-12-25T20:54:51.741Z · LW · GW

Taubes understand the value of falsification, such as mice starving to death while getting fat from little eating. This falsifies the "calories in, calories out" hypothesis, which was quite ambiguous anyway. His talks, and writings, have many falsifications like this. The main post here has none.

**kim0**on The Savage theorem and the Ellsberg paradox · 2012-01-22T17:00:05.181Z · LW · GW

And yet again I am reminded why I do not frequent this supposedly rational forum more. Rationality swishes by over most peoples head here, except for a few really smart ones. You people make it too complicated. You write too much. Lots of these supposedly deep intellectual problems have quite simple answers, such as this Ellsberg paradox. You just have to look and think a little outside their boxes to solve them, or see that they are unsolvable, or that they are wrong questions.

I will yet again go away, to solve more useful and interesting problems on my own.

Oh, and Orthonormal, here is my correct final answer to you: You do not understand me, and this is your fault.

**kim0**on The Savage theorem and the Ellsberg paradox · 2012-01-19T07:55:52.434Z · LW · GW

Bayesian reasoning is for maximizing the probability of being right. Kelly´s criterion is for maximizing aggregated value.

And yet again, the distributions of the probabilities are **different**, because they have different variance, and difference in variance give different aggregated value, which is what people tend to try to optimize.

Aggregating value in this case is to get more pies, and fewer boots to the head. Pies are of no value to you when you are dead from boots to the head, and this is the root cause for preferring lower variance.

This isn´t much of a discussion when you just ignore and deny my argument instead of trying to understand it.

**kim0**on The Savage theorem and the Ellsberg paradox · 2012-01-18T13:55:28.477Z · LW · GW

No, because expected value is not the same thing as variance.

Betting on red gives 1/3 winnings, exactly.

Betting on green gives 1/3 +/- x winnings, and this is a variance, which is bad.

**kim0**on The Savage theorem and the Ellsberg paradox · 2012-01-17T09:23:59.176Z · LW · GW

Preferring red is rational, because it is a known amount of risk, while each of the other two colours have unknown risks.

This is according to Kellys criterion and Darwinian evolution. Negative outcomes outweigh positive ones because negative ones lead to sickness and death through starvation, poorness, and kicks in the head.

This is only valid in the beginning, because when the experiment is repeated, the probabilities of blue and green become clearer.

**kim0**on Procedural Knowledge Gaps · 2011-02-08T07:59:19.534Z · LW · GW

There often is not any difference at all between flirting and friendliness. People vary very much in their ways. And yet we are supposed to easily tell the difference, with threat of imprisonment for failing.

The main effects I have seen and experienced, is that flirting typically involve more eye contact, and that a lot of people flirt while denying they do it, and refusing to to tell what they would do if they really flirted, and disparaging others for not knowing the difference.

My experience is also that ordinary people are much more direct and clear in the difference between flirting and friendship, while academic people muddle it.

**kim0**on Less Wrong: Open Thread, September 2010 · 2010-09-03T08:19:18.852Z · LW · GW

Most places I have worked, the reputation of the job has been quite different from the actual job. I have compared my experiences with those of friends and colleagues, and they are relatively similar. Having a M.Sc. in physics and lots of programming experience made it possible for me to have more different kinds of engineering jobs, and thus more varied experience.

My conclusion is that the anthropic principle holds for me in the work place, so that each time I experience Dilbertesque situations, they are representative of typical work situations. So yes, I do think my work situation is typical.

My current job doing statistical analysis for stock analysts pay $ 73 000, while the average pay elsewhere is $ 120 000.

**kim0**on Less Wrong: Open Thread, September 2010 · 2010-09-01T09:08:23.192Z · LW · GW

I am, and I am planning to leave it to get a higher more average pay. From my viewpoint, it is terribly overrated and undervalued.

**kim0**on Applying Behavioral Psychology on Myself · 2010-06-22T07:34:49.460Z · LW · GW

That was a damn good article!

It was short, to the point, and based on real data, and useful as well. So unlike the polite verbiage of karma whores. Even William of Ockham would have been proud of you.

Kim0+

**kim0**on Open Thread: May 2010 · 2010-05-02T12:52:36.824Z · LW · GW

I wondered how humans are grouped, so I got some genes from the world, and did an eigenvalue analysis, and this is what i found:

http://kim.oyhus.no/EigenGenes.html

As you can see, humans are indeed clustered in subspecies.

**kim0**on Open Thread: February 2010 · 2010-04-02T06:59:34.726Z · LW · GW

**Many-Worlds explained, with pretty pictures.**

http://kim.oyhus.no/QM_explaining_many-worlds.html

The story about how I deduced the Many-Worlds interpretation, with pictures instead of formulas.

Enjoy!

**kim0**on Newcomb's problem happened to me · 2010-03-30T21:24:52.957Z · LW · GW

You all are quite good at picking up the implications, which means my post worked.

**kim0**on Even if you have a nail, not all hammers are the same · 2010-03-30T09:54:54.127Z · LW · GW

Yes. Quadratic regression is better, often. The problem is that the number of coefficients to adjust in the model gets squared, which goes against Ockhams razor. This is precisely the problem I am working on these days, though in the context of the oil industry.

**kim0**on Even if you have a nail, not all hammers are the same · 2010-03-30T09:33:56.199Z · LW · GW

Thank you for a very nice article.

**kim0**on Newcomb's problem happened to me · 2010-03-30T09:19:07.592Z · LW · GW

More realistic:

Kate wants to increase her status by getting Joe to commit. She then becomes complacent, and then fat, lazy, and bitchy, and looses her respect for Joe since she was able to manipulate him. She never understands any of this, and accuse Joe of her shortcomings.

Joe never understands it either, and gets more remote, stressed, tired, workaholic, lonely, and subdued. They are both miserable.

And if anyone describes anything resembling this typical story to them, they immediately attack with accusations of political incorrectness, misogyny, discrimination, etc.

This bears on why I do not find Newcombs problem and similar of much interest:

Reality is so dominated by myriads of effects that models containing beeings like Omega or similar are far too unrealistic. Just collecting data and making simple models gets one much further.

**kim0**on Advice for AI makers · 2010-01-19T08:29:24.232Z · LW · GW

I guess you down-voters of me felt quite rational when doing so.

And this is precisely the reason I seldom post here, and only read a few posters that I know are rational from their own work on the net, not from what they write here:

There are too many fake rationalists here. The absence of any real arguments either way to my article above, is evidence of this.

My Othello/Reversi example above was easy to understand, and a very central problem in AI systems, so it should be of interest to real rationalists interested in AI, but there is only negative reaction instead, from people I guess have not even made a decent game playing AI, but nevertheless have strong opinions on how they must be.

So, for getting intelligent rational arguments on AI, this community is useless, as opposed to Yudkowsky, Schmidhuber, Hansen, Tyler, etc. which has shown on their own sites that they have something to contribute.

To get real results in AI and rationality, I do my own math and science.

**kim0**on Advice for AI makers · 2010-01-18T23:39:30.434Z · LW · GW

The real dichotomy here is "maximising evaluation function" versus "maximising probability of positive evaluation function"

In paperclip making, or better, the game of Othello/Reversi, there are choices like this:

80% chance of winning 60-0, versus 90% chance of winning 33-31.

The first maximises the winning, and is similar to a paperclip maker consuming the entire universe. The second maximises the probability of succeeding, and is similar to a paperclip maker avoiding being annihilated by aliens or other unknown forces.

Mathematically, the first is similar to finding the shortest program in Kolmogorov Complexity, while the second is similar to integrating over programs.

So, friendly AI is surely of the second kind, while insane AI is of the first kind.

**kim0**on Advice for AI makers · 2010-01-18T20:39:06.960Z · LW · GW

You got voted down because you were rational. You went over some peoples heads.

These are popularity points, not rationality points.

**kim0**on Advice for AI makers · 2010-01-18T20:28:22.971Z · LW · GW

Yes. To me it seems like all arguments for the importance of friendly AI are based on the assumption that its moral evaluation function must be correct, or it will necessarily become evil or insane, due over optimization of some weird aspect.

However, with uncertainty in the system, as limited knowledge of the past, or as uncertainty in what the evaluation function is, optimization should take this into account, and make strategies to keep its options open. In the paperclip example, this would be avoiding making people into paperclips because it suspects that the paperclips might be for people.

Mathematically, an AI going evil insane corresponds to it seeking the most probable optimization, while doing multiple strategies corresponds to it integrating the probabilities over different outcomes.

**kim0**on The Anthropic Trilemma · 2009-09-28T07:04:36.478Z · LW · GW

I have an Othello/Reversi playing program.

I tried making it better by applying probabilistic statistics to the game tree, quite like antropic reasoning. It then became quite bad at playing.

Ordinary minimax with A-B did very well.

Game algorithms that ignore density of states in the game tree, and only focus on minimaxing, do much better. This is a close analogy to the experience trees of Eliezer, and therefore a hint that antropic reasoning here has some kind of error.

Kim0

**kim0**on "What Is Wrong With Our Thoughts" · 2009-05-18T20:55:07.157Z · LW · GW

What exactly makes it difficult to use Russian? I know Russian, so I will understand the explanation.

I find my native Norwegian better to express concepts in than English. If I program something especially difficult, or do some difficult math, physics, or logic, I also find Norwegian better.

However, if I do some easier task, where I have studied it in English, I find it easy to write in English, due to a "cut and paste" effect. I just remember stuff, combine it, and write it down.

**kim0**on "What Is Wrong With Our Thoughts" · 2009-05-17T21:14:28.100Z · LW · GW

Interesting, but too verbose.

The author is clearly not aware of the value of the K.I.S.S. principle, or Ockhams razor, in this context.

**kim0**on The First Koan: Drinking the Hot Iron Ball · 2009-05-12T06:00:51.902Z · LW · GW

You are wrong. Here are some links showing that Go is **not** perfectly clear:

**kim0**on The First Koan: Drinking the Hot Iron Ball · 2009-05-11T03:58:53.112Z · LW · GW

Giving it up is rational thinking, because there is no "it" there when the label is too broad.

In Bayesian inference, it is equivalent to P( A | B v C v D v ...), which is somewhat like underfitting. The space of possibilities becomes too large for it to be possible to find a good move. In games it is precisely the unclear parts of the game space that is interesting to the loosing part, because it is most likely there will be better moves there. But when it is not even possible to analyze those parts, then true optimal play regresses to quarreling about it, which is precisely what the Japanese tradition has done for at least some hundred years.

I have played enough Go to know that the concrete rules can make the endgame very different. The usual practice is to pretend it is not so, and stop the game before the endgame starts.

So Go is riddled with quarrels and pretense. Not a game in practice. More like politics, or Zen.

Optimal playing strategies in games can be very different from what people believe them to be, as examplified by the program Eurisko which won the Traveller TCS championships with very unconventional fleets. I suspect strongly that similar thing will happen for true Go games.

I might have found a variation of minimax that can tackle Go, but to use it, it MUST be possible to evaluate a Go position, at least in principle. So I will probably go for the Tromp-Taylor rules, if I get the time to do this. And perhaps the Japanese rules of Robert Jasiek.

**kim0**on A Request for Open Problems · 2009-05-10T11:02:24.479Z · LW · GW

I agree. We seem to have the same goal, so my first advice stands, not my second.

I am currently trying to develop a language that is both simple and expressive, and making some progress. The overall design is finished, and I am now down to what instructions it should have. It is a general bi-graph, but with a sequential program structure, and no separation of program and data.

It is somewhat different from what you want, as I also need something that have measurable use of time and memory, and is provable able to run fast.

**kim0**on The First Koan: Drinking the Hot Iron Ball · 2009-05-10T09:16:32.546Z · LW · GW

My experience is that neither the Go players nor the philosophers have deep understanding of Go. The Go players have practice in the culture of Go, while philosophers often know philosophy.

The problem is that Go is actually not a game, while people believe that it is. Go are belief systems, and cultures, and in that respect similar to Zen.

The reason that Go is not a game, is that it do not have clear rules. The only people I know that have practical and deep understanding of Go is John Tromp, who has made some nice Go rules, and Robert Jasiek, who claim to have formalized the Japanese practice into rules.

As for me, I gave up the "game" when I realized it had no true core, and the same goes for Zen.

**kim0**on A Request for Open Problems · 2009-05-10T07:54:33.295Z · LW · GW

Then I would go for Turing machines, Lambda calculus, or similar. These languages are very simple, and can easily handle input and output.

Even simpler languages, like cellular automaton No.110 or Combinatory Logic might be better, but those are quite difficult to get to handle input and output correctly.

The reason simple languages, or universal machines, should be better, is that the upper bound for error in estimating probabilities is 2 to the power of the complexity of the program simulating one language in another, according to algorithmic information theory.

So, the simpler the language is, the more correct relative probabilities it will give.

The answer I gave before this one, was for the question: Maximize the likelihood that the language will produce the output corresponding to the observed events.

You however want to compare 2 hypotheses, getting 2 probabilities, and compare them. In this case the absolute size of the probabilities do not matter, just their relative size, so you can just go for the simplest language that is practical to use.

What do you want to use this for?

**kim0**on A Request for Open Problems · 2009-05-09T09:17:45.928Z · LW · GW

That depends on what you mean by "best".

Is speed of calculation important? What about suitability for humans? I guess you want one where complexities are as small as possible.

Given 2 languages, L1 & L2, and their complexity measures, K1 & K2.

If K1(L2) < K2(L1) then I take that as a sign that L1 is better for use in the context of Ockhams razor. It is also a sign that L1 is more complex than L2, but that effect can be removed by doing lots of comparisons like that, so the unnecessarily complex languages loose against those that are actually good. Or one can use 3 languages in a comparison: K1(L3) < K2(L3) is a sign that L1 is better.

The idea here is that a good language should more easily represent systems, such as other languages.

What sort of languages are best in this measure? Not the simplest ones, but rather the more expressive ones. Programs for Turing machines use some complexity for tape traversing and storage systems, so I guess they are not so good. Cellular automata have the same problems. Combinatory Logic is very simple, with complex operators even for simple stuff, but its system for storage and access can be simpler, since it is a tree. I guess Lambda Calculus is one of the better languages, as it is both simple and very expressive, because of its more direct tree structure.

I guess the best language would use an even more expressive structure, such as graphs, perhaps bi-graphs, since those are maximally expressive. The Scheme language can be used as a general graph language thanks to the set-car! and set-cdr! procedures.

**kim0**on No Universal Probability Space · 2009-05-08T22:15:09.235Z · LW · GW

Events, and the universe itself, are encodable as sequences.

This means that events, and possible universes, are a subset of the sequences generated from the universal computer.

Algorithmic information theory can then be used to find probabilities for events and universes.

This is one of the CENTRAL POINTS of Algorithmic Information Theory.

What I am doing now, is teaching you A.I.T., while you wrongly claim you understand it, and wrongly claim I do not, despite an amount of evidence to the contrary. I therefore conclude that you are not very rational.

**kim0**on No Universal Probability Space · 2009-05-08T20:03:34.988Z · LW · GW

It is universal, because every possible sequence is generated.

It is universal, because it is based on universally recursive functions.

It is universal, because it uses an universal computer.

People knowing algorithmic complexity know that it is about probability measures, spaces, universality, etc. You apparently did not, while nitpicking instead.

**kim0**on No Universal Probability Space · 2009-05-08T09:51:52.827Z · LW · GW

You are wrong because I did specify a probability space.

The probability space I specified was one where the sample space was the set of all outputs of all programs for some universal computer, and the measure was one from the book I mentioned. One could for instance choose the Solomonoff measure, from 4.4.3.

From your writings I conclude that is it quite likely that you are neither quite aware of the concept, nor understanding what I write, while believing you do.

**kim0**on No Universal Probability Space · 2009-05-07T14:43:37.826Z · LW · GW

All that will be answered if you study "algorithmic information theory", or "Kolmogorov komplexity" as it is also called. You can find some of it on the net of course, or you can read the book "An Introduction to Kolmogorov Complexity and its Applications" by Ming Li & Paul Vitányi. Thats the book I read, some years after I invented it myself.

**kim0**on Hardened Problems Make Brittle Models · 2009-05-06T20:11:37.759Z · LW · GW

I guess the point is to model artificial intelligences, of which we know almost nothing, so the models and problems need the robustness of logic and simpleness.

Thats why they are brittle when used for modeling people.

**kim0**on No Universal Probability Space · 2009-05-06T19:53:25.022Z · LW · GW

O.K.

One wants an universal probability space where one can find the probability of any event. This is possible:

One way of making such a space is to take all recursive functions of some universal computer, run them, and storing the output, resulting in an universal probability space because every possible set of events will be there, as the results of infinitely many recursive functions, or programs as they are called. The probabilities corresponds to the density of these outputs, these events.

A counterargument is that it is too dependent on the actual universal computer chosen. However, theorems in algorithmic information theory shows that this dependence converges asymptotically as information increases, because the difference of densities of different outputs from different universal computers can at most be 2 to the power of the shortest program simulating the universal computer in another universal computer.

Kim Øyhus

**kim0**on No Universal Probability Space · 2009-05-06T08:42:32.838Z · LW · GW

The technically precise reference was this part:

"This is algorithmic information theory,.."

But if you claim my first line was too obfuscated, I can agree.

Kim Øyhus

**kim0**on No Universal Probability Space · 2009-05-06T08:20:40.423Z · LW · GW

All recursive probability spaces converge to the same probabilities, as the information increases.

Not that those people making up probabilities knows anything about that.

If you want an universal probability space, just take some universal computer, run all programs on it, and keep those that output event A. Then you can see how many of those that output event B, and thus you can get p(B|A) whatever A and B are.

This is algorithmic information theory, and should be known by any black belt bayesian.

Kim Øyhus

**kim0**on Without models · 2009-05-05T21:08:37.512Z · LW · GW

Very interesting article that.

However, evolution is able to test and spread many genes at the same time, thus achieving higher efficiency than the article suggests. Sort of like spread spectrum radio.

I am quite certain its speed is lower than some statistical methods, but not by that much. I guess at something like a constant factor slower, for doubling gene concentration, as compared to 1 std deviation certainty for the goodness of the gene by Gaussian statistics.

Random binary natural testing of a gene is less accurate than statistics, but it avoids putting counters in the cells for each gene, thus shrinking the cellular machinery necessary for this sort of inference, thus increasing the statistical power per base pair. And I know there are more complicated methods in use for some genes, such as anti bodies, methylation, etc.

And then there is sexual selection, where the organisms use their brains to choose partners. This is even closer to evolution assisted by Bayesian super intelligence.

So I guess that evolutions is not so slow after all.

Kim Øyhus

**kim0**on Without models · 2009-05-05T09:44:31.318Z · LW · GW

What is your evidence for this assertion?

In my analysis, evolution by sexual reproduction can be very good at rationality, collecting information of about 1 bit per generation per individual, because an individual can only be naturally selected or die 1 time.

The factors limiting the learning speed of evolution is the high cost of this information, namely death, and that this is the only kind data going into the system. And the value to be optimized is avoidance of death, which also avoids data gathering. And this optimization function is almost impossible to change.

If the genes were ideal Bayesian intelligences, they would still be limited by this high cost of data gathering. It would be something like this:

Consider yourself in a room. On the wall there is a lot of random data. You can do whatever you want with it, but whatever you do, it will change your chance of dying or being copied with no memory. The problem is that you do not know when you die or are copied. Your task is to decrease your chance of dying. This is tractable mathematically, but I find it somewhat tricky.

Kim Øyhus

**kim0**on Without models · 2009-05-05T07:04:04.215Z · LW · GW

All control systems DO have models of what they are controlling. However, the models are typically VERY simple.

A good principle for constructing control systems are: Given that I have a very simple model, how do I optimize it?

The models I learned about in cybernetics were all linear, implemented as matrices, resistors and capacitors, or discrete time step filters. The most important thing was to show that the models and reality together did not result in amplification of oscillations. Then one made sure that the system actually did some controlling, and then one could fine tune it to reality to make it faster, more stable, etc.

One big advantage of linear models is that they can be inverted, and eigenvectors found. Doing equivalent stuff for other kinds of models is often very difficult, requiring lots of computation, or is simply impossible.

As has someone has written before here: It is mathematically justified to consider linear control systems as having statistical models of reality, typically involving gaussian distributions.

Kim Øyhus

**kim0**on How to come up with verbal probabilities · 2009-05-02T10:01:55.799Z · LW · GW

Verbal probabilities are typically impossible because the priors are unknown and important.

However: relative probabilities and similar can often be given usueful estimates, or limits.

For instance: Seeing a cat is more likely than seeing a black cat because black cats are a subset of cats.

Stuff like this is the reason that pure probability calculations are not sufficient for general intelligence.

Probability distributions however, seem to me to be sufficient. This cat example cuts the distribution in 2.

Kim Øyhus