Learn Bayes Nets!

post by abramdemski · 2018-03-27T22:00:11.632Z · score: 84 (24 votes) · LW · GW · 6 comments

It recently occurred to me that there are a lot of people in the rationalist community who want to deeply absorb intuitions about how Bayes' theorem works and how to think with it in practice, who have not been specifically told that learning inference algorithms for Bayesian networks is one of the best ways forward.

Well, I'm telling you now.

Bayesian networks were the innovation which made probabilistic reasoning really practical and interesting for artificial intelligence -- and none of the reasons for that are special to trying to squeeze intelligence into a computer. They're also, more or less, describing the way people have to think in order to do probabilistic reasoning in practice. There have been many innovations in probabilistic reasoning since Bayes nets, but those are arguably more about how to get good results on a computer and less about fundamental conceptual issues that you'll get a lot from.

I would argue that the most important inference algorithm to learn about to get practical intuitions is belief propagation. There are others who would argue for monte carlo algorithms, like MCMC (monte carlo markov chain). You may want to learn both, to form your own opinion (and of course, there are many more algorithms beyond this which you may want to learn, in order to gain more connections so your knowledge of the field sticks, and gain more insights). Belief prop and MCMC are more or less the first two algorithms people thought of; there are a lot of newer developments, but they're largely elaborations.

Here is what I claim you can get out of it, through careful study:

I think the best thing to read, to get up to speed, is the first four chapters of Pearl's Probabilistic Reasoning in Intelligent Systems. It's the original source; Pearl didn't invent everything, but he invented a lot, and he's the first who put it all together. There are better modern introductions for people who want to apply bayesian networks in machine learning, but because Pearl was writing at a time when the use of probability theory was not widely accepted in artificial intelligence, he goes into the philosophy of the subject in a way newer sources don't. I think this is good for the LessWrong audience.

It would be even better, of course, if someone were to write a sequence explaining everything from a more specifically LessWrong perspective, drawing out the implications I mentioned above. Alas, I don't have that much time to spend on writing (which is to say, I have other higher-value things to do, in my current estimation).

One might also derive a more general lesson on the relevance of algorithms to rationality [LW · GW], and go read Artificial Intelligence: A Modern Approach as a rationality textbook. [LW · GW]

6 comments

Comments sorted by top scores.

comment by Qiaochu_Yuan · 2018-03-28T16:17:07.725Z · score: 26 (6 votes) · LW · GW
It would be even better, of course, if someone were to write a sequence explaining everything from a more specifically LessWrong perspective, drawing out the implications I mentioned above. Alas, I don't have that much time to spend on writing (which is to say, I have other higher-value things to do, in my current estimation).

+1; generally in favor of people who have interesting ideas but too many other competing ideas to execute them to just share those ideas and see if anyone else wants to pick them up. This strikes me as a particularly good "homework assignment" for someone who just really, really wants to grok Bayes nets.

comment by kinrany · 2018-03-29T17:19:05.614Z · score: 2 (1 votes) · LW · GW

Idea: Google + Reddit for ideas. Suggest and find previously suggested ideas using the search bar, then vote on ideas and discuss them in the comments.

comment by TurnTrout · 2018-03-28T01:06:47.597Z · score: 13 (4 votes) · LW · GW

Very much agree with this post. In my opinion, ch. 14 (Bayes nets) was the most important chapter in AI: AMA; I actually did every single non-programming exercise for that chapter, coming back later to ensure I could redo those I got wrong. I'd "learned" about probability by reading, but it wasn't until I put my nose to the grindstone and did the math that I started being able to see probability flowing through the networks, that I got S1 intuitions for the difference between independence and conditional independence.

Of course, that was just a chapter - hardly "careful study"; I'm very much looking forward to Pearl.

comment by Gram Stone · 2018-03-27T23:45:38.105Z · score: 13 (4 votes) · LW · GW

It's also a useful analogy for aspects of group epistemics, like avoiding double counting as messages pass through the social network.

Fake Causality [LW · GW] contains an intuitive explanation of double-counting of evidence.

comment by abramdemski · 2018-03-28T01:12:05.446Z · score: 18 (5 votes) · LW · GW

Yeah, and it uses the same analogy for understanding belief propagation as Pearl himself uses, and a reference to Pearl, and a bit more discussion of Bayes nets as a good way to understand things. But, I think, a lot of people didn't derive the directive "Learn Bayes nets!" from that example of insight derived from Bayes nets (and would benefit from going and doing that).

I do think there are some other intuitions lurking in Bayes net algorithms which could benefit from a similar write-up to Fake Causality, but which went "all the way" in terms of describing Bayes nets, rather than partially summarizing.

comment by Viktor Riabtsev (viktor-riabtsev) · 2018-03-28T16:18:40.991Z · score: 11 (3 votes) · LW · GW

Ordered Probabilistic Reasoning in Intelligent Systems . Looking forward to reading it.