Posts
Comments
Thanks. Obviously this claim needs some interpretation, but a UTM still seems a better model of the Universe than, say, any lower automation in the Chomsky hierarchy. For the purposes of defining entropy, it's important that we can use a small base machine, plus a memory tape that we may think of as expanding in an online fashion.
It is, when dealing with sequences that go on to infinity. In that case you get the "KM complexity", from Definition 4.5.8 of Li & Vitanyi (2019). For online sequence prediction, Solomonoff's prior needs to sum the weights from every program.
No such complications appear in the entropy of a bounded system at a fixed precision. And ultimately, it appears that for entropy to increase, you need some kind of coarse-graining, leading us to finite strings. I discuss this in the Background section and around Corollary 1.
Were those recorded!?
For direct implications, I'd like to speak with the alignment researchers who use ideas from thermodynamics. While Shannon's probabilistic information theory is suited to settings where the law of large numbers holds, algorithmic information theory should bring more clarity in messier settings that are relevant for AGI.
Less directly, I used physics as a testing ground to develop some intuitions on how to apply algorithmic information theory. The follow-up agenda is to develop a theory of generalization (i.e., inductive biases) using algorithmic information theory. A lot of AI safety concerns depend on the specific ways that AIs (mis)generalize beliefs and objectives, so I'd like us to have more precise ideas about which generalizations are likely to occur.
They are equal! As discussed in the comments to that post, the difference is at most a constant; it's possible to make that constant vanish by an appropriate choice of reference universal Turning machine.