Posts

Comments

Comment by bargo on Infra-Bayesian physicalism: a formal theory of naturalized induction · 2023-06-12T19:28:00.219Z · LW · GW

A theory of physics is mathematically quite similar to a cellular automaton. This theory will usually be incomplete, something that we can represent in infra-Bayesianism by Knightian uncertainty. So, the "cellular automaton" has underspecified time evolution.

What evidence is there that incomplete models with Knightian uncertainty are a way to turn rough models of physics into loss functions? Can the ideas behind it be applied to regular Bayesianism?

Comment by bargo on Unifying Bargaining Notions (2/2) · 2023-05-19T17:55:15.426Z · LW · GW

bolution

solution

Comment by bargo on Logical Probability of Goldbach’s Conjecture: Provable Rule or Coincidence? · 2023-05-14T08:33:10.941Z · LW · GW

https://arxiv.org/abs/2211.06738 is related

Comment by bargo on Self-Reference Breaks the Orthogonality Thesis · 2023-03-02T09:12:48.692Z · LW · GW

Sure, I mean that it is an implementation of what you mentioned in the third-to-last paragraph.

Comment by bargo on Self-Reference Breaks the Orthogonality Thesis · 2023-03-01T18:31:25.057Z · LW · GW

Congratulations, you discovered [Active Inference]!

Comment by bargo on Curiosity as a Solution to AGI Alignment · 2023-02-28T02:42:38.883Z · LW · GW

Solomonoff Induction and Machine Learning. How would you formulate this in terms of a machine that can only predict future observations?

Comment by bargo on Curiosity as a Solution to AGI Alignment · 2023-02-27T12:30:25.121Z · LW · GW

I want to provide feedback, but can't see the actual definition of the objective function in either of the cases. Can you write down a sketch of how this would be implemented using existing primitives (SI, ML) so I can argue against what you're really intending?

Some preliminary thoughts:

  1. Curiosity (obtaining information) is an instrumental goal, so I'm not sure if making it more important will produce more or less aligned systems. How will you trade off curiosity and satisfaction of human values?
  2. It's difficult to specify correctly - depending on what you mean by curiosity, the AI can start showing itself random unpredictable data (the Noisy TV problem for RL curiosity) or manipulating humans to give it self-affirming instructions (if the goal is to minimize prediction error). So far I haven't seen a solution that isn't a hack that won't generalize to superintelligent agents.
Comment by bargo on [deleted post] 2023-02-04T19:48:00.645Z

Hmm, can you elaborate on what you mean in the last sentence?