Posts

Can we learn much by studying the behaviour of RL policies? 2023-05-15T12:56:25.769Z
Forecasting extreme outcomes 2023-01-09T16:34:40.244Z

Comments

Comment by AidanGoth on All AGI Safety questions welcome (especially basic ones) [May 2023] · 2023-05-23T20:41:11.697Z · LW · GW

I thought it dealt with these ok -- could you be more specific?

It's linear because it's an expectation. It is under-specified in that it needs us to assume or prove the marginal distributions for the  and I guess that's problematic if an algorithm for doing that is a big part of what the authors are looking for. But if we do have marginal distributions for each , then  are well-defined and .

Comment by AidanGoth on All AGI Safety questions welcome (especially basic ones) [May 2023] · 2023-05-23T16:43:46.671Z · LW · GW

This question is in the spirit of "I think I'm doing something dumb / obviously wrong -- help me see why" but it's maybe too niche for this thread. (Answers that redirect me to a better place to ask are welcome.)

I recently read Paul Christiano, Eric Neyman and Mark Xu's "Formalizing the presumption of independence" (https://arxiv.org/pdf/2211.06738.pdf). My understanding is that they aim to formalise some types of reasonable (but defeasible) “hand-waving” in otherwise formal proofs, in a way that maintains the underlying deductive structure of a formal proof and responds appropriately to new information / arguments. They're particularly interested in heuristic estimators that presume the independence of random variables so long as we have no reason to think the variables aren't independent and so long as we can adjust the estimate appropriately if we learn about their dependencies.

To that end, suppose we want to estimate , where  is a set of real-valued random variables, , and we have a collection of deductively proved (in)equalities  about . Then a natural heuristic estimator could be:

where each  has the same marginal distributions as  (i.e.  is equal to  but with each instance of  replaced by ), and where the  are conditionally independent given . This formalises the idea that we assume we've thought of all the dependencies between the variables of interest and that they're independent, conditional on everything we've thought of so far -- but we can revise this estimate by conditioning on new information and dependencies later.

Before considering any information relating the  to each other,  assumes that they are unconditionally independent. As we condition on information about them, we update the estimate to account for this and maintain that the variables are conditionally independent, given the information considered so far. E.g. in the twin primes example, we can initially assume that  and  are independent, and then condition on the fact that if  is prime, then  is odd (this can be operationalised by considering the appropriate indicator function and conditioning on it taking value ) to adjust the estimate and assume (for now) that there are no further dependencies.

We always have . In fact, we always have . If we further have that  doesn't relate  and  (i.e. doesn't include a formula containing both  and ), then I think we have  and , giving  (i.e. without the primes).

My suggested heuristic estimator apparently has lots of nice properties thanks to being an expectation, including some of the informal properties listed in the paper, which can be stated formally (e.g. if  doesn't have an instance of any of the , then conditioning on it won't change the heuristic estimate).

My suggested estimator jumped out to me pretty quickly as capturing (to my understanding) what the authors want, but I'd expect myself to be much worse at this than the authors, who will have spent a while longer thinking about it. So my estimator seems "too good to be true" and I think it's likely I'm pretty confused or missing something obvious and/or important. Please help me see what I'm missing! A couple of hypotheses:

  • There's something wrong / incoherent about my suggested heuristic estimator
  • My suggested heuristic estimator is too general to be useful
    • The paper mainly considers very specific special cases with specific algorithms for heuristic estimators rather than something as general as this, which might be difficult to implement in practice