Posts

Comments

Comment by gsastry on Thoughts on the impact of RLHF research · 2023-01-26T22:51:06.095Z · LW · GW

(While I appreciate many of the investigations in this paper and think it is good to improve our understanding, I don’t think they let us tell what’s up with risk.) This could be the subject of a much longer post and maybe will be discussed in the comments.

Do you mean they don't tell us what's up with the difference in risks of the measured techniques, or that they don't tell us much about AI risk in general? (I'd at least benefit from learning more about your views here)

Comment by gsastry on Geometric Rationality is Not VNM Rational · 2022-11-30T19:57:40.525Z · LW · GW

See also: https://www.lesswrong.com/posts/qij9v3YqPfyur2PbX/indexical-uncertainty-and-the-axiom-of-independence for an argument against independence

Comment by gsastry on Perform Tractable Research While Avoiding Capabilities Externalities [Pragmatic AI Safety #4] · 2022-06-02T21:45:27.124Z · LW · GW

Agreed on (1) and (2). I'm still interested in the counterfactual value of theoretical research in security. One reason is that the "reasoning style" of ELK seems quite similar to that of cryptography – and at least we have some track record with the development of computer security.

Comment by gsastry on Perform Tractable Research While Avoiding Capabilities Externalities [Pragmatic AI Safety #4] · 2022-06-01T20:47:14.078Z · LW · GW

The military and information assurance communities, which are used to dealing with highly adversarial environments, do not search for solutions that render all failures an impossibility.

In information security, practitioners do not look for airtight guarantees of security, but instead try to increase security iteratively as much as possible. Even RSA, the centerpiece of internet encryption, is not provably completely unbreakable (perhaps a superintelligence could find a way to efficiently factor large numbers).

I take your point, and I like the analogy to computer security. But it does seem like cryptography has had a good record of producing innovations that stem from aiming at rigorous guarantees under certain assumptions, and has been extremely valuable to improving the state of computer security.

Do you think that this claim is overstated, or just that we should additionally rely on other approaches?

Comment by gsastry on Argument, intuition, and recursion · 2020-12-13T01:35:27.164Z · LW · GW

Can you recommend some other posts in that reference class?

Comment by gsastry on Comment on decision theory · 2018-10-02T04:56:04.115Z · LW · GW

I agree with both your claims, but maybe with less confidence than you (I also agree with DanielFilan's point below).

Here are two places I can imagine MIRI's intuitions here coming from, and I'm interested in your thoughts on them:

(1) The "idealized reasoner is analogous to a Carnot engine" argument. It seems like you think advanced AI systems will be importantly disanalogous to this idea, and that's not obvious to me.

(2) 'We might care about expected utility maximization / theoretical rationality because there is an important sense in which you are less capable / dumber / irrational if e.g. you are susceptible to money pumps. So advanced agents, since they are advanced, will act closer to ideal agents.'

(I don't have much time to comment so sorry if the above is confusing)

Comment by gsastry on Comment on decision theory · 2018-09-11T02:51:33.679Z · LW · GW

I'm not sure what it means for this work to "not apply" to particular systems. It seems like the claim is that decision theory is a way to understand AI systems in general and reason about what they will do, just as we use other theoretical tools to understand current ML systems. Can you spell this out a bit more? (Note that I'm also not really sure what it means for decision theory to apply to all AI systems: I can imagine kludgy systems where it seems really hard in some sense to understand their behavior with decision theory, but I'm not confident at all)

Comment by gsastry on Mechanistic Transparency for Machine Learning · 2018-07-16T23:40:41.728Z · LW · GW

I'm not sure if this will be helpful or if you've already explored this connection, but the field of abstract interpretation tries to understand the semantics of a computer program without fully executing it. The theme of "trying to understand what a program will do by just examining its source code" is also present in program analysis. If we can understand neural networks as typed functional programs maybe there's something worth thinking about here.

Comment by gsastry on Mythic Mode · 2018-02-25T05:12:17.678Z · LW · GW

Like some other commenters, I also highly recommend Impro if this post resonates with you.

Readers who are very interested in a more conceptual analysis of what decision making "is" in the narrative framework may want to check out Tempo (by Venkatesh Rao, who writes at Ribbonfarm). Rao takes as axiomatic the memetically derived idea that all our choices are between life scripts that end in our death, and looks at how to make these choices. It's more of an analytical book on strategy (with exercises) than a poetic exemplar of Mythic Mode, but it seems very related to me. In particular, I think it helps with a core question of Mythic Mode: how do you get useful work out of this narrative way of thinking without being led astray? I don't claim to have an answer, but reading Tempo has certainly been useful for this question.

Comment by gsastry on Idea: Monthly Community Thread · 2018-01-24T23:19:39.076Z · LW · GW

I'm still confused on where to post stuff that I would think of posting in the old LW's Open Threads. For example, "What are the best pieces of writing/advice on dealing with 'shoulds'?" would be one thing that I'd want to post in an Open Thread. I have other various little questions/requests like this.

Comment by gsastry on A model I use when making plans to reduce AI x-risk · 2018-01-19T17:43:40.871Z · LW · GW

I don't understand the point about avoiding government involvement in the long run. It seems like your argument is that government projects are incompetent at managing tech projects (maybe because of structural reasons). This seems like a very strong claim to me, and seems only accurate when there's bad incentive compatibility. For example, are you excluding things like the Manhattan Project?

Comment by gsastry on Idea: Monthly Community Thread · 2017-12-29T23:31:32.096Z · LW · GW

Are there plans for recurring Open Threads like the old LW? Or is there a substitute now where it's recommended to post comments that used to go in the Open Threads?

Comment by gsastry on Open thread, 21-27 April 2014 · 2014-04-25T06:38:21.842Z · LW · GW

Has either one been fully specified/formalized?

Here's one attempt to further formalize the different decision procedures: http://commonsenseatheism.com/wp-content/uploads/2014/04/Hintze-Problem-class-dominance-in-predictive-dilemmas.pdf (H/T linked by Luke)

Comment by gsastry on April 2014 Media Thread · 2014-04-23T06:33:46.824Z · LW · GW

Related: http://www.ribbonfarm.com/2013/11/07/ux-and-the-civilizing-process/

Comment by gsastry on AALWA: Ask any LessWronger anything · 2014-03-24T18:53:51.682Z · LW · GW

Thanks. I have some followup questions :)

  1. What projects are you currently working on?/What confusing questions are you attempting to answer?
  2. Do you think that most people should be very uncertain about their values, e.g. altruism?
  3. Do you think that your views about the path to FAI are contrarian (amongst people working on FAI/AGI, e.g. you believing most of the problems are philosophical in nature)? If so, why?
  4. Where do you hang out online these days? Anywhere other than LW?

Please correct me if I've misrepresented your views.

Comment by gsastry on AALWA: Ask any LessWronger anything · 2014-03-16T01:21:39.358Z · LW · GW
  1. What do you think are the most interesting philosophical problems within our grasp to be solved?
  2. Do you think that solving normative ethics won't happen until a FAI? If so, why?
  3. You argued previously that metaphilosophy and singularity strategies are fields with low hanging fruit. Do you have any examples of progress in metaphilosophy?
  4. Do you have any role models?