Posts

Some Comments on Recent AI Safety Developments 2024-11-09T16:44:58.936Z
Changing the Mind of an LLM 2024-10-11T22:25:37.464Z
The Existential Dread of Being a Powerful AI System 2024-09-26T10:56:32.904Z
Turning 22 in the Pre-Apocalypse 2024-08-22T20:28:25.794Z
How AI Fails Us: A non-technical view of the Alignment Problem 2022-11-18T19:02:42.056Z

Comments

Comment by testingthewaters on The o1 System Card Is Not About o1 · 2024-12-14T00:21:59.653Z · LW · GW

This is imo quite epistemically important.

Comment by testingthewaters on Turning 22 in the Pre-Apocalypse · 2024-08-24T10:40:52.860Z · LW · GW

It's definitely something I hadn't read before, so thank you. I would say to that article (on a skim) that it has clarified my thinking somewhat. I therefore question the law/toolbox dichotomy, since to me it seems that usefulness - accuracy-to-perceived reality are in fact two different axes. Thus you could imagine:

  • A useful-and-inaccurate belief (e.g. what we call old wives tales, "red sky in morning, sailors take warning", herbal remedies that have medical properties but not because of what the "theory" dictates) 
  • A not-useful-but-accurate belief (when I pitch this baseball, the velocity is dependent on the space-time distortion created by earth's gravity well)
  • A not-useful-and-not-accurate belief (bloodletting as a medical "treatment")
  • And finally a useful-and-accurate belief (when I set up GPS satellites I should take into account time dilation)

And, of course, all of these are context dependent (sometimes you may be thinking about baseballs going at lightspeed)! I guess then my position is refined into: "category 4 is great if we can get it but for most cases category 1 is probably easier/better", which seems neither pure toolbox or pure law

Comment by testingthewaters on Turning 22 in the Pre-Apocalypse · 2024-08-23T23:19:40.887Z · LW · GW

Hey, thanks for responding! Re the physics analogy, I agree that improvements in our heuristics are a good thing:

However, perhaps you have already begun to anticipate what I will say—the benefit of heuristics is that they acknowledge (and are indeed dependent) on the presence of context. Unlike a “hard” theory, which must be applicable to all cases equally and fails in the event a single counter-example can be found, a “soft” heuristic is triggered only when the conditions are right: we do not use our “judge popular songs” heuristic when staring at a dinner menu.

It is precisely this contextual awareness that allows heuristics to evade the problems of naive probabilistic world-modelling, which lead to such inductive conclusions as the Turkey Illusion. This means that we avoid the pitfalls of treating spaghetti like a Taylor Swift song, and it also means (slightly more seriously) that we do not treat discussions with our parents like bargaining games to extract maximum expected value. Engineers and physicists employ Newton’s laws of motion not because they are universal laws, but because they are useful heuristics about how things move in our daily lives (i.e. when they are not moving at near light speed). Heuristics are what Chris Haufe called “techniques” in the last section: what we worry about is not their truthfulness, but their usefulness.

However, I disagree in that I don't think we're really moving towards some endpoint of "the underlying reality will end up agreeing with this model in many places while substantially improving our understanding in many others". Both because of the chaotic nature of the universe (which I strongly believe puts an upper bound on how well we can model systems without just doing atom by atom simulation to arbitrary precision) and because that's not how physics works in practice today. We have a pretty strong model for how macroscale physics works (General Relativity), but we willingly "drop it" for less accurate heuristics like Newtonian mechanics when it's more convenient/useful. Similarly, even if we understand the fundamentals of neuroscience completely, we may "drop it" for more heuristics driven approaches that are less absolutely accurate. 

Because of this, I maintain my questioning of a general epistemic (and the attached instrumental) project for "rational living" etc.. It seems to me a better model of how we deal with things is like collecting tools for a toolbox, swapping them out for better ones as better ones come in, rather than moving towards some ideal perfect system of thinking. Perhaps that too is a form of rationalism, but at that point it's a pretty loose thing and most life philosophies can be called rationalisms of a sort...

(Note: On the other hand it seems pretty true that better heuristics are linked to better understandings of the world however they arise, so I remain strongly in support of the scientific community and the scientific endeavour. Maybe this is a self-contradiction!)

Comment by testingthewaters on Turning 22 in the Pre-Apocalypse · 2024-08-23T23:10:12.597Z · LW · GW

And as for the specific implications of "moral worth", here are a few:

  • You take someone's opinions more seriously
  • You treat them with more respect
  • When you disagree, you take time to outline why and take time to pre-emptively "check yourself"
  • When someone with higher moral worth is at risk you think this is a bigger problem, compared against the problem of a random person on earth being at risk
Comment by testingthewaters on Turning 22 in the Pre-Apocalypse · 2024-08-23T11:22:49.682Z · LW · GW

Thank you for the feed back! I am of course happy for people to copy over the essay

> Is this saying that human's goals and options (including options that come to mind) change depending on the environment, so rational choice theory doesn't apply?

More or less, yes, or at least that it becomes very hard to apply it in a way that isn't either highly subjective or essentially post-hoc arguing about what you ought to have done (hidden information/hindsight being 20/20)

> This is currently all I have time for; however, my current understanding is that there is a common interpretation of Yudowsky's writings/The sequences/LW/etc that leads to an over-reliance on formal systems that will invevitably fail people. I think you had this interpretation (do correct me if I'm wrong!), and this is your "attempt to renegotiate rationalism ". 

I've definitely met people who take the more humble/humility/heuristics driven approach which I outline in the essay and still call themselves rationalists. On the other hand, I have also seen a whole lot of people take it as some kind of mystic formula to organise their lives around. I guess my general argument is that rationalism should not be constructed on top of such a formal basis (cf. the section about heuristics not theories in the essay) and then "watered down" to reintroduce ideas of humility or nuance or path-dependence. And in part 2 I argue that the core principles of rationalism as I see them (without the "watering down" of time and life experience) make it easy to fall down certain dangerous pathways.

Comment by testingthewaters on Turning 22 in the Pre-Apocalypse · 2024-08-23T11:19:21.798Z · LW · GW

Yeah, of course