Posts

Seattle: The physics of dynamism (and AI alignment) 2022-01-19T17:10:45.248Z
Life, struggle, and the psychological fallout from COVID 2021-12-06T16:59:39.611Z
Comments on Allan Dafoe on AI Governance 2021-11-29T16:16:03.482Z
Stuart Russell and Melanie Mitchell on Munk Debates 2021-10-29T19:13:58.244Z
Three enigmas at the heart of our reasoning 2021-09-21T16:52:52.089Z
David Wolpert on Knowledge 2021-09-21T01:54:58.095Z
Comments on Jacob Falkovich on loneliness 2021-09-16T22:04:58.773Z
The Blackwell order as a formalization of knowledge 2021-09-10T02:51:16.498Z
AI Risk for Epistemic Minimalists 2021-08-22T15:39:15.658Z
The inescapability of knowledge 2021-07-11T22:59:15.148Z
The accumulation of knowledge: literature review 2021-07-10T18:36:17.838Z
Agency and the unreliable autonomous car 2021-07-07T14:58:26.510Z
Musings on general systems alignment 2021-06-30T18:16:27.113Z
Knowledge is not just precipitation of action 2021-06-18T23:26:17.460Z
Knowledge is not just digital abstraction layers 2021-06-15T03:49:55.020Z
Knowledge is not just mutual information 2021-06-10T01:01:32.300Z
Knowledge is not just map/territory resemblance 2021-05-25T17:58:08.565Z
Problems facing a correspondence theory of knowledge 2021-05-24T16:02:37.859Z
Concerning not getting lost 2021-05-14T19:38:09.466Z
Understanding the Lottery Ticket Hypothesis 2021-05-14T00:25:21.210Z
Agency in Conway’s Game of Life 2021-05-13T01:07:19.125Z
Life and expanding steerable consequences 2021-05-07T18:33:39.830Z
Parsing Chris Mingard on Neural Networks 2021-05-06T22:16:14.610Z
Parsing Abram on Gradations of Inner Alignment Obstacles 2021-05-04T17:44:16.858Z
Follow-up to Julia Wise on "Don’t Shoot The Dog" 2021-05-01T19:07:45.468Z
Pitfalls of the agent model 2021-04-27T22:19:30.031Z
Beware over-use of the agent model 2021-04-25T22:19:06.132Z
Probability theory and logical induction as lenses 2021-04-23T02:41:25.414Z
Where are intentions to be found? 2021-04-21T00:51:50.957Z
My take on Michael Littman on "The HCI of HAI" 2021-04-02T19:51:44.327Z
Thoughts on Iason Gabriel’s Artificial Intelligence, Values, and Alignment 2021-01-14T12:58:37.256Z
Reflections on Larks’ 2020 AI alignment literature review 2021-01-01T22:53:36.120Z
Search versus design 2020-08-16T16:53:18.923Z
The ground of optimization 2020-06-20T00:38:15.521Z
Set image dimensions using markdown 2020-06-17T12:37:54.198Z
Our take on CHAI’s research agenda in under 1500 words 2020-06-17T12:24:32.620Z
How does one authenticate with the lesswrong API? 2020-06-15T23:46:39.296Z
Reply to Paul Christiano on Inaccessible Information 2020-06-05T09:10:07.997Z
Feedback is central to agency 2020-06-01T12:56:51.587Z
The simple picture on AI safety 2018-05-27T19:43:27.025Z
Opportunities for individual donors in AI safety 2018-03-31T18:37:21.875Z
Superrationality and network flow control 2013-07-22T01:49:46.093Z
Personality tests? 2012-02-29T09:33:00.489Z
What independence between ZFC and P vs NP would imply 2011-12-08T14:30:44.714Z
Weight training 2011-08-26T15:25:42.166Z
Derek Parfit, "On What Matters" 2011-07-07T16:52:51.007Z
[link] Bruce Schneier on Cognitive Biases in Risk Analysis 2011-05-03T18:37:42.698Z
What would you do with a solution to 3-SAT? 2011-04-27T18:19:51.186Z
[link] flowchart for rational discussions 2011-04-05T09:14:40.772Z
The AI-box for hunter-gatherers 2011-04-02T12:09:42.602Z

Comments

Comment by alexflint on Parable - Soryu destroyer of maps · 2021-12-04T21:04:36.000Z · LW · GW

I thought this was brilliant, actually. My favorite line is:

Of course, B wasn't in analysis paralysis, that would be irrational

In seriousness though, I don't actually see the monastic academy's culture as naturally contrary to the rationalist culture. Both are fundamentally concerned with how to cultivate the kind of mind that can reduce existential risk. Compared to mainstream culture, these two cultures are really very similar. There are some methodological differences, of course, and these details are important, but they are not that deep.

Comment by alexflint on Knowledge is not just mutual information · 2021-11-10T20:02:36.199Z · LW · GW

First, an ontology is just an agents way of organizing information about the world...

Second, a third-person perspective is a "view from nowhere" which has the capacity to be rooted at specific locations...

Yep I'm with you here

Well, what's a 3rd-person perspective good for? Why do we invent such things in the first place? It's good for communication.

Yeah I very much agree with justifying the use of 3rd person perspectives on practical grounds.

we should be able to consider the [first person] viewpoint of any physical object.

Well if we are choosing to work with third-person perspectives then maybe we don't need first person perspectives at all. We can describe gravity and entropy without any first person perspectives at all, for example.

I'm not against first person perspectives, but if we're working with third person perspectives then we might start by sticking to third person perspectives exclusively.

Let's look at a different type of knowledge, which I will call tacit knowledge -- stuff like being able to ride a bike (aka "know-how"). I think this can be defined (following my "very basic" theme) from an object's ability to participate successfully in patterns.

Yeah right. A screw that fits into a hole does have mutual information with the hole. I like the idea that knowledge is about the capacity to harmonize within a particular environment because it might avoid the need to define goal-directedness.

Now we can start to think about measuring the extent to which mutual information contributes to learning of tacit knowledge. Something happens to our object. It gains some mutual information w/ external stuff. If this mutual information increases its ability to pursue some goal predicate, we can say that the information is accessible wrt that goal predicate. We can imagine the goal predicate being "active" in the agent, and having a "translation system" whereby it unpacks the mutual information into what it needs.

The only problem is that now we have to say what a goal predicate is. Do you have a sense of how to do that? I have also come to the conclusion that knowledge has a lot to do with being useful in service of a goal, and that then requires some way to talk about goals and usefulness.

The hope is to eventually be able to build up to complicated types of knowledge (such as the definition you seek here), but starting with really basic forms.

I very much resonate with keeping it as simple as possible, especially when doing this kind of conceptual engineering, which can become so lost. I have been grounding my thinking in wanting to know whether or not a certain entity in the world has an understanding of a certain phenomenon, in order to use that to overcome the deceptive misalignment problem. Do you also have go-to practical problems against which to test these kinds of definitions?

Comment by alexflint on [Event] Weekly Alignment Research Coffee Time (01/24) · 2021-11-01T19:39:00.131Z · LW · GW

Today this link does not seem to be working for me, I see:

Our apologies, your invite link has now expired (actually several hours ago, but we hate to rush people).

I also notice that the date is still 10/25 so perhaps the event is not happening today?

Comment by alexflint on An Unexpected Victory: Container Stacking at the Port of Long Beach · 2021-10-30T13:26:36.703Z · LW · GW

Thank you so much for writing this up, Zvi!

It's hard to actually be correct about the nature of the bottleneck in such a scenario, and harder still to find a workable solution. I suspect that a good part of the success of this effort was just that Ryan was actually correct about the nature of the problem and the nature of the solution. Beyond that, Ryan being head of Flexport probably helped a lot in convincing the initial signal boosters to trust his diagnosis and prescription, and then for the government folks to take the whole thing seriously. It's not just that he had a general-purpose platform, but that he had credibility in that particular industry.

Comment by alexflint on Self-Integrity and the Drowning Child · 2021-10-25T14:29:58.170Z · LW · GW

But how exactly do you do this without hammering down on the part that hammers down on parts? Because the part that hammers down on parts really has a lot to offer, too, especially when it notices that one part is way out of control and hogging the microphone, or when it sees that one part is operating outside of the domain in which its wisdom is applicable.

(Your last paragraph seems to read "and now, dear audience, please see that the REAL problem is such-and-such a part, namely the part that hammers down on parts, and you may now proceed to hammer down on this part at will!")

Comment by alexflint on The LessWrong Team is now Lightcone Infrastructure, come work with us! · 2021-10-06T04:29:05.586Z · LW · GW

Thank you for the work you are doing!

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-10-01T20:10:53.457Z · LW · GW

Thank you!

Well, I would just say that the significance of it for me comes from the connection between the conclusion "I am" and practical life. I like to remind myself that there is something that really matters, and that my actions really seem to affect it, and so I take "I am" to be a reminder of that.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-10-01T20:02:41.816Z · LW · GW

It's just that you end up in circular reasoning in that case, because you have to start with the view that things that have worked in the past will continue to work in the future, then you see that this principle itself has worked in the past, then on the basis of the view you already started with as a premise you conclude that therefore this view that has worked in the past (that things that have worked in past will continue to work in the future) will continue to work in the future.

It's like if I would claim to you that things that have never worked in the past will tend to work in the future, and you ask why, and I say, well, because this view has never worked in the past, therefore it will work in the future. In order to reach that conclusion I had to start out by assuming the thing itself.

Interested in your thoughts.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-10-01T19:57:34.349Z · LW · GW

Yeah thank you for sharing these thoughts.

I have not really resolved these questions to my own satisfaction, but the thing that seems clearest to me is to really notice when these doubts are become a drag on energy levels and confidence, and, if they are, to carve out a block of time to really turn towards them in earnest.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-10-01T19:50:02.408Z · LW · GW

Yeah, these are definitely instances of the problem of the criterion. I actually had a link to your post in the original version of this post but somehow it got edited out as I was moving things around before publishing.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-24T16:19:24.989Z · LW · GW

Thank you for sharing this.

In my own experience, there are moments where I see something that I haven't seen before, such as what is really going on in a certain relationship in my life, or how I have been unwitting applying a single heuristic over and over, or how I have been holding tension my body, and it feels like a big gong has just rung with truth. But I think what's really going on is that I was seeing things in one particular way for a long time, and then upon seeing things in just a slightly different way, I let go of some unconscious tightness around the previous way of seeing things, and that letting go frees up my mind to actually think, and that's such a big relief that I feel this gong ringing with truth. It seems that letting go of seeing things one particular way is what the energetic release is about, rather than the particular new way of seeing things.

I mention this just because it's the thing that seems closest in my own experience to the direct experience of self-evident truth. It seems that when I see that I have been holding to one particular way of seeing things, it is self-evident that it's better to make a conscious choice about how to see things rather than just being unconsciously stuck. But it does not seem to me that there is any self-evident truth in any particular replacement way of seeing things.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-24T15:57:16.371Z · LW · GW

Right. But it's notable that almost no-one in the world is stuck in an actual infinite why-regress, in that there don't seem to be many people sitting around asking themselves "why" until they die, or sitting with a partner asking "why" until one person dies. (I also don't think this is what is happening for monks or other contemplative folks.) I guess in practice people escape by shifting attention elsewhere. But sometimes that is a helpful thing to do, such as when stuck in a rut, and sometimes it is an unhelpful thing to do, such as when already overwhelmed with information. Furthermore some people at very good at shifting their attention around in a way that leads to understanding. Chaitin strikes me as exactly such a person and discusses allocation of attention in that talk (thank you for the lovely link btw - really delightful read!). So what actually is our attentional mechanism and in what way can we trust it?

Interested in any thoughts you may have.

Hope you are well.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-24T04:00:02.724Z · LW · GW

Ah good point. OK yeah I believe that (2) doesn't require the graph to be finite, and I also agree that it's not tenable to believe all three of your statements.

If, hypothetically, we were to stop here, then you might look at our short dialog up to this point as, roughly, a path through a justification graph. But if we do stop, it seems that it will be because we reached some shared understanding, or ran out of energy, or moved on to other tasks. I guess that if we kept going, we would reach a node with no justifications, or a cycle, or an infinite chain as you say. Now:

  • A node with no justification would be quite a strange thing to experience. I would write something, and you would question me, and I would have literally nothing that I could say
  • A cycle would be quite a normal experience to go a few loops around -- plenty of conversations go in loops for some finite time -- but it would be strange for there to be absolutely no way out of the cycle. We would just go and go and go until we lost all energy, and neither of us would notice that we're in a cycle?
  • An infinite chain would be perhaps the most "normal" of the three experiences. We would just have some length of conversation and then, what, give up? Since we have finite minds, there must be a finite program that generates the infinite graph, so wouldn't we eventually notice that and say "huh, it looks like we are on a path with the following generator functions". What then? Would we not go some place else in the justification graph other than the infinite chain we were previously on?

So it's hard for me to imagine really experiencing any of the three possibilities you point out. Yet they would seem to be not just possible but actually guaranteed (in aggregate).

Interested in what you make of this.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-24T03:30:36.811Z · LW · GW

But then are you saying that it's impossible to experience profound doubt? Or are you saying that it's possible to experience profound doubt, but noting perception as belief is a reliable way out of it? If the latter then how do you go from noting perception as belief to making decisions?

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-23T17:53:32.707Z · LW · GW

Thank you for the kind words. If you have time and inclination, I'd be interested to hear anything at all about what the raw justification in your own experience is like.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-23T17:48:27.390Z · LW · GW

I disbelieve 2 because it assumes that there are a finite number of nodes in the graph. (We don't have to hold an infinite graph in our finite brains; we might instead have a finite algorithm for lazily expanding an infinite graph.)

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-23T17:36:52.820Z · LW · GW

I think it's mostly incoherent as a principle

What is it that you are saying is incoherent as a principle?

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-23T17:35:06.173Z · LW · GW

Plenty of people are perfectly well satisfied with various answers to this question within all sorts of systems

Yeah, I'm interested in this. If you have time, what are some of the answers that you see people being satisfied by?

While there's nothing fundamentally wrong with supposing that things may well change right now when they don't appear to have changed in at least the past few billion occasions of right now, it does seem to privilege the observer almost to the point of solipsism.

Right yeah it seems like empiricism follows from a certain kind of humility. It's like, if I don't see the world as having any nature of its own then it basically makes no sense to pay attention to anything beyond myself because how could I possibly look carefully at a tree or a bird or a waterfall without some kind of background view that there is something out there to look at.

And it does seem that when I look at some system for a while, like rain falling on a lake or a rabbit hopping around on the grass, that I just naturally start to discern some cause and effect. It's almost as if a kind of intuitive empiricism is the default, and that it would take some effortful resistance to say "no no none of this is justifiable"

But now we are trusting something about own nature that has this tendency towards an intuitive empiricism, and it really comes down to the question of what it is about our own nature that is trustworthy, because it sure isn't the case that everything we do intuitively has beneficial consequences.

Interested in your thoughts on this.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-23T17:11:56.327Z · LW · GW

Do you not want to strongly believe in something without a strong and non-cyclic conceptual justification for it?

It's not that I don't want to strongly believe in something without a strong and non-cyclic conceptual justification for it. It's that I want my actions to help reduce existential risk, and in order to do that I use reasoning, and so it's important to me that I use the kind of reasoning that actually helps me to reduce existential risk, so I am interested in what aspects of my reasoning are trustworthy or not.

Now you have linked to many compelling impossibility arguments. Hume's is-ought gap, the problem of induction, and many of Eliezer's writings rule out whole regions of the space of possible resolutions to this problem, just as the relativization barrier in computational complexity theory rules out whole regions of the space of possible resolutions to the P versus NP problem. So, good, let's not look in the places that we can definitively rule out (and I do agree that the arguments you have linked to in fact soundly rule out their respective regions of the resolution space).

Given all that, how do you determine whether your reasoning is trustworthy?

Comment by alexflint on David Wolpert on Knowledge · 2021-09-23T03:53:15.487Z · LW · GW

Yeah that resonates with me. I'd be interested in any more thoughts you have on this. Particularly anything about how we might recognize knowing in another entity or in a physical system.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T14:10:31.319Z · LW · GW

Yes it's true, there are people who have spent time at the Monastic Academy and have experienced psychological challenges after leaving.

For me, I enjoyed the simplicity and the living in community and the meditation practice as you say. The training style at the Monastic Academy seemed to really really really work for me. There were tons of difficult moments, but underneath that I felt safe, actually, in a way that I don't think I ever had before. That safety was really critical for me to face some deep doubts that I'd been carrying for a really long time.

But there are also people who have left Monastic Academy feeling very hurt.

I guess sometimes it's good to push through doubt and pain in order to get to the other side, while other times it's better to listen to doubt and pain because it's telling you that something isn't working for you.

I do think it's worth finding somewhere that can provide deep sanctuary. I definitely did find that at the Monastic Academy. There are others who seem not to have.

Instead they spend their days performing mental gymnastics and writing about enigmas without getting anywhere, grasping at a sense of meaning, purpose, and connection that perpetually alludes them because they cannot meet with the suffering that is right in front of them and inside of them.

Hmm. Since you refer to "writing about enigmas without getting anywhere" and the title of the post is "three enigmas at the heart of our reasoning", I understand this to be a critique of my character (?) or perhaps just a personal attack. Was that your intention?

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T13:36:03.712Z · LW · GW

Regarding the first enigma, the expectation that what has worked in the past will work in the future is not a feature of the world, it's a feature of our brains. That's just how neural networks work, they predict the future based on past data.

Yeah right, we are definitely hard-wired to predict the future based on the past, and in general the phenomenon of predicting the future based on the past is a phenomenon of the mind, not of the world. But it sure would be nice to know whether that aspects of our minds is helping us to see things clearly or not. For me personally, I found it very difficult to get to work with full conviction without spending some real time investigating this.

Another way to say this is that we are born hard-wired to do all kinds of things, and we can look at our various hard-wirings and reflect on whether they are helping us to see things clearly and decide what to do about them. Now you might say that neural networks predict the future based on the past in a way that is a level more ingrained than any one particular heuristic or bias. But to me that just makes it all the more pressing to investigate whether this deep aspect of our brains is helping or hurting our capacity to see things clearly. I just found that I could put this question aside for only so long.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T13:14:04.224Z · LW · GW

Yeah I agree the three enigmas are instances of the same thing. And it does seem to me that no system of reasoning can provide its own ultimate justification. But whether it's possible to find ultimate justification for anything is a different question, I think. It seems to me that it's worth making some real time and space to look into whether anything can be ultimately justified. For me it was important to make such time and space because I just couldn't fully dive into my work without first having looked into this.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T02:44:57.217Z · LW · GW

Yeah thank you for that connection Ben. It seems like a true connection to me.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T02:43:19.718Z · LW · GW

Thank you for the kind words and encouragement Gunnar

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-22T00:02:54.708Z · LW · GW

Gosh yes having doubts is ultra important as you say, but I don't think they'll you much good unless you act on them, and a lot of the time acting on doubt means picking an appropriate time and place to investigate them.

Look let's say you're working for a space elevator company and your job is to design the cars that will climb the elevator cable, but you have some doubts about whether the basic science behind the cable design has been done well. Now it's completely fine to just do your job and not worry much about these underlying doubts, but eventually that's going to be a pretty demotivating way to exist, particularly if you are doing the work you are doing because you really care about it. So from a purely practical perspective, it's going to help you do your work to spend a little time investigating the basic assumptions underneath the thing you're working on.

It's a similar situation, I think, with investigating the foundations of reasoning itself. If this underlying thing isn't quite right, then everything we do on top is going to be skewed. We know this, and our doubts correctly implore us to make some time and space to investigate. The demotivating thing, it seems to me, is when we do have the doubts but do not make time and space to investigate.

But yes the goal is not to paper over all doubts forever, but to actually resolve the thing that is unresolved.

Comment by alexflint on Three enigmas at the heart of our reasoning · 2021-09-21T23:30:52.020Z · LW · GW

Right, yeah I agree that we can evaluate empiricism on empirical grounds. That is a thing we can do. And yes, as you say, we can come to different conclusions about empiricism when we evaluate it on empirical grounds. Very interesting point re object-level and meta-level conclusions. But why would start with empiricism at all? Why should we begin with empiricism, and then conclude on such grounds either that empiricism is trustworthy or untrustworthy?

When I say "empiricism cannot justify empiricism", I mean that empiricism cannot explain why we trust empiricism, because the decision to start with empiricism as the framework for evaluating empiricism is not itself accounted for by empiricism. And when I say "accounted for in a way that resolves doubt", not merely argued for.

Maybe a clearer way to say it is that I actually agree with everything you've said, but I don't think what you've said is yet sufficient to resolve the question of whether our reasoning is based on something trustworthy.

(Also, yes I do think the first enigma is close to or identical to Hume's problem of induction.)

Comment by alexflint on Comments on Jacob Falkovich on loneliness · 2021-09-20T01:18:09.262Z · LW · GW

Well thank you. I'm glad you enjoyed it. I enjoyed reading this comment.

Out of interest, what parts did you see as ironic or anti-ironic?

Comment by alexflint on Comments on Jacob Falkovich on loneliness · 2021-09-18T00:32:38.630Z · LW · GW

So, what exactly prevents us from creating and maintaining emotionally satisfying non-sexual friendships?

It’s a good question. I don’t know the answer. But it does seem to me that there is a closeness that is in some romantic relationships that is very rare in friendships, even healthy loving long term friendships. So I do think there is something real that is actually difficult here.

Comment by alexflint on All Possible Views About Humanity's Future Are Wild · 2021-09-17T00:36:42.783Z · LW · GW

I very much appreciated this write-up Holden.

Why do you believe that things will eventually stabilize? Perhaps we will always be on the verge of the next frontier, though it may not always be a spatial one. Yes there may be periods of lock-in, but even if we are locked-in to a certain religion or something for 100,000 years at a time, that still may look pretty dynamic over a long time horizon.

It seems that your claim ought to be that we will soon lock ourselves into something forever. This is a very interesting claim!

Comment by alexflint on Comments on Jacob Falkovich on loneliness · 2021-09-17T00:16:06.442Z · LW · GW

:D Yes!

Comment by alexflint on The Blackwell order as a formalization of knowledge · 2021-09-13T02:12:05.779Z · LW · GW

Yes I believe everything you have said here is consistent with the way the Blackwell order is defined.

Comment by alexflint on The Blackwell order as a formalization of knowledge · 2021-09-13T02:10:22.377Z · LW · GW

Thank you for the kind words and feedback.

I wonder if the last section could be viewed as a post-garbling of the prior sections...

Comment by alexflint on The Blackwell order as a formalization of knowledge · 2021-09-13T02:09:41.854Z · LW · GW

Yes. Thank you. Fixed.

Comment by alexflint on The Blackwell order as a formalization of knowledge · 2021-09-13T02:09:15.636Z · LW · GW

Yes. Thank you. Fixed.

Comment by alexflint on LessWrong is providing feedback and proofreading on drafts as a service · 2021-09-08T15:39:13.328Z · LW · GW

Thank you for doing this. I just requested a proofread on a draft I'm working on.

When I first clicked "Get Feedback" it didn't do anything. I think this was because I had "Hide Intercom" turned on in my settings. When that setting was on, I saw the following error in my console:

PostSubmit.tsx:93 Uncaught TypeError: window.Intercom is not a function
    at onClick (PostSubmit.tsx:93)
    at Object.ein (react-dom.production.min.js:14)
    at rin (react-dom.production.min.js:14)
    at nin (react-dom.production.min.js:14)
    at cgt (react-dom.production.min.js:15)
    at min (react-dom.production.min.js:52)
    at GAe (react-dom.production.min.js:51)
    at ooe (react-dom.production.min.js:52)
    at Jgt (react-dom.production.min.js:56)
    at ygt (react-dom.production.min.js:287)
    at bgt (react-dom.production.min.js:19)
    at ZAe (react-dom.production.min.js:70)
    at loe (react-dom.production.min.js:69)
    at fc.unstable_runWithPriority (scheduler.production.min.js:19)
    at wM (react-dom.production.min.js:122)
    at vgt (react-dom.production.min.js:287)
    at Cin (react-dom.production.min.js:68)
    at HTMLDocument.n (helpers.ts:87)

When I turned off "Hide Intercom", the "Get Feedback" button worked and I requested feedback.

Comment by alexflint on Provide feedback on Open Philanthropy’s AI alignment RFP · 2021-08-29T19:01:19.349Z · LW · GW

Thank you for posting this Asya and Nick. After I read it I realized that it connected to something that I've been thinking about for a while that seems like it might actually be a fit for this RFP under research direction 3 or 4 (interpretability, truthful AI). I drafted a very rough 1.5-pager this morning in a way that hopefully connects fairly obviously to what you've written above:

https://docs.google.com/document/d/1pEOXIIjEvG8EARHgoxxI54hfII2qfJpKxCqUeqNvb3Q/edit?usp=sharing

Interested in your thoughts.

Feedback from everyone is most welcome, too, of course.

Comment by alexflint on AI Risk for Epistemic Minimalists · 2021-08-27T22:23:52.090Z · LW · GW

Yes. Thank you. Would love to hear more about you work on goal-directedness. Let me know if you're up for chatting.

Comment by alexflint on AI Risk for Epistemic Minimalists · 2021-08-25T17:47:30.005Z · LW · GW

How then would you evaluate the level of existential risk at time X? Is that you would ask whether people at time X believed that there was existential risk?

Comment by alexflint on We need a new philosophy of progress · 2021-08-25T16:01:30.968Z · LW · GW

Jason, wouldn't you say that what we need is an understanding of how to make progress, not optimism about progress?

I mean, we do have an understanding of how to make material progress, and we've made a great deal of material progress over the past few millennia, but surely material progress is not where the marginal action is at just now, right?

Comment by alexflint on AI Risk for Epistemic Minimalists · 2021-08-25T15:51:50.428Z · LW · GW

Well all existential risk is about a possible existential catastrophe in the future, and there are zero existential catastrophes in our past, because if there were then we wouldn't be here. Bioweapons, for example, have never yet produced an existential catastrophe, so how is it that we conclude that there is any existential risk due to bioweapons?

So when we evaluate existential risk over time, we are looking at how close humanity is flirting with danger at various times, and how dis-coordinated that flirtation is.

Comment by alexflint on AI Risk for Epistemic Minimalists · 2021-08-24T23:20:08.396Z · LW · GW

Hey- Look, existential risk doesn't arise from risky technologies alone, but from the combination of risky technologies and a dis-coordinated humanity. And existential risk increases not just when a dis-coordinated humanity develops, say, bioweapons, but also when a dis-coordinated humanity develops the precursors to bioweapons, and we can propagate that backwards.

Now the conclusion I am arguing for in the post is that developing powerful AI is likely to increase existential risk, and the evidence I am leaning on is that rapid technological development has landed us where we are now, and where we are now is that we have a great deal of power over the future of life on the planet, but we are not using that power very reliably due to our dis-coordinated state. The clearest illustration of us not using our power very reliably seems to me to be the fact that the level of existential risk is high, and most of that risk is due to humans.

Most technological developments reduce existential risk, since they provide more ways of dealing with the consequences of something like a meteor impact

Well that is definitely a benefit of technological development, but you should consider ways that most technological developments could increase existential risk before concluding that most technological developments overall reduce existential risk. Generally speaking, it really seems to me that most technological developments give humanity more power, and giving a dis-coordinated humanity more power beyond its current level seems very dangerous. A well-coordinated humanity, on the other hand, could certainly take up more power safely.

Comment by alexflint on AI Risk for Epistemic Minimalists · 2021-08-23T22:24:28.791Z · LW · GW

Seems excellent to me. Thank you as always for your work on the newsletter Rohin.

Comment by alexflint on Agency in Conway’s Game of Life · 2021-08-16T22:20:58.880Z · LW · GW

Yeah, so if every configuration has a unique predecessor that we have conservation of information, because you can you can take some future state and evolve it backwards in time to find any past state, so any information present in the state of the universe at time T can be recovered from any later state, so in that sense information is never lost from the universe as a whole.

This means that if I know only that the universe is one of N possible states at some time T, then if I evolve the universe forwards, there are still exactly N possible states that world could be in, since by time-reversibility I could rewind each of those states and expect to get back to the N original states from time T. This is what Eliezer refers to as "phase space volume is preserved under time evolution".

This in turn implies the second thermodynamics, because among all possible configurations of the whole universe, there are only a small number of configurations with short descriptions, but many configurations with long descriptions (since there are fewer short descriptions than long descriptions), so it can never be that a randomly-selected long-description configuration is likely to evolve over time to a short-description configuration, since two configurations can never evolve to the same future configuration, so there are too few short-description configurations to be shared among the astronomically more numerous long-description configurations.

Our universe, remarkably, does have time-reversibility. It is called unitarity in quantum physics, but even in ordinary Newtonian mechanics you can imagine a bunch of billiard balls bouncing around on a frictionless table and see that if you knew the exact velocity of each ball then you could reverse all the velocities and play the whole thing backwards in time.

The black hole information paradox is called a paradox because general relativity says that information is lost in a black hole, but quantum mechanics says that information is never lost under any circumstances.

Comment by alexflint on Agency in Conway’s Game of Life · 2021-08-16T17:22:25.337Z · LW · GW

Yup, Life does not have time-reversibility, so it does not preserve the phase space volume under time evolution, so it does not obey the laws of thermodynamics that exist under our physics.

But one could still investigate whether there is some analog of thermodynamics in Life.

There also is a cellular automata called Critters that does have time reversibility.

Comment by alexflint on The ground of optimization · 2021-08-16T17:17:43.072Z · LW · GW

Thank you for this comment Chantiel. Yes, a container that engineered to evaporate water poured anywhere into it and condense it into a central area would be an optimizing system by my definition. That is a bit like a ball rolling down a hill, which is also an optimizing system and also has nothing resembling agency. I am

The bottle cap example was actually about putting a bottle cap onto a bottle and asking whether, since the water now stays inside the bottle, it should be considered an optimizer. I pointed out that this would not qualify as an optimizing system because if you moved a water molecule from the bottle and place it outside the bottle, the bottle cap would not act to put it back inside.

Comment by alexflint on The accumulation of knowledge: literature review · 2021-07-31T01:57:58.070Z · LW · GW

Yeah nice, thank you for thinking about this and writing this comment, Lorenzo.

an extension of this definition is enforcing a maximum effort E required to extract K

I think this is really spot on. Suppose that I compare the knowledge in (1) a Chemistry textbook, (2) a set of journal papers from which one could, in principle, work out everything from the textbook, (3) the raw experimental data from which one could, in principle, work out everything from the journal papers, (4) the physical apparatus and materials from which one could, in principle, extract all the raw experimental data by actually performing experiments. I think that the number of yes/no questions that one can answer given access to (4) is greater than the number of yes/no questions that one can answer given access to (3), and so on for (2) and (1) also. But answering questions based on (4) requires more effort than (3), which requires more effort than (2), which requires more effort than (1).

We must also somehow quantify the usefulness or generality of the questions that we are answering. There are many yes/no questions that we can answer easily with access to (4), such as "what is the distance between this particular object and this other particular object?", or "how much does this particular object weigh?". But if we are attempting to make decisions in service of a goal, the kind of questions we want to answer are more like "what series of chemical reactions must I perform to create this particular molecule?" and here the textbook can give answers with much lower effort than the raw experimental data or the raw materials.

Would be very interested in your thoughts on how to define effort, and how to define this generality/usefulness thing.

Comment by alexflint on Agency in Conway’s Game of Life · 2021-07-31T01:43:28.085Z · LW · GW

If you can create a video of any of your constructions in Life, or put the constructions up in a format that I can load into a simulator at my end, I would be fascinated to take a look at what you've put together!

Comment by alexflint on The inescapability of knowledge · 2021-07-15T15:26:12.379Z · LW · GW

Gunnar- yes I think this is true, but it's really surprisingly difficult to operationalize this. Here is how I think this plays out:

Suppose that we are recording videos of some meerkats running around in a certain area. One might think that the raw video data is not very predictive of the future, but that if we used the video data to infer the position and velocity of each meerkat, then we could predict the future position of the meerkats, which would indicate an increase in knowledge compared to just storing the raw data. And I do think that this is what knowledge means, but if we try to operationalize this "predictive" quality in terms of a correspondence between the present configuration of our computer and the future configuration of the meerkats then the raw data will actually have higher mutual information with future configurations than the position-and-velocity representation will.

Comment by alexflint on Problems facing a correspondence theory of knowledge · 2021-07-15T14:09:54.332Z · LW · GW

Well if I learn that my robot vacuum is unexpectedly building a model of human psychology then I'm concerned whether or not it in fact acts on that model, which means that I really want to define "knowledge" in a way that does not depend on whether a certain agent acts upon it.

For the same reason I think it would be natural to say that the sailing ship had knowledge, and that knowledge was lost when it sank. But if we define knowledge in terms of the actions that follow then the sailing ship never had knowledge in the first place.

Now you might say that it was possible that the sailing ship would have survived and acted upon its knowledge of the coastline, but imagine a sailing ship that, unbeknownst to it, is sailing into a storm in which it will certainly be destroyed, and along the way is building an accurate map of the coastline. I would say that the sailing ship is accumulating knowledge and that the knowledge is lost when the sailing ship sinks. But the attempted definition from this post would say that the sailing ship is not accumulating knowledge at all, which seems strange.

It's of course important to ground out these investigations in practical goals or else we end up in an endless maze of philosophical examples and counter-examples, but I do think this particular concern grounds out in the practical goal of overcoming deception in policies derived from machine learning.