Posts

Reward Hacking from a Causal Perspective 2023-07-21T18:27:39.759Z
Incentives from a causal perspective 2023-07-10T17:16:28.373Z
Causality: A Brief Introduction 2023-06-20T15:01:39.377Z
Introduction to Towards Causal Foundations of Safe AGI 2023-06-12T17:55:24.406Z
Don't Fear the Reaper: Refuting Bostrom's Superintelligence Argument 2017-03-01T14:28:56.559Z
Autonomy, utility, and desire; against consequentialism in AI design 2014-12-03T17:34:57.989Z
more on predicting agents 2014-11-08T06:43:33.036Z
prediction and capacity to represent 2014-11-04T06:09:57.357Z
AI Tao 2014-10-21T01:15:28.539Z
What is optimization power, formally? 2014-10-18T18:37:10.161Z
Depth-based supercontroller objectives, take 2 2014-09-24T01:25:03.198Z
Everybody's talking about machine ethics 2014-09-17T17:20:57.516Z
Proposal: Use logical depth relative to human history as objective function for superintelligence 2014-09-14T20:00:09.849Z
Intelligence explosion in organizations, or why I'm not worried about the singularity 2012-12-27T04:32:32.918Z

Comments

Comment by sbenthall on Ukraine Situation Report 2022/03/01 · 2022-10-11T18:07:51.938Z · LW · GW

This point about Ukrainian neo-Nazis is very misunderstood by the West.

During the Maidan revolution in Ukraine in 2014, neo-Nazi groups occupied government buildings and brought about a transition of government.

Why are there neo-Nazis in Ukraine? Because during WWII, the Nazis and the USSR were fighting over Ukraine. Ukraine is today quite ethnically diverse, and some of the 'western' Ukrainians who were resentful of USSR rule and, later, Russian influence, have reclaimed nazi ideas as part of a far-right Ukrainian nationalism. Some of these nazi groups that were originally militias have been incorporated into the Ukrainian military.

This is all quite well documented:

https://en.wikipedia.org/wiki/2014_Euromaidan_regional_state_administration_occupations

https://jacobin.com/2022/02/maidan-protests-neo-nazis-russia-nato-crimea

One of the regiments most well known to have Nazi ties was defeated at the Siege of Mariupol

https://en.wikipedia.org/wiki/Azov_Regiment

Naturally, this history is downplayed in presentations of Ukrainian nationalism targeted at the West, and emphasized in Russia depictions of Ukraine.

Comment by sbenthall on Ukraine Post #12 · 2022-10-11T17:54:39.278Z · LW · GW

Thanks for writing this. I have been fretting for some time and realized that what I needed was a rational take on the war. I appreciate the time you've taken you write this out and I'll check out your other posts on this.

Comment by sbenthall on prediction and capacity to represent · 2014-11-08T06:14:48.333Z · LW · GW

This seems correct to me. Thank you.

Comment by sbenthall on prediction and capacity to represent · 2014-11-08T06:13:24.935Z · LW · GW

You don't know anything about how cars work?

Comment by sbenthall on prediction and capacity to represent · 2014-11-08T06:11:27.757Z · LW · GW

It's possible to predict the behavior of black boxes without knowing anything about their internal structure.

Elaborate?

That says a lot more about your personal values then the general human condition.

I suppose you are right.

The models of worms might be a bit better at predicting worm behavior but they are not perfect.

They are significantly closer to being perfect than our models of humans. I think you are right in pointing out that where you draw the line is somewhat arbitrary. But the point is the variation on the continuum.

Comment by sbenthall on prediction and capacity to represent · 2014-11-08T06:07:34.473Z · LW · GW

Do you think it is something external to the birds that make them migrate?

Comment by sbenthall on What is optimization power, formally? · 2014-10-21T00:16:02.920Z · LW · GW

Norbert Wiener is where it all starts. This book has a lot of essays. It's interesting--he's talking about learning machines before "machine learning" was a household word, but envisioning it as electrical circuits.

http://www.amazon.com/Cybernetics-Second-Edition-Control-Communication/dp/026273009X

I think that it's important to look inside the boxes. We know a lot about the mathematical limits of boxes which could help us understand whether and how they might go foom.

Thank you for introducing me to that Concrete Mathematics book. That looks cool.

I would be really interested to see how you model this problem. I'm afraid that op-amps are not something I'm familiar with but it sounds like you are onto something.

Comment by sbenthall on Four things every community should do · 2014-10-21T00:10:24.791Z · LW · GW

Do you think that rationalism is becoming a religion, or should become one?

Comment by sbenthall on What is optimization power, formally? · 2014-10-21T00:08:24.898Z · LW · GW

Thanks. That criticism makes sense to me. You put the point very concretely.

What do you think of the use of optimization power in arguments about takeoff speed and x-risk?

Or do you have a different research agenda altogether?

Comment by sbenthall on What is optimization power, formally? · 2014-10-21T00:05:22.049Z · LW · GW

That makes sense. I'm surprised that I haven't found any explicit reference to that in the literature I've been looking at. Is that because it is considered to be implicitly understood?

One way to talk about optimization power, maybe, would be to consider a spectrum between unbounded, LaPlacean rationality and the dumbest things around. There seems to be a move away from this though, because it's too tied to notions of intelligence and doesn't look enough at outcomes?

It's this move that I find confusing.

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T23:56:56.005Z · LW · GW

There are people in my department who do work in this area. I can reach out and ask them.

I think Mechanical Turk gets used a lot for survey experiments because it has a built-in compensation mechanism and there are ways to ask questions in ways that filter people into precisely what you want.

I wouldn't dismiss Facebook ads so quickly. I bet there is a way to target mobile app developers on that.

My hunch is that like survey questions, sampling methods are going to need to be tuned case-by-case and patterns extracted inductively from that. Good social scientific experiment design is very hard. Standardizing it is a noble but difficult task.

Comment by sbenthall on What is optimization power, formally? · 2014-10-20T05:23:50.043Z · LW · GW

Thanks. That's very helpful.

I've been thinking about Stuart Russell lately, which reminds me...bounded rationality. Isn't there a bunch of literature on that?

http://en.wikipedia.org/wiki/Bounded_rationality

Have you ever looked into any connections there? Any luck with that?

Comment by sbenthall on What is optimization power, formally? · 2014-10-20T05:20:32.263Z · LW · GW

1) This is an interesting approach. It looks very similar to the approach taken by the mid-20th century cybernetics movement--namely, modeling social and cognitive feedback processes with the metaphors of electrical engineering. Based on this response, you in particular might be interested in the history of that intellectual movement.

My problem with this approach is that it considers the optimization process as a black box. That seems particularly unhelpful when we are talking about the optimization process acting on itself as a cognitive process. It's easy to imagine that such a thing could just turn itself into a superoptimizer, but that would not be taking into account what we know about computational complexity.

I think that it's this kind of metaphor that is responsible for "foom" intuitions, but I think those are misplaced.

2) Partial differential equations assume continuous functions, no? But in computation, we are dealing almost always with discrete math. What do you think about using concepts from combinatorial optimization theory, since those are already designed to deal with things like optimization resources and optimization efficiency?

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T05:04:50.515Z · LW · GW

Could you please link to examples of the kind of marketing studies that you are talking about? I'd especially like to see examples of those that you consider good vs. those you consider bad.

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T05:03:36.688Z · LW · GW

I am confused. Shouldn't the questions depend on the content of the study being performed? Which would depend (very specifically) on the users/clients? Or am I missing something?

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T04:55:33.264Z · LW · GW

I would worry about sampling bias due to selection based on, say, enjoying points.

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T04:52:41.216Z · LW · GW

The privacy issue here is interesting.

It makes sense to guarantee anonymity. Participants recruited personally by company founders may be otherwise unwilling to report honestly (for example). For health related studies, privacy is an issue for insurance reasons, etc.

However, for follow-up studies, it seems important to keep earlier records including personally identifiable information so as to prevent repeatedly sampling from the same population.

That would imply that your organization/system needs to have a data management system for securely storing the personal data while making it available in an anonymized form.

However, there are privacy risks associated with 'anonymized' data as well, since this data can sometimes be linked with other data sources to make inferences about participants. (For example, if participants provide a zip code and certain demographic information, that may be enough to narrow it down to a very few people.) You may want to consider differential privacy solutions or other kinds of data perturbation.

http://en.wikipedia.org/wiki/Differential_privacy

Comment by sbenthall on Fixing Moral Hazards In Business Science · 2014-10-20T04:39:17.152Z · LW · GW

He then takes whatever steps we decide on to locate participants.

Even if the group assignments are random, the prior step of participant sampling could lead to distorted effects. For example, the participants could be just the friends of the person who created the study who are willing to shill for it.

The studies would be more robust if your organization took on the responsibility of sampling itself. There is non-trivial scientific literature on the benefits and problems of using, for example, Mechanical Turk and Facebook ads for this kind of work. There is extra value added for the user/client here, which is that the participant sampling becomes a form of advertising.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-24T22:53:00.594Z · LW · GW

Thanks for your thoughtful response. I'm glad that I've been more comprehensible this time. Let me see if I can address the problems you raise:

1) Point taken that human freedom is important. In the background of my argument is a theory that human freedom has to do with the endogeneity of our own computational process. So, my intuitions about the role of efficiency and freedom are different from yours. One way of describing what I'm doing is trying to come up with a function that a supercontroller would use if it were to try to maximize human freedom. The idea is that choices humans make are some of the most computationally complex things they do, and so the representations created by choices are deeper than others. I realize now I haven't said any of that explicitly let alone argued for it. Perhaps that's something I should try to bring up in another post.

2) I also disagree with the morality of this outcome. But I suppose that would be taken as beside the point. Let me see if I understand the argument correctly: if the most ethical outcome is in fact something very simple or low-depth, then this supercontroller wouldn't be able to hit that mark? I think this is a problem whenever morality (CEV, say) is a process that halts.

I wonder if there is a way to modify what I've proposed to select for moral processes as opposed to other generic computational processes.

3) A couple responses:

  • Oh, if you can just program in "keep humanity alive" then that's pretty simple and maybe this whole derivation is unnecessary. But I'm concerned about the feasibility of formally specifying what is essential about humanity. VAuroch has commented that he thinks that coming up with the specification is the hard part. I'm trying to defer the problem to a simpler one of just describing everything we can think of that might be relevant. So, it's meant to be an improvement over programming in "keep humanity alive" in terms of its feasibility, since it doesn't require solving perhaps impossible problems of understanding human essence.

  • Is it the consensus of this community that finding an objective function in E is an easy problem? I got the sense from Bostrom's book talk that existential catastrophe was on the table as a real possibility.

I encourage you to read the original Bennett paper if this interests you. I think your intuitions are on point and appreciate your feedback.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-24T22:32:43.586Z · LW · GW

I see, that's interesting. So you are saying that while the problem as scoped in §2 may take a function of arbitrary complexity, there is a constraint in the superintelligence problem I have missed, which is that the complexity of the objective function has certain computational limits.

I think this is only as extreme a problem as you say in a hard takeoff situation. In a slower takeoff situation, inaccuracies due to missing information could be corrected on-line as computational capacity grows. This is roughly business-as-usual for humanity---powerful entities direct the world according to their current best theories; these are sometimes corrected.

It's interesting that you are arguing that if we knew what information to include in a full specification of humanity, we'd be making substantial progress towards the value problem. In §3.2 I argued that the value problem need only be solved with a subset of the full specification of humanity. The fullness of that specification was desirable just because it makes it less likely that we'll be missing the parts that are important to value.

If, on the other hand, that you are right and the full specification of humanity is important to solving the value problem--something I'm secretly very sympathetic to--then

(a) we need a supercomputer capable of processing the full specification in order to solve the value problem, so unless there is an iterative solution here the problem is futile and we should just accept that The End Is Nigh, or else try, as I've done, to get something Close Enough and hope for slow takeoff, and

(b) the solution to the value problem is going to be somewhere done the computational path from h and is exactly the sort of thing that would be covered in the scope of g*.

It would be a very nice result, I think, if the indirect normativity problem or CEV or whatever could be expressed in terms of the the depth of computational paths from the present state of humanity for precisely this reason. I don't think I've hit that yet exactly but it's roughly what I'm going for. I think it may hinge on whether the solution to the value problem is something that involves a halting process, or whether really it's just to ensure the continuation of human life (i.e. as a computational process). In the latter case, I think the solution is very close to what I've been proposing.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-24T22:09:34.832Z · LW · GW

Could you flesh this out? I'm not familiar with key-stretching.

A pretty critical point is whether or not the hashed value is algorithmically random. The depth measure has the advantage of picking over all permissible starting conditions without having to run through each one. So it's not exactly analogous to a brute force attack. So for the moment I'm not convinced on this argument.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-24T01:54:04.570Z · LW · GW

Thanks for your encouraging comments. They are much appreciated! I was concerned that following the last post with an improvement on it would be seen as redundant, so I'm glad that this process has your approval.

Regarding your first point:

  • Entropy is not depth. If you do something that increases entropy, then you actually reduce depth, because it is easier to get to what you have from an incompressible starting representation. In particular, the incompressible representation that matches the high-entropy representation you have created. So if you hold humanity steady and superheat the moon, you more or less just keep things at D(u) = D(h), with low D(u/h).

  • You can do better if you freeze humanity and then create fractal grey goo, which is still in the spirit of your objection. Then you have high D(u), D(u/h) is something like D(u) - D(h) except for when the fractal starts to reproduce human patterns out of the sheer vigor of its complexity, in which case I guess D(u/h) would begin to drop...though I'm not sure. This may require a more thorough look at the mathematics. What do you think?

Regarding your second point...

Strictly speaking, I'm not requiring that h abstract away the fleshy bits and capture what is essentially human or transhuman. I am trying to make the objective function agnostic to these questions. Rather, h can include fleshy bits and all. What's important is that it includes at least what is valuable, and that can be done by including anything that might be valuable. The needle in the haystack can be discovered later, if it's there at all. Personally, I'm not a transhumanist. I'm an existentialist; I believe our existence precedes our essence.

That said I think this is a clever point with substance to it. I am, in fact, trying to shift our problem-solving attention to other problems. However, I am trying to turn attention to more tractable and practical questions.

One simple one is: how can we make better libraries for capturing human existence, so that a supercontroller could make use of as much data as possible as it proceeds?

Another is: given that the proposed objective function is in fact impossible to compute, but (if the argument is ultimately successful) also given that it points in the right direction, what kinds of processes/architectures/algorithms would approximate a g-maximizing supercontroller? Since we have time to steer in the right direction now, how should we go about it?

My real agenda is that I think that there are a lot of pressing practical questions regarding machine intelligence and its role in the world, and that the "superintelligence" problem is a distraction except that it can provide clearer guidelines of how we should be acting now.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-24T01:29:21.841Z · LW · GW

Maybe. Can you provide an argument for that?

As stated, that wouldn't maximize g, since applying the hash function once and tiling would cap the universe at finite depth. Tiling doesn't make any sense.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-23T17:57:35.803Z · LW · GW

Your point about physical entropy is noted and a good one.

One reason to think that something like D(u/h) would pick out higher level features of reality is that h encodes those higher-level features. It may be possible to run a simulation of humanity on more efficient physical architecture. But unless that simulation is very close to what we've already got, it won't be selected by g.

You make an interesting point about the inefficiency of physics. I'm not sure what you mean by that exactly, and am not in a position of expertise to say otherwise. However, I think there is a way to get around this problem. Like Kolmogorov complexity, depth has another hidden term in it, the specification of the universal Turing machine that is used, concretely, to measure the depth and size of strings. By defining depth in terms of a universal machine that is a physics simulator, then there wouldn't be a way to "bypass" physics computationally. That would entail being able to build a computer, which our physics, that would be more efficient than our physics. Tell me if that's not impossible.

Re: brains, I'm suggesting that we encode whatever we think is important about brains in the h term. If brains execute a computational process, then that process will be preserved somehow. It need not be preserve on grey matter exactly. Those brains could be uploaded onto more efficient architecture.

I appreciate your intuitions on this but this function is designed rather specifically to challenge them.

Comment by sbenthall on Depth-based supercontroller objectives, take 2 · 2014-09-23T17:47:13.112Z · LW · GW

So, the key issue is whether or not the representations produced by the paperclip optimizer could have been produced by other processes. If there is another process that produces the paperclip-optimized representations more efficiently than going through the process of humanity, then that process dominates the calculation of D(r).

In other words, for this objection to make sense, it's not enough for the humanity to have been sufficient for the R scenario. It must be necessary for producing R, or at least necessary to result in it in the most efficient possible way.

What are your criteria for a more concrete model than what has been provided?

Comment by sbenthall on CEV-tropes · 2014-09-23T00:07:44.360Z · LW · GW

Do you know if there are literally entries for these outcomes on tvtropes.org? Should there be?

Comment by sbenthall on Unpopular ideas attract poor advocates: Be charitable · 2014-09-23T00:05:01.456Z · LW · GW

I think what the idea in the post does is that it gets at the curvature of the space, so to speak.

Comment by sbenthall on Everybody's talking about machine ethics · 2014-09-22T21:31:48.963Z · LW · GW

thanks I've been mispelling that for a while now. I stand corrected.

Comment by sbenthall on Everybody's talking about machine ethics · 2014-09-22T21:30:55.590Z · LW · GW

That is of course one of the questions on the table: who has the power to implement and promote different platforms.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-22T21:28:31.382Z · LW · GW

I guess I disagree with this assessment of which problem is easier.

Humanity continues to exist while people stub their toes all the time. I.e., currently the humanity existing problem is close to solved, and the toe stubbing problem has by and large not been.

Comment by sbenthall on Everybody's talking about machine ethics · 2014-09-18T02:43:38.074Z · LW · GW

this is the sort of thing that gets assigned in seminars. Maybe 80% correct but ultimately weak sauce IMO

http://purl.tue.nl/605170089298249.pdf

Comment by sbenthall on Everybody's talking about machine ethics · 2014-09-18T02:39:39.128Z · LW · GW

So there's some big problems of picking the right audience here. I've tried to make some headway into the community complaining about newsfeed algorithm curation (which interests me a lot, but may be more "political" than would interest you) here:

https://github.com/sbenthall/tweetserve/blob/master/DesigningNetworkedPublicsforCommunicativeAction.docx

which is currently under review. It's a lot softer that would be ideal, but since I'm trying to convince these people to go from "algorithms, how complicated! Must be evil" to "oh, they could be designed to be constructive", it's a first step. More or less it's just opening up the idea that Twitter is an interesting testbed for ethically motivated algorithmic curation.

I've been concerned more generally with the problem of computational asymmetry in economic situations. I've written up something that's an attempt at a modeling framework here. It's been accepted only as a poster, because it's results are very slim. It was like a quarter of a semester's work. I'd be interested in following through on it.

http://arxiv.org/abs/1206.2878

The main problem I ran into was not knowing a good way to model relative computational capacity; the best tool I had was big-O and other basic computational theory stuff. I did a little sort of remote apprenticeship with David Wolpert as Los Alamos; he's got some really interesting stuff on level-K reasoning and what he calls predictive game theory.

http://arxiv.org/abs/nlin/0512015

(That's not his most recent version). It's really great work, but hard math to tackle on ones own. In general my problem is there isn't much of a community around this at Berkeley, as far as I can tell. Tell me if you know differently. There's some demand from some of the policy people--the lawyers are quite open-minded and rigorous about this sort of thing. And there's currently a ton of formal work on privacy, which is important but not quite as interesting to me personally.

My blog is a mess and doesn't get into formal stuff at all, at least not recently.

Comment by sbenthall on Unpopular ideas attract poor advocates: Be charitable · 2014-09-17T17:45:21.575Z · LW · GW

I think this is a super post!

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-16T22:20:09.834Z · LW · GW

Re: Generality.

Yes, I agree a toy setup and a proof are needed here. In case it wasn't clear, my intentions with this post was to suss out if there was other related work out there already done (looks like there isn't) and then do some intuition pumping in preparation for a deeper formal effort, in which you are instrumental and for which I am grateful. If you would be interested in working with me on this in a more formal way, I'm very open to collaboration.

Regarding your specific case, I think we may both be confused about the math. I think you are right that there's something seriously wrong with the formulas I've proposed.

If the string y is incompressible and shallow, then whatever x is, D(x) ~ D(x/y), because D(x) (at least in the version I'm using for this argument) is the minimum computational time of producing x from an incompressible program. If there is a minimum running time program P that produces x, then appending y as noise at the end isn't going to change the running time.

I think this case with incompressible y is like your Ongoing Tricky Procession.

On the other hand, say w is a string with high depth. Which is to say, whether or not it is compressible in space, it is compressible in time: you get it by starting with something incompressible and shallow and letting it run in time. Then there are going to be some strings x such that D(x/w) + D(w) ~ D(x). There will also be a lot of strings x such that D(x/w) ~ D(x) because D(w) is finite and there tons of deep things the universe can compute that are deeper. So for a given x, D(x) > D(x/w) > D(x) - D(w) , roughly speaking.

I'm saying the h, the humanity data, is logically deep, like w, not incompressible and shallow, like y or the ongoing tricky procession.

Hmm, it looks like I messed up the formula yet again.

What I'm trying to figure out is to select for universes u such that h is responsible for a maximal amount of the total depth. Maybe that's a matter of minimizing D(u/h). Only that would lead perhaps to globe-flattening shallowness.

What if we tried to maximize D(u) - D(u/h)? That's like the opposite of what I originally proposed.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-16T21:00:48.720Z · LW · GW

Non-catastrophic with respect to existence, not with respect to "human values." I'm leaving values out of the equation for now, focusing only on the problem of existence. If species suicide is on the table as something that might be what our morality ultimately points to, then this whole formulation of the problem has way deeper issues.

My point is that starting anew without taking into account the computational gains, you are increasing D(u) efficiently and D(u/h) inefficiently, which is not favored by the objective function.

If there's something that makes humanity very resilient to exogenous shocks until some later time, that seems roughly analogous to cryogenic freezing of the ill until future cures are developed. I think that still qualifies as maintaining human existence.

Doing the opposite of what humans would have done is interesting. I hadn't thought of that.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-16T20:50:52.708Z · LW · GW

I am trying to develop a function that preserves human existence. You are objecting that you might stub your toe. I think we may be speaking past each other.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T22:20:00.668Z · LW · GW

Maybe this will be more helpful:

If the universe computes things that are not computational continuations of the human condition (which might include resolution to our moral quandaries, if that is in the cards), then it is, with respect to optimizing function g, wasting the perfectly good computational depth achieved by humanity so far. So, driving computation that is not somehow reflective of where humanity was already going is undesirable. The computational work that is favored is work that makes the most of what humanity was up to anyway.

To the extent that human moral progress in a complex society is a difficult computational problem, and there's lots of evidence that it is, then that is the sort of thing that would be favored by objective g.

If moral progress of humanity is something that has a stable conclusion (i.e., humanity at some point halts or goes into a harmonious infinite loop that does not increase in depth) then objective g will not help us hit that mark. But in that case, it should be computationally feasible to derive a better objective function.

To those who are unsatisfied with objective g as a solution to Problem 2, I pose the problem: is there a way to modify objective g so that it prioritizes morally better futures? If not, I maintain the objective g is still pretty good.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T22:07:21.578Z · LW · GW

Re: your first point:

As I see it, there are two separate problems. One is preventing catastrophic destruction of humanity (Problem 1). The other is creating utopia (Problem 2). Objective functions that are satisficing with respect to Problem 1 may not be solutions to Problem 2. While as I read it the Yudkowsky post you linked to argues for prioritizing Problem 2, on the contrary my sense of the thrust of Bostrom's argument is that it's critical to solve Problem 1. Maybe you can tell me if I've misunderstood.

Without implicating human values, I'm claiming that the function f(u) = D(u/ht) / D(u) satisfies Problem 1 (the existential problem). I'm just going to refer to that function as f now.

You seem have conceded this point. Maybe I've misinterpreted you.

As for solving Problem 2, I think we'd agree that any solutions to the utopia problem will also be solutions to the existence problem (Problem 1). The nice thing about f is that its range is (0,1), so it's easy to compose it with other functions that could weight it more towards a solution to Problem 2.

Re: your second point:

I'm not sure if I entirely follow what you're say here, so I'm having a hard time understanding exactly the point of disagreement.

Is the point you're making about the unpredictability of the outcome of optimizing for f? Because the abstract patterns favored by f will look like noise relative to physics?

I think there are a couple elaborations worth making.

First, like Kolmogorov complexity, logical depth depends on a universal computer specification. I gather that you are assuming that the universal computer in question is something that simulated fundamental physics. This need not be the case. Depth is computed as the least running time of incompressible programs on the universal computer.

Suppose we were to try to evolve through a computational process a program that outputs a string that represented the ultimate, flourishing potential of humanity. One way to get that is to run the Earth as a physical process for a period of time and get a description of it at the end, selecting only those timelines in its stochastic unfolding in which life on Earth successfully computes itself indefinitely.

If you stop somewhere along the way, like timestep t, then you are going to get a representation that encodes some of the progress towards that teleological end.

(I think there's a rough conceptual analogy to continuations in functional programming here, if that helps)

An important property of logical depth is the Slow Growth Law. This is proved by Bennett. It says that deep objects cannot be produced quickly from shallow ones. Incompressible programs being the shallowest strings of all. It's not exactly that depth stacks additively, but I'm going to pretend it does for the intuitive argument here (which may be wrong):

If you have the depth of human progress D(h) and the depth of the universe at some future time D(u), then always D(u/h) < D(u) assuming h is deep at all and the computational products of humanity exist at all. But...

ah, I think I've messed up the formula. Let's see... let's have h' be a human slice taken after the time of h.

D(u) > D(u/h') > D(u/h) > D(h) assuming humanity's computational process continues. The more that h' encodes the total computational progress of u, i.e., the higher D(u/h') is relative to D(u)...

Ok, I think I need to modify the formula some. Here's function g:

g(u) = (D(h) + D(u/h)) / D(u)

Does maximizing this function produce better results? Or have I missed your point?

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T17:05:44.283Z · LW · GW

First, I'm grateful for this thoughtful engagement and pushback.

Let's call your second dystopia the Universal Chinese Turing Factory, since it's sort of a mash-up of the factory variant of Searle's Chinese Room argument and a universal Turing Machine.

I claim that the Universal Chinese Turing Factory, if put to some generic task like solving satisfiability puzzles, will not be favored by a supercontroller with the function I've specified.

Why? Because if we look at the representations computed by the Universal Chinese Turing Factory, they may be very logically deep. But their depth will not be especially due to the fact that humanity is mechanically involved in the machine. In terms of the ratio of depth-relative-to-humanity to absolute-depth, there's going to be hardly any gains there.

(You could argue, by the way, that if you employed all of humanity in the task of solving math puzzles, that would be much like the Universal Chinese Turing Factory you describe.)

Let's consider an alternative world, the Montessori World, where all of humanity is employed creating ever more elaborate artistic representations of the human condition. Arguably, this is the condition of those who dream up a post-scarcity economy where everybody gets to be a quasi-professional creative doing improv comedy, interpretative dance, and abstract expressionist basket-weaving. A utopia, I'm sure you'd agree.

The thing is that these representations will be making use of humanity's condition h as an integral part of the computing apparatus that produces the representations. So, humanity is not just a cog in a universal machine implementing other algorithms. It is the algorithm, which the supercontroller then has the interest of facilitating.

That's the vision anyway. Now that I'm writing it out, I'm thinking maybe I got my math wrong. Does the function I've proposed really capture these intuitions?

For example, maybe a simpler way to get at this would be to look at the Kolmogorov complexity of the universe relative to humanity, K(u/h). That could be a better Montessori world. Then again, maybe Montessori world isn't as much of a utopia as I've been saying it is.

If I sell the idea of these kinds of complexity-relative-to-humanity functions as a way of designing supercontrollers that are not existential threats, I'll consider this post a success. The design of the particular function is I think an interesting ethical problem or choice.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T03:13:05.078Z · LW · GW

I can try. This is new thinking for me, so tell me if this isn't convincing.

If a future is deep with respect to human progress so far, but not as deep with respect to all possible incompressible origins, then we are selecting for futures that in a sense make use of the computational gains of humanity.

These computational gains include such unique things as:

  • human DNA, which encodes our biological interests relative to the global ecosystem.

  • details, at unspecified depth, about the psychologies of human beings

  • political structures, sociological structures, etc.

I've left very unspecified what aspects of humanity should constitute the h term but my point is that by including them, to the extent that they represent the computationally costly process of biological and cultural evolution, they will be a precious endowment of high D(u/ht) / D(u) futures. So at the very least they will be preserved in the ongoing computational dynamism.

Further, the kinds of computations that would increase that ratio are the sorts of things that would be like the continuation of human history in a non-catastrophic way. To be concrete, consider the implementation that runs a lot of Monte Carlo simulations of human history from now on, with differences in the starting conditions based on the granularity of the h term and with simulations of exogenous shocks. Cases where large sections of humanity have been wiped out or had no impact would be less desirable than those in which the full complexity of human experience was taken up and expanded on.

A third argument is that something like coherent extrapolated volition or indirect normativity is exactly the kind of thing that is favored by depth with respect to humanity but not absolute depth. That's a fairly weak claim but one that I think could motivate friendly amendments to the original function.

Lastly, I am drawing on some other ethical theory here which is out of scope of this post. My own view is shaped heavily by Simone de Beauvoir's The Ethics of Ambiguity, whose text can be found here:

http://www.marxists.org/reference/subject/ethics/de-beauvoir/ambiguity/

I think the function I've proposed is a better expression of existentialist ethics than consequentialist ethics.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T02:52:46.850Z · LW · GW

1) Thanks, that's encouraging feedback! I love logical depth as a complexity measure. I've been obsessed with it for years and it's nice to have company.

2) Yes, my claim is that Manfred's doomsday cases would have very high D(u) and would be penalized. That is the purpose of having that term in the formula.

I agree with your suspicion that our favorite future have relatively high D(u/h) / D(u) but not the highest value of D(u/h) / D(u). I suppose I'd defend a weaker claim, that a D(u/h) / D(u) supercontroller would not be an existential threat. One reason for this is that D(u) is so difficult to compute that it would be pretty bogged down....

One reason for making a concrete proposal of an objective function is that if it pretty good, that means maybe it's a starting point for further refinement.

Comment by sbenthall on Proposal: Use logical depth relative to human history as objective function for superintelligence · 2014-09-15T02:47:36.128Z · LW · GW

A good prediction :)

Logical depth is not entropy.

The function I've proposed is to maximize depth-of-universe-relative-to-humanity-divided-by-depth-of-universe.

Consider the decision to kill off people and overwrite them with a very fast SAT solver. That would surely increase depth-of-universe, which is in the denominator. I.e. increasing that value decreases the favorability of this outcome.

What increases the favorability of the outcome, in light of that function, are the computation of representations that take humanity as an input. You could imagine the supercontroller doing a lot to, say, accelerate human history. I think that would likely either involve humans or lots of simulations of humans.

Do you follow this argument?

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T17:08:29.597Z · LW · GW

Corporations exist, if they have any purpose at all, to maximize profit. So this presents a sort of dilemma: their diminishing returns and fragile existence suggest that either they do intend to maximize profit but just aren't that great at it; or they don't have even that purpose which is evolutionarily fit and which they are intended to by law, culture, and by their owners, in which case how can we consider them powerful at all or remotely similar to potential AIs etc?

Ok, let's recognize some diversity between corporations. There are lots of different kinds.

Some corporations fail. Others are enormously successful, commanding power at a global scale, with thousands and thousands of employees.

It's the latter kind of organization that I'm considering as a candidate for organizational superintelligence. These seem pretty robust and good at what they do (making shareholders profit).

As HalMorris suggests, that there are diminishing returns to profit with number of employees doesn't make the organization unsuccessful in reaching its goals. It's just that they face diminishing returns on a certain kind of resource. An AI could face similar diminishing returns.

I bet organizations would work a lot better if they could only brainwash employees into valuing nothing but the good of the organization - and that's just one nugatory difference between AIs (uploads or de novo) and organizations.

I agree completely. I worry that in some cases this is going on. I've heard rumors of this sort of thing happening in the dormitories of Chinese factory workers, for example.

But more mundane ways of doing this involve giving employees bonuses based on company performance, or stock options. Or, for a different kind of organization, by providing citizens with a national identity. Organizations encourage loyalty in all kinds of ways.

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T16:55:46.831Z · LW · GW

Yes, but Git has a bottleneck: there are humans in the loop, and there are no plans to remove or significantly modify those humans. By "in the loop", I mean humans are modifying Git, while Git is not modifying humans or itself.

I think I see what you mean, but I disagree.

First, I think timtyler makes a great point.

Second, the level of abstraction I'm talking about is that of the total organization. So, does the organization modify its human components, as it modifies its software component?

I'd say: yes. Suppose Git adds a new feature. Then the human components need to communicate with each other about that new feature, train themselves on it. Somebody in the community needs to self-modify to maintain mastery of that piece of the code base.

More generally, humans within organizations self-modify using communication and training.

At this very moment, by participating in the LessWrong organization focused around this bulletin board, I am participating in an organizational self-modification of LessWrong's human components.

The bottleneck that's been pointed out to me so far are the bottlenecks related to wetware as a computing platform. But since AGI, as far as I can tell, can't directly change its hardware through recursive self-modification, I don't see how that bottleneck puts AGI at an immediate, FOOMy advantage.

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T16:38:32.279Z · LW · GW

Ok, thanks for explaining that.

I think we agree that organizations recursively self-improve.

The remaining question is whether organizational cognitive enhancement is bounded significantly below that of an AI.

So far, most of the arguments I've encountered for why the bound on machine intelligence is much higher than human intelligence have to do with the physical differences between hardware and wetware).

I don't disagree with those arguments. What I've been trying to argue is that the cognitive processes of an organization are based on both hardware and wetware substrates. So, organizational cognition can take advantage of the physical properties of computers, and so are not bounded by wetware limits.

I guess I'd add here that wetware has some nice computational properties as well. It's possible that the ideal cognitive structure would efficiently use both hardware and wetware.

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T16:28:59.891Z · LW · GW

They can't use one improvement to fuel another, they would have to come up with the next one independently

I disagree.

Suppose an organization has developers who work in-house on their issue tracking system (there are several that do--mostly software companies).

An issue tracking system is essentially a way for an organization to manage information flow about bugs, features, and patches to its own software. The issue tracker (as a running application) coordinates between developers and the source code itself (sometimes, its own source code).

Taken as a whole, the developers, issue tracker implementation, and issue tracker source code are part of the distributed cognition of the organization.

I think that in this case, an organization's self-improvement to the issue tracker source code recursively 'fuels' other improvements to the organization's cognition.

The point isn't that an AGI has or does not have certain skills. It's that it has the ability to learn those skills. Deep Blue doesn't have the capacity to learn anything other than playing chess, while humans, despite never running into a flute in the ancestral environment, can learn to play the flute.

Fair enough. But then we should hold organizations to the same standard. Suppose, for whatever reason, an organization needs better-than-median-human flute-playing for some purpose. What then?

Then they hire a skilled flute-player, right?

I think we may be arguing over an issue of semantics. I agree with you substantively that general intelligence is about adaptability, gaining and losing skills as needed.

My point in the OP was that organizations and the hypothetical AGI have comparable kinds of intelligence, so we can think about them as comparable superintelligences.

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T16:07:54.563Z · LW · GW

should the word "corporation" in the first sentence be "[organization]"?

Yes, at least to be consistent with my attempt at de-politicizing the post :) I've corrected it. Thanks.

I wasn't sure what sort of posts were considered acceptable. I'm glad that particular examples have come up in the comments.

Do you think I should use particular examples in future posts? I could.

Comment by sbenthall on ...Recursion, Magic · 2012-12-28T02:46:50.079Z · LW · GW

I find this difficult to follow. Is there a concrete mathematical definition of 'recursion' in this sense available anywhere?

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T02:32:45.622Z · LW · GW

I've realized I didn't address your direct query:

(Aside: Is the theory of "communicative rationality" specified well enough that we can measure degrees of it, as we can with Bayesian rationality?)

Not yet. It's a qualitatively described theory. I think it's probably possible to render it into quantitative terms, but as far as I know it has not yet been done.

Comment by sbenthall on Intelligence explosion in organizations, or why I'm not worried about the singularity · 2012-12-28T02:23:15.092Z · LW · GW

There are many reasons why the intelligence of AI+ greatly dwarfs that of human organizations; see Section 3.1 of the linked paper.

Since an organization's optimization power includes optimization power gained from information technology, I think that the "AI Advantages" in section 3.1 mostly apply just as well to organizations. Do you see an exception?

This sounds similar to a position of Robin Hanson addressed in Footnote 25 of the linked paper.

Ah, thanks for that. I think I see your point: rogue AI could kill everybody, whereas a dominant organization would still preserve some people and so is less 'interesting'.

Two responses:

First, a dominant organization seems like the perfect vehicle for a rogue AI, since it would already have all resources centralized and ready for AI hijacking. So, a study of the present dynamics between superintelligent organizations is important to the prediction of hard takeoff machine superintelligence.

Second, while I once again risk getting political at this point, I'd argue that an overriding concern for the total existence of humanity only makes sense if one doesn't have any skin in the game of any of the other power dynamics going on. I believe there are ethical reasons for being concerned with some of these other games. That is well beyond the scope of this post.

The Singularity Institute is completely aware that there are other existential risks to humanity; its purpose is to deal with one of them.

That's clear.

This sounds awfully suspicious. Are you sure you don't have the bottom line precomputed?

Honestly, I don't follow the line of reasoning in the post you've linked to. Could you summarize in your own terms?

My reason for not providing arguments up front is because I think excessive verbiage impairs readability. I would rather present justifications that are relevant to my interlocutor's objections than try to predict everything up front. Indeed, I can't predict all objections up front, since this audience has more information than I have available.

However, since I have faith that we are all in the same game of legitimate truth-seeking, I'm willing to pursue dialectical argumentation until it converges.

How long did it take you to come up with this line of reasoning?

I guess over 27 years. But I stand on the shoulders of giants.