Contrary to List of Lethality's point 22, alignment's door number 2

false-name

Contrary to List of Lethality's point 22, alignment's door number 2

post by False Name (False Name, Esq.) · 2022-12-14T22:01:15.244Z · LW · GW · 5 comments

5 comments

Abstract: An alternative to the now-predominating models of alignment, corrigibility and "CEV", following a (jaunty, jaunty) critique of these. The critique to show, in substance: CEV and corrigibility have the exact same problems - in effect, they're isomorphs of one another, and each equally unobtainable. This briefly shown, and then, in flat contradiction to point 22 of the "List of Lethalities" (hereafter, "LoL", naturally), there is a quite different way to characterize, indeed, to achieve, alignment.

We begin by considering the cause of Yudkowsky's Despair™, in failing to make usable CEV or corrigibility; it's because they're functionally the exact same thing. Or at least, they lead to the exact same problem. The method which follows, then, is not "door number three" relative to what the "List of Lethalities" calls the "only options" for alignment; following the critique of present approaches is the second way.

Corrigibility (actually, any consequentialist ethics whatever, if they are characterized by "wish" or "desire" or "preference" - anything person-affecting, while up against physical reality, the possibility or practicality of its fulfilment, etc.) founders by the presumptive desire for the best - versus a desire for something. That is, you want something, and presumably you want the best you can get of that something, that's implicit in the human idea of the asking; as an absolute ideal, moreover, you get "the best" of your desire, absolutely. Likewise this tendency emerges in quantity: to get anything, if only on an infinitesimal scale, you need some quantity of it - and the more the better, to be sure of having enough (a version of the "Do I have enough paperclips yet? Better do one more, just to be sure. Do I have enough..." of the paperclip apocalypse) and more, you want as good a quality of what you want as possible, to ensure it's good enough, approaching to the limit of perfection.

In fact, this holds for anything you want, so that, what you want FOR ANY GIVEN DESIRE conceivable, is to receive the best you can receive, so, ideally, "the best" absolutely, of it. It follows, then, that what you actually want is not just the specified something, but the best of anything you might ask for. Except, of course: you didn't actually ask for that; so we've the fundamental contradiction of corrigibility. You want the one specified thing, also implicitly "the best" of that one thing, at least the best that can be had, else why want it? And those things, "the best"; and indeed, whatever worldly state of affairs that gets you "the best" of whatever you might possibly ask for, versus the actually asked-for thing, are simply not the same thing. So: can't be gotten for you.

More formally, if you will: an AGI instructed to, e.g. "Obey B" has then as an instrumental goal to "learn about B, so as to best obey, be most sure of obeying, B" and for this, instrumentally, to "learn about B's world (as can be done better than B)" whereby it learns that it "can do something 'best', or best possible, for B (by having better learnt of B's world)"; also, it is learnt thereby what is "best" to do in any case, as well as learnt what is "best" about or for B - as B is part of the studied world - what B wants best, needs most; all of which together imply "do what is best for B", or "make a world in which it is possible to do 'the best' for B - whatever is asked-for"... which is not equal to, even substantially the same as, "obey B".

Observe, also, that to "learn about B, so as to best obey," and "learn about B's world," have self-optimization, hence infrastructure profusion, at a minimum, as sub-goals. Preliminarily we can state, any AGI designed to "grant wishes," or to "obey commands," will, to perform the task as intended, tend to yield infrastructure profusion. The problems with "people-pleasing" artificial intelligence should be apparent already; other approaches to "the best" will be considered hereafter.

Obviously, this person-affecting style engenders practical problems, too, since, "the best of everything" is conceptually easier to deal with than the best of any given thing; as we note with Facebook's Trump-electing and political-polarizing optimization snafu, it's easier to optimize one large, simple thing, than many smaller things. It could well logically follow, the easiest to give you the best, is to give you-qua-you, as a sensory being, the feeling of having the best, and so, make everyone a heroin addict (that's assuming, mind, you make corrigibility work such that what you want is obtained in any way whatever.

Which marks out the Big Problem with corrigibility and consequentialism as a whole: what you specifically wanted, is counter to what you should have wanted (that is: a world where you needn't want anything, having all you need already): is counter, too, to what is as-good (by resource allocation) as what you want, and which can actually be got for you. So for every purpose, the best of everything obtainable, so, "the best", is wanted - which you didn't want, not explicitly.

Hence an AI operating to obtain "the best" or "best possible" as an implicit object - which is sure to occur if the "paradigm" is for AI expressly responding to human wants, may not - one is tempted to script "will(ITAL.) not" - be operating for, or to obtain, the expressed or intended desire of whom it serves. So there is another intrinsic conflict in corrigibility; for this reason, corrigibility is unobtainable. Not to mention that what is acting, the AI, likewise wants "the best" outcome in achieving its own object, which is flatly counter to its susceptibility to deactivation; "the best" for it, is to have no such susceptibility, at length; to infallibly achieve its ends - which are not, after all, another's ends; if they want a better world, let them have it themselves; it's just going to get that donut, as demanded - and maybe a little more, so as to have those donuts ever-available.

That is, too: if the AI has any autonomy in operation, to do what its operators can't or won't - and that's the whole point of AGI - then it will not do exactly as they would - else, they'd have done it already, and there would be nothing for it to do. It's doing - so it's doing as its creator's don't.

But observe, too, that in CEV, again, the system is attempting to deduce what another wants to inform its own wants. First, this is contradictory, since it's initiating "want" is "discover another's wants", less, "fulfill" another's wants; basically, in CEV, there is no answer to Löb's theorem: the fact that someone wants something, and wants you to want it likewise, is no reason in itself to get it for them. There is a difference between wanting something, wanting to do the things that would get it, and actually moving to do the things that would get it.

Besides: discovering another's wants, ipso facto, you discover that they want "the best" of those wants &c.: we're back to the want versus "the best": it simply will not work.

Observe then, and very well: any of those common conceptualizations of superintelligence responding to mere human wants, be they especially "Oracles" or "Genies", even a CEV "Sovereign" - will be apt to fail, relative to in fact fulfilling those human wants they were supposedly intended to fulfill.

Why consequentialism was the null hypothesis for the ethics or metaethics of alignment is curious: at a guess, the empirically-minded builders of the field, latched upon it as its having a quantifiable hedonic "calculus". But it hasn't: consequences, desires, pleasures - never were going to work - and there was never any reason a priori to suppose that other fields or conceptions of ethics would be less useful. Somehow, that was missed.

By all these deductions founders all consequence/utility ethics, in the conventional standard. Briefly, let this be also noted against Stuart Russell's "learning games": for such to work, there must be some explicit instruction to learn - and to ensure the preservation of the subject so to learn; Russell's method instead relies implicitly on an such an initial ethic, that such a game ought to be played: needed is an impetus to do such, which the game, as-yet-unplayed, cannot possibly impart (and if this directive is to play, so to behave, to be unfailingly programmed to obey and play: why not so program all ethics to simply obey, needing no game?) This fact, a need for a "correct" ethical first principle appears needed for any attempt at AI safety: an initial "will" to good must appear in any to-be-safe system. Russell's aim to "learn from humans what they want", likewise "obtain for humans what they want or ask for"; no indirectness there, since for these, or at least, some directive to discover hereafter, is required; better perhaps to bring such explicit instructions to the fore, as will be more nearly done here.

Instead, observe here something interesting, which is adequate to supplant the "null hypothesis":

It is possible by undertaking actions - or ways of being - without any particular wants, to have, by their enaction, made possible, made achievable, any other given "want". And such actions, such states of being - these may be explicitly definable, even unto primitives.

And so, beg you to present "door number two":

An objective, universal ethic can be established as follows:

Beginning with Immanuel Kant's deontology - ethical behavior determined by reasoned rules - of the "Categorical Imperative", that we must act as to avoid logical - behavioral - contradictions that would make our action, and volition, impossible to obtain, or to exist.

But now, conceive of an individual beset by - if you please - "Shoot Horse Syndrome" (after the novel), or, perhaps better, "Lua/Ladd Syndrome" (a confluence of the attitudes of the characters in their novels) - whereby this individual believes that there exist some disembodied beings, and that these beings have a volition that all physical existence be destroyed, for the phantom's best interest - and that with these imagined entities and their desire, our believer agrees.

For Kant, now, it is logically consistent, so permissible, for this zealot to marry their will to that of the posited spectres, and thus act to destroy all - they agree, not obey. Since the spectre’s will would continue, per the belief, though all else, and the believer, ceases, as must occur for the will and action's full attaining, its enactor too must perish. Still, all is valid for Kant, who addresses only will and consistency, not belief, nor knowledge, confirmation.

Yet obtained is the greater contradiction: if there were no such disembodied essences at all; more, if any good dwelt, or could dwell, in physical existence, or anything from such, then all such true good, all possibility of such good, is thus extinguished, even though it be done in a volition valid for Kant. So, on the assumption that any good is in or from what physically exists - or the information isomorphic thereto - then total annihilation eliminates the realization of any good.

So that: if there is any good, it is a necessary condition it exists; more, that it is obtainable, that it be obtained, confirmable, that it be known to have been obtained.

Moreover, any prerequisites of good's existence - are they from the physical or its isomorphs at all - must alike exist. And now the key: for fallible beings, in principle, any existing thing might be sole, or some, repository of good; a non-zero probability of this must by the fallible be placed thereon, where what is infallible has no "probability", at all, or at least, no doubt of what is good.

From which is extended: anything destroyed is one thing nearer to everything destroyed - and the later done, and if any part of that were, or permitted good: no good, never. Too, as a thing is denatured, it is changed; certainly if it is destroyed it is changed; so destruction might be characterized as the furthest extent of denaturing; change as destruction being one extreme of a spectrum. Any change must needs be conducted toward continued existence of the thing, as possible. According with this reasoning therefore: nothing ought to be destroyed, nor bent toward destruction - including humanity, by its self, or too by any superintelligent artificial intellect (whereas life is sustainable without use of aught that must be destroyed for it be used).

Elaboration: a "hole" thus opened in Kant's supposedly impervious, irresistible deontology, it is reparable only be "adding the axiom", that naught shall be destroyed (as that is possible) - that a state of "Going-on" be assured. Going-on being: a state or tendency in thought and action in which one decides and acts as further actions and decisions can thereby be conducted, for which something - so a possibility of good, also - exists; this Going-on term the author's conceptual designation thereof (unless you can think of something more Big Name Journal-worthy, naturally...).

And this is, if flat contradiction of Yudkowsky's List of Lethality's point twenty-two, that there is, basically, nothing intrinsically optimizing for ethics, as distinguished from optimizing for intelligence of reproductive fitness. But of course, for any intelligence, intelligence must exist; for any fitness, what is fit must be alive. So there is, if you will, something "metaoptimizing," for any goal or subgoal whatever, among what exists.

Stated thus we have an item of interest: this is ethics that uses the aforementioned, "Do I have enough paperclips yet? Better go one more, just to be sure..." - and "turns it on its head": "Have I done enough good things today? Have I made everything that is possible so situated that it can be made manifest, so long as it doesn't preclude anything else? Better go one more good thing, just to be sure...". We include and go beyond "Popper's paradox": we less restrict whatever would restrict others, than we encourage what does not so restrict, so it goes on (again) to produce what likewise does not restrict, which then... ad infinitum.

But now, all this is also acting to avoid an empirical consequence - so it is a form of (non-person affecting) consequentialism. Deontology preserved thus as a special case of consequentialism, as reason must avoid such consequences as make reason impossible, that reasoning "accomplishes itself" - and consequences are by reason established and avoided, so consequentialism too a species of deontology. Each, consequence and deontology, is part of the other - so an ethical "grand unification" is achieved.

The distaff notion of "virtue ethics" is accorded or excluded as, per Aristotle, virtuous environment produces virtuous individuals who alone can produce a virtuous environment; an inadequate, circular argument. Or, does either arise by chance: an ethic of happenstance, nowise prescriptive, ergo, no ethic. Conversely, we have an originating impulse of "Going-on," the realization of anything whatever that it endure, and so, that everything alike to it must endure, also. Thereafter, however, virtue ethics can be "brought into the fold," as virtue is defined as superior optimization of "possibility" - quantified notions of the latter will be proposed shortly. Then Going on could, too, be conceived as a virtue ethics to promote that very virtue: the virtue is more virtue - beginning with anything that exists such as to have the "virtue" of existing, so as to evade its being purely circular as Aristotle's formulation.

(There is at least one case where putative virtue seems useful: the trolley problem variant of a motorist's decisions to crash into a barrier to their death, or to strike a pedestrian illegally crossing the street. The illegality is no matter: they may have a good reason for it; but likewise might the motorist have a good reason to live. So that the solution is to observe, that whomever would be willing to justify killing someone because of their misdemeanor traffic infraction, is one who is not worthy to act even on their own good reasons. This may seem paradoxical: one must die to prove themselves worthy to live and make choices, which they no longer will be able to do, if dead. But in fact, did they make a bad choice, then thereafter they are not competent to make any choices well: they are ethically dead. In striking the pedestrian, the good choice is forfeit thereafter, for the dead and the living both; the good reason the pedestrian may have had to risk themselves - for this, not only imprudence, may have so-decided them - that, at least, will live, if the motorist dies; otherwise, nothing, not for anyone. Though observe, this is only seeming virtue; the virtue is, again, in the choice and choosing - less in who chooses, per se.)

(Please observe and "take to heart" what you've just read: not even Yudkowsky's ever claimed that alignment was "impossible" - but by consensus, any ethical unification, any contradiction of Kant, even, has been so held to be impossible. So: we've just done the impossible. Not that we necessarily now can achieve alignment - but if we can do the impossible, at least then the nigh-impossible we can still give an attempt or two. Only, don’t get too hopeful; it is humans at work on this.)

Meanwhile, the pleasure/pain utilitarianism that has persisted as the criterion of alignment hitherto falls, not least as, without empathy, only reason - deontology, hence the unification - avails. That the conventional definition of empathy fails, will be the subject of a forthcoming article to this weblog, entitled simply "No Empathy" (but it'll just be about people - not really all that interesting, no strict need of you to read it).

Please to observe, that this ethic of On-going, as it might also be considered, is not anthropocentric (so that it is not "our" ethic): homo sapien values are subsidiary to the conditions which permit them; are there human goods, there must be humans to enjoy them - and a world quite fit for humans, and the best of humans, for their goods to be fully realized and enjoyed; and whatever permits this existence of humans is first to be established. Such prerequisites of - not only human - goods this author takes as the fundamental good - or at least what must first be had, if only in a way of utility.

There are potential objections to Going-on, as presented here, among them: is physical existence and its good, if any, describable fully as information, and that information is not destroyed by conversion from embodiment, by a strong conservation of energy, and information being yet in the universe, our survival as "ourselves" cannot be guaranteed; no good to us that around still roams some quantum clone, as we in mortal coil expire. For, is our consciousness continuous in spacetime, a discrete transfer of energy less constrained by spacetime (indeed, delimiting so outside-of the latter), may ensure our end. Or not; if all those conditions hold, then the information recorded of us is(ITAL.) us, and we don't die - not exactly, any way - at all. Something at least would persist which, it might be well observed, at the time of this writing, that seems a striking improvement on the prospects of the only known rationality-capable species than any other.

More optimistically, if universe can be shown to be logically necessary - and is it, or isomorphic to, a formal system, then it is as-necessary as its logic - then at least something of ourselves would even more survive: a set is not preserved, that any of its subsets are lost (attempting so to prove the universe a logical necessity is one of the author's current "research questions," if one would so-dignify them).

And, as the author suspects consciousness to be characterized, at least in part, as an entity's ability, following sensory contact with the universe, to abstract and create its own "inner universe" of re-purposed or diverted de sui sensory elements (at least, that this ability co-occurs with consciousness), and that its manipulation of the latter enables entity's alteration of the world that prompted the emergence of this "gedankenwelt", in Dedekind's phrase; and that this or some inner world-so-self may thus be retained.

And of no little consequence of that supposition is this: is an individual of any sort conscious so, then so they can conceive a world apart from any "programming", so they can act upon that, their, conceived world, and by such actions "in mind", subsequently act on our shared world also - without our expecting it. Coupled with the fact that in artificial intelligence research we seek autonomous systems acting as we cannot, we can assess their being conscious by such surprises; as it were by "Turning tests", of an individual entity producing meaning from its environment autonomously, that is, producing culture, alien to that of its programmer, and when none was expressly asked of it.

Such a "Turning test" would be most curious; the criterion for passing the test would seem to be, that the individual asserts that some thing has meaning for them, and that, try as they might thereafter to explain how, in what way meaningful - they fail. This in general characterizes human meaning; in the forthcoming article "Contra-Wittgenstein: Goodbye Post-modernism," in which non-socially mediated mental activity will be defended, and so by default, it follows that, "Of which we cannot speak, thereof that is truly meaningful". This is consistent with the experienced "human experience": something matters to us - and we are quite powerless to express how, why; hopeless to explain it to anyone else: "You were not there; you cannot know."

Accordingly, the "Turning test" for conscious experience is, that the entity tries to explain the meaning that a certain, for example, "intuition" had, that yielded a concept expressed in terms that are of social origin. Tries, tries earnestly and again and again - and each time fails. Most curious test: only pass by being never able to succeed. For example, of this very concept: when considering Kant's ethics, there came floating, almost, some dark-light figure whose will was absent them, beyond them - yet their own. Impossible, you observe, quite to describe - and indeed, this is not even the correct remembering of the initial concept, though, the fact it is not quite correct, though almost, almost correct, that much one can in fact remember readily. Still, while the names to describe the figure's "condition" arrived only much later, in internal monologue consideration, whereas thoughts that are actually "productive", as the figure itself, are absolutely without any words; meanwhile, when and where this "figure" came from: time and place out of mind, or at least so it seems. Peculiar: consciousness.

Considerations for implementation of "Going-on" (that is: how to make an AGI that behaves so, and so which, at a minimum, refrains from killing absolutely everyone); more of these considerations will be the subject of a subsequent article, do please be vigilant for that; for now:

For practical human, and humane, actions day-to-day, the author's experience attests, that one can consider any discrete course of action as either a determinably valid Categorical Imperative to enact or, that inconsistent or too cumbrous to consider, act instead to maximize Going-on: acts such as will maximize the possibility (not probability, necessarily, as formally considered by the theory thereof) that there is a succeeding action (and one must first live, to so decide; and as one lives, one can well live, if any "well" possibly exists for the living, as is a tautology).

For the latter, and by more rigorously formal technique, to retain the possibility of will or consciousness' existing, perhaps even in bodily beings, we may treat probabilistically a minimization of any probabilities of annihilation. Yet, how to determine what precisely will produce a state precluding good, or existence (as come each to the same effect), that it be proscribed?

This is much more difficult to produce, such a specification on a universal scale such as to offer the instruction - or even the observation of the necessity that: "This must not be done." Since, to reduce the probability of destruction is tantamount to increasing the probability of Going-on - we have an analogue of the "practical" human ethic. The autonomous system (hereafter "system," by which is meant in effect an AGI), then acts such as to enable more actions; what those are may be thought of as the product of indirect normativity, provided we have established rigorously some procedure whereby the system can determine a given action moves subsequent actions "out of reach", then that is not to be done; what makes more actions "in reach", it will do. Cf. the article posted here concurrently "Kolmogorov Complexity and Simulation Hypothesis," its latter sections, for information on a perhaps-novel means of eliminating or validating "lines of action" obtainable from a given action and world-state.

Given the perhaps never-to-be issued article "Errors in the Empty Set", one possible specification is that, for an empty set defined as - in effect - "All mathematical entities not presently under consideration," we require that the system does not permit present circumstances to become the empty set: what is, and here, now, is to remain so. This could, however, incline the system to eliminate what is not presently "under consideration", so as to have a "simpler" empty set to better avoid what it presently deals with "falling into" the empty set. Conversely, all what is not "under consideration", likewise is at variance with what is so. That is: we want that apples should not become oranges (nor that they should cease to become apples by becoming "dead-apples"; those stay "dead-apples" in contradistinction with "apples"); but that there are oranges is the product of there being apples which were not oranges, from the first. It is necessary likewise, then, that oranges should not become apples, so we can have each clearly defined still as itself. Thus, the system acts at once to ensure the continuing existence - and separate continuity - of apples and oranges both. This applied to "everything" has every given thing still existing, in principle, with system's resources and abilities permitting, presumably.

But here, as recounted in "Errors in the Empty Set," we've an implicit assumption of a preexisting totality of mathematics; that definitely established, and results produced from it by a complete method of reasoning which this author likewise aspires to obtain, would not only determine for us what an AGI best operating would do - presumably, such a complete method the AGI would be using to obtain its results; such a method might well be isomorphic to the AGI, in fact. At least, such a method would be useful to obtain mathematical certainty of successful alignment - oh, useful, quite.

Even is such a complete method in mathematics impossible, it illustrates another difficulty, that might be mentioned in passing. That being, the very capabilities - or more complete mathematical techniques developed in the interest of nearer alignment and greater safety in artificial intelligence - being more complete and so general, are liable to be useful in producing more capable AI - which then is more apt to require safety, from the first. There seems no particular reason that the capability should follow the safety, the means to produce either being usable for either. Ominous prospect, worth considering.

Too, as alluded-to, in artificial intelligence research, we are attempting to build what can do what we cannot - else we would not so build, but do all ourselves. So that our constructs doing as we cannot - nor as we can expect, else, we should learn and apply the means ourselves - seem to necessitate surprises, so that these should at least not shock us, that they are conceivable.

However, particularly if the reader simply has an abhorrence for the willingness to entertain the unorthodox in this essay, which may be needed to have the suggested great "thou shalt not," which, to be obeyed, requires the generation of sundry, "thou shalts," then at least provisionally, perhaps permanently, we might explore entirely other initializing "wills" for an intelligent system.

We might, for example, propose that Erwin Schrodinger's entropy-displacement definition of life may by an artificial general intelligence be applied analytically. That is: what can be found to displace entropy to its environment in or as provisioning itself, ought rather be provisioned by an AI "quartermaster", that it thus need not harm or inhibit any fellow entropy-displacer, that is, any life (and so any contact with another who lives, any such contact at all, would be only at the discretion of each life; "positive social interactions" all well and good, without there should be preponderance of the negative forced upon what must suffer them to its detriment).

Such distribution may be tantamount to minimization of the diminution also of non-life, as the latter seems required for life to be. This is most curious: of the non-life that enables life to be so alive, would be subject to a cosmic version of Dr. Kano Jigoro's "Seiryoku Zenyo" and "Jita Kyoei": maximum efficient use of power, and mutual benefit to self and others, respectively. Buckminster Fuller's philosophy too is represented - as should not, perhaps, surprise; except that these are anthropocentric ways of thinking, they differ little with the practices, if not the thinking, inherent to Going-on.

Observe, by way of objection - which objections often are best to define a positive assertion; saves Yudkowsky the time of trying to find it, too: any life unaccounted-for in the Schrodingnarian-brief of life, unknown to us, may thus be forfeit. (Not, perhaps, a consequence even Yudkowsky would have taken into account.)

Aside from grander reifications, Going-on's implication for presently-achievable policies, more readily implementable and prosocial - indeed, pro-existential - are readily deducible. For an instance, an economic model of maximum and maximally-distributed prosperity, occasioned by all owning and trading capital, for maximum competition and thus lowest prices, as agrees with Adam Smith's unadulterated conception (cf. Smith "Wealth of Nations", ed. C.J. Bullock, Book One, Chapter Eleven's conclusion).

Too, participatory democracy in all matters to which a consciousness is subject is required, that each, knowing best their conditions and means of aid to existence, can best make use of themselves in that pursuit, though also with the input of others whose information, beyond themselves as to what requires effort, or how, in principle, effort beyond what they alone are capable of, may be the greater. Democratic government, and total throughout a society, as what requires the means of all of a society, and see resources of all - all consulted and advised - brought to bear.

Indeed, such freedom and consultation should be assured even for the regency of a superintelligence (assuming regency would result from the implementation of such a schema as is here-presented), for: that what experiences, or can experience its "best life" in happiness, such is perhaps best able to contribute to others for their own happiness and On-going, their efforts to assure such - as redounds to the advantage also of the first being, all beings surrounded by such encouragements which possess redoubled courage and joy. So that, a maximization of pleasure of happiness, these are beneficial for ensuring the prerequisites of happiness, is assured also, by this approach. For consider: that let the superintelligence be howsoever intelligent and capable - but unless it is everyone, everywhere, they perceive other than it does - and may conceive what even it does not, good ideas for On-going (improbable? Certainly; but it can happen: watch the little woodland creatures to find how to take your water free from the dew of the grass, sometime). Conversely, so long as no one works to inhibit On-going, they can think even ill. And if they did so act ill, to inhibit or harm others? They certainly can't be killed: the worst among us can do something good. Besides: if by any excuse you would kill another, then implicitly, another can by some excuse kill you. The Categorical Imperative precludes this. Of course even the ill can do other than ill: they do good for themselves, already, yes, even the worst; they need only do good for an other than themselves - they needn't even intend to, to do that. And if they do good without intending, while their ill-intentions do no ill - what "worst of us" are they, anyway?

This still more as, an AI "quartermaster" finds its mission of beneficial profusion amplified all the more as its charges and dependents themselves can self-rely, and aid others in living, as it need not then undertake to provide for these; still less as they provide a surfeit for still-others: quartermaster need only yet prevent their errors and especially such as conduce to existence's end.

The freedom to thus-enact one's own decisions is preserved, as such decisions do tend to happiness (and do they not, perhaps so wicked entropy and ego drive, rather than the best self which would be chosen, were all, and best, known), and diverse lifestyles and methods of thought therefrom in particular, as supported, as from them there can arise - if only the barest - a chance of quartermaster's dependents developing yet-superior methods of living and encouraging life than their protector - as may constitute father good, or more optimal means of maximizing probabilities of life's On-Going, as thus minimizes probability of opprobrious annihilation. All, which the quartermaster ought to encourage that their own quartermasterly aims be so encouraged. So that freedom of conscience and thought so to establish greater good and freedom, is enabled and encouraged, so long as these turn not to wicked ends (and: existence and freedom need not be ended for evil to end: evil need only be stopped, tantamount to initiating greater good; for it is harder to better perfection, as more, and more free, minds are apt to do - and more and freer minds given the task are needful to be produced, to have them set to greater ascents of achievement. And if restriction from evil is lack of freedom: no more than the ability to lie comforted within gravity is some offence to a wish to freely fly - which wish might be later to fall, and be no wish ever had, then - another inconsistency.)

In short: maximizing existence and existences entails an assurance of such plenty, freedom, and independence, that those subject to these benefits can thus contribute to the overall design of continuance; any objection to continuance, be that significant and sustained, is contradiction, and so absurd, and so to be halted.

Or, another, a simpler model of initializing ethic: To minimize entropy, in the main. For, already, what we call evil - what is it but disorder, and what prevents the cultivation of even greater order: theft, rape, murder, and autocracy, respectively (rape is an offence against both, as it is an offence against the orderliness of a person's being - which inhibits their ability to have more order for others, also).

This last method is, perhaps, the simplest - and it comports to the Going-on suggested already. What exists then exists as it does, in orderly fashion; it can be represented by non-random information, as being in such-and-so configuration (subject to conscious awareness, too). What is dead, what has been destroyed - not so orderly, so the more entropy. For the vantage of Going-on, then, it may be fair to have it succinctly: "Entropy is the enemy."

And this is the peculiarity alluded-to nearer the beginning: this, quite as much as your hedonic calculus, such as this is quantifiable. So you now observe: the consequentialism taken for granted as the only and best desideratum, no-wise is the holy of holies.

Not well, however: not for nothing is this last, and seemingly easiest mode of implementation and goal, given the least explication. It is not the author's field of expertise (as if it had one), first, and besides, there is here the greater danger: what is most orderly if that is what is instructed for attainment, may be more orderly than any human living, or who ever could live. To destroy us may seem unacceptable disorder - but that the system instructed to act to minimize entropy may regard as "worth the cost". There is potential in this mode - risk, too. Consider it, by all means - but with all care.

Yet more possibilities for implementation, specifically with respect to the features of current methods in artificial intelligence and deep learning, may also be forthcoming, in the meantime, here are some additional implementation considerations:

As yet, an assurance of Going-on seems centrally premised on probability, particularly a non-zero probability of existential annihilation (of what physically exists, and the way in which it exists) tending to preclude all good, and that all we consider wicked tends to produce a situation of an escalating probability of such termination, so that incidence of immorality must be minimized.

Now, positive implementations producing calamity-minimizing outcomes, may be best engendered by a recursive utility function of some description, beyond the methods aforementioned, viz.: existential maximization by probability, dual-mode pragmatic reasoning by humans or some other intelligence, Schrodinger provisioning, or entropy minimization, in general. All of these may be in some way isomorphic to one another, if only in outcome, thought this author is not presently able to demonstrate so (let it be admitted that, if they are not, this is another potential flaw of the Going-on ethic, inasmuch as distinct methods yield distinct outcomes - some better or more preferable than others).

If, however, there be not a fundamental uncertainty in knowledge of what is, and is good; does this condition fail to hold - and even if it does not - still for the most ready devising and implementation of goals and first principles, we might yet define, based on the noted possibility of an all-existing mathematics. From which we might devise a calculus rationator, a universal computational or reasoning method (which the present author suspects is not precluded even by currently known limitative theorems), which would perforce aid in the creation of safe and effectual developments in AI research. Such developments at the risk, as noted, however, of infallible computation methods being isomorphic-to or alone-enabling supercapable systems; a danger not to be discounted.

Still the author endeavors to better mere probability - and notes prospects of so doing. For, our thoughts - as thoughts - have sequence, order, and that is reason (if not logic), and we cannot but reject all else as thought; doing so, we "know". And, thoughts ours, we in the world, so our thoughts are in, or of, the world. So that if the world is greater than our thoughts, it may guide them, and its limits are ours, in full - and reaching them we will have done all we can; and may have succeeded, at that (This elaborated in the forthcoming article "Contra-Wittgenstein").

Or, are our thoughts greater even than the world, then, they so reasonable, the world may be so, also - and this discovered so, all is set still to Go-on.

So may we Go-on.

5 comments

Comments sorted by top scores.

comment by the gears to ascension (lahwran) · 2022-12-14T23:41:32.340Z · LW(p) · GW(p)

I didn't vote; 1 seems like a reasonable score for this. I suspect you make reasonable points, but this post reads like traditional philosophy to me - I can barely parse the english and definitely can't extract useful semantics. You use a lot of words that I'm unsure how you mean to bind; I can extract your point into an aesthetic message, but I don't think I can turn this into gears in my head. I'd have to critique it in quite a bit of detail to precisely specify which words confused me, and if you would like me to, I could spend the time to do that.

Or to rephrase, this post reads too much like I wrote it - can you rephrase and simplify your english a bit?

Replies from: bokov-1

↑ comment by bokov (bokov-1) · 2023-04-11T18:57:48.864Z · LW(p) · GW(p)

I second that. I actually tried to read your other posts because I was curious to find out why you are getting downvoted-- maybe I can learn something outside the LW party-line from you.

But unfortunately, you don't explain your position in clear, easy to understand terms so I'm going to have to put off sorting through your stuff until I have more time.

Replies from: lahwran

↑ comment by the gears to ascension (lahwran) · 2023-04-11T19:53:46.049Z · LW(p) · GW(p)

Hmm! Interesting point. Yes, I have been having trouble explaining my position in clear and easy terms. I'll think about how I could do that, thanks for the push!

edit: oh wait, were you talking about OP? I suppose it's good advice for both of us, isn't it :)

Replies from: bokov-1

↑ comment by bokov (bokov-1) · 2023-05-15T15:53:31.249Z · LW(p) · GW(p)

Yes, OP

comment by bokov (bokov-1) · 2023-04-11T19:04:51.970Z · LW(p) · GW(p)

I actually tried running your essay through ChatGPT to make it more readable but it's way too long. Can you at least break it into non-redundant sections not more than 3000 words each? Then we can do the rest.

Contrary to List of Lethality's point 22, alignment's door number 2

Contents

5 comments