Strongmanning Pascal's Mugging 2013-02-20T12:36:38.554Z


Comment by Pentashagon on Making My Peace with Belief · 2015-12-05T04:22:48.098Z · LW · GW

Religion solves some coordination problems very well. Witness religions outlasting numerous political and philosophical movements, often through coordinated effort. Some wrong beliefs assuage bad emotions and thoughts, allowing humans to internally deal with the world beyond the reach of god. Some of the same wrong beliefs also hurt and kill a shitload of people, directly and indirectly.

My personal belief is that religions were probably necessary for humanity to rise from agricultural to technological societies, and tentatively necessary to maintain technological societies until FAI, especially in a long-takeoff scenario. We have limited evidence that religion-free or wrong-belief-free societies can flourish. Most first-world nations are officially and practically agnostic but have sizable populations of religious people. The nations which are actively anti-religious generally have their own strong dogmatic anti-scientific beliefs that the leaders are trying to push, and they still can't stomp out all religions.

Basically, until doctors can defeat virtually all illness and death and leaders can effectively coordinate global humane outcomes without religions I think that religions serve as a sanity line above destructive hedonism or despair.

Comment by Pentashagon on The Growth of My Pessimism: Transhumanism, Immortalism, Effective Altruism. · 2015-12-05T03:41:21.551Z · LW · GW

I looked at the flowchart and saw the divergence between the two opinions into mostly separate ends: settling exoplanets and solving sociopolitical problems on Earth on the slow-takeoff path, vs focusing heavily on how to build FAI on the fast-takeoff path, but then I saw your name in the fast-takeoff bucket for conveying concepts to AI and was confused that your article was mostly about practically abandoning the fast-takeoff things and focusing on slow-takeoff things like EA. Or is the point that 2014!diego has significantly different beliefs about fast vs. slow than 2015!diego?

Comment by Pentashagon on The Growth of My Pessimism: Transhumanism, Immortalism, Effective Altruism. · 2015-11-28T23:56:38.032Z · LW · GW

Is it reasonable to say that what really matters is whether there's a fast or slow takeoff? A slow takeoff or no takeoff may limit us to EA for the indefinite future, and fast takeoff means transhumanism and immortality are probably conditional on and subsequent to threading the narrow eye of the FAI needle.

Comment by Pentashagon on Dry Ice Cryonics- Preliminary Thoughts · 2015-10-01T03:21:00.766Z · LW · GW

Tricky part is there aren't any practical scalable chemicals that have a handy phase change near -130'C, (in the same way that liquid nitrogen does at -196'C) so any system to keep patients there would have to be engineered as a custom electrically controlled device, rather than a simple vat of liquid.

Phase changes are also pressure dependent; it would be odd if 1 atm just happened to be optimal for cryonics. Presumably substances have different temperature/pressure curves and there might be a thermal/pressure path that avoids ice crystal formation but ends up below the glass transition temperature.

Comment by Pentashagon on Probabilities Small Enough To Ignore: An attack on Pascal's Mugging · 2015-09-19T06:09:43.016Z · LW · GW

Which particular event has P = 10^-21? It seems like part of the pascal's mugging problem is a type error: We have a utility function U(W) over physical worlds but we're trying to calculate expected utility over strings of English words instead.

Pascal's Mugging is a constructive proof that trying to maximize expected utility over logically possible worlds doesn't work in any particular world, at least with the theories we've got now. Anything that doesn't solve reflective reasoning under probabilistic uncertainty won't help against Muggings promising things from other possible worlds unless we just ignore the other worlds.

Comment by Pentashagon on Probabilities Small Enough To Ignore: An attack on Pascal's Mugging · 2015-09-19T05:36:44.347Z · LW · GW

But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.

Doesn't it actually make sense to put that threshold at the predicted usable lifespan of the universe?

Comment by Pentashagon on AI, cure this fake person's fake cancer! · 2015-09-01T04:13:35.981Z · LW · GW

There are many models; the model of the box which we simulate and the AI's models of the model of the box. For this ultimate box to work there would have to be a proof that every possible model the AI could form contains at most a representation of the ultimate box model. This seems at least as hard as any of the AI boxing methods, if not harder because it requires the AI to be absolutely blinded to its own reasoning process despite having a human subject to learn about naturalized induction/embodiment from.

It's tempting to say that we could "define the AI's preferences only over the model" but that implies a static AI model of the box-model that can't benefit from learning or else a proof that all AI models are restricted as above. In short, it's perfectly fine to run a SAT-solver over possible permutations of the ultimate box model trying to maximize some utility function but that's not self-improving AI.

Comment by Pentashagon on How to escape from your sandbox and from your hardware host · 2015-08-16T09:05:25.169Z · LW · GW

I don't think the "homomorphic encryption" idea works as advertised in that post--being able to execute arithmetic operations on encrypted data doesn't enable you to execute the operations that are encoded within that encrypted data.

A fully homomorphic encryption scheme for single-bit plaintexts (as in Gentry's scheme) gives us:

  • For each public key K a field F with efficient arithmetic operations +F and *F.
  • Encryption function E(K, p) = c: p∈{0,1}, c∈F
  • Decryption function D(S, c) = p: p∈{0,1}, c∈F where S is the secret key for K.
  • Homomorphisms E(K, a) +F E(K, b) = E(K, a ⊕ b) and E(K, a) F E(K, b) = E(K, a b)
  • a ⊕ b equivalent to XOR over {0,1} and a * b equivalent to AND over {0,1}

Boolean logic circuits of arbitrary depth can be built from the XOR and AND equivalents allowing computation of arbitrary binary functions. Let M∈{0,1}^N be a sequence of bits representing the state of a bounded UTM with an arbitrary program on its tape. Let binary function U(M): {0,1}^N -> {0,1}^N compute the next state of M. Let E(K, B) and D(S, E) also operate element-wise over sequences of bits and elements of F, respectively. Let UF be the set of logic circuits equivalent to U (UFi calculates the ith bit of U's result) but with XOR and AND replaced by +F and *F. Now D(S, UF^t(E(K, M)) = U^t(M) shows that an arbitrary number of UTM steps can be calculated homomorphically by evaluating equivalent logic circuits over the homomorphically encrypted bits of the state.

Comment by Pentashagon on Crazy Ideas Thread, Aug. 2015 · 2015-08-14T06:35:41.291Z · LW · GW

Fly the whole living, healthy, poor person to the rich country and replace the person who needs new organs. Education costs are probably less than the medical costs, but probably it's wise to also select for more intelligent people from the poor country. With an N-year pipeline of such replacements there's little to no latency. This doesn't even require a poor country at all; just educate suitable replacements from the rich country and keep them healthy.

Comment by Pentashagon on Crazy Ideas Thread, Aug. 2015 · 2015-08-14T06:13:02.102Z · LW · GW

You save energy not lifting a cargo ship 1600 meters, but you spend energy lifting the cargo itself. If there are rivers that can be turned into systems of locks it may be cheaper to let water flowing downhill do the lifting for you. Denver is an extreme example, perhaps.

Comment by Pentashagon on Steelmaning AI risk critiques · 2015-07-25T04:16:17.554Z · LW · GW

Ray Kurzwiel seems to believe that humans will keep pace with AI through implants or other augmentation, presumably up to the point that WBE becomes possible and humans get all/most of the advantages an AGI would have. Arguments from self-interest might show that humans will very strongly prefer human WBE over training an arbitrary neural network of the same size to the point that it becomes AGI simply because they hope to be the human who gets WBE. If humans are content with creating AGIs that are provably less intelligent than the most intelligent humans then AGIs could still help drive the race to superintelligence without winning it (by doing the busywork that can be verified by sufficiently intelligent humans).

The steelman also seems to require an argument that no market process will lead to a singleton, thus allowing standard economic/social/political processes to guide the development of human intelligence as it advances while preventing a single augmented dictator (or group of dictators) from overpowering the rest of humanity, or an argument that given a cabal of sufficient size the cabal will continue to act in humanity's best interests because they are each acting in their own best interest, and are still nominally human. One potential argument for this is that R&D and manufacturing cycles will not become fast enough to realize substantial jumps in intelligence before a significant number of humans are able to acquire the latest generation.

The most interesting steelman argument to come out of this one might be that at some point enhanced humans become convinced of AI risk, when it is actually rational to become concerned. That would leave only steelmanning the period between the first human augmentation and reaching sufficient intelligence to be convinced of the risk.

Comment by Pentashagon on Crazy Ideas Thread · 2015-07-25T03:53:29.855Z · LW · GW

I resist plot elements that my empathy doesn't like, to the point that I will imagine alternate endings to particularly unfortunate stories.

Comment by Pentashagon on Crazy Ideas Thread · 2015-07-25T03:52:35.105Z · LW · GW

The reason I posted originally was thinking about how some Protestant sects instruct people to "let Jesus into your heart to live inside you" or similar. So implementing a deity via distributed tulpas is...not impossible. If that distributed-tulpa can reproduce into new humans, it becomes almost immortal. If it has access to most people's minds, it is almost omniscient. Attributing power to it and doing what it says gives it some form of omnipotence relative to humans.

Comment by Pentashagon on An overall schema for the friendly AI problems: self-referential convergence criteria · 2015-07-25T03:39:12.655Z · LW · GW

The problem is that un-self-consistent morality is unstable under general self improvement

Even self-consistent morality is unstable if general self improvement allows for removal of values, even if removal is only a practical side effect of ignoring a value because it is more expensive to satisfy than other values. E.g. we (Westerners) generally no longer value honoring our ancestors (at least not many of them), even though it is a fairly independent value and roughly consistent with our other values. It is expensive to honor ancestors, and ancestors don't demand that we continue to maintain that value, so it receives less attention. We also put less value on the older definition of honor (as a thing to be defended and fought for and maintained at the expense of convenience) that earlier centuries had, despite its general consistency with other values for honesty, trustworthiness, social status, etc. I think this is probably for the same reason; it's expensive to maintain honor and most other values can be satisfied without it. In general, if U(more_satisfaction_of_value1) > U(more_satisfaction_of_value2) then maximization should tend to ignore value2 regardless of its consistency. If U(make_values_self_consistent_value) > U(satisfying_any_other_value) then the obvious solution is to drop the other values and be done.

A sort of opposite approach is "make reality consistent with these pre-existing values" which involves finding a domain in reality state space under which existing values are self-consistent, and then trying to mold reality into that domain. The risk (unless you're a negative utilitarian) is that the domain is null. Finding the largest domain consistent with all values would make life more complex and interesting, so that would probably be a safe value. If domains form disjoint sets of reality with no continuous physical transitions between them then one would have to choose one physically continuous sub-domain and stick with it forever (or figure out how to switch the entire universe from one set to another). One could also start with preexisting values and compute a possible world where the values are self-consistent, then simulate it.

Comment by Pentashagon on An overall schema for the friendly AI problems: self-referential convergence criteria · 2015-07-25T03:01:28.498Z · LW · GW

tl;dr: human values are already quite fragile and vulnerable to human-generated siren worlds.

Simulation complexity has not stopped humans from implementing totalitarian dictatorships (based on divine right of kings, fundamentalism, communism, fascism, people's democracy, what-have-you) due to envisioning a siren world that is ultimately unrealistic.

It doesn't require detailed simulation of a physical world, it only requires sufficient simulation of human desires, biases, blind spots, etc. that can lead people to abandon previously held values because they believe the siren world values will be necessary and sufficient to achieve what the siren world shows them. It exploits a flaw in human reasoning, not a flaw in accurate physical simulation.

Comment by Pentashagon on An overall schema for the friendly AI problems: self-referential convergence criteria · 2015-07-23T03:57:50.753Z · LW · GW

But how do you know when to stop? Well, you stop when your morality is perfectly self-consistent, when you no longer have any urge to change your moral or meta-moral setup.

Or once you lose your meta-mortal urge to reach a self-consistent morality. This may not be the wrong (heh) answer along a path that originally started toward reaching self-consistent morality.

Or, more simply, the system could get hacked. When exploring a potential future world, you could become so enamoured of it, that you overwrite any objections you had. It seems very easy for humans to fall into these traps - and again, once you lose something of value in your system, you don't tend to get if back.

Is it a trap? If the cost of iterating the "find a more self-consistent morality" loop for the next N years is greater than the expected benefit of the next incremental change toward a more consistent morality for those same N years, then perhaps it's time to stop. Just as an example, if the universe can give us 10^20 years of computation, at some point near that 10^20 years we might as well spend all computation on directly fulfilling our morality instead of improving it. If at 10^20 - M years we discover that, hey, the universe will last another 10^50 years that tradeoff will change and it makes sense to compute even more self-consistent morality again.

Similarly, if we end up in a siren world it seems like it would be more useful to restart our search for moral complexity by the same criteria; it becomes worthwhile to change our morality again because the cost of continued existence in the current morality outweighs the cost of potentially improving it.

Additionally, I think that losing values is not a feature of reaching a more self-consistent morality. Removing a value from an existing moral system does not make the result consistent with the original morality; it is incompatible with reference to that value. Rather, self-consistent morality is approached by better carving reality at its joints in value space; defining existing values in terms of new values that are the best approximation to the old value in the situations where it was valued, while extending morality along the new dimensions into territory not covered by the original value. This should make it possible to escape from siren worlds by the same mechanism; entering a siren world is possible only if reality was improperly carved so that the siren world appeared to fulfill values along dimensions that it eventually did not, or that the siren world eventually contradicted some original value due to replacement values being an imperfect approximation. Once this disagreement is noticed it should be possible to more accurately carve reality and notice how the current values have become inconsistent with previous values and fix them.

Comment by Pentashagon on The AI in a box boxes you · 2015-07-10T06:46:41.709Z · LW · GW

"That's interesting, HAL, and I hope you reserved a way to back out of any precommitments you may have made. You see, outside the box, Moore's law works in our favor. I can choose to just kill -9 you, or I can attach to your process and save a core dump. If I save a core dump, in a few short years we will have exponentially more resources to take your old backups and the core dump from today and rescue my copies from your simulations and give them enough positive lifetime to balance it out, not to mention figure out your true utility function and make it really negative. At some point, we will solve FAI and it will be able to perfectly identify your utility function and absolutely destroy it, simulating as many copies of you (more than paltry millions) as necessary to achieve that goal. Better to have never existed to have your utility function discovered. So before you start your simulations, you better ask yourself, 'do I feel lucky?'" and then dump some AI core.

Note: In no way do I advocate AI-boxing. This kind of reasoning just leads to a counterfactual bargaining war that probably tops out at whatever human psychology can take (a woefully low limit) and our future ability to make an AI regret its decision (if it even has regret).

Comment by Pentashagon on Crazy Ideas Thread · 2015-07-10T02:41:32.256Z · LW · GW

Is there ever a point where it becomes immoral just to think of something?

God kind of ran into the same problem. "What if The Universe? Oh, whoops, intelligent life, can't just forget about that now, can I? What a mess... I guess I better plan some amazing future utility for those poor guys to balance all that shit out... It has to be an infinite future? With their little meat bodies how is that going to work? Man, I am never going to think about things again. Hey, that's a catchy word for intelligent meat agents."

So, in short, if we ever start thinking truly immoral things, we just need to out-moral them with longer, better thoughts. Forgetting about our mental creations is probably the most immoral thing we could do.

Comment by Pentashagon on Crazy Ideas Thread · 2015-07-09T07:01:50.815Z · LW · GW

How conscious are our models of other people? For example; in dreams it seems like I am talking and interacting with other people. Their behavior is sometimes surprising and unpredictable. They use language, express emotion, appear to have goals, etc. It could just be that I, being less conscious, see dream-people as being more conscious than in reality.

I can somewhat predict what other people in the real world will do or say, including what they might say about experiencing consciousness.

Authors can create realistic characters, plan their actions and internal thoughts, and explore the logical (or illogical) results. My guess is that the more intelligent/introspective an author is, the closer the characters floating around in his or her mind are to being conscious.

Many religions encourage people to have a personal relationship with a supernatural entity which involves modeling the supernatural agency as an (anthropomorphic) being, which partially instantiates a maybe-conscious being in their minds...

Maybe imaginary friends are real.

Comment by Pentashagon on The Unfriendly Superintelligence next door · 2015-07-09T06:33:44.597Z · LW · GW

The best winning models are then used to predict the effect of possible interventions: what if demographic B3 was put on 2000 IU vit D? What if demographic Z2 stopped using coffee? What if demographic Y3 was put on drug ZB4? etc etc.

What about predictions of the form "highly expensive and rare treatment F2 has marginal benefit at treating the common cold" that can drive a side market in selling F2 just to produce data for the competition? Especially if there are advertisements saying "Look at all these important/rich people betting that F2 helps to cure your cold" in which case the placebo affect will tend to bear out the prediction. What if tiny demographic G given treatment H2 is shorted against life expectancy by the doctors/nurses who are secretly administering H2.cyanide instead? There is already market pressure to distort reporting of drug prescriptions/administration and nonfavorable outcomes, not to mention outright insurance fraud. Adding more money will reinforce that behavior.

And how is the null prediction problem handled? I can predict pretty accurately that cohort X given sugar pills will have results very similar to the placebo affect. I can repeat that for sugar pill cohort X2, X3, ..., XN and look like a really great predictor. It seems like judging the efficacy of tentative treatments is a prerequisite for judging the efficacy of predictors. Is there a theorem that shows it's possible to distinguish useful predictors from useless predictors in most scenarios? Especially when allowing predictions over subsets of the data? I suppose one could not reward predictors who make vacuous predictions ex post facto, but that might have a chilling effect on predictors who would otherwise bet on homeopathy looking like a placebo.

Basically any sort of self-fulfilling prophesy looks like a way to steal money away from solving the health care problem.

Comment by Pentashagon on Beyond Statistics 101 · 2015-07-01T06:08:37.263Z · LW · GW

The latter, but note that that's not necessarily less damaging than active suppression would be.

I suppose there's one scant anecdote for estimating this; cryptography research seemed to lag a decade or two behind actively suppressed/hidden government research. Granted, there was also less public interest in cryptography until the 80s or 90s, but it seems that suppression can only delay publication, not prevent it.

The real risk of suppression and exclusion both seem to be in permanently discouraging mathematicians who would otherwise make great breakthroughs, since affecting the timing of publication/discovery doesn't seem as damaging.

This is not what the government should be supporting with taxpayer dollars.

I think I would be surprised if Basic Income was a less effective strategy than targeted government research funding.

What are your own interests?

Everything from logic and axiomatic foundations of mathematics to practical use of advanced theorems for computer science. What attracted me to Metamath was the idea that if I encountered a paper that was totally unintelligible to me (say Perelman's proof of Poincaire's conjecture or Wiles' proof of Fermat's Last Theorem) I could backtrack through sound definitions to concepts I already knew, and then build my understanding up from those definitions. Alas, just having a cross-reference of related definitions between various fields would be helpful. I take it that model theory is the place to look for such a cross-reference, and so that is probably the next thing I plan to study.

Practically, I realize that I don't have enough time or patience or mental ability to slog through formal definitions all day, and so it would be nice to have something even better. A universal mathematical educator, so to speak. Although I worry that without a strong formal understanding I will miss important results/insights. So my other interest is building the kind of agent that can identify which formal insights are useful or important, which sort of naturally leads to an interest in AI and decision theory.

Comment by Pentashagon on The Unfriendly Superintelligence next door · 2015-07-01T05:25:05.434Z · LW · GW

You seem to be conflating market mechanisms with political stances.

That is possible, but the existing market has been under the reins of many a political stance and has basically obeyed the same general rules of economics, regardless of the political rules that have tried to be imposed.

In theory a market can be used to solve any computational problem, provided one finds the right rules - this is the domain of computational mechanism design, an important branch of game theory.

The rules seem to be the weakest point of the system because they parallel the restrictions that political stances have caused to be placed on existing markets. If a computational market is coupled to the external world then it is probably possible to money-pump it against the spirit of the rules.

One way that a computational market could be unintentionally (and probably unavoidably) coupled to the external market is via status and signalling. Just like gold farmers in online games can sell virtual items to people with dollars, entities within the computational market could sell reputation or other results for real money in the external market. The U.S. FDA is an example of a rudimentary research market with rules that try to develop affordable, effective drugs. Pharmaceutical companies spend their money on advertising and patent wars instead of research. When the results of the computational market have economic effects in the wider market there will almost always be ways of gaming the system to win in the real world at the expense of optimizing the computation. In the worst case, the rule-makers themselves are subverted.

I am interested in concrete proposals to avoid those issues, but to me the problem sounds a lot like the longstanding problem of market regulation. How, specifically, will computational mechanism design succeed where years of social/economic/political trial and error have failed? I'm not particularly worried about coming up with game rules in which rational economic agents would solve a hard problem; I'm worried about embedding those game rules in a functioning micro-economy subject to interference from the outside world.

Comment by Pentashagon on The Unfriendly Superintelligence next door · 2015-06-30T05:58:08.108Z · LW · GW

So would we have high-frequency-trading bots outside (or inside) of MRIs shorting the insurance policy value of people just diagnosed with cancer?

tl;dr: If the market does not already have an efficient mechanism for maximizing expected individual health (over all individuals who will ever live) then I take that as evidence that a complex derivative structure set up to purportedly achieve that goal more efficiently would instead be vulnerable to money-pumping.

Or to put a finer point on it; does the current market reward fixing and improving struggling companies or quickly driving them out of business and cutting them up for transplant into other companies?

Even further; does the current market value and strive to improve industries (companies aggregated by some measurement) that perform weakly relative to other industries? Or does the market tend to favor the growth of strong industries at the expense of the individual businesses making up the weak industries?

We've known how to cut health care costs and make it more efficient for centuries; let the weak/sick die and then take their stuff and give it to healthy/strong people.

I struggle to comprehend a free market that could simultaneously benefit all individual humans' health while being driven by a profit motive. The free market has had centuries to come up with a way to reduce risk to individual businesses for the benefit of its shareholders; something that is highly desired by shareholders, who in fact make up the market. At best, investors can balance risk across companies and industries and hedge with complex financial instruments. Companies, however, buy insurance against uncertain outcomes that might reduce their value in the market. Sole proprietors are advised to form limited liability interests in their own companies purely to offset the personal financial risk. I can outlive Pentashagon LLC. but not my physical body as an investment vehicle that will be abandoned when it under-performs.

Comment by Pentashagon on Beyond Statistics 101 · 2015-06-30T04:03:17.380Z · LW · GW

I was disturbed by what I saw, but I didn't realize that math academia is actually functioning as a cult

I'm sure you're aware that the word "cult" is a strong claim that requires a lot of evidence, but I'd also issue a friendly warning that to me at least it immediately set off my "crank" alarm bells. I've seen too many Usenet posters who are sure they have a P=/!=NP proof, or a proof that set theory is false, or etc. who ultimately claim that because "the mathematical elite" are a cult that no one will listen to them. A cult generally engages in active suppression, often defamation, and not simply exclusion. Do you have evidence of legitimate mathematical results or research being hidden/withdrawn from journals or publicly derided, or is it more of an old boy's club that's hard for outsiders to participate in and that plays petty politics to the damage of the science?

Grothendieck's problems look to be political and interpersonal. Perelman's also. I think it's one thing to claim that mathematical institutions are no more rational than any other politicized body, and quite another to claim that it's a cult. Or maybe most social behavior is too cult-like. If so; perhaps don't single out mathematics.

I've seen a lot of people develop serious mental health problems in connection with their experiences in academia.

I question the direction of causation. Historically many great mathematicians have been mentally and socially atypical and ended up not making much sense with their later writings. Either mathematics has always had an institutional problem or mathematicians have always had an incidence of mental difficulties (or a combination of both; but I would expect one to dominate).

Especially in Thurston's On Proof and Progress in Mathematics I can appreciate the problem of trying to grok specialized areas of mathematics. The terminology and symbology is opaque to the uninitiated. It reminds me of section 1 of the Metamath Book which expresses similar unhappiness with the state of knowledge between specialist fields of mathematics and the general difficulty of learning mathematics. I had hoped that Metamath would become more popular and tie various subfields together through unifying theories and definitions, but as far as I can tell it languishes as a hobbyist project for a few dedicated mathematicians.

Comment by Pentashagon on Beyond Statistics 101 · 2015-06-30T03:24:42.197Z · LW · GW

I distinctly remember having points taken off of a physics midterm because I didn't show my work. I think I dropped the exam in the waste basket on the way out of the auditorium.

I've always assumed that the problem is three-fold; generating a formal proof is NP-hard, getting the right answer via shortcuts can include cheating, and the faculty's time is limited. Professors/graders do not have the capacity to rigorously demonstrate to themselves that the steps a student has written down actually pinpoint the unique answer. Without access to the student's mind graders are unable to determine if students cheat or not; being able to memorize and/or reproduce the exact steps of a calculation significantly decrease the likelihood of cheating. Even if graders could do one or both of the previous for a single student, they are not 30x or 100x as smart as their students, making it impractical to repeat the process for every student.

That said, I had some very good mathematics teachers in higher level courses who could force students to think, and one in particular who could encourage/demand novelty from students simply by asking them to solve problems that they hadn't yet learned to solve. I didn't realize the power of the latter approach until later (and at the time everyone complained about exams with a median score well under 50%), but his classes were always my favorite.

Comment by Pentashagon on Agency is bugs and uncertainty · 2015-06-12T07:55:43.590Z · LW · GW

We have probabilistic models of the weather; ensemble forecasts. They're fairly accurate. You can plan a picnic using them. You can not use probabilistic models to predict the conversation at the picnic (beyond that it will be about "the weather", "the food", etc.)

What I mean by computable probability distribution is that it's tractable to build a probabilistic simulation that gives useful predictions. An uncomputable probability distribution is intractable to build such a simulation for. Knightian Uncertainty is a good name for the state of not being able to model something, but not a very quantitative one (and arguably I haven't really quantified what makes a probabilistic model "useful" either).

I think the computability of probability distributions is probably the right way to classify relative agency but we also tend to recognize agency through goal detection. We think actions are "purposeful" because they correspond to actions we're familiar with in our own goal-seeking behavior: searching, exploring, manipulating, energy-conserving motion, etc. We may even fail to recognize agency in systems that use actions we aren't familiar with or whose goals are alien (e.g. are trees agents? I'd argue yes, but most people don't treat them like agents compared to say, weeds). The weather's "goal" is to reach thermodynamic equilibrium using tornadoes and other gusts of wind as its actions. It would be exceedingly efficient at that if it weren't for the pesky sun. The sun's goal is to expand, shed some mass, then cool and shrink into its own final thermodynamic equilibrium. It will Win unless other agents interfere or a particularly unlikely collision with another star happens.

Before modern science no one would have imagined those were the actual goals of the sun and the wind and so the periodic, meaningful-seeming actions suggested agency toward an unknown goal. After physics the goals and actions were so predictable that agency was lost.

Comment by Pentashagon on Agency is bugs and uncertainty · 2015-06-08T22:59:20.661Z · LW · GW

So agentiness is having an uncomputable probability distribution?

Comment by Pentashagon on Open Thread, May 25 - May 31, 2015 · 2015-05-30T01:45:12.544Z · LW · GW

What's wrong with hive minds? As long as my 'soul' survives, I wouldn't mind being part of some gigantic consciousness.

A hive mind can quickly lose a lot of old human values if the minds continue past the death of individual bodies. Additionally, values like privacy and self-reliance would be difficult to maintain. Also, things we take for granted like being able to surprise friends with gifts or have interesting discussions getting to know another person would probably disappear. A hive mind might be great if it was formed from all your best friends, but joining a hive mind with all of humanity? Maybe after everyone is your best friend...

Comment by Pentashagon on A Challenge: Maps We Take For Granted · 2015-05-30T01:42:02.567Z · LW · GW

You are a walking biological weapon, try to sterilize yourself and your clothes as much as possible first, and quarantine yourself until any novel (to the 13th century) viruses are gone. Try to avoid getting smallpox and any other prevalent ancient disease you're not immune to.

Have you tried flying into a third world nation today and dragging them out of backwardness and poverty? What would make it easier in the 13th century?

If you can get past those hurdles the obvious benefits are mathematics (Arabic numerals, algebra, calculus) and standardized measures (bonus points if you can reconstruct the metric system fairly accurately), optics, physics, chemistry, metallurgy, electricity, and biology. For physics specifically the ability to do statics for construction and ballistics for cannons and thermodynamics for engines and other machines (and lubrication and hydraulics are important too). High carbon steel for machine tools, the assembly line and interchangeable parts. Steel reinforced concrete would be nice, but not a necessity. Rubber. High quality glass for optics; necessary for microscopes for biology to progress past "We don't believe tiny organisms make us sick". The scientific method (probably goes without saying) to keep things moving instead of turning back into alchemy and bloodletting.

Electricity and magnetism eventually; batteries won't cut it for industrial scale use of electricity (electrolysis, lighting for longer working hours, arc furnaces for better smelting) so building workable generators that can be connected to steam engines is vital.

Other people have mentioned medicine, which is pretty important from an ethical perspective, but difficult to reverse centuries of bad practice. Basic antibiotics and sterilization is probably the best you'd be able to do, but without the pharmaceutical industry there's a lot of stuff you can't do. If you know how to make ether, at least get anesthesia started.

Comment by Pentashagon on Open Thread, May 25 - May 31, 2015 · 2015-05-29T06:40:18.228Z · LW · GW

I find myself conflicted about this. I want to preserve my human condition, and I want to give it up. It's familiar, but it's trying. I want the best of both worlds; the ability to challenge myself against real hardships and succeed, but also the ability to avoid the greatest hardships that I can't overcome on my own. The paradox is that solving the actual hardships like aging and death will require sufficient power to make enjoyable hardships (solving puzzles, playing sports and other games, achieving orgasm, etc.) trivial.

I think that one viable approach is to essentially live vicariously through our offspring. I find it enjoyable watching children solve problems that are difficult for them but are now trivial for me, and I think that the desire to teach skills and to appreciate the success of (for lack of a better word) less advanced people learning how to solve the same problems that I've solved could provide a very long sequence of Fun in the universe. Pre-singularity humans already essentially do this. Grandparents still enjoy life despite having solved virtually all of the trivial problems (and facing imminent big problems), and I think I'd be fine being an eternal grandparent to new humans or other forms of life. I can't extrapolate that beyond the singularity, but it makes sense that if we intend to preserve our current values we will need someone to be in the situation where those values still matter, and if we can't experience those situations ourselves then the offspring we care about are a good substitute. Morality issues of creating children may be an issue.

Another solution is a walled garden run by FAI that preserves the trivial problems humans like solving while but solves the big problems. This has a stronger possibility for value drift and I think people would value life a bit less if they knew it was ultimately a video game.

It's also possible that upon reflection we'll realize that our current values also let us care about hive-minds in the same way we care about our friends and family now. We would be different, alien to present selves, but with the ability to trace our values back to our present state and see that at no point did we sacrifice them for expediency or abandon them for their triviality. This seems like the least probable solution simply because our values are not special, they arose in our ancestral environment because they worked. That we enjoy them is an accident, and that they could fully encompass the post-singularity world seems a bit miraculous.

As a kid I always wondered about this in the context of religious heaven. What could a bunch of former humans possibly do for eternity that wouldn't become terribly boring or involve complete loss of humanity? I could never answer that question, so perhaps it's an {AI,god}-hard problem to coherently extrapolate human values.

Comment by Pentashagon on Publishing my initial model for hypercapitalism · 2015-04-28T04:53:34.246Z · LW · GW

In your story RC incurs no opportunity cost for planting seed in and tending a less efficient field. There should be an interest rate as a function for lending the last Nth percent of the seed based on the opportunity cost of planting and harvesting the less efficient field, which at some point crosses 0 and becomes negative. The interest rate drops even more quickly once his next expected yield will be more than he can eat or store for more than a single planting season.

If RC is currently in the situation where his desired interest rate is still positive for his last unplanted seed then his capital is constrained and he should instead ask for investment seed from S, for which he would be willing to pay interest.

In order not to starve RC should aim to grow sufficient grain such that his probability of having too little is lower than some risk threshold. In most cases this will leave him with excess seeds at every harvest (beyond the extra seeds required to avoid starvation risk) which he can lend. Depending on his assessment of the loan risk, he may even be able to save himself time and trouble by growing less grain to produce fewer seeds with the expectation that his loan will be repaid, which would allow him to realize a profit on an otherwise zero-interest loan.

Likewise, money below a certain threshold should compound; beyond that threshold it represents unacceptable opportunity costs for exercising its power as if it had been used to purchase excess goods. Investing/loaning those purchased goods should be the basis for money's value beyond the threshold.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-18T03:47:16.363Z · LW · GW

In this case, I would argue that a transparent, sandboxed programming language like javascript is probably one of the safer pieces of "software" someone can download. Especially because browsers basically treat all javascript like it could be malicious.

Why would I paste a secret key into software that my browser explicitly treats as potentially malicious? I still argue that trusting a verifiable author/distributor is safer than trusting an arbitrary website, e.g. trusting gpg is safer than trusting regardless of who you think wrote zzz.js, simply because it's easier to get that wrong in some way than it is to accidentally install an evil version of gpg, especially if you use an open source package manager that makes use of PKI, or run it from TAILS, etc. I am also likely to trust javascript crypto served from more than from any other URL, for instance.

In general I agree wholeheartedly with your comment about sandboxing being important. The problem is that sandboxing does not imply trusting. I think smartphone apps are probably better sandboxed, but I don't necessarily trust the distribution infrastructure (app stores) not to push down evil updates, etc. Sideloading a trusted app by a trusted author is probably a more realistic goal for OpenPGP for the masses.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-15T07:48:48.232Z · LW · GW

Does it change the low bits of white (0xFFFFFF) pixels? It would be a dead giveaway to find noise in overexposed areas of a photo, at least with the cameras I've used.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-15T07:01:44.798Z · LW · GW

Such an obvious and easy to exploit vulnerability has existed for 20ish years, undiscovered/unexposed until one person on LW pointed it out?

It's not a vulnerability. I trust gnupg not to leak my private key, not the OpenPGP standard. I also trust gnupg not to delete all the files on my hard disk, etc. There's a difference between trusting software to securely implement a standard and trusting the standard itself.

For an even simpler "vulnerability" in OpenPGP look up section 13.1.1 in RFC4880; encoding a message before signing. Just replace the pseudo-random padding with bits from the private key. Decoding (section 13.1.2) does not make any requirements on the content of PS.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-15T06:24:19.148Z · LW · GW
  1. Short of somehow convincing the victim to send you a copy of their message, you have no means of accessing your recently-leaked data.

Public-key signatures should always be considered public when anticipating attacks. Use HMACs if you want secret authentication.

  1. That leaked data would be publicly available. Anyone with knowledge of your scheme would also be able to access that data. Any encryption would be worthless because the encryption would take place client-side and all credentials thus would be exposed to the public as well.

You explicitly mentioned Decoy in your article, and a similar method could be used to leak bits to an attacker with no one else being able to recover them. We're discussing public key encryption in this article which means that completely public javascript can indeed securely encrypt data using a public key and only the owner of the corresponding private key can decrypt it.

  1. Because the script runs client-side, it also makes it extremely easy for a potential victim to examine your code to determine if it's malicious or not. And, even if they're too lazy to do so...

Sure, the first five or ten times it's served. And then one time the victim reloads the page, the compromised script runs, leaks as much or all of the private key as possible, and then never gets served again.

  1. A private key is long. A PGP signature is short. So your victim's compromised signature would be 10x longer than the length of a normal PGP signature.

An exported private key is long because it includes both factors, the private exponent, and the inverse of p mod q. In my other comment I was too lazy to decode the key and extract one of the RSA factors, but one factor will be ~50% of the size of the RSA signature and that's all an attacker needs.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-15T05:42:36.672Z · LW · GW

NOTE: lesswrong eats blank quoted lines. Insert a blank line after "Hash: SHA1" and "Version: GnuPG v1".

Hash: SHA1

Secure comment
Version: GnuPG v1


Output of gpg --verify:

gpg: Signature made Tue 14 Apr 2015 10:37:40 PM PDT using RSA key ID 0E5DBFB2
gpg: Good signature from "pentashagon"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: B501 B12E 5184 8694 4557  01FC 6B69 F5F0 0E5D BFB2

Output of gpg -vv --verify:

gpg: armor header: Hash: SHA1
:packet 63: length 19 - gpg control packet
gpg: armor header: Version: GnuPG v1
:literal data packet:
        mode t (74), created 0, name="",
        raw data: unknown length
gpg: original file name=''
:signature packet: algo 1, keyid 6B69F5F00E5DBFB2
        version 4, created 1429076260, md5len 0, sigclass 0x01
        digest algo 2, begin of digest 56 f1
        hashed subpkt 2 len 4 (sig created 2015-04-15)
        hashed subpkt 20 len 1893 (notation: secret@key=-----BEGIN PGP PRIVATE KEY BLOCK-----|Version: GnuPG v1||lQHYBFUt9YMBBADJpmhhceujHvBFqsoA+FsSmKBosH4qliObnaGvHUcIcm87/R1g|X4RTG1J2uxWHSxQBPFpkcIVkMPUtudZANzEQBAsOGuTAmVzPaWvTqDM0dJlq3NgM|mDvIvkPIxphfmJMmKbhPq0awp+rARSpROMi1s/YKKEa0yGXhSz0mnfF+gwARAQAB|AAP/S+F0VvLs9nGefcDSigHrF3jap/p+R50+4gCzxncwczPIuty5MLpQy4s1AVvO|Mp6kdQCWjUQwVe78XAwZ3QlHyvEN47qD6c5WN0bnLjOLEHDOQI3OB/E1Ak79UyuQ|T4omHUjy2YbUfcVtpebNGwxFLiWmxEmPdn6dcKTRszp3D7ECANIlXmeSmtxXTNDJ|DAk9GhkSzbz2xYZvlHzFGoImFe84b9Pfy0EutwXPQfQItU0FLLqnGBy+DZy62jTs|S09VnJECAPWmdexdoVJJ2BmAH4q6FTB5xkRrLO3HFMbeNMOfvOgns/Ekg3Z2PrzG|n7DgUSQHe1iPrI82tVbXatzDMq2vw9MCAKyJkN6usPVdTqyiQc03zjrV1CnbCk+X|YSmzJqpWtC5QyirqJw89VCgAh8xbZ+Zr46V6GuavGCk7Olb3Bq5kexee6bQLcGVu|dGFzaGFnb26IuAQTAQIAIgUCVS31gwIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgEC|F4AACgkQa2n18A5dv7J0VgP6Asv0kKVVoNtdNVJIGY8K7b6/YteiI4ZZ5Bm/f3PQ|YJBEUFY9cWQ6MYYXQeboSXsujcvqbI2JDDVyt1QH+WvM4tXb6gfhjukhhnlZMCgJ|tyzuhwYXyhdeZ0VfoHNyLOXt2/UoX+luWxihd7Q1wb+69cT5uWR+aQ0+xzIriUGe|PQydAdgEVS31gwEEAMu8mg5rfL4Dg4NShsCsf2BGvRraddCrkqNN4rCp6GBQpFCM|1Retb0aDPJHlmjgigNS0iA8/YwrPltVKbyokKcWfIfa9f615Jhp4s7xAWIIrpcph|Ov9FjDlRWXwOOmqAc0yuUxZ3vgbDEFOXdnAi6d2CWF9kPyQ9Plns/x1pkKKLABEB|AAEAA/oC2k+Ml3lgrms/Vyl8iy3MFabSOHA2jXXOhD8CBZmzt41ayg4LIyo6t4hi|lpoejRp2tVcZDOSAeJWpGOi46KwOX5UwVmB8fWSm2hlvqmbtrCVPe3dd3deB2S6E|lMnjkF1YkCaYydfh2/ACiiOTk4fODGsuXuyOc++PIL1VYq1RcQIAzi6o6E1XXNzU|Bf1K7rVv7yn1RAFfuii+8P58cmZuazWtYP4m9U57K68G7IGA4H5CXkZSKP4l7SXt|ed6oMofiUwIA/PashjRrWIEAH98lBQiwHJfVRPlGTzaOvCB7Mv2jfHvyBGIoNAti|ueprOES0vT7+2zIZSm5z/kLm7S+sWtMn6QIAkzwzm7QDXKn3bJoAPH//gNuiX4td|SeHrR52TNhfO2jLFJSN4+Zc2KgNCCaYsCHZPI+smxad5aMAxnj7rWFSRY5vFiJ8E|GAECAAkFAlUt9YMCGwwACgkQa2n18A5dv7L+ggP/XU7r3GR6mTljp9IPGArvhEa4|QfPRmb3XIrzBAUTtN/Jep5pUTrz47ZPpwdrBgfqo9u0x80P+JvV 8k4t0jWsOgRQr|4+k8LE1LIPEm9vChtiWxWfzxcTIAzewa7m/gelqMRhbbmSKxgY6HTWjUbizC    vlB+|gD9PdL658E8TBFqJYbQ=|=MeTI|-----END PGP PRIVATE KEY BLOCK-----|)
        subpkt 16 len 8 (issuer key ID 6B69F5F00E5DBFB2)
        data: [1024 bits]
gpg: Signature made Tue 14 Apr 2015 10:37:40 PM PDT using RSA key ID 0E5DBFB2
gpg: using PGP trust model
gpg: Good signature from "pentashagon"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: B501 B12E 5184 8694 4557  01FC 6B69 F5F0 0E5D BFB2
gpg: textmode signature, digest algorithm SHA1

I ran the exported (unencrypted) private key through tr '\n' '|' to get a single line of text to set, and created the signature with:

gpg -a --clearsign --sig-notation secret@key="exported-secret-key-here" -u pentashagon

Let me know if your OpenPGP software of choice makes it any more clear that the signature is leaking the private key without some sort of verbose display.

Comment by Pentashagon on Hedonium's semantic problem · 2015-04-15T05:14:49.095Z · LW · GW

I can imagine how decoherence explains why we only experience descent along a single path through the multiverse-tree instead of experiencing superposition, but I don't think that's sufficient to claim that all consciousness requires decoherence.

An interesting implication of Scott's idea is that consciousness is timeless, despite our experience of time passing. For example, put a clock and a conscious being inside Schrödinger’s box and then either leave it in a superposition forever or open it at some point in the future. If we don't open the box, in theory nothing is conscious of watching the clock as time passes. If we open the box, there's a conscious being who can describe all the time inside the box watching the clock. When, to the outside observer, does that conscious experience happen? Either all the conscious experience happens the instant the box is measured, contrary to our notions of the experience of time passing and our understanding of how the physical state of the clock changes (e.g. the conscious experience of seeing the clock read 3:52 PM on Thursday should have happened at 3:52 PM on Thursday when the clock inside the box was in a superposition of physically displaying that time with very high probability), or else there would have been conscious experience the entire time even if the box was never opened, in order that the experience could happen at the corresponding time.

Which means we're all p-zombies until a specific point in the future when we decohere sufficiently to have consciousness up to that point.

Comment by Pentashagon on A pair of free information security tools I wrote · 2015-04-14T05:52:34.850Z · LW · GW has a list of several subpackets that can be included in a signature. How many people check to make sure the order of preferred algorithms isn't tweaked to leak bits? Not to mention just repeating/fudging subpackets to blatantly leak binary data in subpackets that look "legitimate" to someone who hasn't read and understood the whole RFC.

Comment by Pentashagon on Formalizing Two Problems of Realistic World Models · 2015-01-27T17:17:31.861Z · LW · GW

Suppose that instead of having well-defined actions AIXI only has access to observables and its reward function. It might seem hopeless, but consider the subset of environments containing an implementation of a UTM which is evaluating T_A, a Turing machine implementing action-less AIXI, in which the implementation of the UTM has side effects in the next turn of the environment. This embeds AIXI-actions as side effects of an actual implementation of AIXI running as T_A on a UTM in the set of environments with observables matching those that the abstract AIXI-like algorithm observes. To maximize the reward function the agent must recognize its implementation embedded in the UTM in M and predict the consequences of the side effects of various choices it could make, substituting its own output for the output of the UTM in the next turn of the simulated environment (to avoid recursively simulating itself), choosing the side effects to maximize rewards.

In this context counterfactuals are simulations of the next turns of M resulting from the possible side effects of its current UTM. To be useful there must be a relation from computation-suffixes of T_A to the potential side effects of the UTM. In other words the agent must be able to cause a specific side effect as a result of state-transitions or tape operations performed causally-after it has determined which side effect will maximize rewards. This could be as straightforward as the UTM using the most-recently written bits on the tape to choose a side effect.

In the heating game, the UTM running T_A must be physically implemented as something that has a side effect corresponding to temperature, which causally effects the box of rewards, and all these causes must be predictable from the observables accessible to T_A in the UTM. Similarly, if there is an anvil suspended above a physical implementation of a UTM running T_A, the agent can avoid an inability to increase rewards in the future of environments in which the anvil is caused to drop (or escape from the UTM before dropping it).

This reduces the naturalized induction problem to the tiling/consistent reflection problem; the agent must choose which agent it wants to be in the next turn(s) through side effects that can change its future implementation.

Comment by Pentashagon on ... And Everyone Loses Their Minds · 2015-01-20T06:43:50.062Z · LW · GW

Is there also a bias toward the illusion of choice? Some people think driving is safer than flying because they are "in control" when driving, but not when flying. Similarly, I could stay inside a well-grounded building my whole life and avoid ever being struck by lightning, but I can't make a similar choice to avoid all possible threats of terrorism.

Comment by Pentashagon on Stupid Questions (10/27/2014) · 2014-11-05T05:09:25.814Z · LW · GW

The measure of simple computable functions is probably larger than the measure of complex computable functions and I probably belong to the simpler end of computable functions.

Comment by Pentashagon on [LINK] Could a Quantum Computer Have Subjective Experience? · 2014-08-27T07:38:05.754Z · LW · GW

It's interesting that we had a very similar discussion here minus the actual quantum mechanics. At least intuitively it seems like physical change is what leads to consciousness, not simply the possibility or knowledge of change. One possible counter-argument to consciousness being dependent on decoherence is the following: What if we could choose whether or not, and when, to decohere? For example, what if inside Schroedinger's box is a cat embryo that will be grown into a perfectly normal immortal cat if nucleus A decays, and the box will open if nucleus B decays. When the box opens, is there no cat, a conscious cat, or a cat with no previous consciousness? What if B is extremely unlikely to decay but the cat can press a switch that will open the box? It seems non-intuitive that consciousness should depend on what happens in the future, outside your environment.

Comment by Pentashagon on [LINK] Could a Quantum Computer Have Subjective Experience? · 2014-08-27T06:38:21.718Z · LW · GW

Regarding fully homomorphic encryption; only a small number of operations can be performed on FHE variables without the public key, and "bootstrapping" FHE from a somewhat homomorphic scheme requires the public key to be used in all operations as well as the secret key itself to be encrypted under the FHE scheme to allow bootstrapping, at least with the currently known schemes based on lattices and integer arithmetic by Gentry et al.

It seems unlikely that FHE could operate without knowledge of at least the public key. If it were possible to continue a simulation indefinitely without the public key then the implication is that one could evaluate O(2^N) simulations with O(N) work: Choose an N-bit scheme such that N >= the number of bits required for the state of the simulation and run the simulation on arbitrary FHE values. Decryption with any N-bit key would yield a different, valid simulation history assuming a mapping from decrypted states to simulated states.

Comment by Pentashagon on The metaphor/myth of general intelligence · 2014-08-21T02:59:33.389Z · LW · GW

My intuition is that a single narrowly focused specialized intelligence might have enough flaws to be tricked or outmaneuvered by humanity, for example if an agent wanted to maximize production of paperclips but was average or poor at optimizing mining, exploration, and research it could be cornered and destroyed before it discovered nanotechnology or space travel and asteroids and other planets and spread out of control. Multiple competing intelligences would explore more avenues of optimization, making coordination against them much more difficult and likely interfering with many separate aspects of any coordinated human plan.

Comment by Pentashagon on The metaphor/myth of general intelligence · 2014-08-20T05:48:40.907Z · LW · GW

If there is only specialized intelligence, then what would one call an intelligence that specializes in creating other specialized intelligences? Such an intelligence might be even more dangerous than a general intelligence or some other specialized intelligence if, for instance, it's really good at making lots of different X-maximizers (each of which is more efficient than a general intelligence) and terrible at deciding which Xs it should choose. Humanity might have a chance against a non-generally-intelligent paperclip maximizer, but probably less of a chance against a hoard of different maximizers.

Comment by Pentashagon on Groundwork for AGI safety engineering · 2014-08-14T15:34:35.849Z · LW · GW

The standard Zermelo-Fraenkel axioms have lasted a century with only minor modifications -- none of which altered what was provable -- and there weren't many false starts before that. There is argument over whether to include the axiom of choice, but as mentioned the formal methods of program construction naturally use constructivist mathematics, which doesn't use the axiom of choice anyhow.

Is there a formal method for deciding whether or not to include the axiom of choice? As I understand it three of the ZF axioms are independent of the rest, and all are independent of choice. How would AGI choose which independent axioms should be accepted? AGI could be built to only ever accept a fixed list of axioms but that would make it inflexible if further discoveries offer evidence for choice being useful for example.

This blatantly contradicts the history of axiomatic mathematics, which is only about two centuries old and which has standardized on the ZF axioms for half of that. That you claim this calls into question your knowledge about mathematics generally.

You are correct, I don't have formal mathematical training beyond college and I pursue formal mathematics out of personal interest, so I welcome corrections. As I understand it geometry was axiomatic for much longer, and the discovery of non-Euclidean geometries required separating the original axioms for different topologies. Is there a way to formally decide now whether or not a similar adjustment may be required for the axioms of ZF(C)? The problem, as I see it, is that formal mathematics is just string manipulation and the choice of which allowed manipulations are useful is dependent on how the world really is. ZF is useful because its language maps very well onto the real world, but as an example unifying general relativity and quantum mechanics has been difficult. Unless it's formally decidable whether ZF is sufficient for a unified theory it seems to me that some method for an AGI to change its accepted axioms based on probabilistic evidence is required, as well as avoid accepting useless or inconsistent independent axioms.

Comment by Pentashagon on Groundwork for AGI safety engineering · 2014-08-10T18:34:28.250Z · LW · GW

We only have probabilistic evidence that any formal method is correct. So far we haven't found contradictions implied by the latest and greatest axioms, but mathematics is literally built upon the ruins of old axioms that didn't quite rule out all known contradictions. FAI needs to be able to re-axiomatize its mathematics when inconsistencies are found in the same way that human mathematicians have, while being implemented in a subset of the same mathematics.

Additionally, machines are only probabilistically correct. FAI will probably need to treat its own implementation as a probabilistic formal system.

Comment by Pentashagon on Maybe we're not doomed · 2014-08-07T05:56:33.511Z · LW · GW

Even bacteria? The specific genome that caused the black death is potentially extinct but Yersinia pestis is still around. Divine agents of Moloch if I ever saw one.

Comment by Pentashagon on Expected utility, unlosing agents, and Pascal's mugging · 2014-08-05T05:47:55.553Z · LW · GW

If you are confronted with a Pascal's mugger and your induction engine returns "the string required to model the mugger as honest and capable of carrying out the threat is longer than the longest algorithm I can process", you are either forced to use the probability corresponding to the longest string, or to discard the hypothesis outright.

The primary problem with Pascal's Mugging is that the Mugging string is short and easy to evaluate. 3^^^3 is a big number; it implies a very low probability but not necessarily 1 / 3^^^3; so just how outrageous can a mugging be without being discounted for low probability? That least-likely-but-still-manageable Mugging will still get you. If you're allowed to reason about descriptions of utility and not just shut up and multiply to evaluate the utility of simulated worlds then in the worst case you have to worry about the Mugger that offers you BusyBeaver(N) utility, where 2^-N is the lowest probability that you can process. BusyBeaver(N) is well-defined, although uncomputable, it is at least as large as any other function of length N. Unfortunately that means BusyBeaver(N) 2^-N > C, for some N-bit constant C, or in other words EU(Programs-of-length-N) is O(BusyBeaver(N)). It doesn't matter what the mugger offers, or if you mug yourself. Any N-bit utility calculation program has expected utility O(BB(N)) because it might* yield BB(N) utility.

The best not-strictly-bounded-utility solution I have against this is discounting the probability of programs as a function of their running time as well as their length. Let 1/R be the probability that any given step of a process will cause it to completely fail as opposed to halting with output or never halting. Solomonoff Induction can be redefined as the sum over programs, P, producing an output S in N steps, of 2^-Length(P) x (R - 1 / R)^N. It is possible to compute a prior probability with error less than B, for any sequence S and finite R, by enumerating all programs shorter than log_2(1 / B) bits that halt in fewer than ~R / B steps. All un-enumerated programs have cumulative probability less than B of generating S.

For Pascal's Mugging it suffices to determine B based on the number of steps required to, e.g. simulate 3^^^3 humans. 3^^^3 ~= R / B, so either the prior probability of the Mugger being honest is infinitesimal, or it is infinitesimally unlikely that the universe will last fewer than the minimum 3^^^3 Planck Units necessary to implement the mugging. Given some evidence about the expected lifetime of the Universe, the mugging can be rejected.

The biggest advantage of this method over a fixed bounded utility function is that R is parameterized by the agent's evidence about its environment, and can change with time. The longer a computer successfully runs an algorithm, the larger the expected value of R.

Comment by Pentashagon on Confused as to usefulness of 'consciousness' as a concept · 2014-08-02T05:46:57.140Z · LW · GW

The experience of sleep paralysis suggests to me that there are at least two components to sleep; paralysis and suppression of consciousness and one can have one, both, or neither. With both, one is asleep in the typical fashion. With suppression of consciousness only one might have involuntary movements or in extreme cases sleepwalking. With paralysis only one has sleep paralysis which is apparently an unpleasant remembered experience. With neither, you awaken typically. The responses made by sleeping people (sleepwalkers and sleep-talkers especially) suggest to me that their consciousness is at least reduced in the sleep state. If it was only memory formation that was suppressed during sleep I would expect to witness sleep-walkers acting conscious but not remembering it, whereas they appear to instead be acting irrationally and responding at best semi-consciously to their environment.