Comment by vadim-kosoy on More realistic tales of doom · 2019-04-12T14:24:34.832Z · score: 4 (2 votes) · LW · GW

I agree that robot armies are an important aspect of part II.

Why? I can easily imagine an AI takeover that works mostly through persuasion/manipulation, with physical elimination of humans coming only as an "afterthought" when AI is already effectively in control (and produced adequate replacements for humans for the purpose of physically manipulating the world). This elimination doesn't even require an "army", it can look like everyone agreeing to voluntary "euthanasia" (possibly not understanding its true meaning). To the extent physical force is involved, most of it might be humans against humans.

Comment by vadim-kosoy on Two Neglected Problems in Human-AI Safety · 2019-04-09T15:40:12.107Z · score: 5 (2 votes) · LW · GW

I certainly agree that humans might have critical failures of judgement in situations that are outside of some space of what is "comprehensible". This is a special case of what I called "corrupt states" when talking about DRL, so I don't feel like I have been ignoring the issue. Of course there is a lot more work to be done there (and I have some concrete research directions how to understand this better).

Reinforcement learning with imperceptible rewards

2019-04-07T10:27:34.127Z · score: 18 (6 votes)
Comment by vadim-kosoy on Rule Thinkers In, Not Out · 2019-03-18T21:27:33.841Z · score: 4 (2 votes) · LW · GW

Oh, I just use the pronoun "ey" for everyone. IMO the entire concept of gendered pronouns is net harmful.

Comment by vadim-kosoy on Blegg Mode · 2019-03-15T21:16:07.055Z · score: 2 (2 votes) · LW · GW

Hmm. Why would the entity feel disrespected by how many clusters the workers use? I actually am aware that this is an allegory for something else. Moreover, I think that I disagree you with about the something else (although I am not sure since I am not entirely sure what's your position about the something else is). Which is to say, I think that this allegory misses crucial aspects of the original situation and loses the crux of the debate.

Comment by vadim-kosoy on Blegg Mode · 2019-03-14T20:52:53.921Z · score: 8 (4 votes) · LW · GW

Alright, but then you need some (at least informal) model of why computationally bounded agents need categories. Instead, your argument seems to rely purely on the intuition of your fictional character ("you notice that... they seem to occupy a third category in your ontology of sortable objects").

Also, you seem to assume that categories are non-overlapping. You write "you don't really put them in the same mental category as bleggs". What does it even mean, to put two objects in the same or not the same category? Consider a horse and a cow. Are they in the same mental category? Both are in the categories "living organisms", "animals", "mammals", "domesticated mammals". But, they are different species. So, sometimes you put them in the same category, sometimes you put them in different categories. Are "raven" and "F16 aircraft" in the same category? They are if your categories are "flying objects" vs. "non-flying objects", but they aren't if your categories are "animate" vs. "non-animate".

Moreover, you seem to assume that categories are crisp rather than fuzzy, which is almost never the case for categories that people actually use. How many coins does it take to make a "pile" of coins? Is there an exact number? Is there an exact age when a person gets to be called "old"? If you take a table made out of a block of wood, and start to gradually deform its shape until it becomes perfectly spherical, is there an exact point when it is no longer called a "table"? So, "rubes" and "bleggs" can be fuzzy categories, and the anomalous objects are in the gray area that defies categorization. There's nothing wrong with that.

If we take this rube/blegg factory thought experiment seriously, then what we need to imagine is the algorithm (instructions) that the worker in the factory executes. Then you can say that the relevant "categories" (in the context of the factory, and in that context only) are the vertices in the flow graph of the algorithm. For example, the algorithm might be, a table that specifies how to score each object (blue +5 points, egg-shaped +10 points, furry +1 point...) and a threshold which says what the score should to be to put it in a given bin. Then there are essentially only two categories. Another algorithm might be "if object passes test X, put in the rube bin, if object passes test Y, put it in the blegg bin, if object passes neither test, put in in the Palladium scanner and sort according to that". Then, you have approximately seven categories: "regular rube" (passed test X), "regular blegg" (passed test Y), "irregular object" (failed both tests), "irregular rube" (failed both tests and found to contain enough Palladium), "irregular blegg" (failed both tests and found to contain not enough Palladium), "rube" (anything put in the rube bin) and "blegg" (anything put in the blegg bin). But in any case, the categorization would depend on the particular trade-offs that the designers of the production line made (depending on things like, how expensive is it to run the palladium scanner), rather than immutable Platonic truths about the nature of the objects themselves.

Then again, I'm not entirely sure whether we are really disagreeing or just formulating the same thing in different ways?

Comment by vadim-kosoy on Blegg Mode · 2019-03-13T19:02:44.101Z · score: 3 (2 votes) · LW · GW

I don't understand what point are you trying to make.

Presumably, each object has observable properties and unobservable properties . The utility of putting an object into bin A is and the utility of putting it into bin B is . Therefore, your worker should put an object into bin A if an only if

That's it. Any "categories" you introduce here are at best helpful heuristics, with no deep philosophical significance.

Comment by vadim-kosoy on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T18:47:30.377Z · score: 1 (1 votes) · LW · GW

Makes perfect sense, forget I asked.

Comment by vadim-kosoy on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T17:27:29.660Z · score: 1 (1 votes) · LW · GW

I'm confused. Wouldn't it mean that even without this trick laser sail is only for nearby missions?

Comment by vadim-kosoy on 'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples · 2019-03-01T21:45:02.052Z · score: 4 (3 votes) · LW · GW

Amusingly, one of the sample texts contained the Japanese "一生える山の図の彽をふるほゥていしまうもようざないかった" which google translate renders as "I had no choice but to wear a grueling of a mountain picture that would last me" (no, it doesn't make sense in context).

Comment by vadim-kosoy on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-01T21:38:03.281Z · score: 2 (2 votes) · LW · GW

Some options you didn't mention (maybe on purpose because they are less efficient?):

  • Cheating the rocket equation using pulse propulsion
  • Breaking a laser sail spaceship by having a mirror that detaches and reflects the laser back to the spaceship but from the opposite direction (don't remember whose idea that is)

Also, your rocket equation is non-relativistic, although IIRC the relativistic equation is the same just with change in rapidity instead of change in velocity.

Comment by vadim-kosoy on So You Want To Colonize The Universe Part 3: Dust · 2019-03-01T19:04:06.866Z · score: 3 (3 votes) · LW · GW

Usual neutrinos or dark matter won't work, but if we go to the extremely speculative realm, there might be some "hidden sector" of matter that doesn't normally interact with ordinary matter but allows complex structure. Producing it and doing anything with it would be very hard, but not necessarily impossible.

Comment by vadim-kosoy on So You Want To Colonize The Universe Part 3: Dust · 2019-03-01T19:00:54.515Z · score: 2 (2 votes) · LW · GW

This is extremely speculative, but one way it could be possible to build very sturdy probes is, if we there was a phase of matter whose binding energies were typical of the nuclear forces (or some other, hitherto unknown strong force) rather than the electromagnetic force, like usual matter. Strangelets are one candidate.

Comment by vadim-kosoy on So You Want to Colonize the Universe Part 2: Deep Time Engineering · 2019-03-01T17:15:43.056Z · score: 4 (4 votes) · LW · GW

Instead of delivering a vessel that can support earth-based life for hundreds of millions of years, we just have to deliver about 100 kg of Von Neumann probes and stored people, which build more of themselves.

We don't necessarily need stored people. The probe can unfold into basic infrastructure + receiver, and the people can be transmitted by some communication channel (radio, laser or something more exotic).

Comment by vadim-kosoy on So You Want to Colonize The Universe · 2019-03-01T17:06:42.890Z · score: 6 (3 votes) · LW · GW

I think that the Landauer limit argument was debunked.

Comment by vadim-kosoy on Rule Thinkers In, Not Out · 2019-02-28T21:02:44.642Z · score: 9 (2 votes) · LW · GW

It seems that Einstein was just factually wrong, since ey did not expect the EPR paradox to be empirically confirmed (which only happened after eir death), but intended it as a reductio ad absurdum. Of course, thinking of the paradox did contribute to our understanding of QM, in which sense Einstein played a positive role here, paradoxically.

Comment by vadim-kosoy on Rule Thinkers In, Not Out · 2019-02-27T14:54:28.311Z · score: 9 (6 votes) · LW · GW

Einstein seems to have batted a perfect 1000

Did ey? As far as I know, ey continued to resist quantum mechanics (in its ultimate form) for eir entire life, and eir attempts to create a unified field theory led to nothing (or almost nothing).

Comment by vadim-kosoy on Some disjunctive reasons for urgency on AI risk · 2019-02-17T21:46:01.851Z · score: 12 (4 votes) · LW · GW

I think that this problem is in the same broad category as "invent general relativity" or "prove the Poincare conjecture". That is, for one thing quantity doesn't easily replace talent (you couldn't invent GR just as easily with 50 mediocre physicists instead of one Einstein), and, for another thing, the work is often hard to parallelize (50 Einsteins wouldn't invent GR 50 times as fast). So, you can't solve it just by spending lots of resources in a short time frame.

Comment by vadim-kosoy on Some disjunctive reasons for urgency on AI risk · 2019-02-16T22:40:55.314Z · score: 14 (5 votes) · LW · GW

Where do you draw the line between "the people in that industry will have the time and skill to notice the problems and start working on them" and what is happening now, which is: some people in the industry (at least, you can't argue DeepMind and OpenAI are not in the industry) noticed there is a problem and started working on it? Is it an accurate representation of the no-foom position to say, we should only start worrying when we literally observe a superhuman AI that is trying to take over the world? What if, AI takes years to gradually push humans to the sidelines, but the process in unstoppable because this time is not enough to solve alignment from scratch and the economic incentives to keep employing and developing AI are too strong to fight against?

Comment by vadim-kosoy on How does OpenAI's language model affect our AI timeline estimates? · 2019-02-15T17:29:48.433Z · score: 19 (6 votes) · LW · GW

They claim beating records on a range of standard tests (such as the Winograd schema), which is not something you can cheat by cherry-picking, assuming they are honest about the results.

Comment by vadim-kosoy on The Hamming Question · 2019-02-13T20:19:38.961Z · score: 1 (1 votes) · LW · GW

I think we probably use the phrase "love of knowledge" differently. The way I see it, if you love knowledge then you must engage with questions addressable with the current tools in a way that brings the field forward, otherwise you are not gaining any knowledge, you are just wasting your time or fooling yourself and others. If certain scientists get spurious results because of poor methodology, there is no love of knowledge in it. I also don't think they use poor methodology because of desire for knowledge at all: rather, they probably do it because of the pressure to publish and because of the osmosis of some unhealthy culture in their field.

Comment by vadim-kosoy on The Hamming Question · 2019-02-13T14:54:15.632Z · score: 1 (1 votes) · LW · GW

What is the difference between love of knowledge and "advancing the field"? Most researchers seem to focus on questions that are some combination of (i) interesting personally to them (ii) would bring them fame and (iii) would bring them grants. It would be awfully convenient for them if that is literally the best estimate you could make of what research will ultimately be useful, but I doubt it is case. Some research that "advances the field" is actively harmful (e.g. advancing AI capabilities without advancing understanding, improving the ability to create synthetic pandemics, creating other technologies that are easy to weaponize by bad actors, creating technology that shifts economic incentives towards massive environmental damage...)

Comment by vadim-kosoy on The Hamming Question · 2019-02-12T20:13:38.415Z · score: 6 (4 votes) · LW · GW

For one thing, this observation is strongly confounded by other characteristics that are different between those fields. For another, yes, I know that often something that was studied just for the love of knowledge has tremendous applications later. And yet, I feel that, if your goal is improving the world then there is room for more analysis than, "does it seem interesting to study that". Also what I consider "practical" is not necessarily what is normally viewed as "practical". For example I consider it "practical" to research a question because it may have some profound consequences many decades down the line, even if that's only backed by broad considerations rather than some concrete application.

Comment by vadim-kosoy on The Hamming Question · 2019-02-08T23:59:43.998Z · score: 1 (1 votes) · LW · GW

I agree with the general principle, it's just that, my impression is that most scientists have asked themselves this question and made more or less reasonable decisions regarding it, with respect to the scale of importance prevalent in the academia. From my (moderate amount of) experience, most scientists would love to crack the biggest problem in their field if they think they have a good shot at it.

Comment by vadim-kosoy on The Hamming Question · 2019-02-08T22:19:39.353Z · score: 8 (6 votes) · LW · GW

By the way, I never understood why it's supposed to be such a "trick" question. "Why aren't you working on them?" the obvious answer is, diminishing returns. If a lot of people (or a lot of "total IQ") already goes into problem X, then adding more to problem X might be less useful than adding more to problem Y, which is less important but also more neglected.

In the context of our community, people might interpret it as something like "why aren't more people working on mitigating X-risk instead of studying academic questions with no known applications", which is a good question, but it's not the same. The key here is the meaning of "important". For most academics, "important" means "acknowledged as important in the academia", or at best "intrinsically interesting". On the other hand, for EA-minded people "important" means "has actual positive influence on the world". This difference in the meaning of "important" seems much more important than blaming people for not choosing the most important question on a scale they already accept.

Comment by vadim-kosoy on How does Gradient Descent Interact with Goodhart? · 2019-02-02T13:29:31.022Z · score: 1 (1 votes) · LW · GW

To clarify, my answer addresses specifically the case where is the true risk and is the empirical risk of some ANN, for some offline learning task. The question is more general, but I think this special case is important and instructional.

Comment by vadim-kosoy on How does Gradient Descent Interact with Goodhart? · 2019-02-02T13:06:32.528Z · score: 22 (10 votes) · LW · GW

My answer will address the special case where is the true risk and the empirical risk of an ANN for some offline learning task. The question is more general, but I think that this special case is important and instructional.

The questions you ask seem to be closely related to statistical learning theory. Statistical learning theory studies questions such as, (a) "how many samples do I need in my training set, to be sure that the model my algorithm learned generalizes well outside the training set?" and (b) "is the given training set sufficiently representative to warrant generalization?" In the case of AI alignment, we are worried that, a model of human values trained on a limited data set will not generalize well outside this data set. Indeed, many classical failure modes that have been speculated about can be regarded as, human values coincide with a different function inside the training set (e.g. with human approval, or with the electronic reward signal fed into the AI) but not on the entire space.

The classical solutions to questions (a) and (b) for offline learning is Vapnik-Chervonenkis theory and its various offshoots. Specifically, VC dimension determines the answer to question (a) and Rademacher complexity determines the answer to question (b). The problem is, VC theory is known to be inadequate for deep learning. Specifically, the VC dimension of a typical ANN architecture is too large to explain the empirical sample complexity. Moreover, risk minimization for ANNs is known to be NP-complete (even learning the intersection of two half-spaces is already NP-complete), but gradient descent obviously does a "good enough" job in some sense, despite that most known guarantees for gradient descent only apply to convex cost functions.

Explaining these paradoxes is an active field of research, and the problem is not solved. That said, there have been some results in recent years that IMO provide some very important insights. One type of result is, algorithms different from gradient descent that provably learn ANNs under some assumptions. Goel and Klivans 2017 show how to learn an ANN with two non-linear layers. They bypass the no-go results by assuming either learning a sufficiently smooth function, or, learning a binary classification but with a sufficient margin between the positive examples and the negative examples. Zhang et al 2017 learn ANNs with any number of layers, assuming a margin and regularization. Another type of result is Allen-Zhu, Li and Liang 2018 (although it wasn't peer reviewed yet, AFAIK, and I certainly haven't verified all the proofs). They examine a rather realistic algorithm (ReLU response, stochastic gradient descent with something similar to drop-outs) but limited to three layers. Intriguingly, they prove that the algorithm successfully achieves improper learning of a different class of functions, which have the form of somewhat smaller ANNs with smooth response. The "smoothness" serves the same role as in the previous results, to avoid the NP-completeness results. The "improperness" solves the apparent paradox of high VC dimension: it is indeed infeasible to learn a function that looks like the actual neural network, instead the neural network is just a means to learn a somewhat simpler function.

I am especially excited about the improper learning result. Assuming this result can be generalized to any number of layers (and I don't see any theoretical reason why it can't), the next question is understanding the expressiveness of the class that is learned. That is, it is well known that ANNs can approximate any Boolean circuit, but the "smooth" ANNs that are actually learned are somewhat weaker. Understanding this class of functions better might provide deep insights regarding the power and limitation of deep learning.

Comment by vadim-kosoy on 2018 AI Alignment Literature Review and Charity Comparison · 2019-02-01T18:07:40.047Z · score: 3 (2 votes) · LW · GW

I would like to emphasis that there is a lot of research I didn't have time to review, especially in this section, as I focused on reading organisation-donation-relevant pieces. For example, Kosoy's The Learning-Theoretic AI Alignment Research Agenda seems like a worthy contribution.

I would like to note that my research is funded by MIRI, so it is somewhat organisation-donation-relevant.

Comment by vadim-kosoy on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-01-26T21:59:45.743Z · score: 5 (3 votes) · LW · GW

Number 3 is an interesting claim, but I would assume that, if this is true and DeepMind are aware of this, they would just find a way to erase the spam clicks from the human play database.

Comment by vadim-kosoy on For what do we need Superintelligent AI? · 2019-01-26T12:59:38.909Z · score: 7 (5 votes) · LW · GW

The difference between "slightly above human" and "very high level of superintelligence" is difficult to grasp, because we don't have a good way to quantify intelligence and don't have a good way to predict how much intelligence you need to achieve something. That said, some plausible candidates (in addition to the two you mentioned, which are reasonable) are:

  1. Solving all other X-risks
  2. Constructing a Dyson sphere or something else that will allow much more efficient and massive conversion of physical resources to human flourishing
  3. Solving all problems of society/government/economics, except to the extent we want to solve them ourselves
  4. Creating a way of life for everyone which is neither oppressive (like having to work in a boring and/or unpleasant job) nor dull or meaningless
  5. Finding the optimal way to avert a Malthusian catastrophe while satisfying the human preferences for reproduction and immortality
  6. Allowing us to modify/improve the minds of ourselves and our descendants, and/or create entirely new kinds of minds, while protecting us from losing our values and identities, or unintentionally triggering a moral catastrophe
  7. Solving all moral conundrums involving animals, wild nature and other non-human minds, if such exist
  8. Negotiating with aliens, if such exist (but that is probably very non-urgent)

Regarding near-light-speed space travel (and space colonization), it does seem necessary if you want to make the best use of the universe.

Also, I think Gurkenglas has a very good point regarding acausal trade.

Comment by vadim-kosoy on Cooperative Oracles · 2019-01-25T15:08:49.691Z · score: 1 (1 votes) · LW · GW

"the combination of conditions 2 and 3 still has some changes being labeled as improvements that wouldn't be improvements under the old concept of Pareto Optimality."

Why? Condition 3 implies that U_{RO,j} \leq U_{RO',j}. So, together with condition 2, we get that U_{RO,j} \leq U_{RO',j} for any j. That precisely means that this is a Pareto improvement in the usual sense.

Comment by vadim-kosoy on Safely and usefully spectating on AIs optimizing over toy worlds · 2019-01-25T15:00:25.901Z · score: 1 (1 votes) · LW · GW

My point is, I don't think it's possible to implement a strong computationally feasible agent which doesn't search through possible hypotheses, because solving the optimization problem for the hard-coded ontology is intractable. In other words, what gives intelligence its power is precisely the search through possible hypotheses.

Comment by vadim-kosoy on Announcement: AI alignment prize round 4 winners · 2019-01-25T14:32:28.561Z · score: 3 (2 votes) · LW · GW

I can probably spend some time (perhaps around 4 hours / week) on mentoring, especially for new researchers that want to contribute to the learning-theoretic research agenda or its vicinity. However, I am not sure how to make this known to the relevant people. Should I write a post that says "hey, who wants a mentor?" Is there a better way?

Comment by vadim-kosoy on Announcement: AI alignment prize round 4 winners · 2019-01-25T14:19:55.936Z · score: 9 (4 votes) · LW · GW

As an anecdata point, it seems probable that I would not write the essay about the learning-theoretic research agenda without the prize, or at least, it would be significantly delayed. This is because I am usually reluctant to publish anything that doesn't contain non-trivial theorems, but it felt like for this prize it would be suitable (this preference is partially for objective reasons, but partially it is for entirely subjective motivation issues). In hindsight, I think that spending the time to write that essay was the right decision regardless of the prize.

Comment by vadim-kosoy on Announcement: AI alignment prize round 4 winners · 2019-01-20T18:31:04.984Z · score: 13 (6 votes) · LW · GW

Congratulations to all the winners!

I hadn't completed my intended submission in time for the last round, and was looking forward to compete in the next one, so it's slightly disappointing. Oh, well. IMHO it could work in the long run if it was substantially less frequent: say, once in 2 years.

In any case, I think it was a great initiative and I sincerely thank the organizing team for making it happen!

Comment by vadim-kosoy on Book Review: The Structure Of Scientific Revolutions · 2019-01-09T14:45:48.595Z · score: 11 (6 votes) · LW · GW

Crossposted from SSC comments section

That problem— What must the world be like in order that man may know it?— was not, however, created by this essay. On the contrary, it is as old as science itself, and it remains unanswered. But it need not be answered in this place.

At this point I lose patience. Kuhn is no longer being thought-provoking, he’s being disingenuous. IT’S BECAUSE THERE’S AN OBJECTIVE REALITY, TOM.

I haven’t read Kuhn and I don’t know whether I’m interpreting em correctly, but to me it seems not that simple at all.

Saying there is an objective reality doesn’t explain why this reality is comprehensible. In statistical learning theory there are various analyses of what mathematical conditions must hold for it to be possible to learn a model from observations (i.e. so that you can avoid the no-free-lunch theorems) and how difficult it is to learn it, and when you add computational complexity considerations into the mix it becomes even more complicated. Our understanding of these questions is far from complete.

In particular, our ability to understand physics seems to rely on the hierarchical nature of physical phenomena. You can discover classical mechanics without knowing anything about molecules or quantum physics, you can discover atomic and molecular physics while knowing little about nuclear physics, and you can discover nuclear and particle physics without understanding quantum gravity (i.e. what happens to spacetime on the Planck scale). If the universe was s.t. it is impossible to compute the trajectory of a tennis ball without string theory, we might have never discovered any physics.

Comment by vadim-kosoy on Player vs. Character: A Two-Level Model of Ethics · 2018-12-15T11:57:52.320Z · score: 5 (4 votes) · LW · GW

I strongly agree with these comments regarding is-ought. To add a little, talking about winning/losing, effective strategies or game theory assumes a specific utility function. To say Maria Teresa "lost" we need to first agree that death and pain are bad. And even the concept of "survival" is not really well-defined. What does it mean to survive? If humanity is replaced by "descendants" which are completely alien or even monstrous from our point of view, did humanity "survive"? Surviving means little without thriving and both concepts are subjective and require already having some kind of value system to specify.

Comment by vadim-kosoy on Player vs. Character: A Two-Level Model of Ethics · 2018-12-15T11:49:08.479Z · score: 5 (3 votes) · LW · GW

It also seems worth pointing out that the referent of the metaphor indeed has more than two levels. For example, we can try to break it down as genetic evolution -> memetic evolution -> unconscious mind -> conscious mind. Each level is a "character" to the "player" of the previous level. Or, in computer science terms, we have a program writing a program writing a program writing a program.

Comment by vadim-kosoy on Towards a New Impact Measure · 2018-11-22T15:45:00.206Z · score: 3 (2 votes) · LW · GW

The proof of Theorem 1 is rather unclear: "high scoring" is ill-defined, and increasing the probability of some favorable outcome doesn't prove imply that the action is good for since it can also increase the probability of some unfavorable outcome. Instead, you can easily construct by hand a s.t. , using only that (just set to equal for any history with prefix and for any history with prefix ).

Comment by vadim-kosoy on Cooperative Oracles · 2018-11-17T16:53:09.704Z · score: 1 (1 votes) · LW · GW

The definition of stratified Pareto improvement doesn't seem right to me. You are trying to solve the problem that there are too many Pareto optimal outcomes. So, you need to make the notion of Pareto improvement weaker. That is, you want more changes to count as Pareto improvements so that less outcomes count as Pareto optimal. However, the definition you gave is strictly stronger than the usual definition of Pareto improvement, not strictly weaker (because condition 3 has equality instead of inequality). What it seems like you need is dropping condition 3 entirely.

The definition of almost stratified Pareto optimum also doesn't make sense to me. What problem are you trying to solve? The closure of a set can only be larger than the set. Also, the closure of an empty set is empty. So, on the one hand, any stratified Pareto optimum is in particular an almost stratified Pareto optimum. On the other hand, if there exists an almost stratified Pareto optimum, then there exists a stratified Pareto optimum. So, you neither refine the definition of an optimum nor make existence easier.

Dimensional regret without resets

2018-11-16T19:22:32.551Z · score: 9 (4 votes)
Comment by vadim-kosoy on Two Kinds of Technology Change · 2018-10-11T17:49:48.185Z · score: 1 (1 votes) · LW · GW

Hmm, interesting. But why was the cost of capital relative to labor so high?

Comment by vadim-kosoy on Two Kinds of Technology Change · 2018-10-11T09:56:01.981Z · score: 6 (4 votes) · LW · GW

This analysis seems correct but somewhat misleading. Specifically, I think that when a technology is enabled by a change in economic conditions, it is often the case that the change in economic conditions was caused by a different technology. So, the ultimate limiting factor is still insight.

In particular, Gutenberg's printing press doesn't seem like a great example for the "insight is not the limiting factor" thesis. First, the Chinese had movable type earlier but it was not as efficient with the Chinese language because of the enormous number of characters, which is why it didn't become more popular in China. Second, you say yourself that "printing presses with movable type followed a century later". A century is still a lot of time! Third, coming back to what I said before, why did paper production only took off in Europe in the 1300s? As far as I understand, it was invented in China, from there it propagated to the Muslim word, and from there it reached Europe through Spain. So, for many centuries, the reason the Europeans didn't use paper was lack of insight. Only when the knowledge that originated it China reached them did they catch on.

Comment by vadim-kosoy on Thinking of the days that are no more · 2018-10-08T18:19:42.925Z · score: 3 (2 votes) · LW · GW

You can get divorced and still have both parents in your kids' lives. Conversely, you can remain married and make your kids miserable. There is no system that can force someone to be a good parent.

Comment by vadim-kosoy on Thinking of the days that are no more · 2018-10-06T19:03:16.139Z · score: 6 (5 votes) · LW · GW

I mostly agree with all of that, but also, the case against "everyone managed it back in the good old days" seems understated IMO. If in the "good old days" everyone stayed in miserable relationships because of social and legal barriers to leaving, that's not a point in favor of the good old days. I don't see the advantage of modern society in this respect as a trade-off, it seems more like a win-win.

Comment by vadim-kosoy on Realism about rationality · 2018-10-05T09:31:15.447Z · score: 1 (1 votes) · LW · GW

Nearly everything you said here was already addressed in my previous comment. Perhaps I didn't explain myself clearly?

It would be trickier for the device I described to pull off such a deception, because it would have to actually halt and show us its output in such cases.

I wrote before that "I wonder how would you tell whether it is the hypercomputer you imagine it to be, versus the realization of the same hypercomputer in some non-standard model of ZFC?"

So, the realization of a particular hypercomputer in a non-standard model of ZFC would pass all of your tests. You could examine its internal state or its output any way you like (i.e. ask any question that can be formulated in the language of ZFC) and everything you see would be consistent with ZFC. The number of steps for a machine that shouldn't halt would be a non-standard number, so it would not fit on any finite storage. You could examine some finite subset of its digits (either from the end or from the beginning), for example, but that would not tell you the number is non-standard. For any question of the form "is larger than some known number ?" the answer would always be "yes".

But finite resource bounds already prevent us from completely ruling out far-fetched hypotheses about even normal computers. We’ll never be able to test, e.g., an arbitrary-precision integer comparison function on all inputs that could feasibly be written down. Can we be sure it always returns a Boolean value, and never returns the Warner Brothers dancing frog?

Once again, there is a difference of principle. I wrote before that: "...given an uncomputable function and a system under test , there is no sequence of computable tests that will allow you to form some credence about the hypothesis s.t. this credence will converge to when the hypothesis is true and when the hypothesis is false. (This can be made an actual theorem.) This is different from the situation with normal computers (i.e. computable ) when you can devise such a sequence of tests."

So, with normal computers you can become increasingly certain your hypothesis regarding the computer is true (even if you never become literally 100% certain, except in the limit), whereas with a hypercomputer you cannot.

Actually, hypothesizing that my device “computed” a nonstandard version of the halting function would already be sort of self-defeating from a standpoint of skepticism about hypercomputation, because all nonstandard models of Peano arithmetic are known to be uncomputable.

Yes, I already wrote that: "Although you can in principle have a class of uncomputable hypotheses s.t. you can asymptotically verify is in the class, for example the class of all functions s.t. it is consistent with ZFC that is the halting function. But the verification would be extremely slow and relatively parsimonious competing hypotheses would remain plausible for an extremely (uncomputably) long time. In any case, notice that the class itself has, in some strong sense, a computable description: specifically, the computable verification procedure itself."

So, yes, you could theoretically become certain the device is a hypercomputer (although reaching high certainly would take very long time), without knowing precisely which hypercomputer it is, but that doesn't mean you need to add non-computable hypotheses to your "prior", since that knowledge would still be expressible as a computable property of the world.

I don’t know enough about Solomonoff induction to say whether it would unduly privilege such hypotheses over the hypothesis that the device was a true hypercomputer (if it could even entertain such a hypothesis).

Literal Solomonoff induction (or even bounded versions of Solomonoff induction) is probably not the ultimate "true" model of induction, I was just using it as a simple example before. The true model will allow expressing hypotheses such as "all the even-numbered bits in the sequence are ", which involve computable properties of the environment that do not specify it completely. Making this idea precise is somewhat technical.

Comment by vadim-kosoy on The Rocket Alignment Problem · 2018-10-04T13:00:05.762Z · score: 13 (11 votes) · LW · GW

I agree with Ben, and also, humanity successfully sent a spaceship to the moon surface on the second attempt and successfully sent people (higher stakes) to the moon surface on the first attempt. This shows that difficult technological problems can be solved without extensive trial and error. (Obviously some trial and error on easier problems was done to get to the point of landing on the moon, and no doubt the same will be true of AGI. But, there is hope that the actual AGI can be constructed without trial and error, or at least without the sort of trial and error where error is potentially catastrophic.)

Comment by vadim-kosoy on Realism about rationality · 2018-10-03T09:47:20.369Z · score: 1 (1 votes) · LW · GW

In some sense, yes, although for conventional computers you might settle on very slow verification. Unless you mean that, your mind has only finite memory/lifespan and therefore you cannot verify an arbitrary conventional computer within any given credence, which is also true. Under favorable conditions, you can quickly verify something in PSPACE (using interactive proof protocols), and given extra assumptions you might be able to do better (if you have two provers that cannot communicate you can do NEXP, or if you have a computer whose memory you can reliably delete you can do an EXP-complete language), however it is not clear whether you can be justifiably highly certain of such extra assumptions.

See also my reply to lbThingrb.

Comment by vadim-kosoy on Realism about rationality · 2018-10-03T09:35:43.902Z · score: 4 (3 votes) · LW · GW

It is true that a human brain is more precisely described as a finite automaton than a Turing machine. And if we take finite lifespan into account, then it's not even a finite automaton. However, these abstractions are useful models since they become accurate in certain asymptotic limits that are sufficiently useful to describe reality. On the other hand, I doubt that there is a useful approximation in which the brain is a hypercomputer (except maybe some weak forms of hypercomputation like non-uniform computation / circuit complexity).

Moreover, one should distinguish between different senses in which we can be "modeling" something. The first sense is the core, unconscious ability of the brain to generate models, and in particular that which we experience as intuition. This ability can (IMO) be thought of as some kind of machine learning algorithm, and, I doubt that hypercomputation is relevant there in any way. The second sense is the "modeling" we do by manipulating linguistic (symbolic) constructs in our conscious mind. These constructs might be formulas in some mathematical theory, including formulas that represent claims about uncomputable objects. However, these symbolic manipulations are just another computable process, and it is only the results of these manipulations that we use to generate predictions and/or test models, since this is the only access we have to those uncomputable objects.

Regarding your hypothetical device, I wonder how would you tell whether it is the hypercomputer you imagine it to be, versus the realization of the same hypercomputer in some non-standard model of ZFC? (In particular, the latter could tell you that some Turing machine halts when it "really" doesn't, because in the model it halts after some non-standard number of computing steps.) More generally, given an uncomputable function and a system under test , there is no sequence of computable tests that will allow you to form some credence about the hypothesis s.t. this credence will converge to when the hypothesis is true and when the hypothesis is false. (This can be made an actual theorem.) This is different from the situation with normal computers (i.e. computable ) when you can devise such a sequence of tests. (Although you can in principle have a class of uncomputable hypotheses s.t. you can asymptotically verify is in the class, for example the class of all functions s.t. it is consistent with ZFC that is the halting function. But the verification would be extremely slow and relatively parsimonious competing hypotheses would remain plausible for an extremely (uncomputably) long time. In any case, notice that the class itself has, in some strong sense, a computable description: specifically, the computable verification procedure itself.)

My point is, the Church-Turing thesis implies (IMO) that the mathematical model of rationality/intelligence should be based on Turing machines at most, and this observation does not strongly depend on assumptions about physics. (Well, if hypercomputation is physically possible, and realized in the brain, and there is some intuitive part of our mind that uses hypercomputation in a crucial way, then this assertion would be wrong. That would contradict my own intuition about what reasoning is (including intuitive reasoning), besides everything we know about physics, but obviously this hypothesis has some positive probability.)

Comment by vadim-kosoy on Realism about rationality · 2018-10-01T21:48:17.574Z · score: 4 (3 votes) · LW · GW

What does it mean to have a box for solving the halting problem? How do you know it really solves the halting problem? There are some computable tests we can think of, but they would be incomplete, and you would only verify that the box satisfies those computable tests, not that is "really" a hypercomputer. There would be a lot of possible boxes that don't solve the halting problem that pass the same computable tests.

If there is some powerful computational hardware available, I would want the AI the use that hardware. If you imagine the hardware as being hypercomputers, then you can think of such an AI as having a "prior over hypercomputable worlds". But you can alternatively think of it as reasoning using computable hypotheses about the correspondence between the output of this hardware and the output of its sensors. The latter point of view is better, I think, because you can never know the hardware is really a hypercomputer.

Comment by vadim-kosoy on Realism about rationality · 2018-10-01T20:55:18.066Z · score: 8 (4 votes) · LW · GW

Physics is not the territory, physics is (quite explicitly) the models we have of the territory. Rationality consists of the rules for formulating these models, and in this sense it is prior to physics and more fundumental. (This might be a disagreement over use of words. If by "physics" you, by definition, refer to the territory, then it seems to miss my point about Occam's razor. Occam's razor says that the map should be parsimonious, not the territory: the latter would be a type error.) In fact, we can adopt the view that Solomonoff induction (which is a model of rationality) is the ultimate physical law: it is a mathematical rule of making predictions that generates all the other rules we can come up with. Such a point of view, although in some sense justified, at present would be impractical: this is because we know how to compute using actual physical models (including running computer simulations), but not so much using models of rationality. But this is just another way of saying we haven't constructed AGI yet.

I don't think it's meaningful to say that "weird physics may enable super Turing computation." Hypercomputation is just a mathematical abstraction. We can imagine, to a point, that we live in a universe which contains hypercomputers, but since our own brain is not a hypercomputer, we can never fully test such a theory. This IMO is the most fundumental significance of the Church-Turing thesis: since we only perceive the world through the lens of our own mind, then from our subjective point of view, the world only contains computable processes.

Comment by vadim-kosoy on The Tails Coming Apart As Metaphor For Life · 2018-09-25T21:24:47.577Z · score: 5 (5 votes) · LW · GW

One way to deal with this is, have an entire set of utility functions (the different "lines"), normalize them so that they approximately agree inside "Mediocristan", and choose the "cautious" strategy, i.e. the strategy the maximizes the minimum of expected utility over this set. This way you are at least guaranteed not the end up in a place that is worse than "Mediocristan".

Computational complexity of RL with traps

2018-08-29T09:17:08.655Z · score: 14 (5 votes)

Entropic Regret I: Deterministic MDPs

2018-08-16T13:08:15.570Z · score: 12 (7 votes)

Algo trading is a central example of AI risk

2018-07-28T20:31:55.422Z · score: 25 (15 votes)

The Learning-Theoretic AI Alignment Research Agenda

2018-07-04T09:53:31.000Z · score: 5 (4 votes)

Meta: IAFF vs LessWrong

2018-06-30T21:15:56.000Z · score: 1 (1 votes)

Computing an exact quantilal policy

2018-04-12T09:23:27.000Z · score: 2 (1 votes)

Quantilal control for finite MDPs

2018-04-12T09:21:10.000Z · score: 3 (3 votes)

Improved regret bound for DRL

2018-03-02T12:49:27.000Z · score: 0 (0 votes)

More precise regret bound for DRL

2018-02-14T11:58:31.000Z · score: 1 (1 votes)

Catastrophe Mitigation Using DRL (Appendices)

2018-02-14T11:57:47.000Z · score: 0 (0 votes)

Bugs?

2018-01-21T21:32:10.492Z · score: 4 (1 votes)

The Behavioral Economics of Welfare

2017-12-22T11:35:09.617Z · score: 28 (12 votes)

Improved formalism for corruption in DIRL

2017-11-30T16:52:42.000Z · score: 0 (0 votes)

Why DRL doesn't work for arbitrary environments

2017-11-30T12:22:37.000Z · score: 0 (0 votes)

Catastrophe Mitigation Using DRL

2017-11-22T05:54:42.000Z · score: 0 (0 votes)

Catastrophe Mitigation Using DRL

2017-11-17T15:38:18.000Z · score: 0 (0 votes)

Delegative Reinforcement Learning with a Merely Sane Advisor

2017-10-05T14:15:45.000Z · score: 0 (0 votes)

On the computational feasibility of forecasting using gamblers

2017-07-18T14:00:00.000Z · score: 0 (0 votes)

Delegative Inverse Reinforcement Learning

2017-07-12T12:18:22.000Z · score: 3 (2 votes)

Learning incomplete models using dominant markets

2017-04-28T09:57:16.000Z · score: 1 (1 votes)

Dominant stochastic markets

2017-03-17T12:16:55.000Z · score: 0 (0 votes)

A measure-theoretic generalization of logical induction

2017-01-18T13:56:20.000Z · score: 3 (3 votes)

Towards learning incomplete models using inner prediction markets

2017-01-08T13:37:53.000Z · score: 2 (2 votes)

Subagent perfect minimax

2017-01-06T13:47:12.000Z · score: 0 (0 votes)

Minimax forecasting

2016-12-14T08:22:13.000Z · score: 0 (0 votes)

Minimax and dynamic (in)consistency

2016-12-11T10:42:08.000Z · score: 0 (0 votes)

Attacking the grain of truth problem using Bayes-Savage agents

2016-10-20T14:41:56.000Z · score: 1 (1 votes)

IRL is hard

2016-09-13T14:55:26.000Z · score: 0 (0 votes)

Stabilizing logical counterfactuals by pseudorandomization

2016-05-25T12:05:07.000Z · score: 1 (1 votes)

Stability of optimal predictor schemes under a broader class of reductions

2016-04-30T14:17:35.000Z · score: 0 (0 votes)

Predictor schemes with logarithmic advice

2016-03-27T08:41:23.000Z · score: 1 (1 votes)

Reflection with optimal predictors

2016-03-22T17:20:37.000Z · score: 1 (1 votes)

Logical counterfactuals for random algorithms

2016-01-06T13:29:52.000Z · score: 3 (3 votes)

Quasi-optimal predictors

2015-12-25T14:17:05.000Z · score: 2 (2 votes)

Implementing CDT with optimal predictor systems

2015-12-20T12:58:44.000Z · score: 1 (1 votes)

Bounded Solomonoff induction using optimal predictor schemes

2015-11-10T13:59:29.000Z · score: 1 (1 votes)

Superrationality in arbitrary games

2015-11-04T18:20:41.000Z · score: 7 (6 votes)

Optimal predictor schemes

2015-11-01T17:28:46.000Z · score: 2 (2 votes)

Optimal predictors for global probability measures

2015-10-06T17:40:19.000Z · score: 0 (0 votes)

Logical counterfactuals using optimal predictor schemes

2015-10-04T19:48:23.000Z · score: 0 (0 votes)

Towards reflection with relative optimal predictor schemes

2015-09-30T15:44:21.000Z · score: 1 (1 votes)

Improved error space for universal optimal predictor schemes

2015-09-30T15:08:53.000Z · score: 0 (0 votes)

Optimal predictor schemes pass a Benford test

2015-08-30T13:25:59.000Z · score: 3 (3 votes)

Optimal predictors and propositional calculus

2015-07-04T09:51:38.000Z · score: 0 (0 votes)

Optimal predictors and conditional probability

2015-06-30T18:01:31.000Z · score: 2 (2 votes)

A complexity theoretic approach to logical uncertainty (Draft)

2015-05-11T20:04:28.000Z · score: 5 (5 votes)

Identity and quining in UDT

2015-03-19T19:03:29.000Z · score: 2 (2 votes)