Posts

[Link] Audio recording of Stephen Wolfram discussing AI and the Singularity 2015-11-18T21:41:31.933Z · score: 1 (2 votes)

Comments

Comment by raelwayscot on Open Thread March 21 - March 27, 2016 · 2016-03-25T01:13:24.238Z · score: 0 (0 votes) · LW · GW

Deutsch briefly summarized his view on AI risks in this podcast episode: https://youtu.be/J21QuHrIqXg?t=3450 (Unfortunately there is no transcript.)

What are your thoughts on his views apart from what you've touched upon above?

Comment by raelwayscot on After Go, what games should be next for DeepMind? · 2016-03-10T22:46:47.184Z · score: 7 (7 votes) · LW · GW

Demis Hassabis has already announced that they'll be working on a Starcraft bot in some interview.

Comment by raelwayscot on Open Thread Feb 22 - Feb 28, 2016 · 2016-02-23T12:59:22.706Z · score: 1 (1 votes) · LW · GW

What is your preferred backup strategy for your digital life?

Comment by raelwayscot on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning · 2016-01-28T23:20:24.722Z · score: 3 (3 votes) · LW · GW

I meant that for AI we will possibly require high-level credit assignment, e.g. experiences of regret like "I should be more careful in these kinds of situations", or the realization that one particular strategy out of the entire sequence of moves worked out really nicely. Instead it penalizes/enforces all moves of one game equally, which is potentially a much slower learning process. It turns out playing Go can be solved without much structure for the credit assignment processes, hence I said the problem is non-existent, i.e. there wasn't even need to consider it and further our understanding of RL techniques.

Comment by raelwayscot on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning · 2016-01-28T20:39:37.260Z · score: 0 (0 votes) · LW · GW

"Nonexistent problems" was meant as a hyperbole to say that they weren't solved in interesting ways and are extremely simple in this setting because the states and rewards are noise-free. I am not sure what you mean by the second question. They just apply gradient descent on the entire history of moves of the current game such that expected reward is maximized.

Comment by raelwayscot on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning · 2016-01-28T19:16:56.142Z · score: 0 (0 votes) · LW · GW

Yes, but as I wrote above, the problems of credit assignment, reward delay and noise are non-existent in this setting, and hence their work does not contribute at all to solving AI.

Comment by raelwayscot on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning · 2016-01-28T18:15:43.151Z · score: 1 (1 votes) · LW · GW

I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."

Yes, but that would likely require an extremely large amount of training data because to prepare actions for many kind of situations you'd have an exponential blow up to cover many combinations of many possibilities, and hence the model would need to be huge as well. It also would require high-quality data sets with simple correction signals in order to work, which are expensive to produce.

I think, above all for building a real-time AI you need reuse of concepts so that abstractions can be recombined and adapted to new situations; and for concept-based predictions (reasoning) you need one-shot learning so that trains of thoughts can be memorized and built upon. In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy. If there is a simple and fast solutions to this, then AGI could be right around the corner. If not, it could take several decades of research.

Comment by raelwayscot on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning · 2016-01-28T14:36:35.857Z · score: 1 (3 votes) · LW · GW

I agree. I don't find this result to be any more or less indicative of near-term AI than Google's success on ImageNet in 2012. The algorithm learns to map positions to moves and values using CNNs, just as CNNs can be used to learn mappings from images to 350 classes of dog breeds and more. It turns out that Go really is a game about pattern recognition and that with a lot of data you can replicate the pattern detection for good moves in very supervised ways (one could call their reinforcement learning actually supervised because the nature of the problem gives you credit assignment for free).

Comment by raelwayscot on Open thread, Jan. 25 - Jan. 31, 2016 · 2016-01-26T21:02:49.208Z · score: 3 (5 votes) · LW · GW

Then which blogs do you agree with on the matter of the refugee crisis? (My intent is just to crowd-source some well-founded opinions because I'm lacking one.)

Comment by raelwayscot on Open thread, Jan. 25 - Jan. 31, 2016 · 2016-01-26T20:12:33.544Z · score: 1 (3 votes) · LW · GW

What are your thoughts on the refugee crisis?

Comment by raelwayscot on Open Thread, January 11-17, 2016 · 2016-01-12T19:27:22.770Z · score: 1 (1 votes) · LW · GW

Just speaking of weaknesses of the paperclip maximizer though experiment. I've seen this misunderstanding at least 4 out of 10 times that the thought experiment was brought up.

Comment by raelwayscot on Open Thread, January 11-17, 2016 · 2016-01-12T17:46:30.331Z · score: 0 (0 votes) · LW · GW

I think many people intuitively distrust the idea that an AI could be intelligent enough to transform matter into paperclips in creative ways, but 'not intelligent enough' to understand its goals in a human and cultural context (i.e. to satisfy the needs of the business owners of the paperclip factory). This is often due to the confusion that the paperclip maximizer would get its goal function from parsing the sentence "make paperclips", rather from a preprogrammed reward function, for example a CNN that is trained to map the number of paperclips in images to a scalar reward.

Comment by raelwayscot on Torture vs. Dust Specks · 2016-01-06T12:49:18.454Z · score: 0 (0 votes) · LW · GW

I think the problem here is the way the utility function is chosen. Utilitarianism is essentially a formalization of reward signals in our heads. It is a heuristic way of quantifying what we expect a healthy human (one that can raise up and survive in a typical human environment and has an accurate model of reality) to want. All of this only converges roughly to a common utility because we have evolved to have the same needs which are necessarily pro-life and pro-social (since otherwise our species wouldn't be present today).

Utilitarianism crudely abstracts from the meanings in our heads that we recognize as common goals and assigns numbers to them. We have to be careful what we want to assign numbers to in order to get results that we want in all corner cases. I think, hooking up the utility meter to neurons that detect minor inconveniences is not a smart way of achieving what we collectively want because it might contradict our pro-life and pro-social needs. Only when the inconveniences accumulate individually so that they condense as states of fear/anxiety or noticeably shorten human life, it affects human goals and it makes sense to include them into utility considerations (which, again, are only a crude approximation of what we have evolved to want).

Comment by raelwayscot on Open Thread, January 4-10, 2016 · 2016-01-06T11:17:19.900Z · score: 5 (6 votes) · LW · GW

Why does E. Yudkowsky voice such strong priors e.g. wrt. the laws of physics (many worlds interpretation), when much weaker priors seem sufficient for most of his beliefs (e.g. weak computationalism/computational monism) and wouldn't make him so vulnerable? (With vulnerable I mean that his work often gets ripped apart as cultish pseudoscience.)

Comment by raelwayscot on Rationality Reading Group: Part Q: Joy in the Merely Real · 2016-01-03T18:46:40.115Z · score: 1 (1 votes) · LW · GW

I would love to seem some hard data about correlation between the public interest in science and it's degree of 'cult status' vs. 'open science'.

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-27T16:17:46.065Z · score: 0 (0 votes) · LW · GW

I mean "only a meme" in the sense, that morality is not absolute, but an individual choice. Of course, there can be arguments why some memes are better than others, that happens during the act of individuals convincing each other of their preferences.

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-27T15:58:06.121Z · score: 0 (0 votes) · LW · GW

Is it? I think, the act of convincing other people of your preferred state of the world is exactly what justifying morality is. But that action policy is only a meme, as you said, which is individually chosen based on many criteria (including aesthetics, peer-pressure, consistency).

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-27T10:52:51.517Z · score: 0 (0 votes) · LW · GW

Moral philosophy is a huge topic and it's discourse is not dominated by looking at DNA.

Everyone can choose their preferred state then, at least to the extent it is not indoctrinated or biologically determined. It is rational to invest energy into maintaining or achieving this state (because the state presumably provides you with a steady source of reward), which might involve convincing others of your preferred state or prevent them from threating it (e.g. by putting them into jail). There is likely an absolute truth (to the extent physics is consistent from our point of view), but no absolute morale (because it's all memes in an undirected process). Terrorists do nothing wrong from their point of view, but from mine it threatens my preferred state, so I will try to prevent terrorism. We may seem lucky that many preferred states converge to the same goals which are even fairly sustainable, but that is just an evolutionary necessity and perhaps mostly a result of empathy and the will to survive (otherwise our species wouldn't have survived in paleolithic groups of hunters and gatherers).

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-26T23:44:15.748Z · score: 0 (0 votes) · LW · GW

What are the implications of that on how we decide what is are the right things to do?

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-26T23:18:45.348Z · score: 0 (0 votes) · LW · GW

Because then it would argue from features that are built into us. If we can prove the existence of these features with high certainty, then it could perhaps serve as guidance for our decisions.

On the other hand, it is reasonable that evolution does not create such goals because it is an undirected process. Our actions are unrestricted in this regard, and we must only bear the consequences of the system that our species has come up with. What is good is thus decided by consensus. Still, the values we have converged to are shaped by the way we have evolved to behave (e.g. empathy and pain avoidance).

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-26T18:24:52.876Z · score: 0 (0 votes) · LW · GW

More why doing it is desirable at all. Is it a matter of the culture that currently exists? I mean, is it 'right' to eradicate a certain ethnic group if the majority endorses it?

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-26T15:40:49.319Z · score: 0 (0 votes) · LW · GW

What is the motivation behind maximizing QUALY? Does it require certain incentives to be present in the culture (endorsement of altruism) or is it rooted elsewhere?

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-26T01:01:31.962Z · score: 0 (0 votes) · LW · GW

I mean a moral terminal goal. But I guess we would be a large step closer to a solution of the control problem if we could specify such a goal.

What I had in mind is something like this: Evolution has provided us with a state which everyone prefers who is healthy (who can survive in a typical situation in which humans have evolved with high probability) and who has an accurate mental representation of reality. That state includes being surrounded by other healthy humans, so by induction everyone must reach this state (and also help others to reach it). I haven't carefully thought this through, but I just want to give an idea for what I'm looking for.

Comment by raelwayscot on Stupid Questions, 2nd half of December · 2015-12-25T22:17:55.901Z · score: 1 (1 votes) · LW · GW

Is there a biological basis that explains that utilitarianism and preservation of our species should motivate our actions? Or is it a purely selfish consideration: I feel well when others feel well in my social environment (and therefore even dependent on consensus)?

Comment by raelwayscot on Open thread, Dec. 21 - Dec. 27, 2015 · 2015-12-22T15:41:57.996Z · score: 0 (0 votes) · LW · GW

Is that actually the 'strange loop' that Hofstadter writes about?

Comment by raelwayscot on Open thread, Dec. 14 - Dec. 20, 2015 · 2015-12-14T12:42:09.971Z · score: 5 (5 votes) · LW · GW

Here they found dopamine to encode some superposed error signals about actual and counterfactual reward:

http://www.pnas.org/content/early/2015/11/18/1513619112.abstract

Could that be related to priors and likelihoods?

Significance

There is an abundance of circumstantial evidence (primarily work in nonhuman animal models) suggesting that dopamine transients serve as experience-dependent learning signals. This report establishes, to our knowledge, the first direct demonstration that subsecond fluctuations in dopamine concentration in the human striatum combine two distinct prediction error signals: (i) an experience-dependent reward prediction error term and (ii) a counterfactual prediction error term. These data are surprising because there is no prior evidence that fluctuations in dopamine should superpose actual and counterfactual information in humans. The observed compositional encoding of “actual” and “possible” is consistent with how one should “feel” and may be one example of how the human brain translates computations over experience to embodied states of subjective feeling.

Abstract

In the mammalian brain, dopamine is a critical neuromodulator whose actions underlie learning, decision-making, and behavioral control. Degeneration of dopamine neurons causes Parkinson’s disease, whereas dysregulation of dopamine signaling is believed to contribute to psychiatric conditions such as schizophrenia, addiction, and depression. Experiments in animal models suggest the hypothesis that dopamine release in human striatum encodes reward prediction errors (RPEs) (the difference between actual and expected outcomes) during ongoing decision-making. Blood oxygen level-dependent (BOLD) imaging experiments in humans support the idea that RPEs are tracked in the striatum; however, BOLD measurements cannot be used to infer the action of any one specific neurotransmitter. We monitored dopamine levels with subsecond temporal resolution in humans (n = 17) with Parkinson’s disease while they executed a sequential decision-making task. Participants placed bets and experienced monetary gains or losses. Dopamine fluctuations in the striatum fail to encode RPEs, as anticipated by a large body of work in model organisms. Instead, subsecond dopamine fluctuations encode an integration of RPEs with counterfactual prediction errors, the latter defined by how much better or worse the experienced outcome could have been. How dopamine fluctuations combine the actual and counterfactual is unknown. One possibility is that this process is the normal behavior of reward processing dopamine neurons, which previously had not been tested by experiments in animal models. Alternatively, this superposition of error terms may result from an additional yet-to-be-identified subclass of dopamine neurons.

Comment by raelwayscot on Starting University Advice Repository · 2015-12-06T12:05:36.188Z · score: 3 (3 votes) · LW · GW

Some helpful links I've collected over the years:

If you do something related to computer science:

  • https://news.ycombinator.com/item?id=8085148 (work on some side projects, for example program an economy simulator, invent a simple layout/markup language, implement a LISP-machine in C)
  • Get familiar with the UNIX command line, learn VIM and use Spacemacs as editor. Use org-mode for notes and git/magit for version control of all your projects and notes. Make use of a cloud service to keep all your files accessible from all your devices.

As someone who has developed RSI during their studies: If you feel that you don't have enough time to exercise, ignore that voice in your head and get a minimum amount of 7 minutes intense workout and 1 hour of very light exercise (e.g. walking, cycling or swimming) each day (and consider two sessions of longer intense workout per week, e.g. 60 min. swimming lessons). After each hour of sitting do a 5 minute break to stretch or drink a tea (Awareness.app for OS X is a nice software solution that can help you with that). A bad physical condition will affect your mood and mental performance negatively.

Comment by raelwayscot on Stupid Questions November 2015 · 2015-11-21T09:08:27.460Z · score: 0 (0 votes) · LW · GW

Do Bayesianists strongly believe that the Bayes' theorem accurately describes how the brain changes its latent variables in face of new data? It seems very unlikely to me that the brain keeps track of probability distributions and that they sum up to one. How do Bayesianists believe this works at the neuronal level?

Comment by raelwayscot on Open thread, Nov. 16 - Nov. 22, 2015 · 2015-11-16T22:22:17.839Z · score: 0 (0 votes) · LW · GW

Ok, so the motivation is to learn templates to do correlation at each image location with. But where would you get the idea from to do the same with the correlation map again? That seems non-obvious to me. Or do you mean biological vision?

Comment by raelwayscot on Open thread, Nov. 16 - Nov. 22, 2015 · 2015-11-16T18:00:04.711Z · score: 1 (1 votes) · LW · GW

I find CNNs a lot less intuitive than RNNs. In which context was training many filters and successively apply pooling and again filters to smaller versions of the output an intuitive idea?

Comment by raelwayscot on Life Advice Repository · 2015-10-19T06:58:07.854Z · score: 1 (1 votes) · LW · GW

Could one say that the human brain works best if it is slightly optimistically biased, just enough to have benefits of the neuromodulation accompanied with positive thinking, but not so much that false expectations have a significant potential to severely disappoint you? Are there some recommended sequences/articles/papers on this matter?

Comment by raelwayscot on Simulations Map: what is the most probable type of the simulation in which we live? · 2015-10-11T19:26:45.833Z · score: 0 (2 votes) · LW · GW

Perhaps the conditions that cause the Fermi paradox are actually crucial for life. If spaceflight was easy, all resources would have been exhausted by exponential growth pretty quickly. This would invalidate the 'big distances' point as evidence for a non-streamlined universe, though.

Comment by raelwayscot on Simulations Map: what is the most probable type of the simulation in which we live? · 2015-10-11T09:50:48.798Z · score: 5 (5 votes) · LW · GW

If we are in a simulation, why isn’t the simulation more streamlined? I have a couple of examples for that:

  • Classical physics and basic chemistry would likely be sufficient for life to exist.
  • There are seven uninhabitable planets in our solar system.
  • 99.9…% of everything performs extremely boring computations (dirt, large bodies of fluids and gas etc.).
  • The universe is extremely hostile towards intelligent life (GRBs, supernovae, scarcity of resources, large distances between celestial body).

It seems that our simulation hosts would need to have access to vast or unlimited resources. (In that case it would be interesting to consider whether life is sustainable in a world with unlimited resources at all. Perhaps scarcity is somehow required for ethical behavior to develop; malice would perhaps spread too easily.)

I’m a big fan of these infographics by the way.

Comment by raelwayscot on October 2015 Media Thread · 2015-10-02T01:01:04.178Z · score: 2 (2 votes) · LW · GW

Quite good Omega Tau interview on failure modes of mega projects: http://omegataupodcast.net/2015/09/181-why-megaprojects-fail-and-what-to-do-about-it/

Comment by raelwayscot on Open thread, Sep. 28 - Oct. 4, 2015 · 2015-10-01T17:51:11.908Z · score: 2 (2 votes) · LW · GW

Happy Longevity Day!

Comment by raelwayscot on The Library of Scott Alexandria · 2015-09-15T20:18:30.888Z · score: 1 (1 votes) · LW · GW

I would say be flexible as some topics are much more complex than others. I've found that most summaries on this list have a good length.

Comment by raelwayscot on Reading group for Calculus by Spivak · 2015-09-09T18:25:02.487Z · score: 1 (1 votes) · LW · GW

Perhaps you can revive one of these study groups: https://www.reddit.com/subreddits/search?q=spivak

Cross-posting to all of them might reach some people who are interested.

This Baby Rudin group is currently active: https://www.reddit.com/r/babyrudin/