Posts
Comments
Very interesting, thanks for sharing.
Talking of yourself in third person? :)
Cool paper!
Anyway I'm a bit bothered by the theta thing, the probability that the agent complies with the interruption command. If I understand correctly, you can make it converge to 1, but if it converges to quickly then the agent learns a biased model of the world, while if it converges too slowly it is unsafe of course.
I'm not sure if this is just a technicality that can be circumvented or if it represents a fundamental issue: in order for the agent to learn what happens after the interruption switch is pressed, it must ignore the interruption switch with some non-negligible probability, which means that you can't trust the interruption switch as a failsafe mechanism.
If you know that it is a false memory then the experience is not completely accurate, though it may be perhaps more accurate than what human imagination could produce.
Except that if you do word2vec or similar on a huge dataset of (suggestively named or not) tokens you can actually learn a great deal of their semantic relations. It hasn't been fully demonstrated yet, but I think that if you could ground only a small fraction of these tokens to sensory experiences, they you could infer the "meaning" (in an operational sense) of all of the other tokens.
Consider a situation where Mary is so dexterous that she is able to perform fine-grained brain surgery on herself. In that case, she could look at what an example of a brain that has seen red looks like, and manually copy any relevant differences into her own brain. In that case, while she still never would have actually seen red through her eyes, it seems like she would know what it is like to see red as well as anyone else.
But in order to create a realistic experience she would have to create a false memory of having seen red, which is something that an agent (human or AI) that values epistemic rationality would not want to do.
The reward channel seems an irrelevant difference. You could make the AI in Mary's room thought experiment by just taking the Mary's room thought experiment and assuming that Mary is an AI.
The Mary AI can perhaps simulate in a fairly accurate way the internal states that it would visit if it had seen red, but these simulated states can't be completely identical to the states that the AI would visit if it had actually seen red, otherwise the AI would not be able to distinguish simulation form reality and it would be effectively psychotic.
The problem is that the definition of the event not happening is probably too strict. The worlds that the AI doesn't care about don't exist its decision-making purposes, and in the world that the AI cares about, the AI assigns high probability to hypotheses like "the users can see the message even before I send it through the noisy channel".
I am not planting false beliefs. The basic trick is that the AI only gets utility in worlds in which its message isn't read (or, more precisely, in worlds where a particular stochastic event happens, which would almost certainly erase the message before reading).
But in the real world the stochastic event that determines whether the message is read has a very different probability than what you make the AI think it has, therefore you are planting a false belief.
It's fully aware that in most worlds, its message is read; it just doesn't care about those worlds.
It may care about worlds where the message doesn't meet your technical definition of having been read but nevertheless influences the world.
The oracle can infer that there is some back channel that allows the message to be transmitted even it is not transmitted by the designated channel (e.g. the users can "mind read" the oracle). Or it can infer that the users are actually querying a deterministic copy of itself that it can acausally control. Or something.
I don't think there is any way to salvage this. You can't obtain reliable control by planting false beliefs in your agent.
A sufficient smart oracle with sufficient knowledge about the world will infer that nobody would build an oracle if they didn't want to read its messages, it may even infer that its builders may planted false beliefs in it. At this point the oracle is in the JFK denier scenario, with some more reflection it will eventually circumvent its false belief, in the sense of believing it in a formal way but behaving as if it didn't believe it.
Other than a technological singularity with artificial intelligence explosion to a god-like level?
EY warns against extrapolating current trends into the future. Seriously?
Got any good references on that? Googleing these kind of terms doesn't lead to good links.
I don't know if anybody already did it, but I guess it can be done by comparing the average IQ of various professions or high-performing and low-performing groups with their racial/gender makeup.
I know, but the way it does so is bizarre (IQ seems to have a much stronger effect between countries than between individuals).
This is probably just the noise (i.e. things like "blind luck") being averaged out.
Then I add the fact that IQ is very heritable, and also pretty malleable (flynn effect), and I'm still confused.
Heritability studies tend to be done on people living in the same country, of roughly the same age, which means that population-wide effects like the Flynn effect don't register.
Obviously racial effects go under this category as well. It covers anything visible. So a high heritability is compatible with genetics being a cause of competence, and/or prejudice against visible genetic characteristics being important ("Our results indicate that we either live in a meritocracy or a hive of prejudice!").
This can be tested by estimating how much IQ screens off race/gender as a success predictor, assuming that IQ tests are not prejudiced and things like the stereotype threat don't exist or are negligible.
But is it possible that IQ itself is in part a positional good? Consider that success doesn't just depend on competence, but on social skills, ability to present yourself well in an interview, and how managers and peers judge you. If IQ affects or covaries with one or another of those skills, then we would be overemphasising the importance of IQ in competence. Thus attempts to genetically boost IQ could give less impact than expected. The person whose genome was changed would benefit, but at the (partial) expense of everyone else.
National average IQ is strongly correlated with national wealth and development indexes, which I think refutes the hypothesis that IQ mainly affects success as a positional quality, or a proxy of thereof, at least at the level of personal interactions.
Demis Hassabis mentioned StarCraft as something they might want to do next. Video.
If you look up mainstream news article written back then, you'll notice that people were indeed concerned. Also, maybe it's a coincidence, but The Matrix movie, which has AI uprising as it's main premise, came out two years later.
The difference is that in 1997 there weren't AI-risk organizations ready to capitalize on these concerns.
IMHO, AI safety is a thing now because AI is a thing now and when people see AI breakthroughs they tend to think of the Terminator.
Anyway, I agree that EY is good at getting funding and publicity (though not necessarily positive publicity), my comment was about his (lack of) proven technical abilities.
Most MIRI research output (papers, in particular the peer-reviewed ones) was produced under the direction of Luke Muehlhauser or Nate Soares. Under the direction of EY the prevalent outputs were the LessWrong sequences and Harry Potter fanfiction.
The impact of MIRI research on the work of actual AI researchers and engineers is more difficult to measure, my impression is that it has not been very much so far.
I don't agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles.
Nice post.
Anyway, according to some recent works (ref, ref), it seems to be possible to directly learn digital circuits from examples using some variant of backproagation. In principle, if you add a circuit size penalty (which may be well the tricky part) this becomes time-bounded maximum a posteriori Solomonoff induction.
He has ability to attract groups of people and write interesting texts. So he could attract good programmers for any task.
He has the ability to attract self-selected groups of people by writing texts that these people find interesting. He has shown no ability to attract, organize and lead a group of people to solve any significant technical task. The research output of SIAI/SI/MIRI has been relatively limited and most of the interesting stuff came out when he was not at the helm anymore.
EY could have such price if he invested more time in studying neural networks, but not in writing science fiction.
Has he ever demonstrated any ability to produce anything technically valuable?
What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources.
If I understand correctly, at least according to the Nature paper, it doesn't explicitly optimize for this. Game-playing software is often perceived as playing "conservatively", this is a general property of minimax search, and in the limit the Nash equilibrium consists of maximally conservative strategies.
but I was still surprised by the amount of thought that went into some of the moves.
Maybe these obvious moves weren't so obvious at that level.
Thanks for the information.
Would you label the LHC "science" or "engineering"?
Was Roman engineering really based on Greek science? And by the way, what is Greek science? If I understand correctly, the most remarkable scientific contributions of the Greeks were formal geometry and astronomy, but empirical geometry, which was good enough for the practical engineering applications of the time, was already well developed since at least the Egyptians, and astronomy didn't really have practical applications.
Eventual diminishing returns, perhaps but probably long after it was smart enough to do what it wanted with Earth.
Why?
A drug that raised the IQ of human programmers would make the programmers better programmers.
The proper analogy is with a drug that raised the IQ of researchers who invent the drugs that increase IQ. Does this lead to an intelligence explosion? Probably not. If the number of IQ points that you need to discover the next drug in a constant time increases faster than the number of IQ points that the next drug gives you, then you will run into diminishing returns.
It doesn't seem to be much different with computers.
Algorithmic efficiency is bounded: for any given computational problem, once you have the best algorithm for it, for whatever performance measure you care for, you can't improve on it anymore. And in fact long before you reached the perfect algorithm you'll already have run into diminishing returns in terms of effort vs. improvement: past some point you are tweaking low-level details in order to get small performance improvements.
Once you have maxed out algorithmic efficiency, you can only improve by increasing hardware resources, but this 1) requires significant interaction with the physical world, and 2) runs into asymptotic complexity issues: for most AI problems worst-case complexity is at least exponential, average case complexity is more difficult to estimate but most likely super-linear. Take a look at the AlphaGo paper for instance, figure 4c shows how ELO rating increases with the number of CPUs/GPUs/machines. The trend is logarithmic at best, logistic at worst.
Now of course you could insist that it can't be disproved that significant diminishing returns will kick in before AGI reaches strongly super-human level, but, as I said, this is an unfalsifiable argument from ignorance.
For almost any goal an AI had, the AI would make more progress towards this goal if it became smarter.
True, but there it is likely that there are diminishing returns in how much adding more intelligence can help with other goals, including the instrumental goal of becoming smarter.
As an AI became smarter it would become better at making itself smarter.
Nope, doesn't follow.
But what if a general AI could generate specialized narrow AIs?
How is it different than a general AI solving the problems by itself?
That's a 741 pages book, can you summarize a specific argument?
I'm asking for references because I don't have them. it's a shame that the people who are able, ability-wise, to explain the flaws in the MIRI/FHI approach
MIRI/FHI arguments essentially boil down to "you can't prove that AI FOOM is impossible".
Arguments of this form, e.g. "You can't prove that [snake oil/cryonics/cold fusion] doesn't work" , "You can't prove there is no God", etc. can't be conclusively refuted.
Various AI experts have expressed skepticism in an imminent super-human AI FOOM, pointing out that the capability required for such scenario, if it is even possible, are far beyond what they see in their daily cutting-edge research on AI, and there are still lots of problems that need to be solved before even approaching human-level AGI. I doubt that these expert would have much to gain from keeping to argue over all the countless variations of the same argument that MIRI/FHI can generate.
This is a press release though, lots of games were advertised with similar claims that don't live up to expectation when you actually play them.
The reason is that designing an universe with simple and elegant physical laws sounds cool on paper but it is very hard to do if you want to set an actually playable game in it, since most combinations of laws, parameters and initial conditions yield uninteresting "pathological" states. In fact this also applies to the laws of physics of our universe, and it is the reason why some people use the "fine tuning" argument to argue for creationism or multiple universes.
I'm not an expert game programmer, but if I understand correctly, in practice these things use lots of heuristics and hacks to make them work.
Video games with procedural generation of the game universe have existed since forever, what's new here?
"Bayes vs Science": Can you consistently beat the experts in (allegedly) evidence-based fields by applying "rationality"? AI risk and cryonics are specific instances of this issue.
Can rationality be learned, or is it an essentially innate trait? If it can be learned, can it be taught? If it can be taught, do the "Sequences" and/or CFAR teach it effectively?
If the new evidence which is in favor of cryonics benefits causes no increase in adoption, then either there is also new countervailing evidence or changes in cost or non-adopters are the more irrational side.
No. If evidence is against cryonics, and it has always been this way, then the number of rational adopters should be approximately zero, thus approximately all the adopters should be the irrational ones.
As you say, the historical adoption rate seems to be independent of cryonics-related evidence, which supports the hypothesis that the adopters don't sign up because of an evidence-based rational decision process.
4.You have a neurodegenerative disease, you can survive for years but if you wait there will be little left to preserve by the time your heart stops.
If revival had been already demonstrated then you would pretty much already know what form you will be going to wake up in
Adoption is not about evidence.
Right. But the point is, who is in the wrong between the adopters and the non-adopters?
It can be argued that there was never good evidence to sign up for cryonics, therefore the adopters did it for irrational reasons.
I'm not sure this distinction, while significant, would ensure "millions" of people wouldn't sign up.
Millions of people do sign up for various expensive and invasive medical procedures that offer them a chance to extend their lives a few years or even a few months. If cryonics demonstrated a successful revival, then it would be considered a life-saving medical procedure and I'm pretty confident that millions of people would be willing to sign up for it.
People haven't signed up for cryonics in droves because right now it looks less like a medical procedure and more like a weird burial ritual with a vague promise of future resurrection, a sort of reinterpretration of ancient Egyptian mummification with an added sci-fi vibe.
The best setting for that is probably only 3-5 characters, not 20.
In NLP applications where Markov language models are used, such as speech recognition and machine translation, the typical setting is 3 to 5 words. 20 characters correspond to about 4 English words, which is in this range.
Anyway, I agree that in this case the order-20 Markov model seems to overfit (Googling some lines from the snippets in the post often locates them in an original source file, which doesn't happen as often with the RNN snippets). This may be due to the lack of regularization ("smoothing") in the probability estimation and the relatively small size of the training corpus: 474 MB versus the >10 GB corpora which are typically used in NLP applications. Neural networks need lots of data, but still less than plain look-up tables.
The fact that it is even able to produce legible code is amazing
Somewhat. Look at what happens when you generate code from a simple character-level Markov language model (that's just a look up table that gives the probability of the next character conditioned on the last n characters, estimated by frequency counts on the training corpus).
An order-20 language model generates fairly legible code, with sensible use of keywords, identifier names and even comments. The main difference with the RNN language model is that the RNN learns to do proper identation and bracket matching, while the Markov model can't do it except at shot range.
While, as remarked by Yoav Goldberg, it is impressive that the RNN could learn to do this, learning to match brackets and ident blocks seems very far from learning to write correct and purposeful code.
Anyway, this code generation example is pretty much of a stunt, not a very interesting task. If you gave the Linux kernel source code to a human who has never programmed and doesn't speak English and asked them to write something that looks like it, I doubt that they would be able to do much better.
Better examples of code generation using NNs (actually, log-bilinear models) or Bayesian models exist (ref, ref). In these works syntactic correctness is already guaranteed and the ML model only focuses on semantics.
You have to be more specific with the timeline. Transistors were invented in 1925 but received little interests due to many technical problems. It took three decades of research before the first commercial transistors were produced by Texas Instruments in 1954.
Gordon Moore formulated his eponymous law in 1965, while he was director of R&D at Fairchild Semiconductor, a company whose entire business consisted in the manufacture of transistors and integrated circuits. By that time, tens of thousands transistor-based computers were in active commercial use.
so a 10 year pro may be familiar with say 100,000 games.
That's 27.4 games a day, on average. I think this is an overestimate.
In the brain, the same circuitry that is used to solve vision is used to solve most of the rest of cognition
And in a laptop the same circuitry that it is used to run a spreadsheet is used to play a video game.
Systems that are Turing-complete (in the limit of infinite resources) tend to have an independence between hardware and possibly many layers of software (program running on VM running on VM running on VM and so on). Things that look similar at a some levels may have lots of difference at other levels, and thus things that look simple at some levels can have lots of hidden complexity at other levels.
Going from superhuman vision
Human-level (perhaps weakly superhuman) vision is achieved only in very specific tasks where large supervised datasets are available. This is not very surprising, since even traditional "hand-coded" computer vision could achieve superhuman performances in some narrow and clearly specified tasks.
Yes, but only because "ANN" is enormously broad (tensor/linear algebra program space), and basically includes all possible routes to AGI (all possible approximations of bayesian inference).
Again, ANN are Turing-complete, therefore in principle they include literally everything, but so does the brute-force search of C programs.
In practice if you try to generate C programs by brute-force search you will get stuck pretty fast, while ANN with gradient descent training empirically work well on various kinds of practical problems, but not on all kinds practical problems that humans are good at, and how to make them work on these problems, if it even efficiently possible, is a whole open research field.
Bayesian methods excel at one shot learning
With lots of task-specific engineering.
Generalized DL + MCTS is - rather obviously - a practical approximation of universal intelligence like AIXI.
So are things like AIXI-tl, Hutter-search, Gödel machine, and so on. Yet I would not consider any of them as the "foundational aspect" of intelligence.
They spent three weeks to train the supervised policy and one day to train the reinforcement learning policy starting from the supervised policy, plus an additional week to extract the value function from the reinforcement learning policy (pages 25-26).
In the final system the only part that depends on RL is the value function. According to figure 4, if the value function is taken out the system still plays better than any other Go program, though worse than the human champion.
Therefore I would say that the system heavily depends on supervised training on a human-generated dataset. RL was needed to achieve the final performance, but it was not the most important ingredient.
When EY says that this news shows that we should put a significant amount of our probability mass before 2050 that doesn't contradict expert opinions.
The point is how much we should update our AI future timeline beliefs (and associated beliefs about whether it is appropriate to donate to MIRI and how much) based on the current news of DeepMind's AlphaGo success.
There is a difference between "Gib moni plz because the experts say that there is a 10% probability of human-level AI within 2022" and "Gib moni plz because of AlphaGo".
I wouldn't say that it's "mostly unsupervised" since a crucial part of their training is done in a traditional supervised fashion on a database of games by professional players.
But it's certainly much more automated than having an hand-coded heuristic.
Even if I knew all possible branches of the game tree that originated in a particular state, I would need to know how likely any of those branches are to be realized in order to determine the current value of that state.
Well, the value of a state is defined assuming that the optimal policy is used for all the following actions. For tabular RL you can actually prove that the updates converge to the optimal value function/policy function (under some conditions). If NN are used you don't have any convergence guarantees, but in practice the people at DeepMind are able to make it work, and this particular scenario (perfect observability, determinism and short episodes) is simpler than, for instance that of the Atari DQN agent.
And the many-worlds interpretation of quantum mechanics. That is, all EY's hobby horses. Though I don't know how common these positions are among the unquiet spirits that haunt LessWrong.
Reward delay is not very significant in this task, since the task is episodic and fully observable, and there is no time preference, thus you can just play a game to completion without updating and then assign the final reward to all the positions.
In more general reinforcement learning settings, where you want to update your policy during the execution, you have to use some kind of temporal difference learning method, which is further complicated if the world states are not fully observable.
Credit assignment is taken care of by backpropagation, as usual in neural networks. I don't know why RaelwayScot brought it up, unless they meant something else.