next page (older posts) →
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2006-11-22T20:00:00.000Z · comments (48)
Interesting list. There was also Frank Ramsey.
Who does Oxford have... Julian Huxley?xodarap on Prizes for ELK proposals
I've been trying to understand this paragraph:
That is, it looks plausible (though still <50%) that we could improve these regularizers enough that a typical “bad” reporter was a learned optimizer which used knowledge of direct translation, together with other tricks and strategies, in order to quickly answer questions. For example, this is the structure of the counterexample discussed in Section: upstream. This is a still a problem because e.g. the other heuristics would often “misfire” and lead to bad answers, but it is a promising starting point because in some sense it has forced some optimization process to figure out how to do direct translation.
This comment is half me summarizing my interpretation of it to help others, and half an implicit question for the ARC team about whether my interpretation is correct.
Corrections and feedback on this extremely welcome!mitchell_porter on Why rationalists should care (more) about free software
they would then only need a slight preponderance of virtue over vice
This assumes that morality has only one axis, which I find highly unlikely.
This is a good and important point. A more realistic discussion of aligning with an idealized human agent might consider personality traits, cognitive abilities, and intrinsic values as among the properties of the individual agent that are worth optimizing, and that's clearly a multidimensional situation in which the changes can interact, even in confounding ways.
So perhaps I can make my point more neutrally as follows. There is both variety and uniformity among human beings, regarding properties like personality, cognition, and values. A process like alignment, in which the corresponding properties of an AI are determined by the properties of the human being(s) with which it is aligned, might increase this variety in some ways, or decrease it in others. Then, among the possible outcomes, only certain ones are satisfactory, e.g. an AI that will be safe for humanity even if it becomes all-powerful.
The question is, how selective must one be, in choosing who to align the AI with. In his original discussions of this topic, back in the 2000s, Eliezer argued that this is not an important issue, compared to identifying an alignment process that works at all. He gave as a contemporary example, Al Qaeda terrorists: with a good enough alignment process, you could start with them as the human prototype, and still get a friendly AI, because they have all the basic human traits, and for a good enough alignment process, that should be enough to reach a satisfactory outcome. On the other hand, with a bad alignment process, you could start with the best people we have, and still get an unfriendly AI.
One might therefore wish to only share code for the ethical part of the AI
This assumes you can discern the ethical part and that the ethical part is separate from the intelligent part.
Well, again we face the fact that different software architectures and development methodologies will lead to different situations. Earlier, it was that some alignment methodologies will be more sensitive to initial conditions than others. Here it's the separability of intelligence and ethics, or problem-solving ability and problems that are selected to be solved. There are definitely some AI designs where the latter can be cleanly separated, such as an expected-utility maximizer with arbitrary utility function. On the other hand, it looks very hard to pull apart these two things in a language model like GPT-3.
My notion was that the risk of sharing code is greatest for algorithms that are powerful general problem solvers which have no internal inhibitions regarding the problems that they solve; and that the code most worth sharing, is "ethical code" that protects from bad runaway outcomes by acting as an ethical filter.
But even more than that, I would emphasize that the most important thing is just to solve the problem of alignment in the most important case, namely autonomous superhuman AI. So long as that isn't figured out, we're gambling on our future in a big way.measure on TurnTrout's shortform feed
You also have to assume that the AI knows everything you know which might not be true if it's boxed.ryan_b on What's Up With Confusingly Pervasive Consequentialism?
Let's assume for a moment that consequentialism in Eliezer's sense is the most pervasive thing in the problem space (this is not a claim anyone has made as far as I can tell). What does leaning into consequentialism super hard look like in terms of approaches? The only line of attack l know of which seems to meet the description is the convergent power-seeking sequence [? · GW].ryan-beck on Prizes for ELK proposals
I was notified I didn't win a prize so figured I'd discuss what I proposed here in case it sparks any other ideas. The short version is I proposed adding on a new head that would be an intentional human simulator. During training it would be penalized for telling the truth that the diamond was gone when there existed a lie that the humans would have believed instead. The result would hopefully be a head that acted like a human simulator. Then the actual reporter would be trained so that it would be penalized for using a similar amount of compute as the intentional human simulator, or looking at similar nodes or node regions as the intentional human simulator. The hope is that by penalizing the reporter for acting like the intentional human simulator, it would be more likely to do direct translation instead of human simulation.
This does have at least one counterexample that I proposed as well, which is that the reporter could simply waste compute doing nothing to avoid matching the intentional human simulator, and could look at additional random nodes it doesn't care about to avoid looking like it was looking at the same nodes as the intentional human simulator. Though I thought there was some possibility that having to do these things might end up incentivizing the reporter to act like a direct translator instead of a human simulator.
Although I'm not sure why this wasn't very promising my guess is that the counterexample is too obvious and that my proposal doesn't gain much ground in keeping the reporter from acting like a human simulator, or someone else has already thought of this approach, or perhaps my counterexample is too similar to the counterexample to "penalize reporters that work with many different predictors" where the reporter could just pretend to not work with other predictors (its similar in that the reporter could pretend not to look like the intentional human simulator).
Here's my full submission in google docs with more description: https://docs.google.com/document/d/1Xa4CDLNJ-VPT7hqEUIHlqCsXVeFCYDB5h7Vn3QJ_qpA/edit?usp=sharingdavidad on davidad's Shortform
Out of curiosity, this morning I did a literature search about "hard-coded optimization" in the gradient-based learning space—that is, people deliberately setting up "inner" optimizers in their neural networks because it seems like a good way to solve tasks. (To clarify, I don't mean deliberately trying to make a general-purpose architecture learn an optimization algorithm, but rather, baking an optimization algorithm into an architecture and letting the architecture learn what to do with it.)
Why is this interesting?
Anyway, here's (some of) what I found:
Here are some other potentially relevant papers I haven't processed yet:
Don't know a ton about this but here are a few thoughts:
- Overall, I think distributed compute is probably not good for training or inference, but might be useful for data engineering or other support functions.
- Folding@home crowdsources compute for expanding markov state models of possible protein folding paths. Afaik, this doesn't require backpropagation or any similar latency-sensitive updating method. The crowdsourced computers just play out a bunch of scenarios, which are then aggregated and pruned off-line. Interesting paths are used to generate new workloads for future rounds of crowdsourcing.
This is an important disanalogy to deep RL models, and I suspect this is why F@H doesn't suffer from the issues Lennart mentioned (latency, data bandwith, etc.)
This approach can work for some of the applications that ppl use big models for - e.g. Rosetta@home does roughly the same thing as Alphafold, but it's worse at it. (afaik Alphafold can't do what F@H does - different problem)
- F@H and DL both benefit from GPUs because matrix multiplication is pretty general. If future AI systems train on more specialized hardware, it may become too hard to crowdsource useful levels of compute.
- Inference needs less aggregate compute, but often requires very low latency, which probably makes it a bad candidate for distribution.
- IMO crowdsourced compute is still interesting even if it's no good for large model training/inference. It's really good at what it does (see F@H, Rosetta@home, cryptomining collectives, etc.), and MDPs are highly general even with long latencies/aggregation challenges.
Maybe clever engineers could find ways to use it for e.g. A/B testing fine-tunings of large models, or exploiting unique datasources/compute environments (self-driving cars, drones, satellites, IoT niches, phones).
"VIX futures in contango" is roughly same problem you mentioned about e.g. VIXY going down 90% "over a market cycle". My claim is that you'll lose less to the contango than to the drift in VIXY. If I understand correctly, (I don't have easy access to the data right now) the VIX ETFs actually hold (dynamic) baskets of options, which means they lose a lot of money to slippage / transactions costs as they trade in and out of those positions.tornus on What are the counterarguments to a Faustian Vaccine Hypothesis? ($2k charity bounty)
Thank you for the reminder to explain and not scold—I shall strive to do so.