Posts
Comments
I recently prepared an overview lecture about research directions in AI alignment for the Moscow AI Safety Hub. I had limited time, so I did the following: I reviewed all the sites on the AI safety map, examined the 'research' sections, and attempted to classify the problems they tackle and the research paths they pursue. I encountered difficulties in this process, partly because most sites lack a brief summary of their activities and objectives (Conjecture is one of the counterexamples). I believe that the field of AI safety would greatly benefit from improved communication, and providing a brief summary of a research direction seems like low-hanging fruit.
So, is a random variable in the sense that it is drawn from a distribution of functions, and the expected value of those functions at each point is equal to . Am I understanding you correctly?
I've read it as a part of Agents Foundation course, and I consider this post really effective and clarifying. It got me thinking, can this generalize to other failure modes? Like if programers notice that AI spend too much resources on self-preservation, and then train against such behavior, this failure mode would still arise because self-preservation is an instrumental goal and is a fact about the world and ways in which goal can be achieved in this world.
I'm not a native speaker, can someone please explain the meaning of "Hell is wasted on the evil" in simpler terms?
Thank you, that seems to be the clarification I needed. And reminded me of a good video, which also touches the subject.
Thank's for your answer, I will read linked post.
I told in the text that I'm going to try to convey the "process" in the comments, and I'll try to do it now.
all sophisticated-enough minds
I think that the recursive buck is passed to the word "enough". You need to have stratification of sophistication of minds, and have a cutoff for when they reach acceptable level off sophistication.
So in the universe with only bosons (so Pauli principle doesn't apply), everything is the Same?
When I imagine a room full of photons, I see a lot of things that can be Different. For example, the coordinates of photons, wavelength, polarization, their number.
Or are you saying that Pauli principle is sufficient, but not necessary?
If you read further, you can see how this is also passing the recursive buck.
You: "There are no clear separation between objects, I only use this to increase my utility function"
Me: "How are you deciding on where to stop dividing reality?"
You: "Well, I calculate my marginal utility from creating an additional concept and then Compare it to zer... ah, yeah, there is the recursive buck. It even capitalized as I said it."
So yeah, while this is a desirable point to stop, this method still relies on your ability to Differentiate between usefulness of two models, and as far as I can tell, in the end, we can only feel it.
Sebz n gval fcbg ba gur raq bs Uneel'f jnaq, n phovp zvyyvzrgre bs napube, fgergpurq bhg n guva yvar bs Genafsvtherq fcvqre-fvyx.
sebz gur puncgre 114
Or if I'd - if I'd only gone with - if, that night -
I'm guessing he is talking about the night he lost his potential phoenix.
I think that's intended author's choice. Like what Harry saw was too terrible to acknowledge. Or maybe it's just to create more suspense.
Snape told him that he wanted to check if Harry resembled his father, and the test consisted of stopping bullies, so that might be the reason for Harry's guess.
safety always remains ahead
When was it ever ahead? I mean, to be sure that safety is ahead, you need to first make advancement there compatible with capabilities. And to do that, you shouldn't advance the capabilities.
maybe you meant pairwise linearly independent (by looking at the graph)
Pick many linearly independent linear combinations
isn't there at most linearly independent linear combinations of ?
The current population size that Mars can support is 0, so even 1 person would be overpopulation. To complete the analogy, we are currently sending the entire population to Mars, and someone says: "But what about oxygen? We don't know if it's on Mars, maybe we should work on spacesuits?" and another says, "Nah, we'll figure it out when we get there."