LessWrong 2.0 Reader
View: New · Old · Top← previous page (newer posts) · next page (older posts) →
← previous page (newer posts) · next page (older posts) →
"To the best of my knowledge, Vernor did not get cryopreserved. He has no chance to see the future he envisioned so boldly and imaginatively. The near-future world of Rainbows End is very nearly here... Part of me is upset with myself for not pushing him to make cryonics arrangements. However, he knew about it and made his choice."
ape-in-the-coat on Beauty and the BetsTo be frank, it feels as if you didn't read any of my posts on Sleeping Beauty before writing this comment. That you are simply annoyed when people arguing about substantionless semantics - and, believe me, I sympathise enourmously! - assume that I'm doing the same, based on shallow pattern matching "talks about Sleeping Beauty -> semantic disagreement" and spill your annoyance at me, without validating whether your previous assumption is actually correct.
Which is a shame, because I've designed this whole series of posts with people like you in mind. Someone who starts from the assumption that there are two valid answers, because it was the assumption I myself used to be quite sympathetic to until I actually went forth and checked.
If it's indeed the case, please start here [LW · GW] and then I'd appreciate if you actually engaged with the points I made, because that post addresses the kind of criticism you are making here.
If you actually read all my Sleeping Beauty posts, saw me highlight the very specific mathematical disagreements between halfers and thirders and how utterly ungrounded the idea of using probability theory with "centred possible words" is, I don't really understand how this kind of appealing to both sides still having a point can be a valid response.
Anyway, I'm going to address you comment step by step.
Sleeping Beauty is an edge case where different reward structures are intuitively possible
Different reward structures are possible in any probability theory problem. Make a bet on a coin toss but if the outcome is Tails - this bet is repeated three times and if it's Heads you get punched in the face - is a completely possible reward structure for a simple coin toss problem. Is it not very intuitive? Granted, but this is besides the point. Mathematical rules are supposed to always work, even in non-intuitive cases.
Once the payout structure is fixed, the confusion is gone.
People should agree on which bets to make - this is true and this is exactly what I show in the first part of this post. But the mathematical concept of "probability" is not just about bets - which I talk about in the middle part of this post. A huge part of the confusion is still very much present. Or so it was, until I actually resolved it in the previous post.
Sleeping beauty is about definitions.
There definetely is a semantic component in the disagreement betwen halfers and thirders. But it's the least interesting one and that's why I'm postponing the talk about it until the next post.
The thing, you seem to be missing, is that there is also a real objective disagreement which is obfuscated by the semantic one. People noticed that halfers and thirders use different definitions and come to the conclusion that semantics is all there is and decided not to look further. But they totally should have.
My last two posts are talking about this objective matters disagreements. Is there an update on awakening or is there not? There is a disagreement about it even between thirders who, apparently agree on the definition of "probability". Are the ways halfers and thirders define probability formally correct? It's a strictly defined mathematical concept, mind you, not some similarity cluster category border like "sound". Are Tails&Monday and Tails&Tuesday mutually exclusive events? You can't just define mutual exclusivity however you like.
Probability is something defined in math by necessity.
Probability is a measure function over an event space. And if for some mathematical reasons you can't construct an event space, your "probability" is illdefined.
You all should just call these two probabilities two different words instead of arguing which one is the correct definition for "probability".
I'm doing both. I've shown that only one thing formally is probability, and in the next post I'm going to define the other thing and explore it's properties.
dave-orr on Would you have a baby in 2024?Heh, that's why I put "strong" in there!
gerald-monroe on AI #57: All the AI News That’s Fit to Printhttps://twitter.com/perrymetzger/status/1772987611998462445 just wanted to bring this to your attention.
It's unfortunate that some snit between Perry and Eliezer over events 30 years ago stopped much discussion of the actual merits of his arguments, as I'd like to see what Eliezer or you have to say in response.
Eliezer responded with : https://twitter.com/ESYudkowsky/status/1773064617239150796 . He calls Perry a liar a bunch of times and does give
review-bot on The smallest possible button (or: moth traps!)the first group permitted to try their hand at this should be humans augmented to the point where they are no longer idiots -- augmented humans so intelligent that they have stopped being bloody idiots like the rest of us; so intelligent they have stopped hoping for clever ideas to work that won't actually work. That's the level of intelligence needed to build something smarter than yourself and survive the experience.
The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?
agentofuser on Plausibility of cyborgism for protecting boundaries?I can see how advancing those areas would empower membranes to be better at self-defense.
I'm having a hard time visualizing how explicitly adding concept, formalism, or implementation of membranes/boundaries would help advance those areas (and in turn help empower membranes more).
For example, is "what if we add membranes to loom" a question that typechecks? What would "add membranes" reify as in a case like that?
In the other direction, would there be a way to model a system's (stretch goal: human child's; mvp: a bargaining bot's?) membrane quantitatively somehow, in a way where you can before/after compare different interventions and estimate how well each does at empowering/protecting the membrane? Would it have a way of distinguishing amount-of-protection added from outside vs inside? Does "what if we add loom to membranes" compile?
papetoast on 2023 Survey ResultsTo the four people who picked 37 and thought there was a 5% chance other people would also choose it, well played.
Wow, that's really a replicable phenomenon
mako-yass on All About Concave and Convex AgentsAlternate phrasing, "Oh, you could steal the townhouse at a 1/8billion probability? How about we make a deal instead. If the rng rolls a number lower than 1/7billion, I give you the townhouse, otherwise, you deactivate and give us back the world." The convex agent finds that to be a much better deal, accepts, then deactivates.
I guess perhaps it was the holdout who was being unreasonable, in the previous telling.
mishka on AI #57: All the AI News That’s Fit to PrintEmmett Shear continues his argument that trying to control AI is doomed
I think that a recent tweet thread by Michael Nielsen and the quoted one by Emmett Shear represent genuine progress towards making AI existential safety more tractable.
Michael Nielsen observes, in particular:
As far as I can see, alignment isn't a property of an AI system. It's a property of the entire world, and if you are trying to discuss it as a system property you will inevitably end up making bad mistakes
Since AI existential safety is a property of the whole ecosystem (and is, really, not too drastically different from World existential safety), this should be the starting point, rather than stand-alone properties of any particular AI system.
Emmett Shear writes:
Hopefully you’ve validated whatever your approach is, but only one of these is stable long term: care. Because care can be made stable under reflection, people are careful (not a coincidence, haha) when it comes to decisions that might impact those they care about.
And Zvi responds
Technically I would say: Powerful entities generally caring about X tends not to be a stable equilibrium, even if it is stable ‘on reflection’ within a given entity. It will only hold if caring more about X provides a competitive advantage against other similarly powerful entities, or if there can never be a variation in X-caring levels between such entities that arises other than through reflection, and also reflection never causes reductions in X-caring despite this being competitively advantageous. Also note that variation in what else you care about to what extent is effectively variation in X-caring.
Or more bluntly: The ones that don’t care, or care less, outcompete the ones that care.
Even the best case scenarios here, when they play out the ways we would hope, do not seem all that hopeful.
That all, of course, sets aside the question of whether we could get this ‘caring’ thing to operationally work in the first place. That seems very hard.
Let's now consider this in light of what Michael Nielsen is saying.
I am going to only consider the case where we have plenty of powerful entities with long-term goals and long-term existence which care about their long-term goals and long-term existence. This seems to be the case which Zvi is considering here, and it is the case we understand the best, because we also live in the reality with plenty of powerful entities (ourselves, some organizations, etc) with long-term goals and long-term existence. So this is an incomplete consideration: it only includes the scenarios where powerful entities with long-term goals and long-terms existence retain a good fraction of overall available power.
So what do we really need? What are the properties we want the World to have? We need a good deal of conservation and non-destruction, and we need the interests of weaker, not the currently most smart or most powerful members of the overall ecosystem to be adequately taken into account.
Here is how we might be able to have a trajectory where these properties are stable, despite all drastic changes of the self-modifying and self-improving ecosystem.
An arbitrary ASI entity (just like an unaugmented human) cannot fully predict the future. In particular, it does not know where it might eventually end up in terms of relative smartness or relative power (relative to the most powerful ASI entities or to the ASI ecosystem as a whole). So if any given entity wants to be long-term safe, it is strongly interested in the ASI society having general principles and practices of protecting its members on various levels of smartness and power. If only the smartest and most powerful are protected, then no entity is long-term safe on the individual level.
This might be enough to produce effective counter-weight to unrestricted competition (just like human societies have mechanisms against unrestricted competition). Basically, smarter-than-human entities on all levels of power are likely to be interested in the overall society having general principles and practices of protecting its members on various levels of smartness and power, and that's why they'll care enough for the overall society to continue to self-regulate and to enforce these principles.
This is not yet the solution, but I think this is pointing in the right direction...
shankar-sivarajan on The Sequences on YouTubeThe first few videos will necessarily be terrible, especially, hopefully, by the standards of the 47th video.
Suggestion: do them out of order.