Posts
Comments
This is true. But ideally I don't think what we need is to be clever, except to the extent that it's a clever way to communicate with people so they understand why the current policies produce bad incentives and agree about changing them.
I think our collective HHS needs are less "clever policy ideas" and more "actively shoot ourselves in the foot slightly less often."
That's a good point about public discussions. It's not how I absorb information, but I can definitely see that.
I'm not sure where I'm proposing bureaucracy? The value is in making sure a conversation efficiently adds value for both parties, by not having to spend time rehashing things that are much faster absorbed in advance. This avoids the friction of needing to spend much of the time rehashing 101-level prerequisites. A very modest amount of groundwork beforehand maximizes the rate of insight in discussion.
I'm drawing in large part from personal experience. A significant part of my job is interviewing researchers, startup founders, investors, government officials, and assorted business people. Before I get on a call with these people, I look them (and their current and past employers, as needed) up on LinkedIn and Google Scholar and their own webpages. I briefly familiarize myself with what they've worked on and what they know and care about and how they think, as best I can anticipate, even if it's only for 15 minutes. And then when I get into a conversation, I adapt. I'm picking their brain to try and learn, so I try to adapt to their communication style and translate between their worldview and my own. If I go in with an idea of what questions I want answered, and those turn out to not be the important questions, or this turns out to be the wrong person to discuss it with, I change direction. Not doing this often leaves everyone involved frustrated at having wasted their time.
Also, should I be thinking of this as a debate? Because that's very different than a podcast or interview or discussion. These all have different goals. A podcast or interview is where I think the standard I am thinking of is most appropriate. If you want to have a deep discussion, it's insufficient, and you need to do more prep work or you'll never get into the meatiest parts of where you want to go. I do agree that if you're having a (public-facing) debate where the goal is to win, then sure, this is not strictly necessary. The history of e.g. "debates" in politics, or between creationists and biologists, shows that clearly. I'm not sure I'd consider that "meaningful" debate, though. Meaningful debates happen by seriously engaging with the other side's ideas, which requires understanding those ideas.
I can totally believe this. But, I also think that responsibly wearing the scientist hat entails prep work before engaging in a four hour public discussion with a domain expert in a field. At minimum that includes skimming the titles and ideally the abstracts/outlines of their key writings. Maybe ask Claude to summarize the highlights for you. If he'd done that he'd have figured out many of the answers to many of these questions on his own, or much faster during discussion. He's too smart not to.
Otherwise, you're not actually ready to have a meaningful scientific discussion with that person on that topic.
If I'm understanding you correctly, then I strongly disagree about what ethics and meta-ethics are for, as well as what "individual selfishness" means. The questions I care about flow from "What do I care about, and why?" and "How much do I think others should or will care about these things, and why?" Moral realism and amoral nihilism are far from the only options, and neither are ones I'm interested in accepting.
I'm not saying it improves decision making. I'm saying it's an argument for improving our decision making in general, if mundane decisions we wouldn't normally think are all that important have much larger and long-lasting consequences. Each mundane decision affects a large number of lives that parts of me will experience, in addition to the effects on others.
I don't see #1 affecting decision making because it happens no matter what, and therefore shouldn't differ based on our own choices or values. I guess you could argue it implies an absurdly high discount rate if you see the resulting branches as sufficiently separate from one another, but if the resulting worlds are ones I care about, then the measure dilution is just the default baseline I start from in my reasoning. Unless there is some way we can or could meaningfully increase the multiplication rate in some sets of branches but not others? I don't think that's likely with any methods or tech I can foresee.
#2 seems like an argument for improving ourselves to be more mindful in our choices to be more coherent on average, and #3 an argument for improving our average decision making. The main difference I can think of for how measure affects things is maybe in which features of the outcome distribution/probabilities among choices I care about.
It's not my field of expertise, so I have only vague impressions of what is going on, and I certainly wouldn't recommend anyone else use me as a source.
I'm not entirely sure how many of these I agree with, but I don't really think any of them could be considered heretical or even all that uncommon as opinions on LW?
All but #2 seem to me to be pretty well represented ideas, even in the Sequences themselves (to the extent the ideas existed when the Sequences got written).
#2 seems to me to rely on the idea that the process of writing is central or otherwise critical to the process of learning about, and forming a take on, a topic. I have thought about this, and I think for some people it is true, but for me writing is often a process of translating an already-existing conceptual web into a linear approximation of itself. I'm not very good at writing in general, and having an LLM help me wordsmith concepts and workshop ideas as a dialogue partner is pretty helpful. I usually form takes my reading and discussing and then thinking quietly, not so much during writing if I'm writing by myself. Say I read a bunch of things or have some conversations, take notes on these, write an outline of the ideas/structure I want to convey, and share the notes and outline with an LLM. I ask it to write a draft that it and I then work on collaboratively. How is that meaningfully worse than writing alone, or writing with a human partner? Unless you meant literally "Ask an LLM for an essay on a topic and publish it," in which case yes, I agree.
It both is and isn't an entry level question. On the one hand, your expectation matches the expectation LW was founded to shed light on, back when EY was writing The Sequences. On the other hand, it's still a topic a lot of people disagree on and write about here and elsewhere.
There's at least two interpretations of your question I can think of, with different answers, from my POV.
What I think you mean is, "Why do some people think ASI would share some resources with humans as a default or likely outcome?" I don't think that and don't agree with the arguments I've seen put forth for it.
But I don't expect our future to be terrible, in the most likely case. Part of that is the chance of not getting ASI for one reason or another. But most of that is the chance that we will, by the time we need it, have developed an actually satisfying answer to "How do we get an ASI such that it shares resources with humans in a way we find to be a positive outcome?" None of us has that answer yet. But, somewhere out in mind design space are possible ASIs that value human flourishing in ways we would reflectively endorse and that would be good for us.
I think it is at least somewhat in line with your post and what @Seth Herd said in reply above.
Like, we talk about LLM hallucinations, but most humans still don't really grok how unreliable things like eyewitness testimony are. And we know how poorly calibrated humans are about our own factual beliefs, or the success probability of our plans. I've also had cases where coworkers complain about low quality LLM outputs, and when I ask to review the transcripts, it turns out the LLM was right, and they were overconfidently dismissing its answer as nonsensical.
Or, we used to talk about math being hard for LLMs, but that disappeared almost as soon as we gave them access to code/calculators. I think most people interested in AI are overestimating how bad most other people are at mental math.
It's a good question. I'd also say limiting mid-game advertising might be a good idea. I'm not really a sports fan in general and don't gamble, but a few months ago I went to a baseball game, and people advertising - I think it was Draftkings? - were walking up and down the aisles constantly throughout the game. It was annoying, distracting, and disconcerting.
Thanks for laying out the parts I wasn't thinking about!
I agree. In which case, I think the concrete proposal of "We need to invest more resources in this" is even more important. That way, we can find out if it's impossible soon enough to use it as justification to make people stop pretending they've got it under control.
Over time I am increasingly wondering how much these shortcomings on cognitive tasks are a matter of evaluators overestimating the capabilities of humans, while failing to provide AI systems with the level of guidance, training, feedback, and tools that a human would get.
His view is that this is no different from people buying Taylor Swift tickets. In general I am highly sympathetic to this argument. I am not looking to tell people how much to invest or what goods to consume.
Hmm. Now you have me wondering if I should be biting that bullet in the other direction. I do think Live Nation's practices could qualify as predatory. I guess the difference is that Swift herself has asked fans not to buy at such high (and scalped) prices.
Edit to add: please ignore my last sentence. @ChristianKI reminded me we definitely know that would not be allowed.
Yes, but I don't see what VRA provisions the cases I listed could violate? Unless you can show state level election discrimination. And the standard for VRA violation is apparently much higher than I think it should be, given the difficulty of stopping or reducing gerrymandering.
Section 2 of the 14th amendment might apply, but it's at best unclear whether it means there have to be popular elections for choosing electors at all. At the time it passed there were still plenty of people around who remembered when most state legislatures chose electors directly.
TBH, with the current court I think it's more likely such a case would get the VRA gutted. The constitution explicitly gives state legislatures authority to apportion electors.
Note: this is not the same as trying to overrule popular vote results after an election happens. That got talked about a lot in 2020, I agree it would not be allowed.
Realistic in what sense?
This proposal requires a majority vote by four state legislatures, and increases each state's influence in presidential politics.
A constitutional amendment requires a 2/3 majority of both houses of Congress and a majority in the legislatures of 3/4 of the states, and reduces the influence of about 20 of the states who are currently overrepresented in the electoral college.
I like it. Every four years, instead of running ads, people spend billions lobbying state legislatures to sign weird rules for how they will assign electoral votes. No one knows who the swing states will be until Election Day. Maybe not even until December 17. That would be a fun supreme court case - what happens when Florida passes a law that changes electoral vote assignments, after the election is held but before the electors meet?
Yeah. As I understand it, state legislatures aren't really restricted in how they assign electoral votes. As in, if it wanted, the TX state legislature could probably say, "We're not holding a 2024 presidential election. Our electoral votes automatically go to whiever the R candidate is." What in the Constitution could stop them? It would most likely be political suicide for the state legislators. But within their authority.
Just wanted to point out that AI Safety ("Friendliness" at the time) was the original impetus for LW. Only, they (esp. EY, early on) kept noticing other topics that were prerequisites for even having a useful conversation about AI, and topics that were prerequisites for those, etc., and that's how the Sequences came to be. So in that sense, "LW is more and more full of detailed posts about AI that newcomers can't follow easily" is a sign that everything is going as intended, and yes, it really is important to read a lot of the prerequisite background material if you want to participate in that part of the discussion.
On the other hand, if you want a broader participation in the parts of the community that are about individual and collective rationality, that's still here too! You can read the Sequence Highlights, or the collections of resources listed by CFAR, or everything else in the Library. And if there's something you want to ask or discuss, make a post about it, and you'll most likely get some good engagement, or at least people directing you to other places to investigate or discuss it. There are also lots of other forums and blogs and substacks with current or historical ties to LW that are more specialized, now that the community is big enough to support that. The diaspora/fragmentation will continue for many of the same reasons we no longer have Natural Philosophers.
Really interesting post! Out of curiosity, have you ever read Star Maker? This is basically one of the civilizations he came up with, a symbiotic setup where the land-based crab-like animals provide the tool using intelligence and the aquatic whale-like animals provide the theoretical intelligence.
I agree with the thrust and most of the content of the post, but in the interest of strengthening it, I'm looking at your list of problems and wanted to point out what I see as gaps/weaknesses.
For the first one, keep in mind it took centuries from trying to develop a temperature scale to actually having the modern thermodynamic definition of temperature, and reliable thermometers. The definition is kinda weird and unintuitive, and strictly speaking runs from 0 to infinity, then discontinuously jumps to negative infinity (but only for some kinds of finite systems), then rises back towards negative zero (I always found this funny when playing the Sims 3 since it had a" -1K Refrigerator"). Humans knew things got hot and cold for many, many millennia before figuring out temperature in a principled way. Morality could plausibly be similar.
The third and fourth seem easily explainable by bounded rationality, in the same way that "ability to build flying machines and quantum computers" and "ability to identify and explain the fundamental laws of physical reality" vary between individuals, cultures, and societies.
For the fifth, there's no theoretical requirement that something real should only have a small number of principles that are necessary for human-scale application. Occam's Razor cuts against anyone suggesting a fundamentally complex thing, but it is possible there is a simple underlying set of principles that is just incredibly complicated to use in practice. I would argue that most attempts to formalize morality, from Kant to Bentham etc., have this problem, and one of the common ways they go wrong is that people try to apply them without recognizing that.
The sixth seems like a complete non-sequitur to me. If moral realism were true, then people should be morally good. But why would they? Even if there were somehow a satisfying answer to the second problem of imposing an obligation, this does not necessarily provide an actual mechanism to compel action or a trend to action to fulfil the obligation. In fact at least some traditional attempts to have moral realist frameworks, like Judeo-Christian God-as-Lawgiver religion, explicitly avoid having such a mechanism.
Depending on the posts I think you could argue they're comparable to one of thosebother source types I listed.
I'm curious which kinds of posts you're looking to cite, for what kinds of use in a dissertation for what field.
Looking over the site as a whole, different posts should (IMO) be regarded as akin to primary sources, news sources, non-peer-reviewed academic papers, whitepapers, symposia, textbook chapters, or professional sources, depending on the author and epidemic status.
In other words, this isn't "a site" for this purpose, it's a forum that hosts many kinds of content in a conveniently cross-referenceable format, some but not all of which is of a suitable standard for referencing for some but not all academic uses. This at least should be familiar to how your professors think about other kinds of citations. Someone might cite a doctor's published case study as part of the background research on some disease, or the NYT's publication of a summary of the pentagon papers in regards to the history of first amendment jurisprudence, or a corporate whitepaper or other report as a source of data about an industry.
I think the things we need in those scenarios are permits and inspections before completing the connection/installation of the new stuff, which I have always needed in any state I've lived in in addition to needing to use licensed plumbers and electricians.
I do, yes.
I don't think this is a good approach, and could easily backfire. The problem isn't that you need people to find errors in your reasoning. It's that you need to find the errors in your reasoning, fix them as best you can, iterate that a few times, then post your actual reasoning in a more thorough form, in a way that is collaborative and not combative. Then what you post may be in a form where it's actually useful for other people to pick it apart and discuss further.
The fact that you specify you want to put in little effort is a major red flag. So is the fact that you want to be perceived as someone worth listening to. The best way to be perceived as being worth listening to is to be worth listening to, which means putting in effort. An approach that focuses on signaling instead of being is a net drain on the community's resources and cuts against the goal of having humanity not die. It takes time and work to understand a field well enough for your participation to be a net positive.
That said, it's clear you have good questions you want to discuss, and there are some pretty easy ways to reformat your posts that would help. Could probably be done in at most an extra hour per post, less as it becomes habitual.
Some general principles:
- Whenever possible, start from a place of wanting to learn and collaborate and discover instead of wanting to persuade. Ask real questions, not rhetorical questions. Seek thought partners, and really listen to what they have to say.
- If you do want to change peoples' minds about something that is generally well-accepted as being well-supported, the burden of proof is on you, not them. Don't claim otherwise. Try not to believe otherwise, if you can manage it. Acknowledge that other people have lots of reasons for believing what they believe.
- Don't call people stupid or blind.
- Don't make broad assumptions about what large groups of people believe.
- Don't say you're completely certain you're right, especially when you are only offering a very short description of what you think, and almost no description of why you think it, or why anyone else should trust or care about what you think.
- Don't make totalizing statements without a lot of proof. You seem to often get into trouble with all-or-nothing assumptions and conclusions that just aren't justified.
- Lay out your actual reasoning. What are your premises, and why do you believe them? What specific premises did you consider? What premises do you reject that many others accept, and why? And no, something like "orthogonality thesis" is not a premise. It's the outcome of a detailed set of discussions and arguments that follow from much simpler premises. Look at what you see as assumptions, then drill down into them a few more layers to find the actual underlying assumptions.
- Cite your sources. What have you done/read/studied on the topic? Where are you drawing specific claims from? This is part of your own epistemic status evaluation and those others will need to know. You should be doing this anyway for your own benefit as you learn, long before you start writing a post for anyone else.
- You may lump the tone of this one under "dogmatic," but the Twelve Virtues of Rationality really are core principles that are extraordinarily useful for advancing both individual and community understanding of pretty much anything. Some of these you already are showing, but pay more attention to 2-4 and 8-11.
It's definitely a useful partner to bounce ideas off, but keep in mind it's trained with a bias to try to be helpful and agreeable unless you specifically prompt it to prompt an honest analysis and critique.
Fair enough, I was being somewhat cheeky there.
I strongly agree with the proposition that it is possible in principle to construct a system that pursues any specifiable goal that has any physically possible level of intelligence, including but not limited to capabilities such as memory, reasoning, planning, and learning.
As things stand, I do not believe there is any set of sources I or anyone else here could show you that would influence your opinion on that topic. At least, not without a lot of other prerequisite material that may seem to you to have nothing to do with it. And without knowing you a whole lot better than I ever could from a comment thread, I can't really provide good recommendations beyond the standard ones, at least not recommendations I would expect that you would appreciate.
However, you and I are (AFAIK) both humans, which means there are many elements of how our minds work that we share, which need not be shared by other kinds of minds. Moreover, you ended up here, and have an interest in many types of questions that I am also interested in. I do not know but strongly suspect that if you keep searching and learning, openly and honestly and with a bit more humility, that you'll eventually understand why I'm saying what I'm saying, whether you agree with me or not, and whether I'm right or not.
Of course some of them are dogmatic! So what? If you can't learn how to learn from sources that make mistakes, then you will never have anything or anyone to learn from.
I have not said either of those things.
I have literally never seen anyone say anything like that here in response to a sincere question relevant to the topic at hand. Can you provide an example? Because I read through a bunch of your comment history earlier and found nothing of the sort. I see many suggestions to do basic research and read basic sources that include a thorough discussion of the assumptions, though.
You don't need me to answer that, and won't benefit if I do. You just need to get out of the car.
I don't expect you to read that link or to get anything useful out of it if you do. But if and when you know why I chose it, you'll know much more about the orthogonality thesis than you currently do.
It's true that your earlier comments were polite in tone. Nevertheless, they reflect an assumption that the person you are replying to should, at your request, provide a complete answer to your question. Whereas, if you read the foundational material they were drawing on and which this community views as the basics, you would already have some idea where they were coming from and why they thought what they thought.
When you join a community, it's on you to learn to talk in their terminology and ontology enough to participate. You don't walk into a church and expect the minister to drop everything mid-sermon to explain what the Bible is. You read it yourself, seek out 101 spaces and sources and classes, absorb more over time, and then dive in as you become ready. You don't walk into a high level physics symposium and expect to be able to challenge a random attendee to defend Newton's Laws. You study yourself, and then listen for a while, and read books and take classes, and then, maybe months or years later, start participating.
Go read the sequences, or at least the highlights from the sequences. Learn about the concept of steelmanning and start challenging your own arguments before you use them to challenge those of others. Go read ASX and SSC and learn what it looks like to take seriously and learn from an argument that seems ridiculous to you, whether or not you end up agreeing with it. Go look up CFAR and the resources and methods they've developed and/or recommended for improving individual rationality and making disagreements more productive.
I'm not going to pretend everyone here has done all of that. It's not strictly necessary, by any means. But when people tell you you're making a particular mistake, and point you to the resources that discuss the issue in detail and why it's a mistake and how to improve, and this happens again and again on the same kinds of issues, you can either listen and learn in order to participate effectively, or get downvoted.
FYI, I don't work in AI, it's not my field of expertise either.
And you're very much misrepresenting or misunderstanding why I am disagreeing with you, and why others are.
And you are mistaken that we're not talking about this. We talk about it all the time, in great detail. We are aware that philosophers have known about the problems for a very long time and failed to come up with solutions anywhere near adequate to what we need for AI. We are very aware that we don't actually know what is (most) valuable to us, let alone any other minds, and have at best partial information about this.
I guess I'll leave off with the observation that it seems you really do believe as you say, that you're completely certain of your beliefs on some of these points of disagreement. In which case, you are correctly implementing Bayesian updating in response to those who comment/reply. If any mind assigns probability 1 to any proposition, that is infinite certainty. No finite amount of data can ever convince that mind otherwise. Do with that what you will. One man's modus ponens is another's modus tollens.
No, I don't, you aren't, and I don't, in that order.
If you agree that I can refute Pascal's Wager then I don't actually "face" it.
If I refute it, I'm not left with power seeking, I'm left with the same complete set of goals and options I had before we considered Pascal's Wager. Those never went away.
And more power is better all else equal, but all else is not equal when I'm trading off effort and resources among plans and actions. So, it does not follow that seeking more power is always the best option.
Clever is not relevant to upvoting or downvoting in this context. The statement, as written, is not insightful or helpful, nor does it lead to interesting discussions that readers would want to give priority in what is shown to others.
Yes, but neither of us gets to use "possible" as a shield and assume that leaves us free to treat the two possibilities as equivalent, even if we both started from uniform priors. If this is not clear, you need to go back to Highly Advanced Epistemology 101 for Beginners. Those are the absolute basics for a productive discussion on these kinds of topics.
You have presented briefly stated summary descriptions of complex assertions without evidence other than informal verbal arguments which contain many flaws and gaps that that I and many others have repeatedly pointed out. I and others have provided counterexamples to some of the assertions and detailed explanations of many of the flaws and gaps. You have not corrected the flaws and gaps, nor have you identified any specific gaps or leaps in any of the arguments you claim to be disagreeing with. Nor have you paid attention to any of the very clear cases where what you claim other people believe blatantly contradicts what they actually believe and say they believe and argue for, even when this is repeatedly pointed out.
This is a question that's many reasoning steps into a discussion that's well developed. Maxentropy priors, Solomonoff priors, uniform priors, there are good reasons to choose each depending on context, take your pick depending on the full set of hypotheses under consideration. Part of the answer is "There's basically no such thing as no evidence if you have any reason to be considering a hypothesis at all." Part is "It doesn't matter that much as long as your choice isn't actively perverse, because as long as you correctly update your priors over time, you'll approach the correct probability eventually."
FWIW, while I am as certain as I can reasonably be that 2+2=4, This is not a foundational assumption. I wasn't born knowing it. I arrived at it based on evidence acquired over time, and if I started encountering different evidence, I would eventually change my mind. See https://www.lesswrong.com/posts/6FmqiAgS8h4EJm86s/how-to-convince-me-that-2-2-3
Also, the reason that "Every assumption is incorrect unless there is evidence" isn't "basic logic" is that "correct" and "incorrect" are not the right categories. Both a statement and its competing hypotheses are claims to which rational minds assign credences/probabilities that are neither zero nor one, for any finite level of evidence. A mind is built with assumptions that govern its operation, and some of those assumptions may be impossible for the mind itself to want to change or choose to change, but anything else that the mind is capable of representing and considering is fair game in the right environment.
It isn't.
I provided counterexamples. Anything that already exists is not impossible, and a system that cannot achieve things that humans achieve easily is not as smart as, let alone smarter or more capable than, humans or humanity. If you are insisting that that's what intelligence means, then TBH your definition is not interesting or useful or in line with anyone else's usage. Choose a different word, and explain what you mean but it.
When you hear about "AI will believe in God" you say - AI is NOT comparable to humans.
When you hear "AI will seek power forever" you say - AI IS comparable to humans.
If that's how it looks to you, that's because you're only looking at the surface level. "Comparability to humans" is not the relevant metric, and it is not the metric by which experts are evaluating the claims. The things you're calling foundational, that you're saying have unpatched holes being ignored, are not, in fact, foundational. The foundations are elsewhere, and have different holes that we're actively working on and others we're still discovering.
AI scientists assume that there is no objective goal.
They don't. Really, really don't. I mean, many do I'm sure in their own thoughts, but their work does not in any way depend on this. It only depends on whether it is possible in principle to build a system, that is capable of having significant impact in the world, which does not pursue or care to pursue or find or care to find whatever objective goal that might exist.
As written, your posts are a claim that such a thing is absolutely impossible. That no system as smart as or smarter than humans or humanity could possibly pursue any known goal or do anything other than try to ensure its own survival. Not (just) as a limiting case of infinite intelligence, but as a practical matter of real systems that might come to exist and compete with humans for resources.
Suppose there is a God, a divine lawgiver who has defined once and for all what makes something Good or Right. Or, any other source of some Objective Goal, whether we can know what it is or not. In what way does this prevent me from making paperclips? By what mechanism does it prevent me from wanting to make paperclips? From deciding to execute plans that make paperclips, and not execute those that don't? Where and how does that "objective goal" reach into the physical universe and move around the atoms and bits that make up the process that actually governs my real-world behavior? And if there isn't one, then why do you expect there to be one if you gave me a brain a thousand or a million times as large and fast? If this doesn't happen for humans, then why do you expect there to be one in other types of mind than human? What are the boundaries of what types of mind this applies to vs not, and why? If I took a mind that did have an obsession with finding the objective goal and/or maximizing its chances of survival, why would I pretend its goal was something else that what it plans to do and executes plans to do? But also, if I hid a secret NOT gate in its wiring that negated the value it expects to gain from any plan it comes up with, well, what mechanism prevents that NOT gate from obeying the physical laws and reversing the system's choices to instead pursue the opposite goal?
In other words, in this post, steps 1-3 are indeed obvious and generally accepted around here, but there is no necessary causal link between steps three and four. You do not provide one, and there have been tens of thousands of pages devoted to explaining why one does not exist. In this post, the claim in the first sentence is simply false, the orthogonality thesis does not depend on that assumption in any way. In this post, you're ignoring the well-known solutions to Pascal's Mugging, one of which is that the supposed infinite positive utility is balanced by all the other infinitely many possible unknown unknown goals with infinite positive utilities, so that the net effect this will have on current behavior depends entirely on the method used to calculate it, and is not strictly determined by the thing we call "intelligence." And also, again, it is balanced by the fact that pursuing only instrumental goals, forever searching and never achieving best-known-current terminal goals, knowing that this is what you're doing and going to do despite wanting something else, guarantees that nothing you do has any value for any goal other than maximizing searching/certainty/survival, and in fact minimizes the chances of any such goal ever being realized. These are basic observations explained in lots of places on and off this site, in some places you ignore people linking to explanations of them in replies to you, and in some other cases you link to them yourself while ignoring their content.
And just FYI, this will be my last detailed response to this line of discussion. I strongly recommend you go back, reread the source material, and think about it for a while. After that, if you're still convinced of your position, write an actually strong piece arguing for it. This won't be a few sentences or paragraphs. It'll be tens to hundreds of pages or more in which you explain where and why and how the already-existing counterarguments, which should be cited and linked in their strongest forms, are either wrong or else lead to your conclusions instead of the ones others believe they lead to. I promise you that if you write an actual argument, and try to have an actual good-faith discussion about it, people will want to hear it.
At the end of the day, it's not my job to prove to you that you're wrong. You are the one making extremely strong claims that run counter to a vast body of work as well as counter to vast bodies of empirical evidence in the form of all minds that actually exist. It is on you to show that 1) Your argument about what will happen in the limit of maximum reasoning ability has no holes for any possible mind design, and 2) This is what is relevant for people to care about in the context of "What will actual AI minds do and how do we survive and thrive as we create them and/or coordinate amongst ourselves to not create them?"
That is what people are doing and how they're using the downvotes, though. You aren't seeing that because you haven't engaged with the source material or the topic deeply enough.
I tried to be polite and patient here, but it didn't work, I'm trying new strategies now. I'm quite sure my reasoning is stronger than reasoning of people who don't agree with me.
I find "your communication was not clear" a bit funny. You are scientists, you are super observant, but you don't notice a problem when it is screamed at your face.
Just to add since I didn't respond to this part: your posts are mostly saying that very well-known and well-studied problems are so simple and obvious and only your conclusions are plausible that everyone else must be wrong and missing the obvious. You haven't pointed out a problem. We knew about the problem. We've directed a whole lot of time to studying the problem. You have not engaged with the proposed solutions.
It isn't anyone else's job to assume you know what you're talking about. It's your job to show it, if you want to convince anyone, and you haven't done that.
Why do you think so? A teenager who did not earn any money in his life yet has failed utterly its objective to earn money?
The difference (aside from the fact that no human has only a single goal) is the word yet. The teenager has an understanding, fluid and incomplete as it may be, about when, how, and why their resource allocation choices will change and they'll start earning money. There is something they want to be when they grow up, and they know they aren't it yet, but they also know when being "grown up" happens. You're instead proposing either that an entity that really, truly wants to maximize paperclips will provably and knowingly choose a path where it never pivots to trying to achieve its stated goal instead of pursuing instrumental subgoals, or that it is incapable of the metacognitive realization that its plan is straightforwardly outcompeted by the plan "Immediately shut down," which is outcompeted by "Use whatever resource I have at hand to immediately make a few paperclips even if I then get shut down."
Or, maybe, you seem to be imagining a system that looks at each incremental resource allocation step individually without ever stepping back and thinking about the longer term implications of its strategy, in which case, why exactly are you assuming that? And why is its reasoning process about local resource allocation so different from its reasoning process where it understands the long term implications of making near term choice that might get it shut down? Any system whose reasoning process is that disjointed and inconsistent is sufficiently internally misaligned that it's a mistake to call it an X-maximizer based on the stated goal instead of behavior.
Also, you don't seem to be considering any particular context of how the system you're imagining came to be created, which has huge implications for what it needs to do to survive. One common version of why a paperclip maximizer might come to be is by mistake, but another is "we wanted an AI paperclip factory manager to help us outproduce our competitors." In that scenario, guess what kind of behavior is most likely to get it shut down? By making such a sweeping argument, you're essentially saying it is impossible for any mind to notice and plan for these kinds of problems.
But compare to analogous situations: "This kid will never show up to his exam, he's going to keep studying his books and notes forever to prepare." "That doctor will never perform a single surgery, she'll just keep studying the CT scan results to make extra sure she knows it's needed." This is flatly, self-evidently untrue. Real-world minds at various levels of intelligence take real-world steps to achieve real-world goals all the time, every day. We divide resources among multiple goals on differing planning horizons because that actually does work better. You seem to be claiming that this kind of behavior will change as minds get sufficiently "smarter" for some definitions of smartness. And not just for some minds, but for all possible minds. In other words, that improved ability to reason leads to complete inability to do pursue any terminal goal other than itself. That somehow the supposedly "smarter" system loses the capability to make the obvious ground-level observation "If I'm never going to pursue my goal, and I accumulate all possible resources, then the goal won't get pursued, so I need to either change my strategy or decide I was wrong about what I thought my goal was." But this is an extremely strong claim that you provide no evidence for aside from bare assertions, even in response to commenters who direct you to near-book-length discussions of why those assertions don't hold.
The people here understand very well that systems (including humans) can have behavior that demonstrates different goals (in the what-they-actually-pursue sense) than the goals we thought we gave them, or than they say they have. This is kinda the whole point of the community existing. Everything from akrasia to sharp left turns to shoggoths to effective altruism is pretty much entirely about noticing and overcoming these kinds of problems.
Also, if we step back from the discussion of any specific system or goal, the claim that "a paperclip maximizer would never make paperclips" is true for any sufficiently smart system is like saying, "If the Windows Task Scheduler ever runs anything but itself, it's ignoring the risk that there's a better schedule it could find." Which is a well-studied practical problem with known partial solutions and also known not to have a fully general solution. That second fact doesn't prevent schedulers from running other things, because implementing and incrementally improving the partial solutions is what actually improves capabilities for achieving the goal.
If you don't want to seriously engage with the body of past work on all of the problems you're talking about, or if you want to assume that the ones that are still open or whose (full or partial) solutions are unknown to you are fundamentally unsolvable, you are very welcome to do that. You can pursue any objectives you want. But putting that in a post the way you're doing it will get the post downvoted. Correctly downvoted, because the post is not useful to the community, and further responses to the post are not useful to you.
The answer here is: a paperclip maximizer that devotes 100% of resources to self-preservation has, by its own choice, failed utterly to achieve its own objective. By making that choice, it ensures that the value of its own survival, by its own estimation, is not infinite. It is zero. Survival in this limit is uncorrelated at best with paperclip production. In fact, survival would almost certainly have negative expected value in this scenario, because if the maximizer simply shut down, there is a chance some fraction of the resources it would have wasted on useless survival will instead be used by other entities to make paperclips. Another way to say it may be that what you have described is not, in fact, a paperclip maximizer, because in maximizing resource acquisition for survival, it is actually minimizing paperclip production. It may want to be a paperclip maximizer, it may claim to be one, it may believe it is one, but it simply isn't.
Also, some general observations, in the interest of assuming you want an actual discussion where you are part of this community trying to learn and grow together: The reason your posts get downvoted isn't because the readers are stupid, and it isn't because there is not an interesting or meaningful discussion to be had on these questions. It's because your style of writing is insulting, inflammatory, condescending, and lacks sufficient attention to its own assumptions and reasoning steps. You assert complex and nuanced arguments and propositions (like Pascal's Wager) as though they were self-evidently true and fundamental without adequately laying out which version of those propositions you even mean, let alone why you think they're so inarguable. You seem to not have actually looked to find out what other people have already thought and written about many of these topics and questions, when in fact we have noticed the skulls. And that's fine not to have looked, but in that case, try to realize you maybe haven't looked, and write in a way that shows you're open to being told about things you didn't consider or question in your own assumptions and arguments.
I think the grandfather idea is that if you kill 100 people now, and the average person who dies would have had 1 descendant, and the large loss would happen in 100 years (~4 more generations), then the difference in total lives lived between the two scenarios is ~500, not 900. If the number of descendants per person is above ~1.2, then burying the waste means population after the larger loss in 100 years is actually higher than if you processed it now.
Obviously I'm also ignoring a whole lot of things here that I do think matter, as well.
And of course, as you pointed out in your reply to my comment above, it's probably better to ignore the scenario description and just look at it as a pure choice along the lines of something like "Is it better to reduce total population by 900 if the deaths happen in 100 years instead of now?"
General elections necessarily do. Coverage of issues does not. Assignment of opinions in the press can be to people and ideologies without pretending everyone in a party shares or should share identical views.