Posts
Comments
It's also available on Android.
Finitism doesn't reject the existence of any given natural number (although ultrafinitism might), nor the validity of the successor function (counting), nor even the notion of a "potential" infinity (like time), just the idea of a completed one being an object in its own right (which can be put into a set). The Axiom of Infinity doesn't let you escape the notion of classes which can't themselves be an element of a set. Set theory runs into paradoxes if we allow it. Is it such an invalid move to disallow the class of Naturals as an element of a set, when even ZFC must disallow the Surreals for similar reasons?
Before Cantor, all mathematicians were finitists. It's not a weird position historically.
We do model physics with "real" numbers, but that doesn't mean the underlying reality is infinite or even infinitely divisible. My finitism is motivated by my understanding of physics and cosmology, not the other way around. Nature seems to cut us off from any access to any completed infinity, and it's not clear that even potential infinities are allowed (hence my sympathy with ultrafinitism). I have no need of that axiom.
Quantum Field Theory, though traditionally modeled using continuous mathematics, implies the Bekenstein bound: a finite region of space contains a finite amount of information. There are no "infinite bits" available to build the real numbers with. However densely you store information, eventually, at some point, your media collapses into a black hole, and packing in more must take up more space.
Physical space can't be a continuum like the "reals". It's not infinitely divisible. Measuring distance with increasing precision requires higher frequency waves, and thus higher energies, which eventually has enough effective mass to gravitationally distort the very space you are measuring, eventually collapsing into a black hole.
Below a certain limit, distance isn't physically meaningful. If you assume an electron is a point particle with "infinitesimal" size and you zoom in enough, you should be able to get arbitrarily high electric field strength. But at some point, high enough field strength results in vacuum polarization: virtual electron/positron pairs get pushed around and finally one of the positrons annihilates whatever you thought the real electron was, and then one of the virtual electrons doesn't have anything to pair with and becomes the real one. It's as if the electron is jumping around. You can't nail it down. It doesn't physically have a position down below a certain scale in time and space. There are no infinite bits. All the fundamental particle types are like this. There are no infinitesimal point particles. They're just waves.
There's also a cosmological horizon limiting how much of the Universe we can see. There's also a (related) past temporal horizon at the Big Bang. We can't see a completed past-temporal or spacial infinity, in any direction. We're not sure of the Ultimate Fate of the Universe, but it looks like Heat Death is probably it, given our current understanding of physics. So there's a future limit as well. The other likely candidate Fates are finite in time as well.
But even supposing finite information content in a finite region seems to be enough to make potential-infinite time not really meaningful. There's a finite number of states possible, so eventually all reachable states are reached. If physics is deterministic (it seems to be), then we get into a cycle. So time is better modeled as a finite circle, rather than an infinite line. And if it's not deterministic? Then we still saturate all reachable states, the order just gets shuffled around a bit. There's no phycial way to tell the difference.
Potential-infinite space is the same way. Any accessible region has a finite number of states, so at least some of them must repeat exactly in other regions. If there's some determinism to the pattern, then it's maybe better modeled as some curled-up finite space (although aperiodic tilings are also possible). If it's random, then we still saturate all reachable states, the order just gets shuffled around a bit. There's no physical way to tell the difference. Once all reachable states have been saturated, why does it matter if they appear only once or a googol or infinity times?
I installed Mindfulness Bell on my phone, and every time it chimes, I ask myself, "Should I be doing something else right now?" Sometimes I'm being productive and don't need to stop. When I notice I've started ignoring it, I change the chime sound so I notice it again. The interval is adjustable. If I'm stuck scrolling social media, this often gives me the opportunity to stop. Doesn't always work though. I also have it turned off at night so I can sleep. This is a problem if I get stuck on social media at night when I should be sleeping. Instead, after bed time, I progressively dim the lights and screen to the point where I can barely read it. That's usually enough to let me fall asleep.
I'm hearing intuitions, not arguments here. Do you understand Cantor's Diagonalization argument? This proves that the set of all integers is "smaller" (in a well-defined way) than the set of all real numbers, despite the set of all integers being already infinite in size. And it doesn't end there. There is no largest set.
Russell's paradox arises when a set definition refers to itself. For example, in a certain town, the barber is the one who shaves all those (and only those) who do not shave themselves. This seems to make sense its face. But who shaves the barber? Contradiction! Not all set definitions are valid, and this includes the universal one, which can be proved to not exist in many ways, at least in the usual ZFC (and similar).
There are two ways to construct a universal object. Either make it a non-set notion like a "proper class", which can't be an element of a set (and thus can't contain itself or any other proper class), or restrict the axiom of comprehension in a way which results in a non-well-founded set theory. Cantor's Theorem doesn't hold for all sets in NF. The diagonal set argument can't be constructed (in all cases) under its rules. NF has a universal set that contains itself, but it accomplishes this by restricting comprehension to stratified formulas. I'm not a set theorist, so I'm still not sure I understand this properly, but it looks like an infinite hierarchy of set types, each with its own universal set. Again, no end to the hierarchy, but in practice all the copies behave the same way. So instead of strictly two types of classes, the proper class and the small class, you have some kind of hyperset that can contain sets, but not other hypersets, and hyper-hypersets that can contain both, but not other hyper-hypersets, and so forth, ad infinitum.
Personally, I'm rather sympathetic to the ultrafinitists, and might be a finitist myself. I can accept the slope of a vertical line being "infinite" in the limit. That's just an artifact of how we chose to measure something. Measure it differently, and the infinity disappears. I can also accept a potential infinity, like not having a largest integer, because the successor function can make a bigger one. We can make an abstract algorithm run on an abstract machine that can count, and it has a finite description. But taking the "completed" set of all integers as an object itself rubs me the wrong way. That had to be tacked on as a separate axiom. It's unphysical. No operation could possibly construct a physical model of such a thing. It's an impossible object. One could try to point to a pre-existing model, but we physically cannot verify it. It would take infinite time, space, or precision, which is again unphysical.
Similarly, there is no physical way to verify an infinite God exists, because we physically cannot distinguish it from a (sufficiently large, but) finite one. I might be willing to call such an alien a small-g "god", but it's not the big-G omni-everything one in valentinslepukhin's definition. That only leaves some kind of a priori logical argument, because it can't be an empirical one, but it has to be based on axioms I can accept, doesn't it? I can entertain weird axioms for the sake of argument, but I'm not seeing one short of "God exists", which is blatant question begging.
The main idea here is that one can always derive a "greater" set (in terms of cardinality) from any given set, even if the given set is already infinite, because there are higher degrees of infinity. There is no greatest infinity, just like there is no largest number. So even if (hypothetically) a Being with infinite knowledge exists, there could be Beings with greater knowledge than that. No matter which god you choose, there could be one greater than that, meaning there are things the god you chose doesn't know (and hence He isn't "omniscient", and therefore isn't "God", because this was a required attribute.)
I don't know how to interpret "all existing objects", because I don't know what counts as an "object" in your definition. Set theory doesn't require ur-objects (although those are known variations) and just starts with the empty set, meaning all "objects" are themselves sets. The powerset operation evaluates to the set of all subsets of a set. The powerset of a set always has greater cardinality than the set you started with. That is, for any given collection of "objects", the number of possible groupings of those objects is always a greater number than the number of objects, even if the collection of objects you started with had an infinite number to begin with. So no, this doesn't prove that an infinite universe cannot exist, just that there are degrees of infinities (and no "greatest" one).
Naiive set theory leads to paradoxes when defining self-referential sets. The idea of "infinite" gods seem to have similar problems. There are various ways to resolve this. The typical one used in foundations of mathematics is the notion of a collection that is too large to be a set, a "proper class". ("Class" used to be synonymous with "set".) But later on in the discussion it was pointed out that this isn't the only possible resolution.
I don't know of any officially sanctioned way. But, hypothetically, meeting a publicly-known real human person in person and giving them your public pgp key might work. Said real human could vouch for you and your public key, and no one else could fake a message signed by you, assuming you protect your private key. It's probably sufficient to sign and post one message proving this is your account (profile bio, probably), and then we just have to trust you to keep your account password secure.
Would it help if we wore helmets?
Hissp v0.5.0 is up.
python -m pip install hissp
If you always wanted to learn about Lisp macros, but only know Python, try the Hissp macro tutorials.
That seems to be getting into Game Theory territory. One can model agents (players) with different strategies, even suboptimal ones. A lot of the insight from Game Theory isn't just about how to play a better strategy, but how changing the rules affects the game.
Not sure I understand what you mean by that. The Universe seems to follow relatively simple deterministic laws. That doesn't mean you can use quantum field theory to predict the weather. But chaotic systems can be modeled as statistical ensembles. Temperature is a meaningful measurement even if we can't calculate the motion of all the individual gas molecules.
If you're referring to human irrationality in particular, we can study cognitive bias, which is how human reasoning diverges from that of idealized agents in certain systematic ways. This is a topic of interest at both the individual level of psychology, and at the level of statistical ensembles in economics.
It's short for "woo-woo", a derogatory term skeptics use for magical thinking.
I think the word originates as onomatopoeia from the haunting woo-woo Theremin sounds played in black-and-white horror films when the ghost was about to appear. It's what the "supernatural" sounds like, I guess.
It's not about the belief being unconventional as much as it being irrational. Just because we don't understand how something works doesn't mean it doesn't work (it just probably doesn't), but we can still call your reasons for thinking so invalid. A classic skeptic might dismiss anything associated categorically, but rationalists judge by the preponderance of the evidence. Some superstitions are valid. Prescientific cultures may still have learned true things, even if they can't express them well to outsiders.
Use a smart but not self-improving AI agent to antagonize the world with the goal of making advanced societies believe that AGI is a bad idea and precipitating effective government actions. You could call this the Ozymandias approach.
ChaosGPT already exists. It's incompetent to the point of being comical at the moment, but maybe more powerful analogues will appear and wreak havoc. Considering the current prevalence of malware, it might be more surprising if something like this didn't happen.
We've already seen developments that could have been considered AI "warning shots" in the past. So far, they haven't been enough to stop capabilities advancement. Why would the next one be any different? We're already living in a world with literal wars killing people right now, and crazy terrorists with various ideologies. It's surprising what people get used to. How bad would a warning shot have to be to shock the world into action given that background noise? Or would we be desensitized by then by the smaller warning shots leading up to it? Boiling the frog, so to speak. I honestly don't know. And by the time a warning shot gets that bad, can we act in time to survive the next one?
Intentionally causing earlier warning shots would be evil, illegal, destructive, and undignified. Even "purely" economic damage at sufficient scale is going to literally kill people. Our best chance is civilization stepping up and coordinating. That means regulations and treaties, and only then the threat of violence to enforce the laws and impose the global consensus on any remaining rogue nations. That looks like the police and the army, not terrorists and hackers.
We have already identified some key resources involved in AI development that could be restricted. The economic bottlenecks are mainly around high energy requirements and chip manufacturing.
Energy is probably too connected to the rest of the economy to be a good regulatory lever, but the U.S. power grid can't currently handle the scale of the data centers the AI labs want for model training. That might buy us a little time. Big tech is already talking about buying small modular nuclear reactors to power the next generation of data centers. Those probably won't be ready until the early 2030s. Unfortunately, that also creates pressures to move training to China or the Middle East where energy is cheaper, but where governments are less concerned about human rights.
A recent hurricane flooding high-purity quartz mines made headlines because chip producers require it for the crucibles used in making silicon wafers. Lower purity means accidental doping of the silicon crystal, which means lower chip yields per wafer, at best. Those mines aren't the only source, but they seem to be the best one. There might also be ways to utilize lower-purity materials, but that might take time to develop and would require a lot more energy, which is already a bottleneck.
The very cutting-edge chips required for AI training runs require some delicate and expensive extreme-ultraviolet lithography machines to manufacture. They literally have to plasmify tin droplets with a pulsed laser to reach those frequencies. ASML Holdings is currently the only company that sells these systems, and machines that advanced have their own supply chains. They have very few customers, and (last I checked) only TSMC was really using them successfully at scale. There are a lot of potential policy levers in this space, at least for now.
I do not really understand how technical advance in alignment realistically becomes a success path. I anticipate that in order for improved alignment to be useful, it would need to be present in essentially all AI agents or it would need to be present in the most powerful AI agent such that the aligned agent could dominate other unaligned AI agents.
The instrumental convergence of goals implies that a powerful AI would almost certainly act to prevent any rivals from emerging, whether aligned or not. In the intelligence explosion scenario, progress would be rapid enough that the first mover achieves a decisive strategic advantage over the entire world. If we find an alignment solution robust enough to survive the intelligence explosion, it will set up guardrails to prevent most catastrophes, including the emergence of unaligned AGIs.
I don’t expect uniformity of adoption and I don’t necessarily expect alignment to correlate with agent capability. By my estimation, this success path rests on the probability that the organization with the most capable AI agent is also specifically interested in ensuring alignment of that agent. I expect these goals to interfere with each other to some degree such that this confluence is unlikely. Are your expectations different?
Alignment and capabilities don't necessarily correlate, and that accounts for lot of why my p(doom) is so high. But more aligned agents are, in principle, more useful, so rational organizations should be motivated to pursue aligned AGI, not just AGI. Unfortunately, alignment research seems barely tractable, capabilities can be brute-forced (and look valuable in the short term) and corporate incentive structures being what they are, in practice, what we're seeing is a reckless amount of risk taking. Regulation could alter the incentives to balance the externality with appropriate costs.
How about "bubble lighting" then?
The forms of approaches that I expected to see but haven’t seen too much of thus far are those similar to the one that you linked about STOP AI. That is, approaches that would scale with the addition of approximately average people.
Besides STOP AI, there's also the less extreme PauseAI. They're interested in things like lobbying, protests, lawsuits, etc.
I presume that your high P(doom) already accounts for your estimation of the probability of government action being successful. Does your high P(doom) imply that you expect these to be too slow, or too ineffective?
Yep, most of my hope is on our civilization's coordination mechanisms kicking in in time. Most of the world's problems seem to be failures to coordinate, but that's not the same as saying we can't coordinate. Failures are more salient, but that's a cognitive bias. We've achieved a remarkable level of stability, in the light of recent history. But rationalists can see more clearly than most just how mad the world still is. Most of the public and most of our leaders fail to grasp some of the very basics of epistemology.
We used to think the public wouldn't get it (because most people are insufficiently sane), but they actually seem appropriately suspicious of AI. We used to think a technical solution was our only realistic option, but progress there has not kept up with more powerful computers brute-forcing AI. In desperation, we asked for more time. We were pleasantly surprised at how well the message was received, but it doesn't look like the slowdown is actually happening yet.
As a software engineer, I've worked in tech companies. Relatively big ones, even. I've seen the pressures and dysfunction. I strongly suspected that they're not taking safety and security seriously enough to actually make a difference, and reports from insiders only confirm that narrative. If those are the institutions calling the shots when we achieve AGI, we're dead. We desperately need more regulation to force them to behave or stop. I fear that what regulations we do get won't be enough, but they might.
Other hopes are around a technical breakthrough that advances alignment more than capabilities, or the AI labs somehow failing in their project to produce AGI (despite the considerable resources they've already amassed), perhaps due to a breakdown in the scaling laws or some unrelated disaster that makes the projects too expensive to continue.
However, it seems to be a less reasonable approach if time scales are short or probabilities are high.
I have a massive level of uncertainty around AGI timelines, but there's an uncomfortably large amount of probability mass on the possibility that through some breakthrough or secret project, AGI was achieved yesterday and not caught up with me. We're out of buffer. But we might still have decades before things get bad. We might be able to coordinate in time, with government intervention.
I would expect this would include the admission of ideas which would have previously been pruned because they come with negative consequences.
What ideas are those?
Protesters are expected to be at least a little annoying. Strategic unpopularity might be a price worth paying if it gets results. Sometimes extremists shift the Overton Window.
I mean, yes, hence my comment about ChatGPT writing better than this, but if word gets out that Stop AI is literally using the product of the company they're protesting in their protests, it could come off as hypocrisy.
I personally don't have a problem with it, but I understand the situation at a deeper level than the general public. It could be a wise strategic move to hire a human writer, or even ask for competent volunteer writers, including those not willing to join the protests themselves, although I can see budget or timing being a factor in the decision.
Or they could just use one of the bigger Llamas on their own hardware and try to not get caught. Seems like an unnecessary risk though.
The press release strikes me as poorly written. It's middle-school level. ChatGPT can write better than this. Exactly who is your (Stop AI's) audience here? "The press"?
Exclamation points are excessive. "Heart's content"? You're not in this for "contentment". The "you can't prove it, therefore I'm right" argument is weak. The second page is worse. "Toxic conditions"? I think I know what you meant, but you didn't connect it well enough for a general audience. "accelerate our mass extinction until we are all dead"? I'm pretty sure the "all dead" part has to come before the "extinction". "(and abusing his sister)"? OK, there's enough in the public record to believe than Sam is not (ahem) "consistently candid", but I'm at under 50% about the sister abuse even then on priors. Do you want to get sued for libel on top of your jail time? Is that a good strategy?
I admire your courage and hope you make an impact, but if you're willing to pay these heavy costs, of getting arrested, and facing jail time etc., then please try to win! Your necessity defense is an interesting idea, but if this is the best you can do, it will fail. If you can afford to hire a good defense attorney, you can afford a better writer! Tell me how this is move is 4-D chess and not just a blunder.
The mods probably have access to better analytics. I, for one, was a long-time lurker before I said anything.
My current estimate of P(doom) in the next 15 years is 5%. That is, high enough to be concerned , but not high enough to cash out my retirement. I am curious about anyone harboring a P(doom) > 50%. This would seem to be high enough to support drastic actions. What work has been done to develop rational approaches to such a high P(doom)?
I mean, what do you think we've been doing all along?
I'm at like 90% in 20 years, but I'm not claiming even one significant digit on that figure. My drastic actions have been to get depressed enough to be unwilling to work in a job as stressful as my last one. I don't want to be that miserable if we've only got a few years left. I don't think I'm being sufficiently rational about it, no. It would be more dignified to make lots of money and donate it to the organization with the best chance of stopping or at least delaying our impending doom. I couldn't tell you which one that is at the moment though.
Some are starting to take more drastic actions. Whether those actions will be effective remains to be seen.
In my view, technical alignment is not keeping up with capabilities advancement. We have no alignment tech robust enough to even possibly survive the likely intelligence explosion scenario, and it's not likely to be developed in time. Corporate incentive structure and dysfunction makes them insufficiently cautious. Even without an intelligence explosion, we also have no plans for the likely social upheaval from rapid job loss. The default outcome is that human life becomes worthless, because that's already the case in such economies.
Our best chance at this point is probably government intervention to put the liability back on reckless AI labs for the risks they're imposing on the rest of us, if not an outright moratorium on massive training runs.
Gladstone has an Action Plan. There's also https://www.narrowpath.co/.
On a related topic, I am looking to explore how to determine the right scale of the objective function for revenge (or social correction if you prefer a smaller scope). My intuition is that revenge was developed as a mechanism to perform tribal level optimizations. In a situation where there has been a social transgression, and redressing that transgression would be personally costly but societally beneficial, what is the correct balance between personal interest and societal interest?
This is a question for game theory. Trading a state of total anarchy for feudalism, a family who will avenge you is a great deterrent to have. It could even save your life. Revenge is thus a good thing. A moral duty, even. Yes, really. For a smaller scope, being quick to anger and vindictive will make others reluctant to mess with you.
Unfortunately, this also tends to result in endless blood feuds as families get revenge for the revenge for the revenge, at least until one side gets powerful enough to massacre the other. In the smaller scope, maybe you exhaust yourself or risk getting killed fighting duels to protect your honor.
We've found that having a central authority to monopolize violence rather than vengeance and courts to settle disputes rather than duels works better. But the instincts for anger and revenge and taking offense are still there. Societies with the better alternatives now consider such instincts bad.
Unfortunately, this kind of improved dispute resolution isn't available at the largest and smallest scales. There is no central authority to resolve disputes between nations, or at least not ones powerful enough to prevent all wars. We still rely on the principle of vengeance (second strike) to deter nuclear wars. This is not an ideal situation to be in. At the smaller scale, poor inner-city street kids join gangs that could avenge them, use social media show off weapons they're not legally allowed to have, and have a lot of anger and bluster, all to try to protect themselves in a system that can't or won't do that for them.
So, to answer the original question, the optimal balance really depends on your social context.
at the personal scale it might yield the decision that one should go work in finance and accrue a pile of utility. But if you apply instrumental rationality to an objective function at the societal scale it might yield the decision to give all your spare resources to the most effective organizations you can find.
Yes. And yes. See You Need More Money for the former, Effective Altruism for the latter, and Earning to give for a combination of the two.
As for which to focus on, well, Rationality doesn't decide for you what your utility function is. That's on you. (surprise! you want what you want)
My take is that maybe you put on your own oxygen mask first, and then maybe pay a tithe, to the most effective orgs you can find. If you get so rich that even that's not enough, why not invest in causes that benefit you personally, but society as well? (Medical research, for example.)
I also don't feel the need to aid potential future enemies just because they happen to be human. (And feel even less obligation for animals.) Folks may legitimately differ on what level counts as having taken care of themselves first. I don't feel like I'm there yet. Some are probably worse off than me and yet giving a lot more. But neglecting one's own need is probably not very "effective" either.
The questions seem underspecified. You're haven't nailed down a single world, and different worlds could have different answers. Many of the laws of today no longer make sense in worlds like you're describing. They may be ignored and forgotten or updated after some time.
If we have the technology to enhance human memory for perfect recall, does that violate copyright, since you're recording everything? Arguably, it's fair use to remember your own life. Sharing that with others gets murkier. Also, copyright was originally intended to incentivize creation. Do we still need that incentive when AI becomes more creative than we are? I think not.
You can already find celebrity deepfakes online, depicting them in situations they probably wouldn't approve of. I imagine the robot question has similar answers. We haven't worked that out yet, but there seem to be legal trends towards banning it, but without enough teeth to actually stop it. I think culture can adapt to the situation just fine even without a ban, but it could take some time.
What we'd ask depends on the context. In general, not all rationalist teachings are in the form of a question, but many could probably be phrased that way.
"Do I desire to believe X if X is the case and not-X if X is not the case?" (For whatever X in question.) This is the fundamental lesson of epistemic rationality. If you don't want to lie to yourself, the rest will help you get better at that. But if you do, you'll lie to yourself anyway and all your acquired cleverness will be used to defeat itself.
"Am I winning?" This is the fundamental lesson of instrumental rationality. It's not enough to act with Propriety or "virtue" or obey the Great Teacher. Sometimes the rules you learned aren't applicable. If you're not winning and it's not due to pure chance, you did it wrong, propriety be damned. You failed to grasp the Art. Reflect, and actually cut the enemy.
Those two are the big ones. But there are more.
Key lessons from Bayes:
- How much do I currently believe this?
- What's my best guess, or upper/lower bound, or probability distribution?
- Do I have sufficient cause to even entertain the hypothesis?
- Is the evidence more consistent with my hypothesis, or its converse?
- How ad-hoc is the hypothesis? Is there a weaker claim more likely to be true?
- Is there a chain of causality linking the evidence to the hypothesis? What else is on that chain? Can I gather more direct evidence? Have I double-counted indirect evidence?
Others I thought of:
- Am I confused?
- What's a concrete example?
- What do I expect? (And then, "How surprised am I?")
- Assuming this failed, how surprised would I be? What's the most obvious reason why it would fail? Can I take a step to mitigate that/improve my plans? (And repeat.)
- Does this happen often enough to be worth the cost of fixing it?
- Has anyone else solved this problem? Have I checked? (Web, LLMs, textbooks?)
- Have I thought about this for at least five minutes?
- Do I care? Should I? Why?
- Wanna bet?
- Can I test this?
- What's my biggest problem? What's the bottleneck? What am I not allowed to care about? If my life was a novel, what would I be yelling at the protagonist to do? Of my current pursuits, which am I pursuing ineffectively? What sparks my interest?
- Am I lying?
I'm not claiming this list is exhaustive.
I think #1 implies #2 pretty strongly, but OK, I was mostly with you until #4. Why is it that low? I think #3 implies #4, with high probability. Why don't you?
#5 and #6 don't seem like strong objections. Multiple scenarios could happen multiple times in the interval we are talking about. Only one has to deal the final blow for it to be final, and even blows we survive, we can't necessarily recover from, or recover from quickly. The weaker civilization gets, the less likely it is to survive the next blow.
We can hope that warning shots wake up the world enough to make further blows less likely, but consider that the opposite may be true. Damage leads to desperation, which leads to war, which leads to arms races, which leads to cutting corners on safety, which leads to the next blow. Or human manipulation/deception through AI leads to widespread mistrust, which prevents us from coordinating on our collective problems in time. Or AI success leads to dependence, which leads to reluctance to change course, which makes recovery harder. Or repeated survival leads to complacency until we boil the frog to death. Or some combination of these, or similar cascading failures. It depends on the nature of the scenario. There are lots of ways things could go wrong, many roads to ruin; disaster is disjunctive.
Would warnings even work? Those in the know are sounding the alarm already. Are we taking them seriously enough? If not, why do you expect this to change?
I don't really have a problem with the term "intelligence" myself, but I see how it could carry anthropomorphic baggage for some people. However, I think the important parts are, in fact, analogous between AGI and humans. But I'm not attached to that particular word. One may as well say "competence" or "optimization power" without losing hold of the sense of "intelligence" we mean when we talk about AI.
In the study of human intelligence, it's useful to break down the g factor (what IQ tests purport to measure) into fluid and crystallized intelligence. The former being the processing power required to learn and act in novel situations, and the latter being what has been learned and the ability to call upon and apply that knowledge.
"Cognitive skills" seems like a reasonably good framing for further discussion, but I think recent experience in the field contradicts your second problem, even given this framing. The Bitter Lesson says it well. Here are some relevant excerpts (it's worth a read and not that long).
The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. [...] Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation.
[...] researchers always tried to make systems that worked the way the researchers thought their own minds worked---they tried to put that knowledge in their systems---but it proved ultimately counterproductive, and a colossal waste of researcher's time, when, through Moore's law, massive computation became available and a means was found to put it to good use.
[...] We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.
One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.
[...] the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds [...] these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. [...] We want AI agents that can discover like we can, not which contain what we have discovered.
Your conception of intelligence in the "cognitive skills" framing seems to be mainly about the crystalized sort. The knowledge and skills and application thereof. You see how complex and multidimensional that is and object to the idea that collections of such should be well-ordered, making concepts like "smarter-than human" if not wholly devoid of meaning, at least wrongheaded.
I agree that "competence" is ultimately a synonym for "skill", but you're neglecting the fluid intelligence. We already know how to give computers the only "cognitive skills" that matters: the ones that let you acquire all the others. The ability to learn, mainly. And that one can be brute forced with more compute. All the complexity and multidimensionality you see come when something profoundly simple, algorithms measured in mere kilobytes of source code, interacts with data from the complex and multidimensional real world.
In the idealized limit, what I call "intelligence" is AIXI. Though the explanation is long, the definition is not. It really is that simple. All else we call "intelligence" is mere approximation and optimization of that.
Strong ML engineering skills (you should have completed at least the equivalent of a course like ARENA).
What other courses would you consider equivalent?
I don't know why we think we can colonize Mars when we can't even colonize Alaska. Alaska at least has oxygen. Where are the domed cities with climate control?
Specifically, while the kugelblitz is a prediction of general relativity, quantum pair production from strong electric fields makes it infeasible in practice. Even quasars wouldn't be bright enough, and those are far beyond the energy level of a single Dyson sphere. This doesn't rule out primordial black holes forming at the time of the Big Bang, however.
It might still be possible to create micro black holes with particle accelerators, but how easy this is depends on some unanswered questions about physics. In theory, such an accelerator might need to be a thousand light years across at most, but this depends on achievable magnetic field strength. (Magnetars?) On the other hand, if compactified extra dimensions exist (like in string theory), the minimum required energy would be lower. One that small would evaporate almost instantly though. It's not clear if it could be kept alive long enough to get any bigger.
What are efficient Dyson spheres probably made of?
There are many possible Dyson sphere designs, but they seem to fall into three broad categories: shells, orbital swarms, and bubbles. Solid shells are probably unrealistic. Known materials aren't strong enough. Orbital swarms are more realistic but suffer from some problems with self-occlusion and possibly collisions between modules. Limitations on available materials might still make this the best option, at least at first.
But efficient Dyson spheres are probably bubbles. Rather than being made of satellites, they're made of statites, that is, solar sails that don't orbit, but hover. Since both gravitational acceleration and radiant intensity follow the inverse square law, the same design would function at almost any altitude above the Sun, with some caveats. These could be packed much more closely together than the satellites of orbital swarms while maybe using less material. Eric Drexler proposed 100 nm thick aluminum films with some amount of supporting tensile structure. Something like that could be held open by spinning, even with no material compressive structure. Think about a dancer's dress being held open while pirouetting and you get the idea.
The radiation needs to be mostly reflected downwards for the sails to hover, but it could still be focused on targets as long as the net forces keep the statites in place. Clever designs could probably approach 100% coverage.
What percent of the solar system can be converted into Dyson-sphere material? Are gas giants harvestable?
Eventually, almost all of it, but you don't need to to get full coverage. Yes, they're harvestable; at the energy scales we're talking about, even stellar material is harvestable via star lifting. The Sun contains over 99% of the mass of the Solar System.
How long would it take to harvest that material?
I don't know, but I'll go with the 31 years and 85 days for an orbital swarm as a reasonable ballpark. Bubbles are a different design and may take even less material, but either way, we're talking about exponential growth in energy output that can be applied to the construction. At some point the energy matters more than the matter.
What would the radius of a Dyson sphere be? (i.e. how far away from the sun is optimal). How thick?
I'd say as close to the Sun as the materials can withstand (because this takes less material), so probably well within the orbit of Mercury. Too much radiation and the modules would burn up. Station keeping becomes more difficult when you have to deal with variable Solar wind and coronal mass ejections, and these problems are more severe closer in.
The individual statite sails would be very thin. Maybe on the order of 100 nm for the material, although the tensile supports could be much thicker. I don't know how many sails an optimal statite module would use (maybe just 1). But the configuration required for focus and station keeping probably isn't perfectly flat, so a minimal bounding box around a module could be much thicker still.
An energy efficient Dyson Sphere probably looks like a Matrioshka brain, with outer layers collecting waste heat from the inner layers. Layers could be much farther apart than the size of individual modules.
If the sphere is (presumably) lots of small modules, how far apart are they?
Statites could theoretically be almost touching, especially with active station keeping, which is probably necessary anyway. What's going to move them? Solar wind variation? Micrometor collisions? Gravitational interactions with other celestial bodies? Remember, statites work about the same regardless of altitude, so there can be layers with some amount of overlap.
"if an AI is moderately 'nice', leaves Earth alone but does end up converting the rest of the solar system into a Dyson sphere, how fucked is Earth?
Very, probably. And we wouldn't have to wait for the whole (non-Sun) Solar System to be converted before we're in serious trouble.
Rob Miles' YouTube channel has some good explanations about why alignment is hard.
We can already do RLHF, the alignment technique that made ChatGPT and derivatives well-behaved enough to be useful, but we don't expect this to scale to superintelligence. It adjusts the weights based on human feedback, but this can't work once the humans are unable to judge actions (or plans) that are too complex.
If we don't mind the process being slow, this could be achieved by a single "crawler" machine that would go through the matrix field by field and do the updates. Since the machine is finite (albeit huge), this would work.
Not following. We can already update the weights. That's training, tuning, RLHF, etc. How does that help?
We have a goal A, that we want to achieve and some behavior B, that we want to avoid.
No. We're talking about aligning general intelligence. We need to avoid all the dangerous behaviors, not just a single example we can think of, or even numerous examples. We need the AI to output things we haven't thought of, or why is it useful at all? If there's a finite and reasonably small number of inputs/outputs we want, there's a simpler solution: that's not an AGI—it's a lookup table.
You can think of the LLM weights as a lossy compression of the corpus it was trained on. If you can predict text better than chance, you don't need as much capacity to store it, so an LLM could be a component in a lossless text compressor as well. But these predictors generated by the training process generalize beyond their corpus to things that haven't been written yet. It has an internal model of possible worlds that could have generated the corpus. That's intelligence.
Get a VPN. It's good practice when using public Wi-Fi anyway. (Best practice is to never use public Wi-Fi. Get a data plan. Tello is reasonably priced.) Web filters are always imperfect, and I mostly object to them on principle. They'll block too little or too much, or more often a mix of both, but it's a common problem in e.g. schools. Are you sure you're not accessing the Wi-Fi of the business next door? Maybe B&N's was down.
Takeover, if misaligned, also counts as doom. X-risk includes permanent disempowerment, not just literal extinction. That's according to Bostrom, who coined the term:
One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
A reasonably good outcome might be for ASI to set some guardrails to prevent death and disasters (like other black marbles) and then mostly leave us alone.
My understanding is that Neuralink is a bet on "cyborgism". It doesn't look like it will make it in time. Cyborgs won't be able to keep up with pure machine intelligence once it begins to take off, but maybe smarter humans would have a better chance of figuring out alignment before it starts. Even purely biological intelligence enhancement (e.g., embryo selection) might help, but that might not be any faster.
Aschenbrenner also wrote https://situational-awareness.ai/. Zvi wrote a review.
There's some UC San Francisco research to back up this view. California has the nation's biggest homeless population mainly due to unaffordable housing, not migration from elsewhere for a nicer climate.
I notice that https://metaphor.systems (mentioned here earlier) now redirects to Exa. Have you compared it to Phind (or Bing/Windows Copilot)?
Relatively healthy people do occasionally become homeless due to misfortune, but they usually don't stay homeless. It could be someone from the lower class living paycheck to paycheck who has a surprise expense they're not insured for and can't make rent. It could be a battered woman and her children escaping domestic abuse. They get services, they get back on their feet, they get housed. Ideally, the social safety nets would work faster and better than they do in practice, but the system basically works out for them.
The persistently homeless are a different story. They're often mentally ill and/or addicted to drugs. They don't have the will or capacity to take advantage of the available services. In the past, such people were often institutionalized. These institutions often didn't act in their best interests, but it at least kept them off the streets so they couldn't harass the rest of us. Your proposal probably wouldn't help for them.
Why do we need more public bathrooms? I'm skeptical because if there was demand for more bathrooms, then I'd expect the market to produce them.
I wouldn't expect so, why would you think that? Markets have a problem handling unpriced externalities without regulation. (Tragedy of the commons.) Pollution is a notable example of market failure, and the bathrooms issue is a special case of exactly this. Why pay extra to dispose of your waste properly if you can get away with dumping it elsewhere? As a matter of public health, it's better for everyone if this type of waste goes in the sewers and not in the alley, even if the perpetrators can't afford a coffee. How would you propose we stop the pollution? Fining them wouldn't help, even if we could catch them reliably, which would be expensive, because they don't have any money to take. Jailing them would probably cost taxpayers more than maintaining bathrooms would. Taxpayers are already paying for the sewer system (a highly appropriate use of taxation). This is just an expansion of the same.
The table isn't legible with LessWrong's dark theme.
I feel like these would be more effective if standardized, dated and updated. Should we also mention gag orders? Something like this?
As of June 2024, I have signed no contracts or agreements whose existence I cannot mention.
As of June 2024, I am not under any kind of gag order whose existence I cannot mention.
Last updated June 2024. I commit to updating at least annually.
Could LessWrong itself be compelled even if the user cannot? Should we include PGP signatures or something?
I thought it was mostly due to the high prevalence of autism (and the social anxiety that usually comes with it) in the community. The more socially agentic rationalists are trying.
But probably he should be better at communication e.g. realizing that people will react negatively to raising the possibility of nuking datacenters without lots of contextualizing.
Yeah, pretty sure Eliezer never recommended nuking datacenters. I don't know who you heard it from, but this distortion is slanderous and needs to stop. I can't control what everybody says elsewhere, but it shouldn't be acceptable on LessWrong, of all places.
He did talk about enforcing a global treaty backed by the threat of force (because all law is ultimately backed by violence, don't pretend otherwise). He did mention that destroying "rogue" datacenters (conventionally, by "airstrike") to enforce said treaty had to be on the table, even if the target datacenter is located in a nuclear power who might retaliate (possibly risking a nuclear exchange), because risking unfriendly AI is worse.
The argument chain you presented (Deep Learning -> Consciousness -> AI Armageddon) is a strawman. If you sincerely think that's our position, you haven't read enough. Read more, and you'll be better received. If you don't think that, stop being unfair about what we said, and you'll be better received.
Last I checked, most of us were agnostic on the AI Consciousness question. If you think that's a key point to our Doom arguments, you haven't understood us; that step isn't necessarily required; it's not a link in the chain of argument. Maybe AI can be dangerous, even existentially so, without "having qualia". But neither are we confident that AI necessarily won't be conscious. We're not sure how it works in humans but seems to be an emergent property of brains, so why not artificial brains as well? We don't understand how the inscrutable matrices work either, so it seems like a possibility. Maybe gradient descent and evolution stumbled upon similar machinery for similar reasons. AI consciousness is mostly beside the point. Where it does come up is usually not in the AI Doom arguments, but questions about what we ethically owe AIs, as moral patients.
Deep Learning is also not required for AI Doom. Doom is a disjunctive claim; there are multiple paths for getting there. The likely-looking path at this point would go through the frontier LLM paradigm, but that isn't required for Doom. (However, it probably is required for most short timelines.)
You are not wrong to complain. That's feedback. But this feels too vague to be actionable.
First, we may agree on more than you think. Yes, groupthink can be a problem, and gets worse over time, if not actively countered. True scientists are heretics.
But if the science symposium allows the janitor to interrupt the speakers and take all day pontificating about his crackpot perpetual motion machine, it's also of little value. It gets worse if we then allow the conspiracy theorists to feed off of each other. Experts need a protected space to converse, or we're stuck at the lowest common denominator (incoherent yelling, eventually). We unapologetically do not want trolls to feel welcome here.
Can you accept that the other extreme is bad? I'm not trying to motte-and-bailey you, but moderation is hard. The virtue lies between the extremes, but not always exactly in the center.
What I want from LessWrong is high epistemic standards. That's compatible with opposing viewpoints, but only when they try to meet our standards, not when they're making obvious mistakes in reasoning. Some of our highest-karma posts have been opposing views!
Do you have concrete examples? In each of those cases, are you confident it's because of the opposing view, or could it be their low standards?
the problem "How do we stop people from building dangerous AIs?" was "research how to build AIs".
Not quite. It was to research how to build friendly AIs. We haven't succeeded yet. What research progress we have made points to the problem being harder than initially thought, and capabilities turned out to be easier than most of us expected as well.
Methods normal people would consider to stop people from building dangerous AIs, like asking governments to make it illegal to build dangerous AIs, were considered gauche.
Considered by whom? Rationalists? The public? The public would not have been so supportive before ChatGPT, because most everybody didn't expect general AI so soon, if they thought about the topic at all. It wasn't an option at the time. Talking about this at all was weird, or at least niche, certainly not something one could reasonably expect politicians to care about. That has changed, but only recently.
I don't particularly disagree with your prescription in the short term, just your history. That said, politics isn't exactly our strong suit.
But even if we get a pause, this only buys us some time. In the long(er) term, I think either the Singularity or some kind of existential catastrophe is inevitable. Those are the attractor states. Our current economic growth isn't sustainable without technological progress to go with it. Without that, we're looking at civilizational collapse. But with that, we're looking at ever widening blast radii for accidents or misuse of more and more powerful technology. Either we get smarter about managing our collective problems, or they will eventually kill us. Friendly AI looked like the way to do that. If we solve that one problem, even without world cooperation, it solves all the others for us. It's probably not the only way, but it's not clear the alternatives are any easier. What would you suggest?
I can think of three alternatives.
First, the most mundane (but perhaps most difficult), would be an adequate world government. This would be an institution that could easily solve climate change, ban nuclear weapons (and wars in general), etc. Even modern stable democracies are mostly not competent enough. Autocracies are an obstacle, and some of them have nukes. We are not on track to get this any time soon, and much of the world is not on board with it, but I think progress in the area of good governance and institution building is worthwhile. Charter cities are among the things I see discussed here.
Second might be intelligence enhancement through brain-computer interfaces. Neuralink exists, but it's early days. So far, it's relatively low bandwidth. Probably enough to restore some sight to the blind and some action to the paralyzed, but not enough to make us any smarter. It might take AI assistance to get to that point any time soon, but current AIs are not able, and future ones will be even more of a risk. This would certainly be of interest to us.
Third would be intelligence enhancement through biotech/eugenics. I think this looks like encouraging the smartest to reproduce more rather than the misguided and inhumane attempts of the past to remove the deplorables from the gene pool. Biotech can speed this up with genetic screening and embryo selection. This seems like the approach most likely to actually work (short of actually solving alignment), but this would still take a generation or two at best. I don't think we can sustain a pause that long. Any anti-AI enforcement regime would have too many holes to work indefinitely, and civilization is still in danger for the other reasons. Biological enhancement is also something I see discussed on LessWrong.
Yep. It would take a peculiar near-miss for an unfriendly AI to preserve Nature, but not humanity. Seemed obvious enough to me. Plants and animals are made of atoms it can use for something else.
By the way, I expect the rapidly expanding sphere of Darkness engulfing the Galaxy to happen even if things go well. The stars are enormous repositories of natural resources that happen to be on fire. We should put them out so they don't go to waste.