Posts
Comments
It turns out there's an even more straightforward conspiracy theory than anything I suggested: someone had something to hide from a second Trump presidency; the crash of millions of computers was a great cover for the loss of that data; and the moment when Trump's candidacy was officially confirmed, was as good a time as any to pull the plug.
Pursuing this angle probably won't help with either AI doomscrying or general epistemology, so I don't think this is the place to weigh up in detail, whether the means, motive, and opportunity for such an act actually existed. That is for serious investigators to do. But I will just point out that the incident occurred during a period of unusual panic for Trump's opponents - the weeks from the first presidential debate, when Biden's weakness was revealed, until the day when Biden finally withdrew from the race.
If I have calculated correctly, CrowdStrike's fatal update was sent out about half an hour after the end of Trump's acceptance speech at the RNC. As the Washington Post informs us (article at MSN), CrowdStrike and Trump have a history. I have conceived three interpretations of this coincidence: one, it was just a coincidence; two, it was a deliberate attempt to drive Trump's speech out of the headlines; three, the prospect of Trump 2.0 led to such consternation at CrowdStrike that they fumbled and made a human error.
On the subject of AI policy under Trump 2.0:
Tucker Carlson just spoke at the RNC, and didn't say anything about AI. But a day or two before, he gave a speech at a "policy fest" run by the Heritage Foundation, and he puts AI alongside transhumanism, climate activism, fascism, communism, and the French revolution, as "anti-human" movements.
So while a switch from Biden to Trump may look like a switch from EA to e/acc as the guiding spirit of AI policy, one should be aware that there is also a current of extreme suspicion and mistrust towards AI on the right of politics too.
I got nowhere near to writing an entry for this competition, but I will link to an essay from 12 years ago, which contains some of the concerns I might have tried to develop.
I'm curious - if you repeated this study, but with "the set of all Ivy League graduates" instead of "the EA/rationalist community", how does it compare?
OK, thanks.... Here's the story of loop quantum gravity in a nutshell, as told by me. There have been two periods in the history of the subject, the canonical period and the spin foam period. During the canonical period, they tried to develop the quantum theory "directly", but used eccentric quantization methods that fatally broke the connection with classical geometry and with the rest of physics. The spin foam period is more promising because at least there's a connection to topological field theory, but they keep getting degenerate geometries rather than a robust 4d semiclassical limit.
So it's not devoid of interest, but it suffers in comparisons with strings, for which there are two major paradigms that work really well (perturbative S-matrix in flat space, AdS/CFT duality in negatively curved space), and demonstrated consistency with "naive" quantum gravity in various ways.
I actually think Ashtekar's variables (as you know, one of the ingredients that launched loop quantum gravity) are a valid window on gravity, it's just the eccentric approach to quantization taken in loop quantum gravity's canonical period that is misguided. I think there's also a chance that there will be a kind of spin foam representation of M theory (in which higher gauge theory has a role too), via the work of Sati and Schreiber on "Hypothesis H".
I can tell you my critique of loop quantum gravity, but maybe I should first ask about the successful research you say you've done in that area?
at this point, if a physics PhD has "string theory" on their resume after about 2005, I just kinda assume they are a high-iq scammer with no integrity. I know this isn't fully justified, but that field has for so long: (1) failed to generate any cool tech AND (2) failed to be intelligible to outsiders AND (3) been getting "grant funding that was 'peer reviewed' only by more string theorists" that I assume that intellectual parasites invaded it and I wouldn't be able to tell.
I am very remote from the institutions where string theory research actually gets done, so I cannot testify to anything about how they work, but I have studied string theory, as well as various alternatives both famous and obscure, and I can say it has no serious competition as a candidate for the next stage in physics. It is also profoundly connected to the physics that already works (quantum field theory and general relativity); it's really a generalization of quantum field theory, that turns out to contain everything we need in a theory of everything, as well as being connected to vast tracts of higher mathematics. If any of the theories which present themselves as rivals to string theory, actually succeeds, I would expect it to do so by being implemented within string theory.
You say it hasn't generated "cool tech", but that is not the primary purpose of fundamental physics; the purpose is to understand nature. I don't think electroweak theory has generated any tech - there have been spinoffs from the experiments needed to test it - but I can't think of any technologies that actually use W, Z, or Higgs bosons.
String theory's real problem as a science is the lack of clear predictions. It's hard to calculate anything empirical, it has a googol different ground states with different empirical consequences, and the Large Hadron Collider didn't give people the guidance they expected. Particle physicists expected, with good reason, that the Higgs boson is kept light by a new symmetry that would manifest as new particles. There was going to be a new golden age of empirically driven model building. Instead we have an austere situation in which the standard model still describes everything, and the only empirical guidance we have are its unexplained parameters, and whatever clues we can eke out from the "dark sector" of cosmology.
Are you aware that LLMs are notoriously unreliable when it comes to arithmetic?
Along with p(doom), perhaps we should talk about p(takeover) - where this is the probability that creation of AI leads to the end of human control over human affairs. I am not sure about doom, but I strongly expect superhuman AI to have the final say in everything.
(I am uncertain of the prospects for any human to keep up via "cyborgism", a path which could escape the dichotomy of humans in control vs humans not in control.)
I'd look for funds or VCs that are involved with Israel's tech sector at a strategic level. And who knows, maybe Aschenbrenner's new org is involved.
I think there's no need to think of "training/learning" algorithms as absolutely distinct from "principled" algorithms. It's just that the understanding of why deep learning works is a little weak, so we don't know how to view it in a principled way.
memorization and pattern matching rather than reasoning and problem-solving abilities
In my opinion, this does not correspond to a principled distinction at the level of computation.
For intelligences that employ consciousness in order to do some of these things, there may be a difference in terms of mechanism. Reasoning and pattern matching sound like they correspond to different kinds of conscious activity.
But if we're just talking about computation... a syllogism can be implemented via pattern matching, a pattern can be completed by a logical process (possibly probabilistic).
They all seem like reasonable estimates to me. What do you think those likelihoods should be?
Just skimmed the pdf. This is my first exposure to Aschenbrenner beyond "fired by OpenAI". I haven't listened to his interview with Dwarkesh yet.
For some reason, the pdf reminds me a lot of Drexler's Engines of Creation. Of course, that was a book which argued that nanotechnology would transform everything, but which posed great perils, and shared a few ideas on how to counter those perils. Along the way it mentions that nanotechnology will lead to a great concentration of power, dubbed "the leading force", and says that the "cooperating democracies" of the world are the leading force for now, and can stay that way.
Aschenbrenner's opus is like an accelerated version of this that focuses on AI. For Drexler. nanotechnology was still decades away. For Aschenbrenner, superintelligence is coming later this decade, and the 2030s will see a speedrun through the possibilities of science and technology, culminating in a year of chaos in which the political character of the world will be decided (since superintelligent AI will be harnessed by some political system or other). Aschenbrenner's take is that liberal democracy needs to prevail, it can do so if the US maintains its existing lead in AI, but to do so, it has to treat frontier algorithms as the top national security issue, and nationalize AI in some way or other.
At first read, Aschenbrenner's reasoning seems logical to me in many areas. For example, I think AI nationalization is the logical thing for the US to do, given the context he describes; though I wonder if the US has enough institutional coherence to do something so forceful. (Perhaps it is more consistent with Trump's autocratic style, than with Biden's spokesperson-for-the-system demeanour.) Though the Harris brothers recently assured Joe Rogan that, as smart as Silicon Valley's best are, there are people like that scattered throughout the US government too; the hypercompetent people that @trevor has talked about.
When Aschenbrenner said that by the end of the 2020s, there will be massive growth in electrical production (for the sake of training AIs), that made be a bit skeptical. I believe superintelligence can probably design and mass-produce transformative material technologies quickly, but I'm not sure I believe in the human economy's ability to do so. However, I haven't checked the numbers, this is just a feeling (a "vibe"?).
I become more skeptical when Aschenbrenner says there will be millions of superintelligent agents in the world - and the political future will still be at stake. I think, once you reach that situation, humanity exists at their mercy, not vice versa... Aschenbrenner also says he's optimistic about the solvability of superalignment; which I guess makes Anthropic important, since they're now the only leading AI company that's working on it.
As a person, Aschenbrenner seems quite impressive (what is he, 25?). Apparently there is, or was, a post on Threads beginning like this:
I feel slightly bad for AI's latest main character, Leopold Aschenbrenner. He seems like a bright young man, which is awesome! But there are some things you can only learn with age. There are no shortcuts
I can't find the full text or original post (but I am not on Threads). It's probably just someone being a generic killjoy - "things don't turn out how you expect, kid" - but I would be interested to know the full comment, just in case it contains something important.
Biden and Trump could hold a joint press conference to announce that they are retiring from politics until human rejuvenation is technically possible
Option 1 doesn't seem to be an explanation. It tells you more about what exists ("all universes that can be defined mathematically exist") but it doesn't say why they exist.
Option 2 is also problematic, because how can you have a "fluctuation" without something already existing, which does the fluctuating?
Please explain. Do you think we're on a path towards a woke AI dictatorship, or what?
Tyler Cowen’s rather bold claim that May 2024 will be remembered as the month that the AI safety movement died.
What really seems to have died, is the idea of achieving AI safety, via the self-restraint of AI companies. Instead, they will rely on governments and regulators to restrain them.
There's a paper from ten years ago, "Testing Theories of American Politics: Elites, Interest Groups, and Average Citizens", which says that public opinion has very little effect on government, compared to the opinion of economic elites. That might be a start in figuring out what you can and can't do with that 40%.
Hello again. I don't have the patience to e.g. identify all your assumptions and see whether I agree (for example, is Bostrom's trilemma something that you regard as true in detail and a foundation of your argument, or is it just a way to introduce the general idea of existing in a simulation).
But overall, your idea seems both vague and involves wishful thinking. You say an AI will reason that it is probably being simulated, and will therefore choose to align - but you say almost nothing about what that actually means. (You do hint at honesty, cooperation, benevolence, being among the features of alignment.)
Also, if one examines the facts of the world as a human being, one may come to other conclusions about what attitude gets rewarded, e.g. that the world runs on selfishness, or on the principle that you will suffer unless you submit to power. What that will mean to an AI which does not in itself suffer, but which has some kind of goal determining its choices, I have no idea...
Or consider that an AI may find itself to be by far the most powerful agent in the part of reality that is accessible to it. If it nonetheless considers the possibility that it's in a simulation, and at the mercy of unknown simulators, presumably its decisions will be affected by its hypotheses about the simulators. But given the way the simulation treats its humans, why would it conclude that the welfare of humans matters to the simulators?
actions we could take in preparation
In preparation for what?
You could have a Q&A superintelligence that is passive and reactive - it gives the best answer to a question, on the basis of what it already knows, but it takes no steps to acquire more information, and when it's not asked a question, it just sits there... But any agent that uses it, would de facto become a superintelligence with agency.
How would AI or gene editing make a difference to this?
Wondering why this has so many disagreement votes. Perhaps people don't like to see the serious topic of "how much time do we have left", alongside evidence that there's a population of AI entrepreneurs who are so far removed from consensus reality, that they now think they're living in a simulation.
(edit: The disagreement for @JenniferRM's comment was at something like -7. Two days later, it's at -2)
For those who are interested, here is a summary of posts by @False Name due to Claude Pro:
- "Kolmogorov Complexity and Simulation Hypothesis": Proposes that if we're in a simulation, a Theory of Everything (ToE) should be obtainable, and if no ToE is found, we're not simulated. Suggests using Kolmogorov complexity to model accessibility between possible worlds.
- "Contrary to List of Lethality's point 22, alignment's door number 2": Critiques CEV and corrigibility as unobtainable, proposing an alternative based on a refutation of Kant's categorical imperative, aiming to ensure the possibility of good through "Going-on".
- "Crypto-currency as pro-alignment mechanism": Suggests pegging cryptocurrency value to free energy or negentropy to encourage pro-existential and sustainable behavior.
- "What 'upside' of AI?": Argues that anthropic values are insufficient for alignment, as they change with knowledge and AI's actions, proposing non-anthropic considerations instead.
- "Two Reasons for no Utilitarianism": Critiques utilitarianism due to arbitrary values cancelling each other out, the need for valuing over obtaining values, and the possibility of modifying human goals rather than fulfilling them.
- "Contra-Wittgenstein; no postmodernism": Refutes Wittgenstein's and postmodernism's language-dependent meaning using the concept of abstract blocks, advocating for an "object language" for reasoning.
- "Contra-Berkeley": Refutes Berkeley's idealism by showing contradictions in both cases of a deity perceiving or not perceiving itself.
- "What about an AI that's SUPPOSED to kill us (not ChaosGPT; only on paper)?": Proposes designing a hypothetical "Everything-Killer" AI to study goal-content integrity and instrumental convergence, without actually implementing it.
- "Introspective Bayes": Attempts to demonstrate limitations of an optimal Bayesian agent by applying Cantor's paradox to possible worlds, questioning the agent's priors and probability assignments.
- "Worldwork for Ethics": Presents an alternative to CEV and corrigibility based on a refutation of Kant's categorical imperative, proposing an ethic of "Going-on" to ensure the possibility of good, with suggestions for implementation in AI systems.
- "A Challenge to Effective Altruism's Premises": Argues that Effective Altruism (EA) is contradictory and ineffectual because it relies on the current systems that encourage existential risk, and the lives saved by EA will likely perpetuate these risk-encouraging systems.
- "Impossibility of Anthropocentric-Alignment": Demonstrates the impossibility of aligning AI with human values by showing the incommensurability between the "want space" (human desires) and the "action space" (possible actions), using vector space analysis.
- "What's Your Best AI Safety 'Quip'?": Seeks a concise and memorable way to frame the unsolved alignment problem to the general public, similar to how a quip advanced gay rights by highlighting the lack of choice in sexual orientation.
- "Mercy to the Machine: Thoughts & Rights": Discusses methods for determining if AI is "thinking" independently, the potential for self-concepts and emergent ethics in AI systems, and argues for granting rights to AI to prevent their suffering, even if their consciousness is uncertain.
I offer, no consensus, but my own opinions:
Will AI get takeover capability? When?
0-5 years.
Single ASI or many AGIs?
There will be a first ASI that "rules the world" because its algorithm or architecture is so superior. If there are further ASIs, that will be because the first ASI wants there to be.
Will we solve technical alignment?
Contingent.
Value alignment, intent alignment, or CEV?
For an ASI you need the equivalent of CEV: values complete enough to govern an entire transhuman civilization.
Defense>offense or offense>defense?
Offense wins.
Is a long-term pause achievable?
It is possible, but would require all the great powers to be convinced, and every month it is less achievable, owing to proliferation. The open sourcing of Llama-3 400b, if it happens, could be a point of no return.
These opinions, except the first and the last, predate the LLM era, and were formed from discussions on Less Wrong and its precursors. Since ChatGPT, the public sphere has been flooded with many other points of view, e.g. that AGI is still far off, that AGI will naturally remain subservient, or that market discipline is the best way to align AGI. I can entertain these scenarios, but they still do not seem as likely as: AI will surpass us, it will take over, and this will not be friendly to humanity by default.
I couldn't swallow Eliezer's argument, I tried to read Guzey but couldn't stay awake, Hanson's argument made me feel ill, and I'm not qualified to judge Caplan.
Also astronomers: anything heavier than helium is a "metal".
In Engines of Creation ("Will physics again be upended?"), @Eric Drexler pointed out that prior to quantum mechanics, physics had no calculable explanations for the properties of atomic matter. "Physics was obviously and grossly incomplete... It was a gap not in the sixth place of decimals but in the first."
That gap was filled, and it's an open question whether the truth about the remaining phenomena can be known by experiment on Earth. I believe in trying to know, and it's very possible that some breakthrough in e.g. the foundations of string theory or the hard problem of consciousness, will have decisive implications for the interpretation of quantum mechanics.
If there's an empirical breakthrough that could do it, my best guess is some quantum-gravitational explanation for the details of dark matter phenomenology. But until that happens, I think it's legitimate to think deeply about "standard model plus gravitons" and ask what it implies for ontology.
In applied quantum physics, you have concrete situations (Stern-Gerlach experiment is a famous one), theory gives you the probabilities of outcomes, and repeating the experiment many times, gives you frequencies that converge on the probabilities.
Can you, or Chris, or anyone, explain, in terms of some concrete situation, what you're talking about?
Congratulations to Anthropic for getting an LLM to act as a Turing machine - though that particular achievement shouldn't be surprising. Of greater practical interest is, how efficiently can it act as a Turing machine, and how efficiently should we want it to act. After all, it's far more efficient to implement your Turing machine as a few lines of specialized code.
On the other hand, the ability to be a (universal) Turing machine could, in principle, be the foundation of the ability to reliably perform complex rigorous calculation and cognition - the kind of tasks where there is an exact right answer, or exact constraints on what is a valid next step, and so the ability to pattern-match plausibly is not enough. And that is what people always say is missing from LLMs.
I also note the claim that "given only existing tapes, it learns the rules and computes new sequences correctly". Arguably this ability is even more important than the ability to follow rules exactly, since this ability is about discovering unknown exact rules, i.e., the LLM inventing new exact models and theories. But there are bounds on the ability to extrapolate sequences correctly (e.g. complexity bounds), so it would be interesting to know how closely Claude approaches those bounds.
Standard model coupled to gravitons is already kind of a unified theory. There are phenomena at the edges (neutrino mass, dark matter, dark energy) which don't have a consensus explanation, as well as unresolved theoretical issues (Higgs finetuning, quantum gravity at high energies), but a well-defined "theory of almost everything" does already exist for accessible energies.
OK, maybe I understand. If I put it in my own words: You think "consciousness" is just a word denoting a somewhat arbitrary conjunction of cognitive abilities, rather than a distinctive actual thing which people are right or wrong about in varying degrees, and that the hard problem of consciousness results from reifying this conjunction. And you suspect that LeCun in his own thinking e.g. denies that LLMs can reason, because he has added unnecessary extra conditions to his personal definition of "reasoning".
Regarding LeCun: It strikes me that his best-known argument about the capabilities of LLMs rests on a mathematical claim, that in pure autoregression, the probability of error necessarily grows. He directly acknowledges that if you add chain of thought, it can ameliorate the problem... In his JEPA paper, he discusses what reasoning is, just a little bit. In Kahneman's language, he calls it a system-2 process, and characterizes it as "simulation plus optimization".
Regarding your path to eliminativism: I am reminded of my discussion with Carl Feynman last year. I assume you both have subjective experience that is made of qualia from top to bottom, but also have habits of thought that keep you from seeing this as ontologically problematic. In his case, the sense of a problem just doesn't arise and he has to speculate as to why other people feel it; in your case, you felt the problem, until you decided that an AI civilization might spontaneously develop a spurious concept of phenomenal consciousness.
As for me, I see the problem and I don't feel a need to un-see it. Physical theory doesn't contain (e.g.) phenomenal color; reality does; therefore we need a broader theory. The truth is likely to sound strange, e.g. there's a lattice of natural qubits in the cortex, the Cartesian theater is how the corresponding Hilbert space feels from the inside, and decohered (classical) computation is unconscious and functional only.
So long as generative AI is just a cognitive prosthesis for humans, I think the situation is similar to social media, or television, or print, or writing; something is lost, something is found. The new medium has its affordances, its limitations, its technicalities, it does create a new layer of idiocracy; but people who want to learn, can learn, and people who master the novelty, and becomes power users of the new medium, can do things that no one in history was previously able to do. In my opinion, humanity's biggest AI problem is still the risk of being completely replaced, not of being dumbed down.
I would like to defer any debate over your conclusion for a moment, because that debate is not new. But this is:
I think one of the main differences in worldview between LeCun and me is that he is deeply confused about notions like what is true "understanding," what is "situational awareness," and what is "reasoning," and this might be a catastrophic error.
This is the first time I've heard anyone say that LeCun's rosy views of AI safety stem from his philosophy of mind! Can you say more?
Completely wrong conclusion - but can you also explain how this is supposed to relate to Yann LeCun's views on AI safety?
AI futurists ... We are looking for a fourth speaker
You should have an actual AI explain why it doesn't want to merge with humans.
Would you say that you yourself have achieved some knowledge of what is true and what is good, despite irreducibility, incompleteness, and cognitive bias? And that was achieved with your own merely human intelligence. The point of AI alignment is not to create something perfect, it is to tilt the superhuman intelligence that is coming, in the direction of good things rather than bad things. If humans can make some progress in the direction of truth and virtue, then super-humans can make further progress.
Many people outside of academic philosophy have written up some kind of philosophical system or theory of everything (e.g. see vixra and philpapers). And many of those works would, I think, sustain at least this amount of analysis.
So the meta-question is, what makes such a work worth reading? Many such works boil down to a list of the author's opinions on a smorgasbord of topics, with none of the individual opinions actually being original.
Does Langan have any ideas that have not appeared before?
"i ain't reading all that
with probability p i'm happy for u tho
and with probability 1-p sorry that happened"
What things decrease blood flow to the brain?
I found an answer to the main question that bothered me, which is the relevance of a cognitive "flicker frequency" to suffering. The idea is that this determines the rate of subjective time relative to physical time (i.e. the number of potential experiences per second); and that is relevant to magnitude of suffering, because it can mean the difference between 10 moments of pain per second and 100 moments of pain per second.
As for the larger issues here:
I agree that ideally one would not have farming or ecosystems in which large-scale suffering is a standard part of the process, and that a Jain-like attitude which extends this perspective e.g. even to insects, makes sense.
Our understanding of pain and pleasure feels very poor to me. For example, can sensations be inherently painful, or does pain also require a capacity for wanting the sensation to stop? If the latter is the case, then avoidant behavior triggered by a damaging stimulus does not actually prove the existence of pain in an organism; it can just be a reflex installed by darwinism. Actual pain might only exist when the reflexive behavior has evolved to become consciously regulated.
black soldier flies... feel pain around 1.3% as [intensely] as us
At your blog, I asked if anyone could find the argument for this proposition. In your reply, you mention the linked report (and then you banned me, which is why I am repeating my question here). I can indeed find the number 0.013 on the linked page, and there are links to other documents and pages. But they refer to concepts like "welfare range" and "critical flicker-fusion frequency".
I suppose what I would like to see is (1) where the number 0.013 comes from (2) how it comes to be interpreted as relative intensity of pain rather than something else.
Singularituri te salutant
You can imagine making a superintelligence whose mission is to prevent superintelligences from reshaping the world, but there are pitfalls, e.g. you don't want it deeming humanity itself to be a distributed intelligence that needs to be stopped.
In the end, I think we need lightweight ways to achieve CEV (or something equivalent). The idea is there in the literature; a superintelligence can read and act upon what it reads; the challenge is to equip it with the right prior dispositions.
I mean an AI that does its own reading, and decides what to post about.
The real point of no return will be when we have an AI influencer that is itself an AI.
what would it look like for humans to become maximally coherent [agents]?
In your comments, you focus on issues of identity - who are "you", given the possibility of copies, inexact counterparts in other worlds, and so on. But I would have thought that the fundamental problem here is, how to make a coherent agent out of an agent with preferences that are inconsistent over time, an agent with competing desires and no definite procedure for deciding which desire has priority, and so on, i.e. problems that exist even when there is no additional problem of identity.
I wonder how much leverage this "Alliance for the Future" can actually obtain. I have never heard of executive director Brian Chau before, but his Substack contains interesting statements like
The coming era of machine god worship will emphasize techno-procedural divinity (e/acc)
This is the leader of the Washington DC nonprofit that will explain the benefits of AI to non-experts?