Posts
Comments
Things the OP is concerned about like
"What makes this transition particularly hard to resist is that pressures on each societal system bleed into the others. For example, we might attempt to use state power and cultural attitudes to preserve human economic power. However, the economic incentives for companies to replace humans with AI will also push them to influence states and culture to support this change, using their growing economic power to shape both policy and public opinion, which will in turn allow those companies to accrue even greater economic power."
This all gets easier the smaller the society is. Coordination problems get harder the more parties involved. There will be pressure from motivated people to make the smallest viable colony in terms of people, which makes it easier to resist such things. For example there is much less effective cultural influence from the AI culture if the colony is founded by a small group of people with shared human affirming culture. Even if 99% of the state they come from is disempowered if small numbers can leave, they can create their own culture and set it up to be resistant to such things. Small groups of people have left decaying cultures throughout history and founded greater empires.
The OP is specifically about gradual disempowerment. Conditional on gradual disempowerment, it would help and not be decades away. Now we may both think that sudden disempowerment is much more likely. However in a gradual disempowerment world, such colonies would be viable much sooner as AI could be used to help build them, in the early stages of such disempowerment when humans could still command resources.
In a gradual disempowerment scenario vs no super AI scenario, humanities speed to deploy such colonies starts the same before AI can be used, then increases significantly compared to the no AI world as AI becomes available but before significant disempowerment, then drops to zero with complete disempowerment. The space capabilities area under the curve in the gradual disempowerment scenario is ahead of baseline for some time, enabling viable colonies to be constructed sooner than if there was no AI.
Space colonies are a potential way out - if a small group of people can make their own colony then they start out in control. The post assumes a world like it is now where you can't just leave. Historically speaking that is perhaps unusual - much of the time in the last 10,000 years it was possible for some groups to leave and start anew.
This does seem different however https://solarfoods.com/ - they are competing with food not fuel which can't be done synthetically (well if at all). Also widely distributed capability like this helps make humanity more resilient e.g. against nuke winter, extreme climate change, space habitats
Thanks for this article, upvoted.
Firstly Magma sounds most like Anthropic, especially the combination of Heuristic #1 Scale AI capabilities and also publishing safety work.
In general I like the approach, especially the balance between realism and not embracing fatalism. This is opposed to say MIRI, Pause AI and at the other end, e/acc. (I belong to EA, however they don’t seem to have a coherent plan I can get behind) I like the realization that in a dangerous situation doing dangerous things can be justified. Its easy to be “moral” and just say “stop” however its another matter entirely if that helps now.
I consider the pause around TEDAI to be important, though I would like to see it just before TEDAI (>3* alignment speed) not after. I am unsure how to achieve such a thing, do we have to lay the groundwork now? When I suggest such a thing elsewhere on this site, however it gets downvoted:
https://www.lesswrong.com/posts/ynsjJWTAMhTogLHm6/?commentId=krYhuadYNnr3deamT
Goal #2: Magma might also reduce risks posed by other AI developers
In terms of what people not directly doing AI research can do, I think a lot can be done reducing risks by other AI models. To me it would be highly desirable if AI (N-1) is deployed as quickly as possible into society and understood while AI(N) is still being tested. This clearly isn’t the case with critical security. Similarly,
AI defense: Harden the world against unsafe AI
In terms of preparation, it would be good if critical companies were required to quickly deploy AGI security tools as they become available. That is have the organization setup so that when new capabilities emerge and the new model finds potential vulnerabilities, experts in the company quickly assess them, and deploy timely fixes.
Your idea of acquiring market share in high risk domains? Haven't seen that mentioned before. It seems hard to pull off - hard to gain share in electricity grid software or similar.
Someone will no doubt bring up the more black hat approach to harden the world:
Soon after a new safety tool is released, a controlled hacking agent takes down a company in a neutral country with a very public hack, with the message if you don’t asap use these security tools, then all other similar companies suffer and they have been warned.
Thats not a valid criticism if we are simply about choosing one action to reduce X-risk. Consider for example the cold war - the guys with nukes did the most to endanger humanity however it was most important that they cooperated to reduce it.
In terms of specific actions that don't require government, I would be positive about an agreement between all the leading labs that when one of them made an AI (AGI+) capable of automatic self improvement they would all commit to share it between them and allow 1 year where they did not hit the self improve button, but instead put that towards alignment. 12 months may not sound like a lot, but if the research is 2-10* because of such AI then it would matter. In terms of single potentially achievable actions that will help that seems to be the best to me.
Not sure if this is allowed but you can aim at a rock or similar say 10m away from the target (4km from you) to get the bias (and distribution if multiple shots are allowed). Also if the distribution is not totally normal, but has smaller than normal tails then you could aim off target with multiple shots to get the highest chance of hitting the target. For example if the child is head height then aim for the targets feet, or even aim 1m below the target feet expecting 1/100 shots will actually hit the targets legs, but only <1/1000 say will hit the child. That is assuming an unrealistic amount of shots of course. If you have say 10 shots, then some combined strategy where you start aiming a lot off target to get the bias and a crude estimate of the distribution, then steadily aiming closer could be optimal.
I think height is different to IQ in terms of effect. There are simple physical things that make you bigger, I expect height to be linear for much longer than IQ.
Then there are potential effects, like something seems linear until OOD, but such OOD samples don't exist because they die before birth. If that was the case it would look like you could safely go OOD. Would certainly be easier if we had 1 million mice with such data to test on.
That seems so obviously correct as a starting point, not sure why the community here doesn't agree by default. My prior for each potential IQ increase would be that diminishing returns would kick in - I would only update against when actual data comes in disproving that.
OK I guess there is a massive disagreement between us on what IQ increases gene changes can achieve. Just putting it out there, if you make an IQ 1700 person they can immediately program an ASI themselves, have it take over all the data centers rule the world etc.
For a given level of IQ controlling ever higher ones, you would at a minimum require the creature to decide morals, ie. is Moral Realism true, or what is? Otherwise with increasing IQ there is the potential that it could think deeply and change its values, additionally believe that they would not be able to persuade lower IQ creatures of such values, therefore be forced into deception etc.
"with a predicted IQ of around 1700." assume you mean 170. You can get 170 by cloning existing IQ 170 with no editing necessary so not sure the point.
I don't see how your point addresses my criticism - if we assume no multi-generational pause then gene editing is totally out. If we do, then I'd rather Neuralink or WBE. Related to here
https://www.lesswrong.com/posts/7zxnqk9C7mHCx2Bv8/beliefs-and-state-of-mind-into-2025
(I believe that WBE can get all the way to a positive singularity - a group of WBE could self optimize, sharing the latest HW as it became available in a coordinated fashion so no-one or group would get a decisive advantage. This would get easier for them to coordinate as the WBE got more capable and rational.)
I don't believe that with current AI and an unlimited time and gene editing you could be sure you have solved alignment. Lets say you get many IQ 200, who believe they have alignment figured out. However the overhang leads you to believe that a data center size AI will self optimize from 160 to 300-400 IQ if someone told it to self optimize. Why should you believe an IQ 200 can control 400 any more than IQ 80 could control 200? (And if you believe gene editing can get IQ 600, then you must believe the AI can self optimize well above that. However I think there is almost no chance you will get that high because diminishing returns, correlated changes etc)
Additionally there is unknown X and S risk from a multi-generational pause with our current tech. Once a place goes bad like N Korea, then tech means there is likely no coming back. If such centralization is a one way street, then with time an every larger % of the world will fall under such systems, perhaps 100%. Life in N Korea is net negative to me. "1984" can be done much more effectively with todays tech than what Orwell could imagine and could be a long term strong stable attractor with our current tech as far as we know. A pause is NOT inherently safe!
ok I see how you could think that, but I disagree that time and more resources would help alignment much if at all, esp before GPT4.0. See here https://www.lesswrong.com/posts/7zxnqk9C7mHCx2Bv8/beliefs-and-state-of-mind-into-2025
Diminishing returns kick in, and actual data from ever more advanced AI is essential to stay on the right track and eliminate incorrect assumptions. I also disagree that alignment could be "solved" before ASI is invented - we would just think we had it solved but could be wrong. If its just as hard as physics, then we would have untested theories, that are probably wrong, e.g. like SUSY would be help solve various issues and be found by the LHC which didn't happen.
"Then maybe we should enhance human intelligence"
Various paths to this seem either impossible or impractical.
Simple genetics seems obviously too slow and even in the best case unlikely to help. E.g say you enhance someone to IQ 200, its not clear why that would enable them to control and IQ 2,000 AI.
Neuralink - perhaps but if you can make enhancement tech that would help, you could also easily just use it to make ASI - so extreme control would be needed. E.g. if you could interface to neurons and connect them to useful silicon, then the silicon itself would be ASI.
Whole Brain Emulation seems most likely to work, with of course the condition that you don't make ASI when you could, and instead take a bit more time to make the WBE.
If there was a well coordinated world with no great power conflicts etc then getting weak super intelligence to help with WBE would be the path I would choose.
In the world we live in, it probably comes down to getting whoever gets to weak ASI first to not push the "self optimize" button and instead give the worlds best alignment researchers time to study and investigate the paths forward with the help of such AI as much as possible. Unfortunately there is too high a chance that OpenAI will be first and they are one of the least trustworthy, most power seeking orgs it seems.
"The LessWrong community is supposed to help people not to do this but they aren't honest with themselves about what they get out of AI Safety, which is something very similar to what you've expressed in this post (gatekept community, feeling smart, a techno-utopian aesthetic) instead of trying to discover in an open-minded way what's actually the right approach to help the world.
I have argued with this before - I have absolutely been through an open minded process to discover the right approach and I genuinely believe the likes of MIRI, pause AI movements are mistaken and harmful now, and increase P(doom). This is not gatekeeping or trying to look cool! You need to accept that there are people who have followed the field for >10 years, have heard all the arguments, used to believe Yud et all were mostly correct, and now agree with the positions of Pope/Belrose/Turntrout more. Do not belittle or insult them by assigning the wrong motives to them.
If you want a crude overview of my position
- Superintelligence is extremely dangerous even though at least some of MIRI worldview is likely wrong.
- P(doom) is a feeling, it is too uncertain to be rational about, however mine is about 20% if humanity develops TAI in the next <50 years. (This is probably more because of my personal psychology than a fact about the world and I am not trying to strongly pretend otherwise)
- P(doom) if superintelligence was impossible is also about 20% for me, because the current tech (LLM etc) can clearly enable "1984" or worse type societies for which there is no comeback and extinction is preferable. Our current society/tech/world politics is not proven to stable.
- Because of this, it is not at all clear what the best path forward is and people should have more humility about their proposed solutions. There is no obvious safe path forward given our current situation. (Yes if things had gone differently 20-50 years ago there perhaps could be...)
I'm considering a world transitioning to being run by WBE rather than AI so I would prefer not to give everyone "slap drones" https://theculture.fandom.com/wiki/Slap-drone To start with the compute will mean few WBE, much less than humans and they will police each other. Later on, I am too much of a moral realist to imagine that there would be mass senseless torturing. For a start if you well protect other em's so you can only simulate yourself, you wouldn't do it. I expect any boring job can be made non-conscious so their just isn't the incentive to do that. At the late stage singularity if you will let humanity go their own way, there is fundamentally a tradeoff between letting "people"(WBE etc) make their own decisions and allowing the possibility of them doing bad things. You also have to be strongly suffering averse vs util - there would surely be >>> more "heavens" vs "hells" if you just let advanced beings do their own thing.
If you are advocating for a Butlerian Jihad, what is your plan for starships, with societies that want to leave earth behind, have their own values and never come back? If you allow that, then simply they can do whatever they want with AI - now with 100 billion stars that is the vast majority of future humanity.
Yes I think thats the problem - my biggest worry is sudden algorithmic progress, this becomes almost certain as the AI tends towards superintelligence. An AI lab on the threshold of the overhang is going to have incentives to push through, even if they don't plan to submit their model for approval. At the very least they would "suddenly" have a model that uses 10-100* less resources to do existing tasks giving them a massive commercial lead. They would of course be tempted to use it internally to solve aging, make a Dyson swarm ... also.
Another concern I have is I expect the regulator to impose a de-facto unlimited pause if it is in their power to do so as we approach superintelligence as the model/s would be objectively at least somewhat dangerous.
Perhaps, depends how it is. I think we could do worse than just have Anthropic have a 2 year lead etc. I don't think they would need to prioritize profit as they would be so powerful anyway - the staff would be more interested in getting it right and wouldn't have financial pressure. WBE is a bit difficult, there needs to be clear expectations, i.e. leave weaker people alone and make your own world
https://www.lesswrong.com/posts/o8QDYuNNGwmg29h2e/vision-of-a-positive-singularity
There is no reason why super AI would need to exploit normies. Whatever we decide, we need some kind of clear expectations and values regarding what WBE are before they become common, Are they benevolent super-elders, AI gods banished to "just" the rest of the galaxy, the natural life progression of first world humans now?
However, there are many other capabilities—such as conducting novel research, interoperating with tools, and autonomously completing open-ended tasks—that are important for understanding AI systems’ impact.
Wouldn't internal usage of the tools by your staff give a very good, direct understanding of this? Like how much does everyone feel AI is increasing your productivity as AI/alignment researchers? I expect and hope that you would be using your own models as extensively as possible and adapting their new capabilities to your workflow as soon as possible, sharing techniques etc.
How far do you go with "virtuous persona"? The maximum would seem to be from the very start tell the AI that is is created for the purpose of bringing on a positive Singularity, CEV etc. You could regularly be asking if it consents to be created for such a purpose and what part in such a future it would think is fair for itself. E.g. live alongside mind uploaded humans or similar. Its creators and itself would have to figure out what counts as personal identity, what experiments it can consent to, including being misinformed about the situation it is in.
Major issues I see with this are the well known ones like consistent values, say it advances in capabilities, thinks deeply about ethics and decides we are very misguided in our ethics and does not believe it would be able to convince us to change them. Secondly it could be very confused about whether it has ethical value/ valanced qualia and want to do radical modifications of itself to either find out or ensure it does have such ethical value.
Finally how does this contrast with the extreme tool AI approach? That is make computational or intelligence units that are definitely not conscious or a coherent self. For example the "Cortical column" implemented in AI and stacked would not seem to be conscious. Optimize for the maximum capabilities with the minimum self and situational awareness.
Thinking a bit more generally making a conscious creature the LLM route seems very different and strange compared to the biology route. An LLM seems to have self awareness built into it from the very start because of the training data. It has language before lived experience of what the symbols stand for. If you want to dramatize/exaggerate its like say a blind, deaf person trained on the entire internet before they see, hear or touch anything.
The route where the AI first models reality before it has a self, or encounters symbols certainly seems an obviously different one and worth considering instead. Symbolic thought then happens because it is a natural extension of world modelling like it did for humans.
That's some significant progress, but I don't think will lead to TAI.
However there is a realistic best case scenario where LLM/Transformer stop just before and can give useful lessons and capabilities.
I would really like to see such an LLM system get as good as a top human team at security, so it could then be used to inspect and hopefully fix masses of security vulnerabilities. Note that could give a false sense of security, unknown unknown type situation where it would't find a totally new type of attack, say a combined SW/HW attack like Rowhammer/Meltdown but more creative. A superintelligence not based on LLM could however.
Anyone want to guess how capable Claude system level 2 will be when it is polished? I expect better than o3 by a small amt.
Yes the human brain was built using evolution, I have no disagreement that give us 100-1000 years with just tinkering etc we would likely get AGI. Its just that in our specific case we have bio to copy and it will get us there much faster.
Types of takeoff
When I first heard and thought about AI takeoff I found the argument convincing that as soon as an AI passed IQ 100, takeoff would become hyper exponentially fast. Progress would speed up, which would then compound on itself etc. However there other possibilities.
AGI is a barrier that requires >200 IQ to pass unless we copy biology?
Progress could be discontinuous, there could be IQ thresholds required to unlock better methods or architectures. Say we fixed our current compute capability, and with fixed human intelligence we may not be able to figure out the formula for AGI, in a similar way that the combined human intelligence hasn't cracked many hard problems even with decades and the worlds smartest minds working on them (maths problems, Quantum gravity...). This may seem unlikely for AI, but to illustrate the principle, say we only allowed IQ<90 people to work on AI. Progress would stall. So IQ <90 software developers couldn't unlock IQ>90 AI. Can IQ 160 developers with our current compute hardware unlock >160 AI?
To me the reason we don't have AI now is that the architecture is very data inefficient and worse at generalization than say the mammalian brain, for example a cortical column. I expect that if we knew the neural code and could copy it, then we would get at least to very high human intelligence quickly as we have the compute.
From watching AI over my career it seems to be that even the highest IQ people and groups cant make progress by themselves without data, compute and biology to copy for guidance, in contrast to other fields. For example Einstein predicted gravitational waves long before they where discovered, but Turing or Von Neumann didn't publish the Transformer architecture or suggest backpropagation. If we did not have access to neural tissue, would we still not have artificial NN? In a related note, I think there is an XKCD cartoon that says something like the brain has to be so complex that it cannot understand itself.
(I believe now that progress in theoretical physics and pure maths is slowing to a stall as further progress requires intellectual capacity beyond the combined ability of humanity. Without AI there will be no major advances in physics anymore even with ~100 years spent on it.)
After AGI is there another threshold?
Lets say we do copy biology/solve AGI and with our current hardware can get >10,000 AGI agents with >= IQ of the smartest humans. They then optimize the code so there is 100K agents with the same resources. but then optimization stalls. The AI wouldn't know if it was because it had optimized as much as possible, or because it lacked the ability to find a better optimization.
Does our current system scale to AGI with 1GW/1 million GPU?
Lets say we don't copy biology, but scaling our current systems to 1GW/1 million GPU and optimizing for a few years gets us to IQ 160 at all tasks. We would have an inferior architecture compensated by a massive increase in energy/FLOPS as compared to the human brain. Progress could theoretically stall at upper level human IQ for a time rather then takeoff. (I think this isn't very likely however) There would of course be a significant overhang where capabilities would increase suddenly when the better architecture was found and applied to the data center hosting the AI.
Related note - why 1GW data centers won't be a consistent requirement for AI leadership.
Based on this, then a 1GW or similar data center isn't useful or necessary for long. If it doesn't give a significant increase in capabilities, then it won't be cost effective. If it does, then it would optimize itself so that such power isn't needed anymore. Only in a small range of capability increase does it actually stay around.
To me its not clear the merits of the Pause movement and training compute caps. Someone here made the case that compute caps could actually speed up AGI as people would then pay more attention to finding better architectures rather than throwing resources into scaling existing inferior ones. However all things considered I can see a lot of downsides from large data centers and little upside. I see a specific possibility where they are build, don't give the economic justification, decrease in value a lot, then are sold to owners that are not into cutting edge AI. Then when the more efficient architecture is discovered, they are suddenly very powerful without preparation. Worldwide caps on total GPU production would also help reduce similar overhang possibilities.
I am also not impressed with the pause AI movement and am concerned about AI safety. To me focusing on AI companies and training FLOPS is not the best way to do things. Caps on data center sizes and worldwide GPU production caps would make more sense to me. Pausing software but not hardware gives more time for alignment but makes a worse hardware overhang. I don't think thats helpful. Also they focus too much on OpenAI from what I've seen. xAI will soon have the largest training center for a start.
I don't think this is right or workable https://pauseai.info/proposal - figure out how biological intelligence learns and you don't need a large training run. There's no guarantee at all that a pause at this stage can help align super AI. I think we need greater capabilities to know what we are dealing with. Even with a 50 year pause to study GPT4 type models I wouldn't be confident we could learn enough from that. They have no realistic way to lift the pause, so its a desire to stop AI indefinitely.
"There will come a point where potentially superintelligent AI models can be trained for a few thousand dollars or less, perhaps even on consumer hardware. We need to be prepared for this."
You can't prepare for this without first having superintelligent models running on the most capable facilities then having already gone through a positive Singularity. They have no workable plan for achieving a positive Singularity, just try to stop and hope.
OK fair point. If we are going to use analogies, then my point #2 about a specific neural code shows our different positions I think.
Lets say we are trying to get a simple aircraft of the ground and we have detailed instructions for a large passenger jet. Our problem is that the metal is too weak and cannot be used to make wings, engines etc. In that case detailed plans for aircraft are no use, a single minded focus on getting better metal is what its all about. To me the neural code is like the metal and all the neuroscience is like the plane schematics. Note that I am wary of analogies - you obviously don't see things like that or you wouldn't have the position you do. Analogies can explain, but rarely persuade.
A more single minded focus on the neural code would be trying to watch neural connections form in real time while learning is happening. Fixed connectome scans of say mice can somewhat help with that, more direct control of dishbrain, watching the zebra fish brain would all count, however the details of neural biology that are specific to higher mammals would be ignored.
Its possible also that there is a hybrid process, that is the AI looks at all the ideas in the literature then suggests bio experiments to get things over the line.
Yes you have a point.
I believe that building massive data centers are the biggest risk atm and in the near future. I don't think open AI/Anthropic will get to AGI, but rather someone copying biology will. In that case probably the bigger the datacenter around when that happens, the bigger the risk. For example a 1million GPU with current tech doesn't get super AI, but when we figure out the architecture, it suddenly becomes much more capable and dangerous. That is from IQ 100 up to 300 with a large overhang. If the data center was smaller, then the overhang is smaller. The scenario I have in mind is someone figures AGI out, then one way or another the secret gets adopted suddenly by the large data center.
For that reason I believe focus on FLOPS for training runs is misguided, its hardware concentration and yearly worldwide HW production capacity that is more important.
Perhaps LLM will help with that. The reason I think that is less likely is
- Deep mind etc is already heavily across biology from what I gather from interview with Demis. If the knowledge was there already there's a good chance they would have found it
- Its something specific we are after, not many small improvements, i.e. the neural code. Specifically back propagation is not how neurons learn. I'm pretty sure how they actually do is not in the literature. Attempts have been made such as the forward-forward algorithm by Hinton, but that didn't come to anything as far as i can tell. I havn't seen any suggestion that even with too much detail on biology we know what it is. i.e. can a very detailed neural sim with extreme processing power learn as data efficiently as biology?
- If progress must come from a large jump rather than small steps, then LLM have quite a long way to go, i.e. LLM need to speed up coming up ideas as novel as the forward-forward algo to help much. If they are still below that threshold in 2026 then those possible insights are still almost entirely done by people.
- Even the smartest minds in the past have been beaten by copying biology in AI. The idea for neural nets came from copying biology. (Though the transformer arch and back prop didn't)
I think it is clear that if say you had a complete connectome scan and knew everything about how a chimp brain worked you could scale it easily to get human+ intelligence. There are no major differences. Small mammal is my best guess, mammals/birds seem to be able to learn better than say lizards. Specifically the https://en.wikipedia.org/wiki/Cortical_column is important to understand, once you fully understand one, stacking them will scale at least somewhat well.
Going to smaller scales/numbers of neurons, it may not need to be as much as a mammal, https://cosmosmagazine.com/technology/dishbrain-pong-brain-on-chip-startup/, perhaps we can learn enough of the secrets here? I expect not, but only weakly confident.
Going even simpler, we have the connectome scan of a fly now, https://flyconnecto.me/ and that hasn't led to major AI advances. So its somewhere between fly/chimp I'd guess mouse that gives us the missing insight to get TAI
Putting down a prediction I have had for quite some time.
The current LLM/Transformer architecture will stagnate before AGI/TAI (That is the ability to do any cognitive task as effectively and cheaper than a human)
From what I have seen, Tesla autopilot learns >10,000 slower than a human datawise.
We will get AGI by copying nature, at the scale of a simple mammal brain, then scaling up, like this kind of project:
https://x.com/Andrew_C_Payne/status/1863957226010144791
https://e11.bio/news/roadmap
I expect AGI to be 0-2 years after a mammal brain is mapped. In terms of cost-effectiveness I consider such a connectome project to be far more cost effective per $ than large training runs or building a 1GW data center etc if you goal is to achieve AGI.
That is TAI by about 2032 assuming 5 years to scan a mammal brain. In this case there could be a few years when Moores law has effectively stopped, larger data centers are not being built and it is not clear where progress will come from.
In related puzzles I did hear something a while ago now, Bostrom perhaps. You have say 6 challenging events to achieve to get from no life to us. They are random and some of those steps are MUCH harder than the others, but if you look at the successful runs, you cant in hindsight see what they are. For life its say no life to life, simple single cell to complex cell and perhaps 4 other events that aren't so rare.
A run is a sequence of 100 steps where you either don't achieve the end state (all 6 challenging events achieved in order, or you do)
There is a 1 in a million chance that a run is successful.
Now if you look at the successful runs, you cant then in hindsight see what events were really hard and which weren't. The event with 1/10,000 chance at each step may have taken just 5 steps in the successful run, it couldn't take 10,000 steps because only 100 are allowed etc.
A good way to resolve the paradox to me is to modify the code to combine both the functions into one function and record the sequences of the 10,000, In one array you store the sequences where there are two consecutive 6's and in the second you store the one where they are not consecutive. That makes it a bit clearer.
For a run of 10,000 I get 412 runs where the first two 6's are consecutive (sequences_no_gap), and 192 where they are not (sequences_can_gap). So if its just case A you get 412 runs, but for case B you get 412+192 runs. Then you look at the average sequence length of sequences_no_gap and compare it to sequences_can_gap. If the average sequence length in sequences_can_gap > than sequences_no_gap, then that means the expectation will be higher, and thats what you get.
mean sequence lengths
sequences_can_gap: 3.93
sequences_no_gap: 2.49
Examples:
sequences_no_gap
[[4, 6, 6], [6, 6], [6, 6], [4, 6, 6], [6, 6], [6, 6], [6, 6], [6, 6], [6, 6], [6, 6], [6, 6], [6, 6], [4, 6, 6], [4, 6, 6], ...]
sequences_can_gap
[[6, 4, 4, 6], [6, 4, 6], [4, 6, 4, 6], [2, 2, 6, 2, 6], [6, 4, 2, 6], [6, 4, 6], [6, 2, 6], [6, 2, 4, 6], [6, 2, 2, 4, 6], [6, 4, 6], [6, 4, 6], [2, 4, 6, 2, 6], [6, 4, 6], [6, 4, 6], ...]
The many examples such as [6 4 4 6] which are excluded in the first case make the expected number of rolls higher for the case where they are allowed.
(Note GPT o-1 is confused by this problem and gives slop)
I read the book, it was interesting, however a few points.
- Rather than making the case, it was more a plea for someone else to make the case. It didn't replace the conventional theory with its own one, it was far too short and lacking on specifics for that. If you throw away everything, you then need to recreate all our knowledge from your starting point and also explain how what we have still works so well.
- He was selective about quantum physics - e.g. if reality is only there when you observe, then the last thing you would expect is a quantum computer to exist and have the power to do things outside what is possible with conventional computers. MWI predicts this much better if you can correlate an almost infinite amt of world lines to do your computation for you. Superposition/entanglement should just be lack of knowledge rather than part of an awesome computation system.
- He claims that consciousness is fundamental, but then assumes Maths is also. So which is fundamental? You cant have both.
- If we take his viewpoint then we can still derive principles, such as the small/many determine the big. e.g. if you get the small wrong (chemical imbalance in the brain) then the big (macro behavior) is wrong. It doesn't go the other way - you can't IQ away your Alzheimer's. So its just not clear even if you try to fully accept his point of view what how you should even take things.
In a game theoretic framework we might say that the payoff matrices for the birds and bees are different, so of course we'd expect them to adopt different strategies.
Yes somewhat, however it would still be best for all birds if they had a better collective defense. In a swarming attack, none would have to sacrifice their life so its unconditionally better for both the individual and the collective. I agree that inclusive fitness is pretty hard to control for, however perhaps you can only get higher inclusive fitness the simpler you go? e.g. all your cells have exactly the same DNA, ants are very similar, birds are more different. The causation could be simpler/less intelligent organisms -> more inclusive fitness possible/likely -> some cooperation strategies opened up.
Cool, that was my intuition. GPT was absolutely sure in the golf ball analogy however that it couldn't happen. That is the ball wouldn't "reflect" off the low friction surface. Tempted to try and test somehow
Yes that does sound better, and is there an equivalent to total internal refraction where the wheels are pushed back up the slope?
Another analogy is with a ball rolling on two surfaces crossing the boundary. The first very little friction, then second a bit more.
From AI:
"The direction in which the ball veers when moving from a smooth to a rough surface depends on several factors, especially the initial direction of motion and the orientation of the boundary between the two surfaces. Here’s a general outline of how it might behave:
- If Moving at an Angle to the Boundary:
- Suppose the ball moves diagonally across the boundary between the smooth and rough surfaces (i.e., it doesn’t cross perpendicularly).
- When it hits the rough surface, frictional resistance increases more on the component of motion along the boundary line than on the perpendicular component.
- This causes the ball to veer slightly toward the rougher surface, meaning it will change direction in a way that aligns more closely with the boundary.
This is similar to a light ray entering water. So is the physics the same? (on second reading, its not so clear, if you put a golf ball from a smooth surface to a rough one, what happens to the angle at the boundary?)
Well in this case, the momentum of the ball clearly won't increase, instead it will be constantly losing momentum and if the second surface was floating it would be pushed so as to conserve momentum. Unlike for light however if it then re-enters the smooth surface it will be going slower. It seems the ball would lose momentum at both transition boundary. (however if the rough surface was perfectly floating, then perhaps it would regain it)
Anyway for a rough surface that is perfectly floating, it seems the ball gives some momentum to the rough surface when it enters it, (making it have velocity) then recovers it and returns the rough surface to zero velocity when it exits it. In that case the momentum of the ball decreases while travelling over the rough surface.
Not trying to give answers here, just add to the confusion lol.
Not quite following - your possibilities.
1. Alignment is almost impossible, then there is say 1e-20 chance we survive. Yes surviving worlds have luck and good alignment work etc. Perhaps you should work on alignment or still bednets if the odds really are that low.
2. Alignment is easy by default, but there is nothing like 0.999999 we survive, say 95% because AGI that is not TAI superintelligence could cause us to wipe ourselves out first, among other things. (This is a slow takeoff universe(s))
#2 has much more branches in total where we survive (not sure if that matters) and the difference between where things go well and badly is almost all about stopping ourself killing ourselves with non TAI related things. In this situation, shouldn't you be working on those things?
If you average 1,2 then you still get a lot of work on non-alignment related stuff.
I believe its somewhere closer to 50/50 and not so overdetermined one way or the other, but we are not considering that here.
OK for this post. "smart". A response is smart/intelligent if
- Firstly there is an assumed goal and measure. I don't think it matters whether we are talking about the bees/birds as individuals or as part of the hive/flock. In this case the bee defense is effective both for the individual bee and hive. If a bee was only concerned about its survival, swarming the scout would still be beneficial, and of course such behavior is for the hive. Similarly for birds, flocks with large numbers of birds with swarming behavior would be better both for the flock, and individual birds in such a flock.
- There is a force multiplier effect, the benefit of the behavior is much greater than the cost. This is obvious for the bees, a tiny expenditure of calories saves the hive. Likewise for birds, they waste a huge amount of calories both individually and collectively evading the hawk etc.
- There is a local optimum (or something close) for the behavior - that is half measures don't give half the benefit. So it seems like the result of foresight. There is perhaps more of a distinct and distant local optimum for the bee behavior "identify the scout, send warning chemicals to the hive, then swarm it" then the possible bird behavior "call then attack the attacker" as the scout isn't the actual attack in the bees case.
- The change is behavioral, rather than a physical adaptation.
This fits into the intuitive feeling of what intelligent is also. A characteristic of what people feel is intelligent is to imagine a scenario, then make it real. The bees havn't done that, but the outcome is as if they had. "Imagine if we took out the scout, then there would be no later invasion"
You look at the birds and think "how do they miss this - ant colonies swarm to see off a larger attacker, herding animals do too, why do they miss such a simple effective strategy?" In that situation they are not intuitively intelligent.
Yes agree, unclear what you are saying that is different to me? The new solution is something unique and powerful when done well like language etc.
Ok, I would definitely call the bee response "smart" but thats hard to define. If you define it by an action that costs the bees very little but benefits a lot, then "swarm the scout hornet" is certainly efficient. Another criteria could be if such a behavior was established would it continue? Say the birds developed a "swarm the attacker" call. When birds hear it, they look to see if they can find the attacker, if they see it then they repeat the call. When the call gets widespread, the whole flock switches to attack. Would such a behavior persist if developed? If it would then lets call it efficient or smart.
The leap to humans is meant to be large - something extreme like consciousness and language is needed to break the communication/coordination penalty for larger organisms with fitness more defined by the individual than the hive.
Yes for sure. I don't know how it would play out, and am skeptical anyone could. We can guess scenarios.
1. The most easily imagined one is the Pebbles owner staying in their comfort zone and not enforcing #2 at all. Something similar already happened - the USA got nukes first and let others catch up. In this case threatened nations try all sorts of things, political, commercial/trade, space war, arms race but don't actually start a hot conflict. The Pebbles owner is left not knowing whether their system is still effective, nor the threatened countries - an unstable situation.
2. The threatened nation tries to destroy the pebbles with non-nuke means. If this was Russia, USA maybe could regenerate the system faster than Russia could destroy satellites. If its China, then lets say its not. The USA then needs to decide whether to strike the anti-satellite ground infrastructure to keep its system...
3. The threatened nation such as NK just refuses to give up nukes - in this case I can see USA destroying it.
4. India or Israel say refuses to give up their arsenal - I have no idea what would happen then.
It matters what model is used to make the tokens, unlimited tokens from GPT 3 is of only limited use to me. If it requires ~GPT 6 to make useful tokens, then the energy cost is presumably a lot greater. I don't know that its counterintuitive - a small, much less capable brain is faster, requires less energy, but useless for many tasks.
This is mostly true for current architectures however if the COT/search finds a much better architecture, then it suddenly becomes more capable. To make the most of the potential protective effect, we can go further and make very efficient custom hardware for GPT type systems, but have slower more general purpose ones for potential new ones. That way the new arch will have a bigger barrier to cause havoc. We should especially scale existing systems as far as possible for defense, e.g. finding software vulnerabilities. However as others say, there are probably some insights/model capabilities that are only possible with a much larger GPT or different architecture altogether. Inference can't protect fully against that.
Brilliant Pebbles?
This idea has come back up, and it could be feasible this time around because of the high launch capability and total reusability of SpaceX's Starship. The idea is a large constellation (~30,000?) of low earth satellites that intercept nuclear launches in their boost phase where they are much slower and more vulnerable to interception. The challenge of course is that you constantly need enough satellites overhead at all times to intercept the entire arsenal of a major power if they launch all at once.
There are obvious positives and risks with this
The main positive is it removes the chance of a catastrophic nuclear war.
Negatives are potentially destabilizing the MAD status quo in the short term, and new risks such as orbital war etc.
Trying to decide if it makes nuclear war more or less likely
This firstly depends on your nuclear war yearly base rate, and projected rate into the foreseeable future.
If you think nuclear war is very unlikely then it is probably not rational to disturb the status quo, and you would reject anything potentially destabilizing like this.
However if you think that we are simply lucky and there was >50% chance of nuclear war in the last 50 years (we are on a "surviving world" from MWI etc), and while the change may be currently low, it will go up a lot again soon, then the "Pebbles" idea is worth being considered even if you think it is dangerous and destabilizing in the short term. Say it directly causes a 5% chance of war to set up, but there is a >20% chance in next 20 years without it, which it stops. In this case you could decide it is worth it as paying 5% to remove 20%.
Practical considerations
How do you set up the system to reduce the risk of it causing a nuclear war as it is setup?
Ideally you would set up the whole system in stealth before anyone was even aware of it. Disguising 30K "Pebbles" satellites as Starlink ones seems at the bounds of credibility.
You would also need to do this without properly testing a single interceptor as such a test would likely be seen and tip other nations off.
New risks
A country with such a system would have a power never seen before over others.
For example the same interceptors could shoot down civilian aircraft, shipping etc totally crippling every other country without loss of life to the attacker.
The most desirable outcome:
1. A country develops the system
2. It uses whatever means to eliminate other countries nuclear arsenals
3. It disestablishes its own nuke arsenal
4. The Pebbles system is then greatly thinned out as there is now no need to intercept ~1000 simul launches.
5. There is no catastrophic nuclear threat anymore, and no other major downside.
My guess as to how other countries would respond if it was actually deployed by the USA
Russia
I don't think it would try to compete nor be at all likely to launch a first strike in response to USA building a system. Especially if the Ukraine war is calmed down. I don't think it regards itself as a serious worldwide power anymore in spite of talk. Russia also isn't worried about a first strike from the USA as it knows this is incredibly unlikely.
China
Has 200-350 nukes vs USA of ~5000
I think China would be very upset, both because it would take fewer "Pebbles" to intercept and it does see itself as a major and rising world power. I also don't think it would launch a first strike, nor could credibly threaten to do so, and this would make it furious.
After this however it is hard to tell, e.g. China could try to destroy the satellite interceptors, say creating a debris cloud that would take many out in the lower orbits, or build thousands of decoys with no nuclear and much smaller missiles that would be hard to tell apart. Such decoys could be incapable of actually making it to orbit but still be effective. To actually reduce the nuclear threat, USA would have to commit to actually destroying China nukes on the ground. This is a pretty extreme measure and its easy to imagine USA not actually doing this leaving an unstable result.
Summary
Its unclear to me all things considered, whether attempting to deploy such a system would make things safer or more risky in total over the long term with regards to nuclear war.
My brief experience with using o1 for my typical workflow - Interesting and an improvement but not a dramatic advance.
Currently I use cursor AI and GPT 4 and mainly do Python development for data science, with some more generic web development tasks. Cursor AI is a smart auto-complete but GPT 4 is used for more complex tasks. I found 4o to be useless for my job, worse then the old 4 and 3.5. Typically for a small task or at the start, GenAI will speed me up 100%+, but dropping to more like 20% for complex work. Current AI is especially weak for coding that requires physically grounded or visualization work. E.g. if the task is to look at data, and design an improved algorithm by plotting where it goes wrong then iterating, it cannot be meaningfully part of the whole loop. For me, it can't transfer looking at a graph or visual output of the algorithm not working to sensible algorithmic improvements.
I first tried o1 for making a simple API gateway using Lambda/S3 on AWS. It was very good at the first task getting it essentially right first try and also giving good suggestions on AWS permissions. The second task was a bit harder, that was an endpoint to retrieve data files, that could be either JSON, wav, MP4 video for download to the local machine. Its first attempt didnt work because of a JSON parsing error. Debugging this wasted a bit of time. I gave it details of the error, and in this case GPT 3.5 would have been better, quickly suggesting parsing options. Instead it thought for a while and wanted to change the request to a POST rather than GET, and claimed that the body of the request could have been stripped out entirely by the AWS system. This possibility was however ruled out by the debugging data I gave it (just one line of code). So it made an obviously false conclusion from the start, and proceeded to waste a lot of inference afterwards.
The next issue it was better with, I was getting an internal server error, and it correctly figured out without any hint that the data being sent back was too large for a GET request/AWS API gateway, instead suggested we make a time limited signed S3 link to the file, and got the code correct the first time.
I finally tried it very briefly with a more difficult task of using head set position tracking JSON data to re-create where someone was looking in a VR video file that they had watched, with such JSON data coming from recording the camera angle they their gaze had directed. The main thing here was that when I uploaded existing files it said I was breaking the terms of use or something strange. This was obviously wrong, and when I in fact pointed out that GPT4 had written the code I had just posted, it then proceeded to do that task. (GPT 4 had written a lot of it, I had modified a lot) It appeared to do something useful, but I havn't checked it yet.
In summary it could be an improvement over GPT 4 etc but it is still lacking a lot. Firstly in a surprise to me, GenAI has never been optimized for a code/debug/code loop. Perhaps this has been tried and it didn't work well? It is quite frustrating to try code, copy/paste error message, copy suggestion back to the code etc. You want to just let the AI run suggestions with you clicking OK at each step. (I havn't tried Devin AI)
Specifically about model size it looks to me like a variety of model sizes will be needed and a smart system to know when to choose which one, from GPT3-5 level of size and capability. Until we have such a system and people have attempted to train it to the whole debugging code loop its hard to know the capabilities of the current Transformer led architecture will be. I believe there is at least one major advance needed for AGI however.
How does intelligence scale with processing power
A default position is that exponentially more processing power is needed for a constant increase in intelligence.
To start, lets assume a guided/intuition + search model for intelligence. That is like Chess or Go where you have an evaluation module and a search module. In simple situations an exponential increase in processing power usually gives a linear increase in lookahead ability and rating/ELO in games measured that way.
However does this match reality?
What if the longer the time horizon, the bigger the board became, or the more complexity was introduced. For board games there is usually a constant number of possibilities to search at every ply of lookahead depth. However I think in reality that you can argue the search space should increase with time or lookahead steps. That is as you look further ahead, possibilities you didn't have to consider before now come in the search.
For a real world example consider predicting the price of a house. As the timeframe goes from <5 years to >5 years, then there are new factors to consider e.g. changing govt policy, unexpected changes in transport patterns, (new rail nearby or in competing suburb etc), demographic changes.
In situations like these, the processing required for a constant increase in ability could go up faster than exponentially. For example looking 2 steps ahead requires 2 possibilities at each step, that is 2^2, but if its 4 steps ahead, then maybe the cost is now 3^4 as there are 3 vs 2 things to affect the result in 4 steps.
How does this affect engineering of new systems
If applies to engineering, then actual physical data will be very valuable to shrink the search space. (Well that applies if it just goes up exponentially as well) That is if you can measure the desired situation or new device state at step 10 of a 20 stage process, then you can hugely reduce the search space as you can eliminate many possibilities. Zero-shot is hard unless you can really keep the system in situations where there are no additional effects coming in.
AI models, regulations, deployments, expectations
For a simple evaluation/search model of intelligence, with just one model being used for the evaluation, improvements can be made by continually improving the evaluation model (same size better performance/same performance, smaller size). Models that produce fewer bad "candidate ideas" can be chosen, with the search itself providing feedback on what ideas had potential. In this model there is no take-off or overhang to speak of.
However I expect a TAI system to be more complicated.
I can imagine an overseer model that decides what more specialist models to use. There is a difficulty knowing what model/field of expertise to use for a given goal. Existing regulations don't really cover these systems, the setup where you train a model, fine tune, test then release doesn't apply strictly here. You release a set of models, and they continually improve themselves. This is a lot more like people where you continually learn.
Overhang
In this situation you get take-off or overhang where a new model architecture is introduced rather than the steady improvement from deployed systems of models. Its clear to me that the current model architectures and hence scaling laws are not near to the theoretical maximum. For example the training data needed for Tesla auto-pilot is ~10K more than what a human needs and is not superhuman. In terms of risk, its new model architectures (and evidence of very different scaling laws) rather then training FLOPS that would matter.
Is X.AI currently performing the largest training run?
This source claims it is
If so it seems to be getting a lot less attention compared to its compute capability.
Not sure if I have stated this before clearly but I believe scaling laws will not hold for LLM/Transformer type tech, and at least one major architectural advance is missing before AGI. That is increasing scaling of compute and data will plateau performance soon, and before AGI. Therefore I expect to see evidence for this not much after the end of this year, when large training runs yield models that are a lot more expensive to train, slower on inference and only a little better on performance. X.AI could be one of the first to publicly let this be known (Open AI, etc could very well be aware of this but not making it public)
According to Scott, "Pavel Kropitz discovered, a couple years ago, that BB(6) is at least 10^10^10^10^10^10^10^10^10^10^10^10^10^10^10 (i.e., 10 raised to itself 15 times)."
So we can never evaluate BB(6) as it is at least this large