Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

post by Lauren Greenspan (LaurenGreenspan) · 2025-01-07T03:08:51.447Z · LW · GW · 2 comments

Contents

  What is Big Science?
  Paths toward a Big Science of AI Safety
    The Way Things Are
      As it stands, big science is more likely to be top-down
    Why bottom-up could be better
    How to scale from where we are now
  Appendix: Potential Pushback of this Approach
None
2 comments

Epistemic Status: This post is an attempt to condense some ideas I've been thinking about for quite some time. I took some care grounding the main body of the text, but some parts (particularly the appendix) are pretty off the cuff, and should be treated as such. 

The magnitude and scope of the problems related to AI safety have led to an increasingly public discussion about how to address them. Risks of sufficiently advanced AI systems involve unknown unknowns that could impact the global economy, national and personal security, and the way we investigate, innovate, and learn. Clearly, the response from the AI safety community should be as multi-faceted and expansive as the problems it aims to address. In a previous post [LW · GW], we framed fruitful collaborations between applied science, basic science, and governance as trading zones mediated by a safety-relevant boundary object (a safety case sketch) without discussing the scale of the collaborations we were imagining.

As a way of analyzing the local coordination between different scientific cultures, a trading zone can sometimes have a fractal structure; at every level of granularity  – a coarse graining of what is meant by ‘local’ – another trading zone appears. In addressing a narrow problem or AI safety sub-goal, like a particular safety case assumption, a small collaboration between different academic groups at an institution (or even across institutions) can constitute a trading zone. Zooming all the way out, the entire field of AI safety can be seen as a trading zone between basic science, applied science, and governance, with a grand mission like ‘create safety cases’ or ‘solve safety’ as its boundary object. This large-scale trading zone may result from (or lead to!) the development of big science for AI safety. Moreover, smaller trading zones would likely benefit from an overarching culture of coordination and resource-sharing that big science would provide.

In this post, I discuss the possible pathways to a big science for AI safety in terms of what is possible, what is likely, and what would be most beneficial, in my view. To ground the discussion, I hold that AI safety should have[1]:

What is Big Science?

‘Big science’ is a term originally coined by Alvin M Weinberg in 1961 to describe large-scale research efforts that require substantial resources, collaboration, and infrastructure. It typically involves joint input and progress from basic science and applied science or engineering, and has downstream effects on industry. It often spans nation-states, though this is not a strict requirement. Government involvement is essential for big science, though it may not be its main driving force.

Throughout this post, we’ll classify big science as either top-down or bottom-up .

The impact of big science tends to be far-reaching, transforming science, industry, and society. The above examples led to the adoption of nuclear energy, better standards for ethics and security in biomedicine, and the world wide web, to name just a few. Each example also received substantial pushback. The public was mainly concerned about the overall cost relative to public benefit, ethics, and safety (like concerns of a mini-black hole swallowing Geneva). In addition to ethics and safety, the scientific community was more concerned with the risks of increased bureaucracy in the pursuit of science, and the distribution of funds across disciplines(Kaiser, Utz). The Human Genome project, for example, invited criticism that the government-ordained focus of the project was too narrow, and would stifle creativity and curiosity among biologists. But scientists also push back on bottom-up big science projects; in the 1990s, prominent condensed matter physicists registered their abject opposition to the  superconducting supercollider (SSC), testifying before congress that the project would underfund their own field, which had much more immediate economic and societal benefits than large-scale exploratory particle physics. In this case, the testimony was so compelling that the US government scrapped the SSC, and condensed matter climbed to the top-tier of the physics hierarchy.

Paths toward a Big Science of AI Safety

The above examples show that big science, by mobilizing scientific practice in a particular direction, often produces a shift in the shared set of values and norms that make up a scientific culture. Depending on your place within that culture, these can be desirable (or not). They also demonstrate big science’s impact on society at large, which could be made even greater given the many futures toward which AI’s development can lead. In the rest of this post, I will unpack some possible paths to big science for AI safety, and argue that a bottom-up approach, though by no means inevitable, is most likely to maintain the core criteria for AI safety laid out in the first section.

The Way Things Are

The technical AI safety research ecosystem can be roughly divided between large AI labs, smaller AI safety research organizations, and independent researchers. Coverage of the problem, though reasonable given the field’s nascent status, is somewhat disjoint between these groups. In spite of the immense resources and research effort, AI safety currently lacks the coordination that would label it ‘big science’. Namely, it lacks:

  1. Infrastructural Centralization (e.g. CERN)
  2. Ideological Centralization (i.e. ‘build a bomb’ or ‘sequence the human genome’)
  3. Science.

First: infrastructure. There is a certain amount of resource pooling among independent researchers and small research organizations, and many of the models they use and tools they develop are small-scale or open-source. While there is a lot of room for good theoretical or experimental work from this group, they are largely resource-restrained to a model scale of GPT-2. Independent researchers are particularly decentralized, and while there is nothing to stop them from collaborating across nation-states, most collaboration tends to cluster around virtual communication channels for distinct research programmes (mech interp, dev interp, SLT). In contrast, AI labs hold the largest share of physical and intellectual infrastructure. While their efforts to understand AI systems often go hand-in-hand with driving capabilities, they nevertheless produce key insights. A lot of their work involves large-scale empirics on state-of-the-art models and real-world datasets, with a mounting wall of secrecy that limits its external collaborations. This leads to sub-optimal coordination in the field, resulting in at best two groups (independent/small AI safety and big AI), but at worst many more than that.

Next: Ideology. By this, I mean a big science problem specification and research goals that would guide a scientific community. Historically, these have been guided top-down by the government or bottom-up by scientific consensus, and have played a pivotal role in coordination, resource allocation, and progress allocation in big science collaborations. In AI safety, however, no such unifying objective currently exists. Instead, the field is shaped by competing agendas across stakeholders, including governing bodies, academic researchers, and AI labs of varying stages and sizes.

Among these, large AI labs are the most prolific in terms of research output. These would play the largest role in guiding a centralized ideology, through their own work and relationships with governing bodies that rely on lab members’ expertise to understand pathways and timelines to AGI. Big labs often produce quality work, and have taken steps to include external perspectives through academic collaborations,  researcher access programs, red teaming networks, or fellowship programs. However, these programs, coupled with the labs’ government connections, have the potential to implicitly and explicitly incentivize independent and academic researchers to work within existing frameworks rather than pursue an approach with a ‘riskier’ problem set-up or alternative methodology. Instead of unifying the research landscape under a flexible ideology, these efforts serve to further separate the AI safety ideology set by large AI labs from those of small AI safety organizations, independent researchers, or academic groups.  Moreover, given the competition between the top AI labs leads to further fragmentation, each lab operates with its own priorities. This means that AI labs, even if they currently have the loudest voice, are not suited to establishing a robust, central ideology that can unite the field in big science, as this would depend on short-horizon industry incentives and which lab comes out on top. 

Finally: science. The lack of a central ideological focus is indicative of the current absence of a foundational scientific discipline – and corresponding scientific culture – underpinning AI safety. This shared understanding would play an important role in coordinating big science between different research groups with varying standards of scientific experience and rigor. While AI safety draws on a patchwork of theoretical insights, empirical methods, and methodological input from the fields underpinning AI (physics, neuroscience, mathematics, computer science…), the field lacks a cohesive framework for addressing its core challenges, leading, for example, to an inconsistent formulation of the ‘problem’ of AI safety.

The current priorities of the AI industry reflect and support this dearth of fundamental scientific research. In its new hires, labs seem to be increasingly concerned with engineering skills necessary to build out existing applied research agendas, suggesting that there is less effort to build up scientific foundations. This is likely due to epistemic taste and increasing pressure to compete given shortening timelines, a pressure seemingly reflected in AI safety funding schemes. Whereas a few years ago, freakish prosperity in the AI safety space caused a push for big ideas, now AI is ubiquitous, the economic stakes are higher, and the funds are more sparse. Add to that the narrative within the AI safety community that current frontier models are already near-AGI, causing many funders to focus on the state-of-the-art rather than hedge their bets on the uncertain outputs, and longer time-horizons, that basic science investigations could provide. 

All this to say that within AI safety, coordination with basic science is currently limited, constraining the number of new ideas with a high potential for impact. At the same time, government funding for scientific projects that rely heavily on AI and its development to solve domain-specific problems is increasing, further separating the use-inspiration for AI research in these two groups (solve science v. solve safety). Without explicit incentive to work on projects that address safety concerns, basic scientists will continue to ‘publish or perish’ in their own scientific disciplines just as industry labs continue to ‘produce or perish’ in theirs.

As it stands, big science is more likely to be top-down

In my view, desirable characteristics of big science for AI safety include infrastructural centralization that improves access to and understanding of the problem for basic science and ideological centralization that is broad enough to allow for a pluralism of ideals and projects. In short: it should be bottom-up, and set the conditions for a basic science of AI safety.

However, if big science is inevitable (and I’m not sure it is), it seems more likely to be top-down. Drivers for a big science of AI safety could include:

  1. A high-profile AI accident: This would lead governments to impose ad-hoc restrictions on industry or grab control over industry infrastructure, leading to increased centralization that, depending on the nature and severity of the accident, could lead to top-down big science via soft nationalization [EA · GW].
  2. Escalating geo-political tensions or A Manhattan Project for AI: The second two are already in motion, and if they do lead to big science (and not something like soft nationalization), this would most certainly be top-down. These drivers stem from heightened competition between nation-states. While the recent recommendation from the U.S.-China Economic and Security Review Commission to start a Manhattan Project for AI seems like a false start, it could be seen as a warning message to the CPP and its allies. It also addresses real security concerns for an AI driven future, and raises the question of where a Manhattan project for AI should be hosted and casts doubts on the prospects of a globally centralized ‘trillion dollar cluster’ for AI equity and safety. If the US did pursue this project, it would likely have two major effects. First, it would be forced to take industry claims of creating AGI seriously, and make steps to lock them down. Second, to maintain secrecy, it could restrict academic research and limit open-source collaboration. These dynamics echo the “bureaucratic overreach and political misuse” (Kaiser et al.) of the original Manhattan project. In that case, there was a clear, urgent threat to mobilize toward. A similar project for AI would not only limit innovation and investigation in both private and public sectors, but it would do so without addressing practical problems of long-term AI governance. From an AI safety perspective, a top-down big science directive of ‘build AGI first at any cost’ is not desirable.
  3. A major breakthrough in engineering: As a result, the field risks a premature or overly-constraining ideological centralization with a top-down safety objective set not by governance or academic consensus, but industry. Currently, these priorities emphasize short-term deliverables, engineering prowess over foundational understanding, and empirical efforts to interpret or align the capabilities of current or near-future models. We mentioned earlier that while large AI labs industry labs and their collaborations drive substantial progress in understanding and shaping AI systems, they currently shape many aspects of AI safety, potentially including a top-down objective, which would likely ramp up after a significant gain in engineering.  On paper the objective looks something like ‘make AI safe’, which seems like a virtue to maintain AI safety’s pre-paradigmatic open-mindedness. However, there are implicit parameters for who can work on AI safety and under what conditions, set by the lab’s focus and reinforced by the funding landscape. If the goal-specification of industry and governance feed back into one another, AI safety risks entrenching increasingly narrow goals, methods, and standards into the scientific culture, stifling the diversity of ideas essential for robust AI research and governance. In other words, if Therefore, while large AI labs and their collaborations drive substantial progress in understanding and shaping AI systems, they represent a narrow ideological perspective that are shaping the epistemic priorities of the field, including what it means to make AI systems safe.
  4. A major breakthrough in basic science: This is most likely to be bottom-up. While we are likely in the midst of an empirically-driven phase of AI research, this is not guaranteed to remain the case forever. If experimental leads were to run dry or engineering were to shift significantly, the field’s lack of investment into other (theoretically-driven, foundational, longer time-horizon) approaches may leave us unprepared to address changing safety concerns efficiently and adequately. Complementing these efforts with a stronger, more diverse representation from basic science could foster a richer ideology and more balanced ecosystem for breakthroughs in AI safety.

Why bottom-up could be better

The way nationalization or centralization happens is incredibly important. From the point of view of the scientific community, it would set the standard for ‘good’ research, prioritizing which projects get funded in terms of methodology, focus, stakes, timeliness, and take-off speeds. From a governance standpoint, it is tied to the scale, scope, and solution of the AI safety problem, and dictates who has the power over AI and all its baggage (good and bad).

The AI governance challenge is by no means simple, and its strategy will also depend on the ideology around which big science coordinates. Myopic or overly-constraining rallying cries could yield a big science with too narrow a focus, leaving us unprepared to tackle a threat we didn’t see coming. While top-down big science does lead to gains in basic science and industry, these are not guaranteed to hold up to a threat model predicated on unknown unknowns. By the time we realize we’re headed in the wrong direction, it could be too late.

A top-down approach could succeed given the right goal. The UK AISI is an example of locally centralized big-ish science with clear safety objectives that nevertheless fosters a culture amenable to pure scientific inquiry. That said, I think that many concerns about power concentration and race dynamics, [LW · GW]or the strategic advantage [? · GW] (SA) of centralized AGI consider a top-down model. A bottom-up approach to big science could balance the voice of AI labs in the cooperative development of AI safe AI while also increasing its feasibility given shortening timelines. [? · GW] Or, we might view bottom-up big science as a ‘well designed SA approach’ that can accelerate progress with an equitable distribution of power and safety checks in place. Moreover, it could naturally foster the ‘good attributes’ of AI safety I laid out at the beginning.

CERN, for example, brings theorists, experimentalists, and phenomenologists together studying particle, nuclear, and astrophysics, but also the structure of matter, geophysics, and environmental science. A similar mosaic scientific culture is possible in a CERN for AI[3]. While this article focuses mainly on technical AI research, I would also include fields like economics, psychology, and anthropology for maximal disciplinary flexibility. Doing so would allow basic science to weigh-in on different aspects of AI safety, from mathematical theories to ethical frameworks.

This multi-disciplinarity feeds into the next desideratum: active engagement between academia, industry, and governance. Bottom-up big science achieves this quite naturally. Though the research itself is driven by the needs of the scientific community, rather than a government directive, coordination between scientists at scale can only be achieved with government support. As a public good, a CERN for AI would also need access to some data and models of industry labs. While there may be some hurdles to overcome here, it is not unrealistic to expect the mutual benefits of a collaboration of this kind — including a wider research pool for industry and basic science proof-of-concept – to outweigh corporate reservations.

Finally, the type of research would coordinate around a unifying focus that is broad enough to maintain a sufficient richness of ideas. Allowing basic scientists to pursue AI (safety) research as basic scientists on a scale similar to industry labs could provide the conditions of exploration and innovation necessary for a breakthrough – or many breakthroughs – to be achieved. Given academia’s track record of spinning-up bottom-up big science projects, this is not a stretch. The added resource-pooling would add an efficiency that basic science currently lacks, without overly constraining the research space. It would also prioritize coordination over competition, making for more equitable access for researchers (even if some amount of nationalization is likely).

How to scale from where we are now

It is currently unlikely that academia would mobilize across disciplines to engage in bottom-up big science. If they did, it is unlikely that they would immediately coordinate around problems with the highest potential for safety relevance. Rather than attempting large-scale coordination from the outset, we should focus on facilitating small-scale trading zone collaborations between basic scientists, applied scientists, and governance. These can orient basic research toward an AI safety application while also making this research more legible to others outside of its scientific culture. If we create enough of these trading zones with different disciplines, they can begin to naturally coordinate with each other, creating larger trading zones with cross-disciplinary input from basic science and a ‘local’ language that is legible to a larger population. Continuing in this way facilitates the coarse-graining mentioned above. In this view, bottom-up big science is an emergent phenomenon that arises from a ‘critical mass’ of individual trading zone collaborations. At some scale, mobilization will become possible, yielding an aggregate trading zone between basic science, applied science, and governance unified by the goal of creating a safety-relevant science of AI.

This model for driving science encourages flexible, rather than rigid, centralization, preserving the desiderata laid out for the AI safety field and fostering an epistemically diverse scientific culture. It creates the conditions for a system of checks and balances – as well as mutual benefit – between industry, governance, and academia, and levels the playing field between these domains. If we take the analogy of a fractal seriously, there is no limit to the scale that can be achieved. Big science of this type would continue to drive advancements in safe AI and basic science, pushing the limits of knowledge, innovation, and human impact in lockstep.

Appendix: Potential Pushback of this Approach

It would be naive to expect that this effort would be without pushback. Like the historical examples we looked at earlier, the public, private, and academic sectors will all have something to say about any new allocation of resources.

From academia, we might expect arguments like ‘AI is not science’. When the 2024 nobel prize in physics was awarded to Hinton and Hopfield, it sparked a heated debate about the relationship between physics and AI which, in my mind, demonstrated two key ideas:

  1. Morally, culturally, and ideologically, physics is not AI, and
  2. There is a lot of basic science to discover about AI, and this understanding can feed back into other areas.

Hopefully, fears from basic science that all science will become AI will be assuaged if scientists are allowed to work on AI guided by the standards and values of their own scientific culture. The goal for large-scale scientific coordination isn’t to turn every basic scientist into an AI scientist, but to bring together ideas from disparate fields to make progress on a poorly understood technology that – like it or not – is likely to be integrated within every aspect of our lives before long, including scientific inquiry. The sharing of resources – time, expertise, manpower – facilitated by collaborations can allow researchers to take part in AI focused projects without abandoning the disciplinary commitments of their academic departments. It is also the case that uncoordinated research in AI has led to work from different disciplines expressing similar ideas in different terms; cross-disciplinary collaborations centered on AI specific problems can foster shared insights that benefit each involved field.

Funders or governance may complain that ‘basic science takes too long’ to be competitive given shortening time scales. However, a centralized research infrastructure, including large-scale datasets, computational resources, safety sandbox, and shared knowledge, could minimize duplication of efforts that could slow progress. This could be sped up even more by accelerating the most promising research directions that arise from big science with a focused research organization (FRO). In addition to spinning-up more start-ups, FROs can produce public goods on accelerated timescales, minimizing the time-sink and public cost of the basic science research phase before producing something of use.

Governing bodies and the public may also be concerned if this is secure enough. In AI safety, there are a lot of independent researchers and small research projects or programs. It would take a lot of infrastructure to vet and keep track of these. However, there may be a way to give tiered levels of access to resources through collaborations. My view is that a CERN for AI would increase accountability and offer a middle-of-the-road solution to transparency, somewhere between locking it in a black box of secrecy and making it completely open-source. Security concerns feed into doubts that this is AI safety research. Mainly, this comes down to public trust in the incentives of basic scientists, and how large the scale of collaboration ends up being (if it is not a global endeavor, what are the institutional safeguards in place to prevent this from becoming a Manhattan Project in disguise). Like CERN, researchers could apply to run experiments, and these can be vetted for relevance to safety. From my perspective (as a basic scientist), AI science treated in this way is AI safety science, as understanding a thing allows us to reliably build and control it. Indeed, most AI safety interpretability or technical alignment research has a basic science flavor.

  1. ^

     These are a few criteria I think are important, and should not be taken as a minimal or exhaustive list.

  2. ^

     In the case of CERN, this aim is explicit, though this does not remove it from political alliances.

  3. ^

     For the sake of simplicity, I’ll continue to refer to the bottom-up big science ideal as ‘CERN for AI’, though this should be read as an analogous project, rather than the gold standard.

2 comments

Comments sorted by top scores.

comment by Jonas Hallgren · 2025-01-07T12:01:56.151Z · LW(p) · GW(p)

I really like this! For me it somewhat also paints a vision for what could be which might inspire action.

Something that I've generally thought would be really nice to have over the last couple of years is a vision for how an AI Safety field that is decentralized could look like and what the specific levers to pull would be to get there. 

What does the optimal form of a decentralized AI Safety science look like? 

How does this incorporate parts of meta science and potentially decentralized science? 

How does this look like with literature review from AI systems? How can we use AI Systems in themselves to create such infrastructure in the field? How do such communication pathways optimally look like? 

I feel that there are so many low-hanging fruit here. There are so many algorithms that we could apply to make things better. Yes we've got some forums but holy smokes could the underlying distribution and optimisation systems be optimised. Maybe the lightcone crew could cook something in this direction?

Replies from: LaurenGreenspan
comment by Lauren Greenspan (LaurenGreenspan) · 2025-01-07T15:30:55.742Z · LW(p) · GW(p)

Thanks for the comment! I do hope that the thoughts expressed here can inspire some action, but I'm not sure I understand your questions. Do you mean 'centralized', or are you thinking about the conditions necessary for many small scale trading zones? 

In this way, I guess the emergence of big science could be seen as a phase transition from decentralization -> centralization.