Embracing complexity when developing and evaluating  AI responsibly

post by Aliya Amirova (aliya-amirova) · 2024-10-11T17:46:50.208Z · LW · GW · 9 comments

Contents

      Complex Intervention Development and Evaluation Framework: A Blueprint for Ethical and Responsible AI Development and Evaluation
  1. Applying the Complex Intervention Framework to AI
    1.1. Contextual Awareness
    1.2. Underlying Theory/Mechanism of Action for AI
    1.3. Stakeholder Engagement
    1.4. Addressing Uncertainties
    1.5. Iterative Refinement
    1.6. Feasibility and Acceptability
    1.7. Economic and Societal Impact
  2. Evidence Synthesis and Rigorous Evaluation
    2.1. Designing Evaluation Protocols and Building an Evidence Base
    2.2. Bias Auditing
  3. Complex Adaptive Systems in AI: Key Properties
    3.1. Emergence
    3.2. Feedback Loops
    3.3. Adaptation
    3.4. Self-Organisation
  Conclusion
None
10 comments

Complex Intervention Development and Evaluation Framework: A Blueprint for Ethical and Responsible AI Development and Evaluation

The rapid evolution of artificial intelligence (AI) presents significant opportunities, but it also raises serious concerns about its societal impact. Ensuring that AI systems are fair, responsible, and safe is more critical than ever, as these technologies become embedded in various aspects of our lives. In this post, I’ll explore how a framework used for developing and evaluating complex interventions (1:2) — common in public health and social sciences—can offer a structured approach for navigating the ethical challenges of AI development.

 

I used to believe that by applying first principles and establishing a clear set of rules, particularly in well-defined environments like games, we could guarantee AI safety. However, when I encountered large language models (LLMs) that interact with vast bodies of human knowledge, I realised how limited this view was. These models function in the "wild" world characterised by dynamic rules, fluid social contexts, and hard to predict human behaviour and interactions. This complexity calls for approaches that embrace uncertainty and acknowledge the messiness of real-world environments. This realisation led me to draw from methods used in social science and public health, particularly those related to complex interventions designed to influence behaviour and create change (1;2). 

 

Some AI systems, such as Gemini, ChatGPT, Llama, much like complex interventions, are developed to interact with humans as agents and potentially influence human behaviour in dynamic, real-world contexts. First principles alone are insufficient to ensure fairness and safety in these systems. Foreseeing and eliminating bias in such a complex setting poses a challenge. It is inefficient to pinpoint biases in a complex system in a top down manner. Instead, we need rigorous, evidence-based approaches that place stakeholder acceptability and complex system design at its heart.  A few methodologies are introduced in medicine and healthcare. These approaches (1;2) have the capacity to systematically assess the knowns, set out the uncertainties and unknowns of how these systems interact with the world.

 

Complex interventions are designed to bring about change within intricate complex systems. They often involve multiple interacting components at various levels of organisation or analysis (individual, systemic, population-based etc) and aim to influence behaviour of the system and subsequently targeted outcomes. Examples include public health programs (e.g., five a day fruit and vegetable programme, tobacco ban), educational reforms (e.g., the introduction of a curriculum), or social policy initiatives (e.g., universal basic income programs). Developing and evaluating these interventions requires a holistic approach that takes into account context, stakeholder engagement, and iterative refinement. From the outset, the needs and constraints of the target group are considered, ensuring the intervention is not only safe with no or minimal harm but is feasible, acceptable to the stakeholders, and is set out to achieve stakeholder preferred outcomes. 

 

1. Applying the Complex Intervention Framework to AI

AI systems, particularly those like LLMs, share many characteristics with complex interventions. They operate within layered social and cultural contexts, interact with diverse individuals and communities, and might influence decision-making processes. Therefore, the principles and best practices for complex interventions can be adapted to inform responsible AI design and implementation. Here are several key considerations for applying this framework to AI systems’ development:

 

1.1. Contextual Awareness

AI systems must be developed with a deep understanding of the social, cultural, and political contexts in which they operate. This includes acknowledging potential biases, interpersonal dynamics, and ethical concerns that vary depending on the specific context.

1.2. Underlying Theory/Mechanism of Action for AI

Just as with complex interventions, AI systems should have a well-defined "program theory" that outlines how the AI will function, interact with its environment, and influence user behaviour. This theory should be transparent and accessible to stakeholders.

1.3. Stakeholder Engagement

Involving diverse stakeholders—including representatives from affected communities, as well as social scientists —is critical for developing AI systems in an inclusive and responsible way. This engagement ensures that different perspectives are incorporated, promoting fairness and equity.

1.4. Addressing Uncertainties

AI systems, like complex interventions, operate in uncertain environments and can have unintended consequences. Ongoing monitoring and evaluation are essential to identify and mitigate these risks.

1.5. Iterative Refinement

AI systems should be designed for continuous improvement based on real-world feedback, data, and evaluation. This iterative process ensures that the system adapts over time and remains aligned with ethical standards and societal needs.

1.6. Feasibility and Acceptability

AI systems, like complex interventions, must be feasible and acceptable to the target audience. This requires careful consideration of technical, economic, social, and ethical factors.

Ensuring that AI systems are both feasible and acceptable is critical to their success and responsible deployment. These considerations echo those found in the design of complex interventions, where practical constraints and user acceptance shape outcomes. Feasibility encompasses technical, economic, and operational dimensions. AI systems must be implementable given current technological capabilities, scalable within resource limits, and sustainable over time.

 

1.6.1. Technical Feasibility: Can the system be built and deployed effectively? Can the AI system be reliably deployed with available infrastructure and data?

1.6.2. Economic Feasibility: Is the system cost-effective and sustainable?

1..6.3. Social Acceptability: Will the system be trusted and embraced by its users? Will users trust an AI-powered virtual assistant to handle sensitive personal information? 

1.6.4. Equity: Ensuring equitable access is essential. 

1.7. Economic and Societal Impact

The broader economic and societal impact of AI must be carefully evaluated, especially with regard to job displacement, privacy concerns, and the equitable distribution of benefits and risks.

 

2. Evidence Synthesis and Rigorous Evaluation

For responsible AI development, rigorous evaluation is as crucial as it is for complex interventions. This involves designing robust evaluation protocols and establishing an evidence base for best practices.

2.1. Designing Evaluation Protocols and Building an Evidence Base

AI systems should be assessed using rigorous methodology (eg., RCTs, meta-analysis of robust evidence), and real-world outcome measures. Use real-world testing environments, control groups, and medical research methods, as well as those traditionally used in epidemiology. 

AI safety and ethics should be informed by systematic reviews, empirical research, and collaboration between academia, industry, and other stakeholders, while adhering to best practices for transparent reporting. In medical research as well as healthcare research,, examples of frameworks for transparent and clear  reporting include: EQUATOR (Enhancing the Quality and Transparency of Health Research), CONSORT (Consolidated Standards of Reporting Trials), and PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). These frameworks help ensure that research findings are transparent, reproducible, and methodologically sound, and similar reporting standards should be adapted for AI safety research to maintain integrity and clarity in evaluation protocol of AI systems.

2.2. Bias Auditing

 Regularly auditing AI models for bias, diversifying data sources, and monitoring AI-human interactions can help mitigate biases. Bias Auditing and Ongoing Monitoring (eg., work by Joy Buolamwini): Regular audits should be conducted to assess for bias, ensuring that AI models are fair and inclusive.

 

3. Complex Adaptive Systems in AI: Key Properties

 

3.1. Emergence

Emergent properties in AI systems are often unanticipated outcomes that arise from complex interactions between the system’s components and its environment. As in social interventions, this makes predicting all outcomes difficult, particularly as AI systems interact with a vast and varied user base.

To address emergence, AI developers must consider the system’s potential interactions with its social environment and remain vigilant about unintended outcomes. This necessitates a flexible approach to refinement, allowing the system to evolve based on real-world feedback.

 

3.2. Feedback Loops

Feedback occurs when a change in one part of the system influences other parts, either reinforcing or mitigating the original change. For AI systems, feedback mechanisms can influence user behaviour in ways that amplify both positive and negative outcomes.

Designing responsible AI requires understanding and managing these feedback loops. Developers should regularly audit how AI systems influence user behaviour and be prepared to adjust algorithms to avoid reinforcing harmful patterns.

 

3.3. Adaptation

AI systems, like any part of a complex adaptive system, must be capable of adjusting to changes in the environment, whether those are technical, social, or regulatory shifts.

Adaptation in AI systems requires developers to anticipate the ways in which both users and other stakeholders may respond to the technology. By maintaining flexibility in design and deployment, AI systems can evolve alongside changing societal expectations or regulations.

3.4. Self-Organisation

Self-organisation refers to the spontaneous order that arises from local interactions, rather than from top-down control. In AI, this can manifest in decentralised systems or in how users co-opt technologies for new purposes.

 

AI developers can foster beneficial self-organisation by providing flexible, open systems that enable communities to contribute, iterate, and innovate. However, this must be balanced with governance mechanisms to prevent harmful or unethical uses.

Conclusion

The framework for complex intervention development and evaluation provides a valuable blueprint for managing the ethical and practical challenges posed by AI. By embracing principles such as contextual awareness, stakeholder engagement, iterative refinement, stakeholder acceptability and rigorous evaluation, we can guide the development of AI systems that are not only innovative but also fair, responsible, and safe for all.

 

Consideration of complex adaptive systems—emergence, feedback, adaptation, and self-organisation—offer valuable paradigm for the responsible development of AI. When combined with a focus on feasibility and acceptability, these principles help ensure that AI systems are not only technically sound but also socially and ethically aligned with the needs of the communities they serve. By adopting this comprehensive approach, we can create AI systems that are adaptive, equitable, and sustainable in the long term.

 

1.  Skivington K, Matthews L, Simpson SA, Craig P, Baird J, Blazeby JM, et al. A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance. BMJ. 2021 Sep 30;374:n2061.

2.  Craig P, Dieppe P, Macintyre S, Michie S, Nazareth I, Petticrew M, et al. Developing and evaluating complex interventions: The new Medical Research Council guidance. BMJ. 2008 Sep 29;337:a1655.

9 comments

Comments sorted by top scores.

comment by Seth Herd · 2024-10-12T19:00:47.515Z · LW(p) · GW(p)

I did wonder if this was AI written before seeing the comment thread. It takes a lot of effort for a human to write like an AI. Upvoted for effort.

I think this also missed the mark with the LW audience because it is about AI safety, which is largely separate from AGI alignment. LW is mostly focused on the second. It's addressing future systems that have their own goals, whereas AI safety addresses tool systems that don't really have their own goals. Those issues are important; but around here we tend to worry that we'll all die when a new generation of systems with goals are introduced, so we mostly focus on those. There is a lot of overlap between the two, but that's mostly in technical implementations rather than choosing desired behavior.

Replies from: aliya-amirova, aliya-amirova
comment by Aliya Amirova (aliya-amirova) · 2024-10-13T21:18:06.224Z · LW(p) · GW(p)

The main point I am trying to make is that AGI risks cannot be deduced or theorised solely in abstract terms. They must be understood through rigorous empirical research on complex systems. If you view AI as an agent in the world, then it functions as a complex intervention. It may or may not act as intended by its designer, it may or may not deliver preferred outcomes, and it may or may not be acceptable to the user. There are ways to estimate uncertainty in each of these parameters through empirical research. Actually, there are degrees to which it acts as intended, degrees to which it is acceptable, and so on. This calls for careful empirical research and system-level understanding.

comment by Aliya Amirova (aliya-amirova) · 2024-10-13T18:25:59.361Z · LW(p) · GW(p)

I write academic papers in healthcare, psychology, and epidemiology for peer-review. I don't write blog posts every day, so thank you for your patience with this particular style, which was devised for guidelines and frameworks.

Thank you for sharing your thoughts on AI alignment, AI safety, and imminent threats. I posted this essay to demonstrate how public health guidelines and system thinking can be useful in preventing harm, inequality, and avoiding unforeseen negative outcomes in general. I wanted the LessWrong audience to gain perspectives from other fields that have been addressing rapidly emerging innovations—along with their benefits and harms—for centuries, with the aim of minimising risk and maximising benefit, keeping the wider public in mind.

I am aware of the narrative around the 'paperclip maximiser' threat. However, I believe it is important to recognise that the risks AI brings should not be viewed in the context of a single threat, a single bias, or one path to extinction. AI is a complex system, used in a complex setting—the social structure. It should be studied with due rigour, with a focus on understanding its complexity.

If you can suggest literature on AGI alignment that recognises the complexity of the issue and applies systems thinking to the problem, I would be grateful.

comment by Aliya Amirova (aliya-amirova) · 2024-10-11T18:07:19.676Z · LW(p) · GW(p)

@Mitchell_Porter [LW · GW] What made you think that I am not a native English speaker and what made you think that this post was written by AI? 

Replies from: aliya-amirova, T3t
comment by Aliya Amirova (aliya-amirova) · 2024-10-11T18:53:35.811Z · LW(p) · GW(p)

@RobertM [LW · GW] @Mitchell_Porter [LW · GW]. 

 

I guess the standardised language for framework development fails Turing Test. 

The title is the play on words, merging the title of the guidelines authored by The Medical Research Council — "Complex Intervention Development and Evaluation Framework" — (1) and The Economic Forum for AI — "A Blueprint for Equity and Inclusion in Artificial Intelligence" (2).  The blog I wrote closely follows the standardised structure for frameworks and guidelines, with specific subheadings that are easy to quote.

"Addressing Uncertainties" is a major requirement in the iterative process of development and refinement of complex intervention. I did not come up with it; it is an agreed-upon requirement in high-risk health application and research. 

Would you like to engage with the content of the post? I thought LessWrong is about engaging in a debate where people learn and attempt to reach consensus. 

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2024-10-11T20:54:59.724Z · LW(p) · GW(p)

My apologies. I'm usually right when I guess that a post has been authored by AI, but it appears you really are a native speaker of one of the academic idioms that AIs have also mastered. 

As for the essay itself, it involves an aspect of AI safety or AI policy that I have neglected, namely, the management of socially embedded AI systems. I have personally neglected this in favor of SF-flavored topics like "superalignment" because I regard the era in which AIs and humans have a coexistence in which humans still have the upper hand as a very temporary thing. Nonetheless, we are still in that era right now, and hopefully some of the people working within that frame, will read your essay and comment. I do agree that the public health paradigm seems like a reasonable source of ideas, for the reasons that you give. 

comment by RobertM (T3t) · 2024-10-11T18:32:07.360Z · LW(p) · GW(p)

Not Mitchell, but at a guess:

  • LLMs really like lists
  • Some parts of this do sound a lot like LLM output:
    • "Complex Intervention Development and Evaluation Framework: A Blueprint for Ethical and Responsible AI Development and Evaluation"
    • "Addressing Uncertainties"
  • Many people who post LLM-generated content on LessWrong often wrote it themselves in their native language and had an LLM translate it, so it's not a crazy prior, though I don't see any additional reason to have guessed that here.

Having read more of the post now, I do believe it was at least mostly human-written (without this being a claim that it was at least partially written by an LLM).  It's not obvious that it's particular relevant to LessWrong.  The advice on the old internet was "lurk more"; now we show users warnings like this [LW(p) · GW(p)] when they're writing their first post.

comment by Mitchell_Porter · 2024-10-10T22:55:44.530Z · LW(p) · GW(p)

(edit: looks like [LW(p) · GW(p)] I spoke too soon and this essay is 100% pure, old-fashioned, home-grown human)

This appears to be yet another post that was mostly written by AI. Such posts are mostly ignored. 

This may be an example of someone who is not a native English speaker, using AI to make their prose more idiomatic. But then we can't tell how many of the ideas come from the AI as well. 

If we are going to have such posts, it might be useful to have them contain an introductory note, that says something about the process whereby they were generated, e.g. "I wrote an outline in my native language, and then [specific AI] completed the essay in English", or, "This essay was generated by the following prompt... this was the best of ten attempts", and so on. 

Replies from: aliya-amirova, aliya-amirova
comment by Aliya Amirova (aliya-amirova) · 2024-10-11T17:53:55.375Z · LW(p) · GW(p)

Hey, be civil! That is not nice. I am a human, I did not use AI.