Developing AI Safety: Bridging the Power-Ethics Gap (Introducing New Concepts)

post by Ronen Bar (ronen-bar) · 2025-04-20T04:40:42.983Z · LW · GW · 0 comments

This is a link post for https://forum.effectivealtruism.org/posts/FREY5kyC8mWr5ow4S/developing-ai-safety-bridging-the-power-ethics-gap

Contents

  TLDR
  My Point of View
  Human History Trends
  The Focus of the AI Safety Space
  Suggesting New Concepts, Redefining or Highlighting Existing Ones
None
No comments

TLDR

This post can be seen as a continuation of the this [EA · GW] post.

(To further explore this topic, you can watch a 34-minute video outlining a concept map of the AI space and potential additions. Recommended viewing speed: 1.25x).

This post drew some insights from the Sentientism podcast and the Buddhism for AI course.

My Point of View

I am looking at the AI safety space mainly through the three fundamental questions: What is? What is good? How do we get there?

Historically, human power – driven by increasing data and intelligence – is scaling rapidly and exponentially. Our ability to understand and predict "what is" continues to grow. However, our ethical development ("understanding what is good") is not keeping pace. The power-ethics gap is the car driving increasingly faster, while the driver’s skill is improving just a little bit as the ride goes on and on. This arguably represents one of the most critical problems globally. This imbalance has contributed significantly to increasing suffering and killing throughout history, potentially more so in recent times than even before. The widening power-ethics gap appears correlated with large-scale, human-caused harm.

The Focus of the AI Safety Space

Eliezer Yudkowsky, who describes himself as 'the original AI alignment person,' is one of the most prominent figures in the AI safety space. His philosophical work, many concepts he created, and his discussion forum platform and organizations have significantly shaped the AI safety field. I am in awe of his tremendous work and contribution to humanity, but he has a significant blind spot regarding his understanding of “what is”. Yudkowsky operates within a framework where (almost) only humans are considered sentient, that is his claim, whereas scientific evidence suggests that probably all vertebrates, and possibly many invertebrates, are sentient. This discrepancy is crucial: one of the key founders of the AI safety space has built his perspective on an unscientific assumption that limits his view to a tiny fraction of the world's sentience.

The potential implications of this are profound and this highlights the necessity of re-evaluating AI safety from a broader ethical perspective encompassing all sentient beings, both present and future. This requires introducing new concepts and potentially redefining existing ones. This work is critical since the pursuit of artificial intelligence is primarily focused on increasing power (capabilities), hence it risks further widening the existing power-ethics gap within humanity.

Since advanced AI poses the threat talking away control and mastery of from humans, two crucial pillars for AI safety emerge: maintaining meaningful human control (power) and ensuring ethical alignment (ethics). Currently, the field heavily prioritizes the former, while the latter remains underdeveloped. From an ethical perspective, particularly one concerned with the well-being of sentientkind ('Sentientkind' being analogous to 'humankind' but inclusive of all feeling beings), AI safety and alignment could play a greater role. Given that AI systems may eventually surpass human capabilities, their embedded values will have immense influence.

We must strive to prevent an AI-driven power-ethics gap far exceeding the one already present in humans.

Suggesting New Concepts, Redefining or Highlighting Existing Ones

A monkey (representing evolution), a human, an AI, and a superintelligence. In order to achieve a good world, we probably need the last three to be aligned.

0 comments

Comments sorted by top scores.