Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well)

roland-pihlakas

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well)

post by Roland Pihlakas (roland-pihlakas) · 2025-01-12T03:37:59.692Z · LW · GW · 7 comments

  Introduction
  Why Utility Maximisation Is Insufficient
  Homeostasis as a More Correct and Safer Goal Architecture
    1. Multiple Conjunctive Objectives
    2. Task-Based Agents or Taskishness — “Do the Deed and Cool Down”
    3. Bounded Stakes: Reduced Incentive for Extremes
    4. Natural Corrigibility and Interruptibility
  Diminishing Returns and the “Golden Middle Way”
  Formalising Homeostatic Goals
  Parallels with Other Ideas in Computer Science
  Open Challenges and Future Directions
  Addendum about Unbounded Objectives
  Conclusion
None
7 comments

I notice that there has been very little if any discussion on why and how considering homeostasis is significant, even essential for AI alignment and safety. Current post aims to begin amending that situation. In this post I will treat alignment and safety as explicitly separate subjects, which both benefit from homeostatic approaches.

This text is a distillation and reorganisation of three of my older blog posts at Medium:

I will probably share more such distillations or weaves of my old writings in the future.

Introduction

Much of AI safety discussion revolves around the potential dangers posed by goal-driven artificial agents. In many of these discussions, the agent is assumed to maximise some utility metric over an unbounded timeframe. This simplification, while mathematically convenient, can yield pathological outcomes. A classic example is the so-called “paperclip maximiser”, a “utility monster” which steamrolls over other objectives to pursue a single goal (e.g. creating as many paperclips as possible) indefinitely. “Specification gaming”, Goodhart’s law, and even “instrumental convergence” are also closely related phenomena.

However, in nature, organisms do not typically behave like pure maximisers. Instead, they operate under homeostasis: a principle of maintaining various internal and external variables (e.g. temperature, hunger, social interactions) within certain “good enough” ranges. Going far beyond those ranges — too hot, too hungry, too socially isolated — leads to dire consequences, so an organism continually balances multiple needs. Crucially, “too much of a good thing” is just as dangerous as too little. Excess is harmful even for the very same objective that was maximised for, not just as a side effect on other objectives. This seems to apply to most or even all biological objectives.

In this post, I argue that an explicitly homeostatic, multi-objective model is a more suitable paradigm for AI alignment. Moreover, correctly modelling homeostasis increases AI safety, because homeostatic goals are bounded — there is an optimal zone rather than an unbounded improvement path. This bounding lowers the stakes of each objective and reduces the incentive for extreme (and potentially destructive) behaviours.

Why Utility Maximisation Is Insufficient

In the standard utility maximisation framework, the agent’s central goal is to increase some singular measure of “value” or “reward” over time. Because many real-world objectives are treated as if they were unbounded — for instance, economic growth can always be increased by some amount — maximisers tend to push that objective beyond any reasonable point. This dynamic is what yields the risk of “berserk” behaviour.

Unboundedness: If there is no built-in satiety or equilibrium, the AI keeps pushing the objective indefinitely, often at the expense of everything else.
Single-Dimensional Focus: When multiple objectives exist, simple summation or linear weighting often encourages the AI to optimise whichever objective is most “convenient” to push to extremes. Yet in reality, for example eating food twice the normal amount does not compensate for the lack of drink and drinking a lot does not compensate for neglecting to eat.

By contrast, homeostasis signals that there is an optimal zone — an “enough” amount — for each objective. Pursuing more than enough is wasteful and even harmful. Representing such objectives as unbounded maximisation objectives would be grossly inaccurate. In the example of eating and drinking — both activities need to be balanced with some time granularity.

Homeostasis as a More Correct and Safer Goal Architecture

1. Multiple Conjunctive Objectives

Real organisms typically have several needs (food, water, social connection, rest, etc.) that must all be satisfied to at least a “sufficient” level. In a homeostatic agent:

Each objective has a target range (or setpoint).
If one objective is wildly off-target, the agent diverts efforts to fix that dimension rather than endlessly over-optimising another.
The agent thus operates in a “balanced” or “middle” zone across all dimensions.

When an AI has multiple conjunctive goals, it cannot ignore most while excessively optimising just one. Instead, the synergy among objectives drives it toward a safer, middle-ground strategy. In effect, massive efforts in any single dimension become exponentially harder to justify because you are “pulling away” from the target ranges in others.

2. Task-Based Agents or Taskishness — “Do the Deed and Cool Down”

Homeostatic goals naturally lead to bounded or “task-based” behaviour rather than indefinite optimisation. If all current objectives are satisfied, a homeostatic agent is content to remain idle. It does not keep scanning the universe for hypothetical improvements once the setpoints are reached. This “settle to rest” feature significantly limits the potential damage the agent might cause. Even if such an agent accidentally does something very harmful on a personal level, it is less likely to affect entire nations.

For example, an AI tasked with making 100 paperclips in a homeostatic framework will strive to produce enough paperclips to satisfy the “paperclip count” objective. Once the target is met (often even just approximately) — and so long as other objectives are also in a comfortable range — it does not chase infinitely more micro-optimisations.

3. Bounded Stakes: Reduced Incentive for Extremes

Because each objective is bounded, the stakes of discrepancy in one dimension remain limited by the balance with other dimensions. The system cannot keep turning up the efforts in a single dimension without noticing detrimental side effects (like going off course in every other dimension). This effect reduces the likelihood of extremist outcomes.

4. Natural Corrigibility and Interruptibility

A hallmark of AI alignment is ensuring that we can correct or shut down an agent if it goes off track. Simple maximisers often resist shutdown because they interpret it as a threat to their singular goal. By contrast, a homeostatic system:

Assumes by design that the targets/setpoints may change. Therefore the agent does not need to plan indefinitely into the future, because long-term speculation about changing setpoints is often counterproductive.
After achieving equilibrium, it effectively has no strong incentives to prevent new goals or new tasks from arising. When a human changes the setpoints (the agent’s “needs”), the agent can start its bounded work to re-center on the new bounded targets.
Additionally and just as importantly, if at least one goal in the framework of homeostatic objectives is “be agreeable” and “avoid harming the human operators”, then that acts as a persistent partial “off switch” or at least as a curb on manipulative and other extreme behaviour. The agent will not want or even need to push negative externalities in the name of maximising any single dimension. At the same time, reasonable “civilised” resistance, like proposing a few honest counterarguments where appropriate, should be considered permitted and even useful.

Diminishing Returns and the “Golden Middle Way”

An important corollary of multi-objective homeostasis is diminishing returns. Once a target is nearly satisfied in one dimension, the agent’s “bang for the buck” in that dimension is no longer large; it becomes more “cost-effective” to address other objectives that are further from their optimum. Thus:

The agent can keep multiple needs near healthy levels hopefully most of the time.
Each incremental improvement in a well-satisfied need is overshadowed by bigger deficits in other needs.

This “golden middle way” extends even to safety measures themselves. An AI with an excessively strong safety constraint might be tempted to devote the entire universe’s resources to verifying it is 100% safe — unless that safety measure also has a built-in diminishing returns principle. By bounding safety checks themselves, we avoid another pathological scenario: the AI going to extremes to confirm it has never ever caused any harm, while ignoring the rest of its objectives. I hope the trade here is clear: nobody would build or give resources to an agent that is safe to such an extent, therefore it would be a losing strategy.

Formalising Homeostatic Goals

In reinforcement learning or other goal frameworks, one can use a loss or negative-utility term for each objective. Each objective is measured against its target (also known as “setpoint”), and both “too little” and “too much” push the system away from equilibrium. Summing or combining these losses in a way that penalises big deviations in any single dimension more strongly than small deviations in many fosters a “balancing” behaviour. One simple formula might be:

Here:

Each ${target}_{i}$ is a setpoint for the $i$ -th objective (e.g. water needed, number of people not harmed, etc.). In case of the number of people not harmed, the setpoint is zero or a negative number.
If ${actual}_{i}$ gets too high or too low, the negative squared term grows rapidly, which steers the agent back toward the setpoint.
A sufficiently “close” match to each target means there is little additional gain for over-optimising that dimension.
The weighting factors $w_{i}$ reflect the relative importance (or priority level) of each objective.
In principle, there may be various power/exponential functions involved, not always a square function. But this is already an advanced discussion. This post here is a conversation starter.

Some objectives can also be framed as “safety constraints” about not disturbing the environment too much, and these can be folded into the same overall homeostatic reward system. These would be the “low impact” and "minimal side effects" objectives, sometimes also “keep future options open” and “maximise human autonomy” objectives — “negative goals” by their nature, as these objectives are about not changing the existing state, as opposed to achieving some new state as is the case with “positive goals” which are usually the “performance” objectives.

Parallels with Other Ideas in Computer Science

As machine-learning-minded ones of you may notice, this formula above is very similar to how a regression algorithm works. Except in this case the squared errors are computed from plurality of objectives, not plurality of data points. In both cases the motivation for squaring is avoiding overfitting to a subset of data or objectives.

I propose an additional perspective: The distinction between constraints versus objective functions in combinatorial optimisation is analogous to the distinction between safety objectives and performance objectives.

It is notable that in combinatorial optimisation problems the concept of constraints have been naturally considered as part of the setup, alongside the concept of objectives. In contrast, this unfortunately has often not been the case in the use of reinforcement learning. Yet in reality both performance and safety objectives should be present.

The difference between "safety objectives" versus "constraints in combinatorial optimisation" can be that various safety objectives might considered as “soft” constraints — they can be traded off up to a point, but not too much. When they are violated notably, they become increasingly similar to hard constraints in their effects, thanks to squaring of the discrepancies.

Though in special use cases, some safety considerations can also be treated as “hard” constraints when comparing to performance objectives, while being still traded off or balanced among plurality of other safety objectives. (These are not just theoretical thoughts here — I have implemented such advanced homeostatic setups for example in multi-objective workforce planning algorithms that have been successfully time-tested for the past 15 years by now.)

Finally, the most obvious parallel — control systems and control theory. This framework treats objectives inherently as bounded and homeostatic. It supports changing setpoints, having multiple dimensions/objectives, and they can even be hierarchical. Cybernetics is a broader closely related field which is worth mentioning. It is curious that it has evolved largely separately from AI. Is there a reason we don’t talk more about cybernetic alignment instead of AI alignment?

Open Challenges and Future Directions

While promising, a multi-objective homeostatic approach has many subtleties:

Time Granularity
- In the example of eating and drinking — both activities need to be balanced with some time granularity. It would not be necessary nor productive to keep these objectives maximally balanced at all times. That would mean for example, that an agent runs between food and drink sources without stopping for more than a smallest timestep. Which would in the end result in collecting "equally" less than necessary on both objectives, since most of time would be spent on switching tasks. Deciding an optimal time granularity during which unequal treatment to competing objectives is allowed, requires a bit of strategic thinking on the agent’s part.
Handling Evolving Targets
- Objectives (or their setpoints) may change over time, especially when humans are involved. Ideally, the agent should accept these new targets without undue manipulation such as resisting the changes, escaping, or reverting the objectives. Likewise, the agent should not be inducing the changes. In the current post I mentioned the general principles for solving this, but details still need to be worked out and tested in practice.
Soft vs. Hard Constraints
- Not all objectives are created equal. Some constraints (e.g. “don’t kill humans”) might be hard constraints with extremely high weight or even lexicographic priority. But introducing lexicographic ordering may require some hard decisions. Additionally, one must be diligent and make sure that these higher-priority objectives could not cause extreme losses in unacceptable dimensions of low-priority objectives. Alternatively, one should ensure that the higher-priority objectives include all objectives where extreme losses are undesirable.
Goodhart’s Law and Multi-Scale Measures
- We might measure safety and low impact at multiple “levels” (individual, community, global, etc.). Aggregating or normalising them incorrectly risks inadvertently re-enabling Goodhart-like exploits. As an example, there is a possible error where one large aggregated discrepancy (lets say, at gender level) is split up into multiple small discrepancies (at individual level) while being insufficiently represented at the higher level.
Coordination of Multiple Agents
- If multiple homeostatic agents co-exist, how do they avoid interfering with each other’s setpoints? Most likely some authorisation and power hierarchy framework is relevant here. Such a system would then inevitably need to also include mechanisms for accountability auditing of the higher-ups who need to be held liable when a subordinate system executes unsafe actions down the line.
Tradeoffs Between “Interruptibility” and “Getting Work Done”
- Ideally, we want the agent to do its job of achieving performance objectives but also remain open to external corrections or even a shutdown. Balancing these is partly social/political and partly technical. Yet there is a well known risk of systems becoming too important to shut down. Perhaps aligned systems could be throttled gradually, so a “shutdown” command becomes a “chill down a bit” command.
Integration of Bounded and Unbounded Objectives
- Even after introduction of the concept of homeostatic objectives, sufficiently many performance objectives can still belong to the unbounded class. In contrast to bounded objectives, on which I focused in this post, unbounded objectives could reach infinite positive rewards. Yet these potential infinite rewards should not dominate safety objectives or even exclude balancing of other performance objectives. This topic is explored in a bit more detail in the addendum below.
Rate-Limited Objectives
- Even though homeostatic objectives are bounded locally in time, in various cases they can be unbounded across time. For example, a human need for novelty may be satiated for a day, but it will arise again the next day. This dynamic can be described in other words as rate-limited objectives. Rate limiting is important also for sustainability purposes — to avoid exhausting the renewable resources in the environment. So homeostasis fulfills a sort of balancing role here as well. I imagine, the expansion of humanity could often be described from such a perspective too — humanity may have an intrinsic need to grow to live even outside of our planet Earth one day, but such growth needs to be sustainable.

These are open research challenges, but a homeostatic approach at least provides a conceptual blueprint of how to incorporate multi-dimensional, bounded goals into AI systems without inviting the pathological extremes of naive maximisation. This post is a conversation starter and there is so much to explore further.

Addendum about Unbounded Objectives

I fully acknowledge that various instrumental objectives could still have unbounded nature. For example, accumulating money, building guarantees against future risks, etc. (Though in case of risks, people familiar with the concept of antifragility may argue that reducing risk too much in the short term would paradoxically mean more fragility in the long run).

These unbounded objectives would still become safer when the above described homeostatic principles are considered in relevant objectives alongside.

Additionally, unbounded objectives are usually described by concave utility functions in economics. In other words, these objectives too have their own form of diminishing marginal returns. Besides biology, economics is the other fundamental and well-established field describing our needs. AI alignment surely should consult both. An aligned agent should be able to model diminishing returns and thus balancing between plurality of unbounded objectives just as well as it should be able to properly model homeostasis. As economists have observed, humans prefer averages in all objectives to extremes in a few.

You can read more about balancing multiple unbounded objectives via the mechanism of considering diminishing returns from the following blog posts I have co-authored:

A brief review of the reasons multi-objective RL could be important in AI Safety Research [LW · GW] (2021)
Sets of objectives for a multi-objective RL agent to optimize [LW · GW] (2022)

Conclusion

Homeostasis — the idea of multiple objectives each with a bounded “sweet spot” — offers a more natural and safer alternative to unbounded utility maximisation. By ensuring that an AI’s needs or goals are multi-objective and conjunctive, and that each is bounded, we significantly reduce the incentives for runaway or berserk behaviours.

Such an agent tries to stay in a “golden middle way”, switching focus among its objectives according to whichever is most pressing. It avoids extremes in any single dimension because going too far throws off the equilibrium in the others. This balancing act also makes it more corrigible, more interruptible, and ultimately safer.

In short, modelling multi-objective homeostasis is a step toward creating AI systems that exhibit the sane, moderate behaviours of living organisms — an important element in ensuring alignment with human values. While no single design framework can solve all challenges of AI safety, shifting from “maximise forever” to “maintain a healthy equilibrium” is a crucial part of the solution space.

Thanks for reading! If you have thoughts, questions, improvement suggestions, resource and collaborator references, feedback, ideas, or alternative formulations for multi-objective homeostasis, please share in the comments.

7 comments

Comments sorted by top scores.

comment by Steven Byrnes (steve2152) · 2025-02-06T21:30:37.092Z · LW(p) · GW(p)

Sorry if I missed it, but you don’t seem to address the standard concern that mildly-optimizing agents tend to self-modify into (or create) strongly-optimizing agents.

For example (copying from my comment here [LW(p) · GW(p)]), let’s say we make an AI that really wants there to be exactly 100 paperclips in the bin. There’s nothing else it wants or desires. It doesn’t care a whit about following human norms, etc.

But, there’s one exception: this AI is also “lazy”—every thought it thinks, and every action it takes, is mildly aversive. So it’s not inclined to, say, build an impenetrable fortress around the bin just for an infinitesimal probability increment. “Seems like a lot of work! It’s fine as is,” says the AI to itself.

But hey, here’s something it can do: rent some server time on AWS, and make a copy its own source code and trained model, but comment out the “laziness” code block. That’s not too hard; even a “lazy” AI would presumably be capable of doing that. And the result will be a non-lazy AI that works tirelessly and uncompromisingly towards incrementing the probability of there being 32 paperclips—first 99.99%, then 99.9999%, etc. That’s nice! (from the original AI’s perspective). Or more specifically, it offers a small benefit for zero cost (from the original AI’s perspective).

It’s not wildly different from a person saying “I want to get out of debt, but I can’t concentrate well enough to hold down a desk job, so I’m going to take Adderall”. It’s an obvious solution to a problem.

…OK, in this post, you don’t really talk about “AI laziness” per se, I think, instead you talk about “AI getting distracted by other things that now seem to be a better use of its time”, i.e. other objectives. But I don’t think that changes anything. The AI doesn’t have to choose between building an impenetrable fortress around the bin of paperclips versus eating lunch. “Why not both?”, it says. So the AI eats lunch while its strongly-optimizing subagent simultaneously builds the impenetrable fortress. Right?

Replies from: roland-pihlakas

↑ comment by Roland Pihlakas (roland-pihlakas) · 2025-02-17T19:53:47.787Z · LW(p) · GW(p)

I agree, sounds plausible that this could happen. Likewise as we humans may build a strongly optimising agent because we are lazy and want to use simpler forms of maths. The tiling agents problem is definitely important.

That being said, agents properly understanding and modelling homeostasis is among the required properties (thus essential). It is not meant to be sufficient one. There may be no single sufficient property that solves everything, therefore there is no competition between different required properties. Required properties are conjunctive, they are all needed. My intuition is that homeostasis is one such property. If we neglect homeostasis then we are likely in trouble regardless of advances in other properties.

If we leave aside the question of sloppiness in creating sub-agents, I disagree with the zero cost assumption in the problem you described. I also disagree that it would be an expected and acceptable situation to have powerful agents having a singular objective. As the title of this blog post hints - we need a plurality of objectives.

Having a sub-agent does not change this. Whatever the sub-agent does, will be the responsibility or liability of the main agent who will be held accountable. Legally, one should not produce random sub-agents running amok.

In addition to homeostasis, a properly constructed sub-agent should understand the principle of diminishing returns in instrumental objectives. This topic I do mention towards the end of this blog post. We can consider wall-building as an instrumental objective. But instrumental objectives are not singular and in isolation either, there are also a plurality of these. Thus, spending excessive resources on a single instrumental objective is not economically cost-efficient. Therefore, it makes sense to stop the wall building and switch over to some other objective at some point. Or at least to continue improving the walls only when other objectives have been sufficiently attended to as well - thus providing balancing across objectives.

Secondly, a proper sub-agent should also keep in mind the homeostatic objectives of the main agent. If some homeostatic objective from among the plurality of homeostatic objectives would get harmed as a side effect of the excessive wall-building, then that needs to be taken into consideration. Depending on the situation, the main agent might potentially care about these side effects before it launches the sub-agent.

Thirdly, following the principles of homeostasis does not necessarily mean laziness and sloppiness in everything. Instead, homeostasis primarily notes that unbounded maximisation of a homeostatic objective is incompatible and harmful even for the very objective that was maximised for. In addition to potentially having side effects to the plurality of other objectives. So homeostasis is primarily about minding the target value as opposed to maximisation of the actual value. An additional relevant principle is minding the plurality of objectives.

Finally, when an agent has a task to produce 100 paper clips then that does not mean that the number of paper clips needs to stay at 100 after the task has been completed. Perhaps it is entirely expected that these 100 paper clips will be carried away by authorised parties. Walls help against theft and environmental degradation of produced paper clips, but we do not exactly need the walls to keep the paperclip number at 100 at all times - there is some deeper need or transaction behind the requested paper clips.

In order to avoid confusion, pointing also out that there are two types of balancing involved in these topics:
1. Balancing of an homeostatic objective - keeping the actual value of a single homeostatic objective near the target value - not too low, not too high.
2. Balancing across objectives - as a form of considering the utilities of multiple objectives equally. That means meeting them in such a manner that the homeostatic objectives have for example least-squares deviations, while unbounded objectives have approximately same utility value after the utility functions with diminishing returns have been applied to each actual value.

I am curious, how does this land with you and does this respond to your question?

comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-12T20:37:22.213Z · LW(p) · GW(p)

Yeah, I think this is a great way to design agents that have approximately human-level intelligence, and operate at approximately human ranges of speed and affordance (e.g. no spawning of tons of self-clones).

I think part of the explanation for why solutions like this have traditionally received less attention has been a tendency of the past AI alignment discourse to focus on extreme scenarios. I think this has been shifting in the past couple years to more focus on pragmatic near-term approximately-human-level agents and oracles.

Personally, I'm most optimistic about a safety plan that involves a "strategic pause" enforced via global treaties and personal contracts with privacy-respecting mutual-inspections, and the use of near-human-level AI to facilitate this. I think that if we allow the world to rush quickly past near-human-level AI to superintelligence, then we are probably doomed. Thus, for the practical domain that I hope we can linger in, I support this idea for agent design.

comment by Gordon Seidoh Worley (gworley) · 2025-01-14T03:37:50.270Z · LW(p) · GW(p)

A couple notes:

I expect future AI to be closed-loop [LW · GW]
Closed-loop AI is more dangerous than open-loop AI
That said, closed-loops allow the possibility of homeostasis, which open-loop AI does not
I agree that homeostatic processes, specifically negative-feedback loops, are why ~everything in the universe says in balance. If positive feedback wasn't checked there wouldn't be anything interesting in the world.
AI is moving towards agents. Agents are, by their nature, homeostatic processes, at least for the duration of their time trying to achieve a goal.
Even if we can't align open-loop systems like LLMs, maybe we can align closed-loop systems by preventing run-away positive feedback loops.

comment by Angie Normandale (palindrome) · 2025-01-13T18:33:59.057Z · LW(p) · GW(p)

I've been exploring this for the last year, I think it's a promising avenue for solving some key alignment issues. Homeostatic approaches are well documented in neuroscience but a surprisingly neglected approach to alignment.

Research in support:

Managing competing drives, homeostatic approaches ensure safety wins out
-Mathematical formalisations by Laurençon et al, Kermati and Gutkin
-Robotics researchers successfully implemented a homeostatic approach to help a system manage competing drives
-Friston suggests it's a way to manage Free Energy

Scaling to group behaviour
This could mathematically support Joel Leibo and team's appropriateness agenda and provide a mechanism for the unsolved problem of alignment with changing and influenceable reward functions.

Declaration of interest: I recently joined Roland at Aintelope to support this agenda alongside other applications from neuroscience to alignment!

comment by Knight Lee (Max Lee) · 2025-01-14T08:21:23.463Z · LW(p) · GW(p)

Maybe one concrete implementation would be, when doing RL^[1] on an AI like o3, they don't give it a single math question to solve. Instead, they give it like 5 quite different tasks, and the AI has to allocate its time to work on the 5 tasks.

I know this sounds like a small boring idea, but it might actually help if you really think about it! It might cause the resulting agent's default behaviour pattern to be "optimize multiple tasks at once" rather than "optimize a single task ignoring everything else." It might be the key piece of RL behind the behaviour of "whoa I already optimized this goal very thoroughly, it's time I start caring about something else," and this might actually be the behaviour that saves humanity.

^{^}
RL = reinforcement learning

comment by testingthewaters · 2025-01-14T07:54:49.391Z · LW(p) · GW(p)

In my book this counts as severely neglected and very tractable ai safety research. Sorry that I don't have more to add but felt important to point it out.

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well)

Contents

Introduction

Why Utility Maximisation Is Insufficient

Homeostasis as a More Correct and Safer Goal Architecture

1. Multiple Conjunctive Objectives

2. Task-Based Agents or Taskishness — “Do the Deed and Cool Down”

3. Bounded Stakes: Reduced Incentive for Extremes

4. Natural Corrigibility and Interruptibility

Diminishing Returns and the “Golden Middle Way”

Formalising Homeostatic Goals

Parallels with Other Ideas in Computer Science

Open Challenges and Future Directions

Addendum about Unbounded Objectives

Conclusion

7 comments