Beyond Biomarkers: Understanding Multiscale Causality

post by Matěj Nekoranec (matej-nekoranec) · 2024-07-07T09:56:22.554Z · LW · GW · 0 comments


  Top-down or bottom-up?
  We are generalization machines
  Let’s select the right operational space
  Personomics: A Multiscale Approach
  Case study of global health crisis
  Reference list:
No comments

In exercise science, we typically derive causality in a bottom-up manner. When we evaluate performance, we assess factors such as cardiovascular capacity, metabolic efficiency, or muscular contractile capacity. However, I’ve always grappled with a chicken-and-egg dilemma in exercise physiology. This dilemma highlights the challenge of understanding sequences of events where mutual dependencies exist — each outcome depends on a preceding event, and vice versa.

Consider a simple example: biomechanical testing of an NBA basketball player might reveal that certain parameters (x, y, z) predispose them to excel at that competition level. However, we can also argue that these parameters likely developed in response to the competitive demands of the game. As players advance to higher leagues, they face greater technical demands, which drive their development and the evolution of their biomechanical parameters.

This creates a paradoxical situation. If structure gives rise to behaviour, but structure simultaneously evolves in response to environmental constraints, where does causality lie? What comes first, the chicken or the egg? Does causality arise from the bottom-up physiological blueprint or the top-down constraints of a specific ecological niche?

Top-down or bottom-up?

To understand biological causation, I came across an interesting stream of research from the well-known Oxford physiologist Denis Noble, the founder of modern electrophysiology of the heart. He discusses the concept of biological relativity, where no level of the biological hierarchy holds privileged causation (Noble et al., 2019). In simple terms, lower levels are responsible for dynamics, while higher levels constrain the lower levels by setting boundary conditions.

Example derived from Noble et al. (2019). Global cell properties, such as electric potential, regulate molecular-level properties, such as ion channel proteins, which in turn influence changes in cell properties.

Noble explains that differential equations at lower biological levels have an infinite set of solutions until constrained by higher-level boundaries. This means that multiscale nested biological systems operate as a two-way street, with higher emergent biological levels imposing boundary conditions on lower levels, thus serving as top-down controllers. While low-level descriptions define the system’s dynamics, solutions to these dynamics come from top-down constraints — such as an athlete making decisions. Therefore, how does the adaptation work across different levels?

Diagram showing different scales of biological hierarchy. Derived from Noble et al. (2019).

We are generalization machines

One of the outstanding pre-print papers published this year introduced a novel concept that reformulates adaptation and natural selection as rules of induction (Buckley et al., 2024)

In simple terms, the concept works as follows: Consider the cardiovascular system. We induce top-down stimuli in the form of a specific problem with certain parameters — for instance, training. The system’s immediate reaction is to optimize the problem space within its existing structural capacity. This means we seek a local minimum (see diagram below) as a solution to the immediate external problem we are addressing. If the external problem persists, the model parameters must change to find a better solution than the initial local minimum. For example, when someone begins training, their cardiovascular response will be very aggressive. However, over time, the model parameters will adjust to better handle the external problem of exercise (change in structure). This suggests that structure evolves, to some extent, as a generalized model of the problem we have at hand. However, this, in turn, changes the problem space again in a re-iterative loop. Direct citation from Buckley et al. (2019) can help us further understand the concept.

“Recognizing that adaptation requires learning, and learning requires generalization, and generalization requires an inductive bias, helps us to understand how adaptation really works and what is required.”

It means that every level/scale of the biological hierarchy learns or generalizes based on the data they receive. However, the type of data varies across different biological scales and structures. For example, strength and endurance athletes’ heart has distinct structural and morphological adaptations, such as heart wall thickness (Mihl et al., 2008). It suggests that generalization occurs based on the perceived stress from the heart’s perspective, which can be simplified as a function of pressure and blood volume.

Diagram borrowed from Buckley et al. (2019). Iterative loop between optimisation and learning. In the first step, we want to find the local minimum and then we need to find a better solution than local minimum through adjustment in model parameters aka learning.

Conceptualizing through natural induction can be useful, but it must always be connected to the relevant operational domain (selecting the appropriate scales in which to operate). For instance, if we want to solve genetic mutations in ion channels, we should use the language of molecular biology. Psychiatry will not be very effective in this case. However, how useful is it to describe large-scale behaviour, such as athletic performance, using only molecular biology?

Let’s select the right operational space

Olav Aleksander Bu, coach of Kristian Blummenfelt, well said that at a certain point, we need to put all the granularity into a black box because trying to understand every detail may not be useful. When dealing with a concept as broad as performance that spans multiple scales, we should prioritize scale and subsequent interventions where we can get the biggest “bang for your buck”.

As we go deeper into the biological hierarchy, we get a much larger combinatorial space that can be hard to implement at the practical level. Combinatorial space means, that we cannot solve all the problems at all scales at the same time, because the amount of interventions across scales would be almost infinite. For example, heart structural remodelling and the lactate shuttle mechanism are operating at completely different temporal-spatial scales. However, they both respond to the same top-down constraint, exercise, which increases metabolic rate (production of lactate) and therefore the need for increased blood flow.

As we move up the biological hierarchy, data representation becomes more coarse-grained and less detailed. By selecting the right high-level parameter for assessment (e.g., total training volume), we can influence multiple low-level biological biomarkers simultaneously without the need to go down and try to affect every low-level biological parameter at their respective scales. But it also can be limiting as the missing granularity of the high-level data can limit our understanding in certain scenarios.

Visualization of the hierarchical structure of different biological scales. We can start from an initial level (n-level) at which certain phenomena arise. For example, performance arises at the personal level. Subsequent lower levels (n-1, n-2, etc.) can be derived based on utility. With increasing depth, we expand the combinatorial space for problem-solving across biological scales.

Furthermore, data representation isn’t solely limited to a third-person perspective. As we said, the performance spans multiple scales, and one of them is the scale of being me as an athlete, making decisions and perceiving the world from my position. It’s been established that factors such as perceived self-efficacy could be linked to an understanding of injuries (Olmedilla et al., 2018). For example, we can quantify load in terms of overall running volume, but how athletes subjectively interpret this load varies widely. Does this suggest that third-person data reflects objective truth while first-person data represents subjective bias? Not necessarily. These perspectives simply operate at different scales. Total training load can be linked to the physiological level, while self-efficacy reflects personal perception and experiences within a higher level representing an athlete who is contextually embedded within the environment.

It leads us to the idea that connecting different scales is the way to go in modern data analysis. However, despite significant advances in data acquisition and analytics, I have a feeling that we lack conceptual agreement on the way we conduct analysis. Sometimes, we operate in completely different operational domains/scales and then argue about who has better predictive ability. Is there a way how to connect it all together?

Personomics: A Multiscale Approach

Personomics is a very niche stream of conceptual research, however, it effectively targets the multiscale nature of the human body (Constant, 2024; Ziegelstein, 2017). While focusing solely on either bottom-up or top-down approaches can be limiting, connecting these levels can be an effective strategy to address global challenges such as ageing, precision medicine, or high performance.

Personomics could work by stratifying physiological data according to specific contextual situations and examining how closely related scales correlate. For instance, we can find very low heart variability via wearable devices, which show high sympathetic activation and low coping with stress. However, these findings mean nothing if a person is primarily constrained by a personal context in which he is going through the stressful period in which he lost his job. First-person assessments, such as self-reports, can serve as a gateway to contextualizing physiological data.

This approach aligns well with the current development of Natural Language Processing (NLP), which can be used for differentiating semantic and contextual clusters (through self-reports) and linking them to specific physiological responses. By integrating NLP’s with low-level physiological, we can create a more comprehensive understanding of how different scales of the biological hierarchy interact.

Case study of global health crisis

It seems paradoxical that despite increasing information about the positive effects of exercise, nutrition, and sleep, the overall health of the Western population continues to decline.

The main problem is that we are operating at completely wrong scales. While biomarkers and physiological data give shape and contour to a global health crisis, to pull effective causal levers, we need to shift problem-solving to a completely different space — not physiological, but personal or even societal. These are the primary scales from which we constrain downstream physiological parameters.

For example, we know that increasing a daily step count is a solid proxy for inducing general health benefits. However, consider an imaginary person, Joe, who has a typical corporate career and commutes 90 minutes daily to work. While he could cut down on work to prioritize his health, it is unlikely because his primary optimization problem is his career, not his health. This means his body has learned to optimize the constraints of a corporate career by generalizing in a way that degrades physical health.

Diagram showing the interconnectedness between different scales and how certain top-down constraints can influence lower levels.

We can either try to solve the problem at Joe’s level, by changing his perspective or working environment, or we can go even higher to the societal level and realize that certain biological parameters are downstream consequences of imposing top-down constraints leading to a generalization at a biological level. This leads to an idea of social engineering and how societal structures map back to physiology.

Expanding this problem to society, we can argue that the global health crisis is a generalized rule imposed by societal priorities on what we need to optimize for.


We kicked off this article with the classic chicken-and-egg dilemma, only to discover that it’s less about which came first (where causality lies) and more about the constant interaction between structure and environment. The key concept lies in the operational domain we choose to analyze, considering both the dynamics of lower-level scales and the constraints of top-down influences. It seems that causality is fixed only in our models, but in biology, it is constantly flowing around and can submit to our will only if we come up with the right operational domain and enough contextual information. This conceptualization shows the way for a data holism (e.g., personomics) that can dynamically transverse biological hierarchies.

Reference list:

Buckley, C. L., Lewens, T., Levin, M., Millidge, B., Tschantz, A., & Watson, R. A. (2024). Natural Induction: Spontaneous adaptive organisation without natural selection. In bioRxiv.

Constant, A. (2024). Personomics: Precision psychiatry done right. The British Journal for the Philosophy of Science.

Mihl, C., Dassen, W. R. M., & Kuipers, H. (2008). Cardiac remodelling: concentric versus eccentric hypertrophy in strength and endurance athletes. Netherlands Heart Journal: Monthly Journal of the Netherlands Society of Cardiology and the Netherlands Heart Foundation, 16(4), 129–133.

Noble, R., Tasaki, K., Noble, P. J., & Noble, D. (2019). Biological Relativity requires circular causality but not symmetry of causation: So, where, what and when are the boundaries? Frontiers in Physiology, 10, 827.

Olmedilla, A., Rubio, V. J., Fuster-Parra, P., Pujals, C., & García-Mas, A. (2018). A Bayesian approach to sport injuries likelihood: Does player’s self-efficacy and environmental factors plays the main role? Frontiers in Psychology, 9.

Ziegelstein, R. C. (2020). Personomics: The missing link in the evolution from precision medicine to personalized medicine. In The Road from Nanomedicine to Precision Medicine (pp. 957–966). Jenny Stanford Publishing.


Comments sorted by top scores.