Collaboration in Science: Happier People ↔ Better Research

post by nadinespy · 2023-06-08T17:42:40.516Z · LW · GW · 0 comments

Contents

    TL;DR
    Blog post summary (longer TL;DR)
  Table of contents
  1.0 Isolation in science
  2.0 An appeal for collaboration 
    2.1 Have more people do the actual research
    2.2 Have more task division, external feedback, and exchange up the hierarchy
    2.3 A look into the non-academic sector 
    2.4 Are collaborative structures stifling scientific creativity and solitary work?
    2.5 Wrap-up  
  3.0 An appeal for collaboration in the mind sciences 
    3.1 On the nature of the mind sciences 
    3.2 Why collaboration and feedback help 
    3.3 Towards a shared, multi-disciplinary understanding of the subject of study
  4.0 Towards moral values in science 
  5.0 How incentives prevent us from rigorous and value-based science & what change is needed 
    5.1 Current incentives in academia 
    5.2 Rebuilding the academic space - a quest for experiments and prioritizing long-term benefits 
  Acknowledgements 
  References
    Scientific papers 
    Other resources/links referred to in this post (in chronological order)
None
No comments

This is a cross-post from my personal blog and the Effective Altruism Forum [EA · GW]. I haven't modified anything w. r. t. publishing it on LessWrong, so there may be little details that do not fit to the LW context (e. g., I ask people to write me a mail, if the want to provide feedback, but obviously, there is no need for that on this platform), so please bear this in mind.

TL;DR

Shifting from individualistic to collaborative work in academia can improve scientific progress and increase well-being. This may be particularly important in the mind sciences where correct research methodology is challenging. Current academic incentives hinder collaboration, favouring rather narcissistic or anti-social personalities. Fundamental change may require rebuilding the academic space with different principles, accepting temporary instability for long-term benefits. This post provides both a big picture view and a detailed, comprehensive treatise on the topic of collaboration, with a special focus on the mind sciences, and lots of further resources to look into.

Blog post summary (longer TL;DR)

Table of contents

  1. Isolation in science 
  2. An appeal for collaboration
    1. Have more people do the actual research
    2. Have more task division, external feedback, and exchange up the hierarchy
    3. A look into the non-academic sector
    4. Are collaborative structures stifling scientific creativity and solitary work
    5.  Wrap-up
  3. An appeal for collaboration in the mind sciences
    1. On the nature of the mind sciences
    2. Why collaboration and feedback help
    3. Towards a shared, multi-disciplinary understanding of the subject of study
  4. Towards moral values in science
  5. How incentives prevent us from rigorous and value-based science & what change is needed
    1. Current incentives in academia
    2. Rebuilding the academic space - a quest for experiments and prioritizing long-term  benefits
  6. Acknowledgements
  7. References
    1. Scientific papers
    2. Other resources/links

Recently, I had been – alongside two colleagues of mine, Federico Micheli and Tomasz Korbak – organizing a symposium on Rethinking Computational Approaches to the Mind. It gave me an opportunity to ask what the challenges and possible ways forward are in scientific fields that, in the most broadest sense, study the (human) mind using computational approaches, and to re-evaluate what I particularly could or should be focusing on.[1]

This is a hard question. As part of the answer that I will provide in the following, I will mainly talk about collaboration in scientific projects.

A few notes before I set off: 

1.0 Isolation in science

To better understand the problem, let's examine the typical experience of individual scientists working as part of a research lab or group – what is it actually like to do science from this point of view? 

Probably, the way a great portion of scientific results are produced and communicated is the following: one person (most likely a doctoral or post-doctoral researcher) is largely responsible for carrying out investigations and writing them up in a paper. Meanwhile, a group of probably < 10 people (or whatever the practices are in a given subfield), including the Principal Investigator (PI), contributes via occasional feedback in the process (this may often mean once or twice a year), or optimizing a given study for publication, and reviewing the paper once a first draft has been delivered. Thus, most of the time, only one person is deeply involved in a given project. Some people and some projects can flourish in this way, but I believe that will only be true for a minority. My experience is that people would often prefer a set-up in which they can work more closely with other people, thereby learning from each other more efficiently, and getting feedback more quickly. I also believe it would make many projects and thus science at large better.

Being the sole investigator is an unfortunate situation: the risk of being too confident and ignoring serious flaws in their research is real, and so is being trapped in a bubble with little to no self-correcting mechanisms. High-level (infrequent) feedback from folks who otherwise are not deeply engaged in the project can’t compensate for mistakes in the actual day-to-day work, e.g., in code or other modes of implementation, in designing and carrying out experiments, in analyses or mathematical proofs, or in lines of reasoning that would become apparent only if someone is strongly involved with the material. Over the years, I have noticed that while everyone in academia is terrified of the potential flaws that remain undiscovered in their work, everyone equally accepts and/or ignores this (willingly or unwillingly) – because that’s the practice, and it’s effortful to go against it.

I’ve been also hearing many times about the suffering that researchers go through when they feel the burden of loneliness whilst doing their research – they often lack timely feedback on their work,  are overwhelmed by the amount and diversity of tasks they would need to take on to lead their projects through different stages, and struggle with prioritizing and being their own managers (this is particularly true for post-doctoral researchers). Or, they may also have, in contrast to that, little to do or don’t know what to do at all for large periods of time – and thus waste a lot of it (this is particularly true for doctoral researchers). All of these struggles are not helped by the feeling that the outside world cares little about how they are doing (both personally and work-wise). There is no feedback loop which communicates the message: “Hey, your work is important” which has tremendous effects on their productivity.

This situation is most acute for doctoral researchers. They are probably the most isolated – and most vulnerable – group of researchers in the academic sphere, despite being responsible for a significant portion of published research.

2.0 An appeal for collaboration 

In the following, I will make the argument that collaboration does two things: make people happier, projects easier, alleviate pain (particularly for junior scientists like doctoral researchers), and make scientific results more accurate and more robust by having error-correction built in. I believe many times, the best way to more quickly and substantially improve both scientific work and the well-being of researchers is to establish collaborative patterns and feedback loops at various levels. These include working with others more closely on projects, getting more external feedback, collaborations or adversarial approaches between groups, as well as more exchange between funders and researchers, to name a few examples. 
I do use the term “collaboration” in a very liberal way (meaning anything from mere feedback from others on solo work, to close partnership in a project, to greater ties between researchers and policy-makers and anything in between), and look at how people could work together at various levels of organization, I am particularly concerned with collaboration at the “lowest level”, i.e., where the actual research is done. In what follows, I argue that in a given research project, close collaboration with a team-player rather than a one-against-all spirit often is favourable.

Somewhere on the Snowdon Horseshoe, Wales. (CC-BY-SA 4.0)
Somewhere on the Snowdon Horseshoe, Wales. (CC-BY-SA 4.0)

2.1 Have more people do the actual research

I find it very crucial to have more research projects or tasks where people are (more) equally invested time-wise. For instance, have two or more PIs leading a lab instead of one, or get entirely rid of the pyramidal model (the “1-lab-1-PI model”), as suggested by Romain Brette in this post, where power is concentrated on one (most likely) permanently employed PI employing a temporary workforce, and, instead, distribute financial responsibilities – and therefore power – more equally among researchers. Also, let the actual research be done not only by less experienced people “lower” in the academic hierarchy, but also by people who have many years or decades of experience, e.g., the PIs. Let people work on the codebase together or share equal responsibilities for paper writing. Or let doctoral researchers do projects together (or with their supervisors).[2]

Everyone who has worked with another individual or a group of people truly collaboratively, i.e., where people share the time spent on a project as well as the risks of failing, knows how much better things turn out to be – with much less time wasted, much less redundancy, and very likely less errors –, because two or more heavily invested & interested minds are better than one. Things like code review, the questioning assumptions, or getting the details right in the design of a new experimental study are a lot easier, if done together. Of course, collaboration does not guarantee that truth will be found – but, in conjunction with external corrective input (see more on that in the next section), it does help raise the probability that wrong ideas are replaced by (slightly) less wrong ideas incrementally, or that there’s at least a better idea of where the barriers are.
Collaboration is also more fun than the current hyper-competitive, individualistic culture that dominates the research space, as people have more certainty, less isolation, and therefore less toxicity in their work environment – a win-win situation for both people and science. It would be in the interest of science – and, as a consequence, of society – to implement more collaborative patterns of that form. Mistakes/redundancies would be avoided or discovered more quickly which would accelerate the progress curve. This is a more sustainable way to move in the right direction faster.

2.2 Have more task division, external feedback, and exchange up the hierarchy

To make collaboration work in the science space, we also need to acknowledge the merit of task division, and have a less strict division between “scientists” and “supporters” (such as, e.g., research software engineers, research data stewards, research infrastructure developers, research computing support officers, research community managers etc.) when crediting research contributions. Although people in non-traditional, but research-related roles often do contribute to research in a significant manner, their work is systematically undervalued. Good collaboration comes with crediting everyone’s contribution appropriately.[3]

Moreover, people forming a coalition to work on a project will depend on further feedback from people outside of it – just like one person may engage in false/problematic beliefs or bad practices, so may groups of people. Thus, a group will likewise rely on external corrective input. There’s a sweet spot to be found in terms of how varied this input should be, but as a heuristic, the more pluralistic it is, the more people can get the best out of their projects.

On a more systemic level, a tighter overall dialogue between science, policy-making and funding would be desirable to ensure that good research is funded while bad research isn’t. Currently, exchange between researchers and funders is not or only weakly cultivated which may result in bad decisions on the funder’s side: often, they are not aware of bad research practices and therefore fund the wrong projects, leading to research waste (i.e., research for which funds had been wasted – avoidably so). To avoid research waste, researchers should flag bad research practices as a problem to inform policy-making. Some people alternatively suggest improved peer review to better decide about the quality of research. However, given the amount of research studies being conducted, the demand for thorough peer review is too high, and there is a natural limit to the number of people who can do reviews. One solution could be to simply conduct less studies and therefore publish less papers. (Doing less rather than more studies may also naturally reinforce collaborative structures.)

2.3 A look into the non-academic sector 

To create a stark contrast with academic individualism, let’s consider a (newly launched) start-up: normally, to achieve the start-up’s goals, it will be necessary to collaborate heavily as teams as well as potentially with people outside the start-up, too. This entails good task division. The start-up is necessarily tightly linked to the outside world – feedback will be quick (and often brutal), so corrective forces will necessarily unfold (or the start-up dies). I think a healthy portion of this kind of close collaboration and solid embedding into greater feedback structures would be good for science at large. A good example of close collaboration where people do different things, but are overall well integrated into a whole is Anthropic (not a start-up anymore, but a large scale-up) – an AI safety and research company that aims to build interpretable and safe AI systems: some employees do scaling work, some do interpretability work, others focus on human feedback work, and yet another team does the societal impacts work, all of which, as Chris Olah (one of the co-founders) had put it in an 80k podcast, is united "[…] in a single package and in a tightly integrated team. […] We’re all working very closely together, and there’s very blurry lines, and lots of collaboration."

Ashdown Forest, East Sussex. (CC-BY-SA 4.0)
Ashdown Forest, East Sussex. (CC-BY-SA 4.0)

To be clear, I am not suggesting that science should operate exactly in some way that we may find in industry, or in any non-academic sector more broadly. I don’t know what the best working structures for science would precisely look like – but I do know that we can benefit from the vast knowledge and experience on teamwork that non-academic sectors have on offer. It’s not black and white – we could surely adopt some things from fields outside of academia to make better, more efficient and more impactful science. E.g., in a Clearer Thinking podcast episode on so-called Research Focused Organizations (FRO), Adam Marblestone explains how science can learn from start-ups. A FRO would tackle projects that 

“[…] would produce public goods, such as data sets or tools, that could make research faster and easier”, but for which “[a]cademics can rarely muster the time, focus and workforce coordination needed to turn a proof-of-principle technology into a robust, scalable technique or to transform a research project into a platform.”  

It is thus a type of non-profit start-up which has full-time scientists, engineers and executives, and funding for about 5 years (Marbestone et al., 2022).

To find out about what works and what doesn’t, we’d simply need to start experimenting more.[4] I do not advocate for prescribing a particular way of working in science in an absolute way. However, I do contend that in an increasingly complex world where technology progresses rapidly, fields of knowledge get more and more differentiated, the world more inter-connected, and the amount of knowledge people more vast, the scientific community would do better, on average, if we’d have solid feedback loops and collaborative patterns in place.

Good feedback systems and working structures in science would not only make the scientific domain better, but the world at large: to not stagnate, but make further progress in the 21st century, we, in fact, need more people working in science. This, in turn, would call for and require true and functioning collaboration (something that Will MacAskill did emphasize, too, in his book What We Owe The Future).

2.4 Are collaborative structures stifling scientific creativity and solitary work?

Some folks may argue that working with people closely effectively slows down productivity, or that it stifles creativity or solitary work by imposing too many constraints, or giving too much unnecessary structure. I’ll first address the point on creativity. Personally, I believe this is a somewhat romanticised view of science that doesn’t fit the 21st century. Of course, for some people and projects, an individualistic, more isolated approach to work might be best, so it’s, in part, a matter of personal fit. And there is a general tension between collaboration and letting people do their own stuff – for some projects, there can be sharp diminishing returns in adding new people, and there is a coordination cost. As always, flexibility in accommodating people and project needs is hugely important.

But the point here is not to argue for or against any particular working structure in an absolute way. Rather, I want to point to large patterns in how people experience doing science, and how ubiquitously incentives and working structures in academia work against people’s well-being and against doing good science. The way science is currently done is not the best way for a lot of people. From what I’ve been hearing from many fellow academics is that people’s productivity is stifled exactly because they are left alone and feel that what they do does not matter to anyone and anything, reducing their works’ quality along the way.

To be clear, I think creativity, in its chaotically structureless, almost artistic sense, is part of science – there may often be creative elements or processes involved in generating ideas or finding solutions to problems at different stages in a research project. I do not think at all that such creativity stands in contradiction with collaborative working structures.

Regarding the second point on solitary work, I likewise do not think it is inherently incompatible with having more collaboration in science. More collaboration in science means neither that solo work within collaborative projects is not possible, nor that solo projects are excluded. I can imagine that in most collaborative science projects, solo work, in one way or another, will be necessary a lot of times. E.g., while two people may share the responsibility to work on a project, they may do parts of it together or alone, or first alone and then together. Again, it will depend on the project and the people what is good at any given point in time. The crucial bit is that investment and responsibility are meaningfully shared, and people communicate with each other. Thus, while I do argue that in academia, we lean too much towards individual work, on average, and therefore should make a shift towards more collaboration, I don’t see any incompatibility with having solo work within collaborative projects or continuing to have solo projects, too.

2.5 Wrap-up  

For now, we’re in an unbalanced situation where we’re leaning far too strongly on one side of the extreme, i.e., a fragmented, individualistic research culture with slow or little to no corrective forces. For most people, it is not helpful to be left alone in a research project and be the sole person responsible for it. This is not because people lack the intelligence to do science, but because most of the time, the nature of the problem or project they’re working on is nothing that can be solved most effectively when they’re working on it entirely on their own. Moreover, the usual way of working in academia is not conducive to the well-being of most scientists.

So, I believe In the 21st century, working patterns in science should change. We have an outdated structure in place that views the individual as the decisive unit for scientific output which is also reflected in battles over authorship roles in scientific papers (see a promising solution to circumvent those in Demaine & Demaine, 2023). Science would benefit from building stronger, genuinely collaborative structures at different levels – most importantly at the level of where the actual research is done – to avoid and correct mistakes more quickly, build robust knowledge, and thus progress faster. This also leads to less toxic work environments for researchers. In this context, let me mention a quote from Micheal Nielsen’s biographical background, a leader in quantum computing and the open science movement, describing the merits of open science:

"At its heart, open science is a combination of a pragmatic belief in the value of better tools and systems, and old-fashioned Baconian values: a belief that scientific knowledge is held collectively by humanity, and that science and humanity’s interests are best served by a combination of open sharing, collaboration, competition, and robust debate."

Rhinog Fawr, Wales. (CC-BY-SA 4.0)
Rhinog Fawr, Wales. (CC-BY-SA 4.0)

3.0 An appeal for collaboration in the mind sciences 

I think many, if not all, scientific fields can benefit from more collaboration of the kind that I described above. But I think there is an argument to be made that to help improve robustness and correctness in the mind sciences, collaboration and greater feedback loops may be particularly beneficial due to the nature of the subject. I’ll explain what I mean.

(By “mind sciences”, I mean, without being exhaustive, fields like (cognitive and/or computational) neuroscience, (computational) cognitive science, psychology, social sciences, subfields of artificial intelligence, artificial life, and philosophy.)

In the mind sciences, there are less unequivocal ways of verifying/falsifying ideas compared to physics, so there is less of an “external force” to agree on (and subsequently refer to) common established knowledge. While there may be a clear method of, e.g., doing mathematical proofs, there are no such unambiguous ways of, e.g., translating a theory about some cognitive process into an experimental implementation. Close collaborations, in conjunction with inputs from a wider range of researchers and disciplines (including adversarial approaches) may help reduce errors in all stages or parts of the project.

Let me first explain what I mean by “less unequivocal ways of verifying/falsifying ideas“ in the mind sciences and get to Paul Meehl.

3.1 On the nature of the mind sciences 

Meehl (1990) had, already more than 30 years ago, identified “obfuscating factors” that render the evidence landscape for what he called weak explanatory theories (i.e., theories that concern directional differences or associations between two variables/phenomena of interest) uninterpretable, consequently labelling this as

“[…] the well known deficiency of most branches of the social sciences to have the kind of cumulative growth and theoretical integration that characterizes the history of the more successful scientific disciplines [such as physics].”

One of those obfuscating factors concerns, for instance, a loose derivation chain between theory and observation: put in a simplified way, we would have a theory of interest, some auxiliary statements, some experimental conditions that are to be tested as well as statements about which observations are to be expected. Meehl’s concern is that

“[…] very few derivation chains running from the theoretical premises to the predicted observational relation are deductively tight”, thus, “[…] a falsified prediction cannot constitute a strict, strong, definitive falsifier of the [...] theory.”

Another obfuscating factor is, e. g., the “crud factor”: the fact that in psychological and social science everything correlates with everything to some extent through complex causal structures. It is thus hard to find real differences, real correlations or patterns for which there will be some true but complicated multivariate causal graph. While those are not fundamentally unexplainable, they, practically or epistemologically speaking, are.[5]

Meehl’s concerns tie in with Yarkoni’s sharp, more recent critique of research practices in the psychological sciences: his main concern expressed in his 2022 paper The Generalizability Crisis is that

“[v]erbally expressed psychological constructs – things like cognitive dissonance, language acquisition, and working memory capacity – cannot be directly measured with an acceptable level of objectivity and precision. What can be measured objectively and precisely are operationalizations of those constructs – for example, a performance score on a particular digit span task, or the number of English words an infant has learned by age 3.” (Emphasis added.)

As a consequence, the

“[…] validity of the original verbal assertion now depends not only on what happens to be true about the world itself, but also on the degree to which the chosen proxy measures successfully capture the constructs of interest – what psychometricians term construct validity. […] The key question is how closely the verbal and quantitative expressions of one’s hypothesis align with each other.”

Sadly, a large portion of statistical hypotheses in the psychological sciences “[…] cannot plausibly be considered reasonable operationalizations of the verbal hypotheses they are meant to inform.” [6]

For Yarkoni, there are three ways to respond to this:
1. opting out of psychological science entirely,
2. stop using inferential statistics, doing instead descriptive statistics and qualitative discussion, or
3. adopting better standards.

Regarding the third point, he suggests things like 

Ultimately, Yarkoni gloomily concedes that the third path is effortful, and it’s an open question how the psychological sciences can be helped by it.[7]

Yarkoni’s fundamental critique of psychological research had elicited a big number of responses, one of which came from Lakens et al. (2022) and which challenged Yarkoni’s assumption that every psychological or cognitive construct would be identical with the space of possible operationalizations. Whereas, e.g., the concept of “color” indeed is identical to the space of possible operationalizations – the colors in the visible  spectrum –, many concepts such as, for instance, “anger” aren’t, meaning that they are “[…] semantically richer than and cannot be reduced to their operationalizations (e.g., anger is not  just  what anger measures measure).” As a consequence, merely focussing on aligning verbal and quantitative expressions of one’s hypotheses as demanded by Yarkoni would fail to acknowledge the fundamental gap between verbal and statistical hypotheses in the first place, “[…] no matter how expansive the fitted model is.”

So… it can all get quite complicated. There is more to say to the struggles in the psychological – and, by extension, the mind sciences –, but I will constrain myself at that point to offer only a glimpse into some of the issues that arise. The point here is to showcase some of the difficulties and disputes that the mind sciences face. (Looking at the 38 responses elicited by Yarkoni’s paper might be an interesting starting point to know more.)

3.2 Why collaboration and feedback help 

I had earlier claimed that the mind sciences may particularly benefit from people being more in exchange and working collaboratively – this is in order to develop a shared understanding of the subject as well as common ground of knowledge, thereby increasing responsibility towards a common shared goal or agenda.

As I had already said above, people forming a coalition may also reinforce each other in believing something that is false or problematic, or in engaging in bad practices which is why a group will rely no less on external corrective input than a single person. While collaborations and embedment into greater feedback loops do not guarantee that truth will be found, they come with some useful error-correction mechanisms (such as code review, questioning of assumptions, experimental designs, reviewing results, to name a few) built in. To restate my former belief, I think that collaboration and feedback at various levels do raise the probability that wrong ideas are replaced by (slightly) less wrong ideas incrementally, or that there’s at least a better idea of where the barriers are. This of course applies to science in general, but may be particularly valid for the mind sciences, as it can be so easy to get things wrong in those fields – the research methodology is, in a way, much more complicated than in, e. g., fields like mathematics or particle physics, as it’s less unequivocally right or wrong. Hence, the need to debate what good research methodology is is much higher.

Error correction in the mind sciences may be particularly successful, if collaboration happens between, and/or feedback comes from, people with very different disciplinary lenses on a given topic – something that is of utmost importance in the interdisciplinary nature of the mind sciences (see more on that in the following section).

Ashdown Forest, East Sussex. (CC-BY-SA 4.0)
Ashdown Forest, East Sussex. (CC-BY-SA 4.0)

As things stand, the nature of the knowledge produced in the mind sciences makes it kind of easy to pursue wrong or hardly falsifiable ideas (supported by poor analyses), making it hard for false, ill-defined or useless theories/ideas/narratives to die out (I wrote something related to this in this Twitter thread preceding the symposium mentioned at the beginning).[8] In a way, one can argue that exactly because the mind sciences have faced these kinds of problems from the very beginning that more “exact” sciences haven’t, there may be already a better understanding of how to handle them. For instance, recent, more empirical branches of physics where statistical inferences are equally done, hypotheses tested, and where the topic of investigation is not a general theory with very high predictive power may face similar problems, but have less matured yet to deal with those. Notwithstanding, the mind sciences have not (yet) matured enough, and a lot of the current knowledge produced will remain to be noise and eventually die out. They are still in a very messy, or, as some may state, pre-paradigmatic stage (Kuhn, 2012).

3.3 Towards a shared, multi-disciplinary understanding of the subject of study

It is currently hard to come up with a  good research objective and a good corresponding method in the mind sciences, exactly because they are in this messy state. One must consult a lot of philosophy of science and meta-science questions, but also practical as well as value-based ones – what does work practically, and what is valuable to pursue?

Being embedded in close collaborations as well as in a greater social/interdisciplinary context which encourages exchange on those fronts may help to overcome a “crisis of purpose”, as Jessica Thompson (2021) had put it in her paper on[Forms of explanation and understanding for neuroscience and artificial intelligence. Many folks – particularly early career researchers –  will resonate with the following excerpt:

“Many junior scientists go through a period of questioning or a crisis of purpose at some point during their training. Perhaps especially because of the diverse background assumptions of my collaborators, I was encouraged to question the most basic aspects of my research. Why was I doing this research? To what scientific goal would my research contribute? How were the proposed methods suited to that goal? I found it very challenging to situate my research into a broader scientific enterprise. I could not easily describe how what I was working on would somehow bring my community closer to some ultimate goal. I wanted to “discover the computational mechanisms underlying auditory perception” but I couldn’t precisely tell you what that meant or how I would know whether I had succeeded at taking a useful step towards that goal.”

She, too, emphasizes that people need to cooperate with people from different backgrounds to do science well, “particularly in the interdisciplinary field of neuroscience which most likely lacks a single theory of explanation to account for all explanation”:

“[This lack] suggests that we access the truth in cooperation and in concert, not in isolation, and that who we cooperate with matters. [...]
Explanations are not typically the product of individual experiments or even individual research groups. Achieving explanatory understanding of complex cognitive and neural phenomena will require the sustained coordination of a diverse scientific community.”

Thus, neuroscience, and I’d say the mind sciences more broadly, face particular challenges that can be better overcome, if folks work together.

4.0 Towards moral values in science 

The benefit of joining forces extends to science at large – including all disciplines. As I’ve been arguing throughout this post, this likely produces both better science and happier scientists. This has a strong moral side to it that I would like to make very explicit in this section.

In a paper called Beyond kindness: a proposal for the flourishing of science and scientists (2022), a group of people who had formed the Flourishing Science Think Tank suggests that in order to do science well, we need to adhere to the values of science itself and “[…] recast [it] from a competitively managed activity of knowledge production to a collaboratively organized moral practice that puts kindness and sharing at its core.” They build correspondences between the values of science and an ethical framework of flourishing derived from the Buddhist tradition comprising four “immeasurable values”:

Practically, implementing those values on both individual/small group levels and levels of science governance and institutional management could entail, e. g., 

The assessments flowing from the Flourishing Science Think Tank resonate well with me. A world in which people collaborate will require a balance of a lot of things to join forces and make scientific progress more efficiently and sustainably: 

At the end of the day, we need to ask ourselves: What do we care about most? Some people entertain the view that what science pursues is “simply” truth. Yet seeking the truth is something no one can do on their own - it is an entirely collaborative effort, and thus closely related to moral questions regarding how we seek the truth in conjunction. Being moral with respect to one another is instrumental for getting closer to the truth. It is also instrumental for increasing people’s well-being.[10] (In some strands of moral philosophy such as utilitarian hedonism, morality may be defined as explicitly optimizing for people’s subjective sense of well-being.)

We need value-based science. Those values that allow science to be done well are intricately related to those values that we pursue in our relationships with other human beings (including ourselves).

Importantly, the values to be pursued must be appropriate – e.g., it is possible that collaborative  group structures are strongly fostered within authoritarian norms that stifle critique and pluralism in thinking. Thus, collaborative structures are helpful as long as the moral values of a given community are appropriate.

The Saddle, next to Forcan Ridge, Scotland. (CC-BY-SA 4.0)
The Saddle, next to Forcan Ridge, Scotland. (CC-BY-SA 4.0)

5.0 How incentives prevent us from rigorous and value-based science & what change is needed 

There are already great collaborative efforts taking place in the scientific domain, particularly in the open science community where it’s all about advancing knowledge in a collaborative, ethical way. Big communities include the The Turing Way, Open Life Science, the Psychological Science Accelerator, the Brainhack community (see also Remi Gau et al., 2021), as well as the communities around the Organization for Human Brain Mapping, Society for Research Software Engineering and Software Sustainability Institute, to name a few examples.

So people are collaborating. But not at the necessary scale. From a global perspective, academia operates within outdated, unethical policies and working structures where credit is given to very few people who have succeeded within the problematic standard incentive system currently at place.

5.1 Current incentives in academia 

There are two facts about the incentive system to consider and which hold, to some extent, for any scientific discipline: 

  1. academia’s currency #1 is writing some amount of papers in some known journals – this is all that eventually counts in the pursuit of academic success –, and
  2. scientific journal publishers had discovered that they can make a lot of money by charging huge publication fees – all that eventually counts for them is to make profit.

Take both points together, and you’ll get a lot of research waste. If all that people eventually care about is the quantity of papers and journal prestige, and if all that journal publishers eventually care about is making profit, then no wonder the bulk of research that is being produced is of low quality.[11]

Due to points 1) and 2), every (big or small) group can follow their work without paying much attention to questions on the overall rigour and value of it and/or practical feasibility, as long as academic survival is guaranteed by complying to those incentives. That is, no matter how much feedback one’s research may spark and what other colleagues might say more precisely about it, such feedback very well may remain inconsequential as long as one still manages to publish the study. We all know examples of how easy it can be for bad research to get published no matter what. Thus, complying to academia’s incentives can be done pretty well in silos which, in effect, decouples one’s research activity from any truly consequential feedback system (i.e., feedback that would force oneself to change something about one’s research). To be clear, I am not saying that one should, at any cost, take into account and be influenced by other people’s opinions (that they may have about one’s research). If one honestly and dispassionately disagrees with someone, then that’s fine. The crucial bit here is to do it honestly and dispassionately – two attributes which are not incentivized in current academia, often leading to motivated reasoning.

As stated above, points 1) and 2) hold, to some extent, to any scientific discipline. However, I can imagine this problem is more severe for the mind sciences than, e. g., maths or physics, leading to relatively more low quality work in those fields. Thus, in the mind sciences, we have two mutually reinforcing forces for doing bad research: academic incentives and problems inherent in the subject of study as outlined in section 2.1. As things stand, both the current incentives in academia as well as the nature of the knowledge produced in the mind sciences give scientists a carte blanche to pursue wrong or hardly falsifiable ideas.

Value-based science is not easy given those status quo incentives. They may lead to, or exacerbate the lack of good and ethical, collaborative science (not sure about the underlying true causal graph here – it’s complicated). If we didn’t evaluate the impact of research as well as a researcher’s skills using number of papers and journal impacts, things would certainly look different. Right now, even if the scientific community as a whole wanted to be more collaborative and implement better, robust feedback systems, with the incentives we have, there is just little to no capacity to get this going: everyone is too busy getting their number of papers up. I’ve heard many researchers explicitly say that this is what they were optimizing for, or at the very least have been worrying about. Building good and ethical research processes which foster a strong team- vs. a one-against-all spirit in achieving projects, with feedback from both in- and outside the lab, takes time and would decrease the amount of scientific output in the form of papers tremendously.

As things stand, I do believe current incentives favour particularly narcissistic and/or sociopathic personalities who are the least likely to change the situation. According to the Flourishing Science Think Tank,

“large-scale managerial transformation of the academic enterprise [...], [has led] towards increasing hyper-competition that has promoted (too) much ego-centric behavior and (too) little of the moral and collaborative behavior required for ‘doing good science’.”

This is accompanied by the fact that talent is leaving science (Gewin, 2022). Many people argue that changing the status quo may require excluding neoliberalism from the university sphere. Neoliberalism in the context of academia denotes the framework within which universities have increasingly been governed by the profit motive, as state funding has been replaced by privately paid fees. This, in turn, gave rise to homogenised performance metrics to assess research and teaching. Others don’t see much of a problem in the profit motive of neoliberalism, but suggest  eliminating the misalignment between academia’s business model and doing good research via a greater diversity of metrics and market mechanisms. One must be generally careful with using metrics though: taking a look at European universities, it becomes clear that metrics are going to be gamed – profit or not. In Germany and France, for example, universities are nearly exclusively financed by the state, but the scientific enterprise suffers from the same problems. ("When a measure becomes a target, it ceases to be a good measure" - see Goodhart's law.)

Currently, the two most ubiquitously used metrics are the ones already mentioned a few times in this post: number of papers and prestige of journals. On the one hand, reducing an individual’s output to simple, objective metrics, such as number of publications or journal impacts, saves time and is unambiguous. On the other hand, the long-term costs – for both scientific progress and the well-being of scientists – of using simple quantitative metrics to assess a researcher are large. People’s well-being as well as the quality and robustness of science are at risk.

5.2 Rebuilding the academic space - a quest for experiments and prioritizing long-term benefits 

Tal Yarkoni would say, as he does in this post, that people could well choose to not optimize for academic incentives. I agree. They could. Needless to say, everyone has a moral responsibility to do the right thing. But a bottom-up movement will eventually be not enough – incentives, and therefore top-down policies, will need to change, too.

We may need to fundamentally re-build the academic space under different assumptions and principles. This will also entail a re-assessment of how we evaluate research, and, consequently, a scientist – what do we think should distinguish a “successful” from an “unsuccessful” research project, and what is a good researcher? Regarding the latter, is the criterion productivity? People will produce a lot of low quality research. Is the criterion to publish in high-impact journals? People will do fraudulent science to do so. Is the criterion the future career of students? People will select students with promising profiles rather than educating them. So, how should we instead evaluate someone’s ability to do good science? This seems quite intangible, and directly relates to the hardship in having the right criteria when hiring a scientist. What are good criteria to estimate someone's ability to do research (rather than more specifically their ability to supervise students, or write good code etc.)? In finding answers to those questions, we may want to shift the balance more to a process- rather than outcome-oriented approach. Whatever ideas we come up with, we must be clear about the fact that whenever quantitative metrics are used as proxies to evaluate and reward scientists, those metrics become subject to exploitation, provided it is easier to exploit them than to do something else (i.e., better research).

Overall, we will need to dare to experiment heavily and see what works. As Stuart Richie’s had put it in this article, "[w]e shouldn’t be afraid to trial and test new and creative ideas, even if they might make science look very different from the status quo a decade ago, or even today.” As part of a difficult-to-implement yet sustainable long-term systemic solution, Dienes (2023) suggests to “[…] radically reform the operation of universities, […] bas[ing them] on existing established open democratic practices.” Whatever solutions we come up with – it will come with costs, likely take time, and therefore slow things down, at least at the beginning.[12] But just like with the increasing adoption of open science  practices, building a solid base and making things right from the ground up first can be more effective and beneficial in the long run. This means to take one step back in order to go three steps forward. Slowing down the production of scientific outcomes to build processes conducive to long-term scientific progress will eventually pay off. I think this is the case even if the final processes built make outcome production overall slower, e.g., to ensure scientific rigour. In that case, prioritizing what to do research on will be key which, in times of too many studies being conducted, would be a good thing. In this context, the Importance-Tractability-Neglectedness framework [? · GW] which is a core method in the Effective Altruism movement (coming from science) for prioritizing which problems are most impactful to work on might be an option[13], or using impact certificates to accelerate academic research. This post [EA · GW] identifies reasons why the choice of research questions is often poor: difficulties in getting an overview of the field, bad publishing priorities of journals, bad funding priorities of grantmakers, too short-term projects, lack of creativity and boldness, as well as lack of connection with the end-user in the design of research question. The Importance-Tractability-Neglectedness framework could also be applied to the topic of improving academia as an institution itself.

It is hard for humans to prioritize solutions that are optimal in the long-term, but come with middle or short-term costs. Yet, at no other point in time were we that much in a position – could we afford to that extent – to think about the long-term future and be guided by it in the actions we take today.[14] Long-term solutions particularly rely on people acting in concert. While it is challenging to distinguish problems that greatly require joined forces from those which don’t, getting the incentives in academia right certainly is a collective, global endeavour and therefore belongs to the former category.

A lot of people, groups, and organizations have joined forces already, e. g., Changing Expectations is a Royal Society programme that “[…] aims to understand how best to steward research culture through a shifting research landscape” and involves many topics I have talked about in this post. As another example, one of the questions the Institute for Progress focuses on is “[…] how [we can] change the incentives and funding structures within science to produce more breakthrough research.” Another good resource is the Metascience Conference.

If we can get more people – from academia and any other area, as it’s a societal problem – to stare into the abyss, we may take a discrete step towards making science more rigorous, impactful, and inclusive, and more in accordance with people’s values and well-being. Achieving long-term systemic changes will involve all the more the need to act in unison, and require us to be okay with a temporary loss of stability that inevitably comes with big change.

Sandwood Bay, Scotland. (CC-BY-SA 4.0)

Acknowledgements 

I am thankful to Guillaume Corlouer and Reny Baykova for critically reviewing this post and suggesting text edits to improve text flow and understandability. I am especially thankful to Moritz Boos for critically reviewing this post and thoroughly inspecting the arguments made - for multiple in-depth discussions on philosophy of science and the inherent difficulties in the mind sciences, as well as for making substantial efforts in text editing to improve text flow and understandability.

I am very grateful for the feedback I have received from all three of them.

I am also grateful for the feedback I’ve received from Zoltan Dienes, Nicolas Macé and Dhruva Raman that have led to modifications of the post after publishing it.

References

Scientific papers 

Demaine, E. D., & Demaine, M. L. (2023). Every Author as First Author. arXiv preprint arXiv:2304.01393.

Dienes, Z. (2023). The credibility crisis and democratic governance: How to reform university governance to be compatible with the nature of science. Royal Society Open Science, 10(1), 220808.

Dijstelbloem, H., Huisman, F., Miedema, F., & Mijnhardt, W. (2014). Science in Transition Status Report: Debate, Progress and Recommendations. Online verfügbar unter http://www.scienceintransition. nl/wp-content/uploads/2014/07/Science-in-Transition-Status-Report-June-2014. pdf, zuletzt geprüft am, 19, 2014.

Flourishing Science Think Tank (2022): Dienes, Z., Fucci, E., Rees, M. G., Lübbert, A., Schumann, F., Van Vugt, M. Beyond kindness: a proposal for the flourishing of science and scientists.

Gau, R., Noble, S., Heuer, K., Bottenhorn, K. L., Bilgin, I. P., Yang, Y. F., ... & Marinazzo, D. (2021). Brainhack: Developing a culture of open, inclusive, community-driven neuroscience. Neuron, 109(11), 1769-1775.

Gewin, V. (2022). Has the'great resignation'hit academia?. Nature, 211-213.

Ioannidis, J. P., Klavans, R., & Boyack, K. W. (2018). Thousands of scientists publish a paper every five days.

Kuhn, T. S. (2012). The structure of scientific revolutions. University of Chicago press.

Lakens, D., Uygun Tunç, D. U. Y. G. U., & Tunc, M. (2022). There is no generalizability crisis Comment. Behavioral and Brain Sciences, 45.

Marblestone, A., Gamick, A., Kalil, T., Martin, C., Cvitkovic, M., & Rodriques, S. G. (2022). Unblock research bottlenecks with non-profit start-ups. Nature, 601(7892), 188-190.

Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological reports, 66(1), 195-244.

Orben, A., & Lakens, D. (2020). Crud (re) defined. Advances in Methods and Practices in Psychological Science, 3(2), 238-247.

Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society open science, 3(9), 160384.

Thompson, J. A. (2021). Forms of explanation and understanding for neuroscience and artificial intelligence. Journal of Neurophysiology.

Yarkoni, T. (2022). The generalizability crisis. Behavioral and Brain Sciences, 45, e1.

Other resources/links referred to in this post (in chronological order)

Symposium on Rethinking Computational Approaches to the Mind – Fundamental Challenges and Future Perspectives 

Effective Altruism Forum 

Academic precarity and the single PI lab model 

Anthropic 

Chris Olah on what the hell is going on inside neural networks 

What We Owe The Future 

Biographical Background: Michael Nielsen 

Twitter thread about Symposium on Computational Approaches to the Mind 

Goodhart’s Law 

The Turing Way 

Open Life Science 

The Psychological Science Accelerator 

Brainhacks 

Organization for Human Brain Mapping 

Society for Research Software Engineering 

Software Sustainability Institute 

No, it’s not The Incentives—it’s you 

Rebuilding After the Replication Crisis 

Importance-Tractability-Neglectedness framework [? · GW

Effective Altruism 

Why scientific research is less effective in producing value than it could be: a mapping [EA · GW

Changing expectations

Institute for Progress – Metascience 

Metascience Conference 

Staring into the abyss as a core life skill

Clearer Thinking Podcast Episode 157: Science is learning from start-ups (with Adam Marblestone) 

The Mythical Man-Month 

Why Anima International suspended the campaign to end live fish sales in Poland  [EA · GW]

Quote from Richard Hamming 

Identifying Impactful Research Topics 

  1. ^

     “Computational approaches” denote, in the most basic sense, the usage of somewhat sophisticated maths in one’s research (going beyond things like statistical significance testing which is also using maths – as is anything that is not purely qualitative). In the context of neuroscience, the Organization for Computational Neurosciences (CNS) describes computational neuroscience as an 

    “[…] interdisciplinary field for development, simulation, and analysis of multi-scale models and theories of neural function from the level of molecules, through cells and networks, up to cognition and behavior.” 

    I had ventured out to, broadly speaking, study information-theoretic quantities reflecting concepts like emergence and complexity (both are strongly related, but not the same) in complex dynamical systems. 

  2. ^

    Some will say that this will cause problems in assessing their eligibility to get a doctoral degree, as the assessment is particularly about someone’s individual skills. I believe this is a minor problem, and I’d be confident that there would be ways to assess someone’s abilities in that scenario, too. What we would gain would outweigh the difficulties in assessing someone’s performance by large margins: we’d stop wasting a huge amount of human resources that could have been spent more usefully in the first place. How many academic works end up in a drawer? How many mistakes could have been avoided, and how much faster could the learning process have been, if there was someone to whom a given project would be equally (or at least to some significant extent) important? And how much more could resources have been spent on something more impactful?

  3. ^

    This gets us to the question of what contributions to acknowledge in a scientific publication, and, in relation to this, whom to grant co-authorship. Scientific papers are still academia’s #1 currency. Although I do hope that this changes, as long as this remains to be the case, I think it would be good to have a very encompassing approach to granting co-authorship: as a heuristic, I’d grant anyone an authorship role who had, in small or big ways, contributed to a given research project, and specify clearly in a dedicated contributions section in what way people had done so (e. g., via paper writing, providing input via discussion, developing code, collecting data, providing further engineering infrastructure and the like). Importantly, this would not mean that everyone should be involved in writing the paper. Most people probably wouldn’t. Thus, while everyone involved in the project would be named as an "author", not everyone would have worked on the actual text. This does, and in a way, and does not change the concept of "authorship" - normally, when reading papers, people assume that the authors named on it have worked on the text in some way. Yet, in reality, this is often not true, e. g., it may happen that someone is granted co-authorship – even without having read the paper – because that person had provided the funding. (I had experienced such a case myself.)

  4. ^

    To choose another somewhat random example, in Mythical Man Month – Essays on Software Engineering, Frederick Brooks shares his wisdom of how to manage complex large-scale software engineering projects (which he had done as a project manager for the IBM System/360 computer family and then for OS/360). This is not quite the research context, but it’s still generally highly insightful w. r. t. how teams can be made to work well, no matter their size. As science projects become more large-scale, too, and research software engineering teams grow bigger and bigger, managing large software efforts will increasingly become necessary and thus key for success. The essays (including their photographic underpinning) are delightful to read.

  5. ^

    Even 50 years later since the first introduction of the crud factor, Orben & Lakens (2020) highlight “[…] a[n ever] common and deep-seated lack of understanding about what the crud factor is”. With the increased usage of large-scale data in the psychological sciences, its importance is “[…] currently experiencing a renaissance.”

  6. ^

    This has also serious implications for replication efforts which Yarkoni deems, in many cases, to be in vain, if the experimental design of the study to be replicated is fundamentally uninformative.

  7. ^

    For that reason, Yarkoni urges everyone to honestly evaluate their scientific contributions (long quote ahead): 

    “[…] all psychologists, and early-career researchers in particular, owe it to themselves to spend some time carefully and dispassionately assessing the probability that the work they do is going to contribute meaningfully – even if only incrementally – to our collective ability either to understand the mind or to practically improve the human condition. [...] I am also sympathetic to objections that it’s not fair to expect individual researchers to pro-actively hold themselves to a higher standard than the surrounding community, knowing full well that a likely cost of doing the right thing is that one’s research may become more difficult to pursue, less exciting, and less well received by others. Unfortunately, the world we live in isn’t always fair. I don’t think anyone should be judged very harshly for finding major course correction too difficult an undertaking after spending years immersed in an intellectual tradition that encourages rampant overgeneralization. One is always free to pretend that small p-values obtained from extremely narrow statistical operationalizations can provide an adequate basis for sweeping verbal inferences about complex psychological constructs. But noone else – not one’s peers, not one’s funders, not the public, and certainly not the long-term scientific record – is obligated to honor the charade.”

  8. ^

    This does not only apply to theories, ideas, or narratives, but to whole research agendas: consequential feedback may also entail that year- or decade old research agendas need be abandoned, if they turn out to be false, the wrong direction, or simply low priority. Right now, it can be quite easy to pursue an agenda and just stick to it no matter what. It is also somewhat required for good scientific reputation. Acknowledging that a broader agenda has turned out wrong should be looked at with respect, if not rewarded. (See an, admittedly arbitrary, example for how it could be done from Anima International which had suspended a suspended a campaign to end live fish sales [EA · GW] in Poland – I tried, but couldn’t find anything suitable from the science context!).

  9. ^

    Tools are key to advancing in science, as they not only empower more people to engage in research, but may expand our boundaries of what we are able to know – consider this illustrative quote from Richard Hamming (it may appear quite out of context with respect to the main topic of the post, but it’s great, so I include it nonetheless): 

    “Just as there are odors that dogs can smell and we cannot, as well as sounds that dogs can hear and we cannot, so too there are wavelengths of light that we cannot see, and flavors we cannot taste. Why then, given that our brains are wired the way they are, does the remark ‘Perhaps there are thoughts we cannot think’ surprise you? Evolution so far may possibly have blocked us from being able to think in some directions. There could be unthinkable thoughts."

  10. ^

    There is  no such thing as “value-free” research in the first place. Every research objective a scientist pursues is a statement about its value – about how its being prioritized while leaving other endeavours to pursue in the world aside. In a world of limited lives and resources, this always has moral implications. We inevitably make moral choices in anything we do (and don’t do), as we can’t but be embedded in the world, profiting from various resources it provides to us and therefore being obliged to be considerate about our own impact on it and how we make use of those resources.

  11. ^

    This is also reflected in a mass production of papers no one can handle – there are way too many of them (Ioannidis, Klavans, & Boyack, 2018). It’s impossible to look into all of those that seem relevant (at least not if one works in an absolute niche), so people make, to some significant extent, arbitrary choices (probably reflecting systemic biases). People are also, of course, not oblivious to the fact that a lot of bad and fraudulent research is published, so they often do not believe the literature they come across, leading to a “contemporary crisis of faith in research” where "[…] many prominent researchers believe that as much as half of the scientific literature – not only in medicine, by also in psychology and other fields – may be wrong” (Smaldino, 2016).

  12. ^

    From Dijstelbloem et al. (2014) (long quote ahead): 

    "We need gradual change by means of debate and experimentation. We need to endure that all participants in the system act in unison on the basis of discussions about problems and then proceed to experiment with innovations. Acting in unison is vital when analysing the quality of young researchers and research teams who are dependent on a reliable and predictable system. There has to be agreement about this with research institutions and those parties that finance research. A great many researchers, at all hierarchical levels, can identify with the Science in Transition problem analysis. However, a great many researchers also say that they are incapable of change, even though they would very much like to. For an individual researcher or even an individual university it is, essentially, impossible to withdraw from the system. This means that the system as a whole has to change and that requires a political stimulus. Various participants are pointing the finger at each other. Universities do want to evaluate research and researchers on the basis of quality and societal impact but only want to do so if all of universities and NOW participate and proceed in the same manner. Changes to this type of system are linked to temporary loss of stability. There is no avoidance of cost of some kind. Universities fear loss of income and will not therefore readily have a tendency to enter into such a transition."

  13. ^

    There was an event on May 3rd called Identifying Impactful Research Topics at the Metascience Conference on exactly that framework.

  14. ^

    If you’re interested in what the full consequences of long-termism thinking look like, both philosophically and practically speaking, have a look at Will MacAskill’s book What We Owe The Future – A Million-Year View.

0 comments

Comments sorted by top scores.