Is it ethical to work in AI "content evaluation"?

post by anon_databoy123 (noob1234) · 2025-01-27T19:58:26.176Z · LW · GW · 2 comments

Contents

2 comments

I recently graduated with a CS degree and have been freelancing in "content evaluation" while figuring out what’s next. The work involves paid tasks aimed at improving LLMs, which generally fall into a few categories:

Recently, I have been questioning the ethics of this work. The models I work with are not cutting-edge, but improving them could still contribute to AI arms race dynamics. The platform is operated by Google, which might place more emphasis on safety compared to OpenAI, though I do not have enough information to be sure. Certain tasks, such as those aimed at helping models distinguish between harmful and benign responses, seem like they could be geared towards applying RLHF and are conceivably net-positive. Others, such as comparing model performance across a range of tasks, might be relevant to interpretability, but I am less certain about this.

Since I lean utilitarian, I have considered offsetting potential harm by donating part of my earnings to AI safety organizations. At the same time, if the work is harmful enough on balance, I would rather stop altogether. Another option would be to focus only on tasks that seem clearly safety-related or low-risk, though this would likely mean earning less, which could reduce prospective donations.

2 comments

Comments sorted by top scores.

comment by Dave Orr (dave-orr) · 2025-01-28T01:28:44.766Z · LW(p) · GW(p)

I'm probably too conflicted to give you advice here (I work on safety at Google DeepMind), but you might want to think through, at a gears level, what could concretely happen with your work that would lead to bad outcomes. Then you can balance that against positives (getting paid, becoming more familiar with model outputs, whatever).

You might also think about how your work compares to whoever would replace you on average, and what implications that might have as well.

Replies from: noob1234
comment by anon_databoy123 (noob1234) · 2025-01-28T22:36:49.539Z · LW(p) · GW(p)

Part of why I ask is because it's difficult for me to construct a concrete gears-level picture of how (if at all) my work influences eventual transformative AI. I'm unsure about the extent to which refining current models' coding capabilities accelerates timelines, whether some tasks are possibly net-positive, whether these impacts are easily offset etc.