Is it ethical to work in AI "content evaluation"?

anon_databoy123

Is it ethical to work in AI "content evaluation"?

post by anon_databoy123 (noob1234) · 2025-01-27T19:58:26.176Z · LW · GW · 2 comments

2 comments

I recently graduated with a CS degree and have been freelancing in "content evaluation" while figuring out what’s next. The work involves paid tasks aimed at improving LLMs, which generally fall into a few categories:

Standard Evaluations: Comparing two AI-generated responses and assessing them based on criteria like truthfulness, instruction-following, verbosity, and overall quality. Some tasks also involve evaluating how well the model uses information from provided PDFs.
Coding Evaluations: Similar to standard evaluations but focused on code. These tasks involve checking responses for correctness, documentation quality, and performance issues.
"Safety"-Oriented Tasks: Reviewing potentially adversarial prompts and determining whether the model’s responses align with safety guidelines, such as refusing harmful requests like generating bomb instructions.
Conversational Evaluations: Engaging with the model directly, labeling parts of a conversation (e.g., summarization or open Q&A), and rating its responses based on simpler criteria than the other task types.

Recently, I have been questioning the ethics of this work. The models I work with are not cutting-edge, but improving them could still contribute to AI arms race dynamics. The platform is operated by Google, which might place more emphasis on safety compared to OpenAI, though I do not have enough information to be sure. Certain tasks, such as those aimed at helping models distinguish between harmful and benign responses, seem like they could be geared towards applying RLHF and are conceivably net-positive. Others, such as comparing model performance across a range of tasks, might be relevant to interpretability, but I am less certain about this.

Since I lean utilitarian, I have considered offsetting potential harm by donating part of my earnings to AI safety organizations. At the same time, if the work is harmful enough on balance, I would rather stop altogether. Another option would be to focus only on tasks that seem clearly safety-related or low-risk, though this would likely mean earning less, which could reduce prospective donations.

2 comments

Comments sorted by top scores.

comment by Dave Orr (dave-orr) · 2025-01-28T01:28:44.766Z · LW(p) · GW(p)

I'm probably too conflicted to give you advice here (I work on safety at Google DeepMind), but you might want to think through, at a gears level, what could concretely happen with your work that would lead to bad outcomes. Then you can balance that against positives (getting paid, becoming more familiar with model outputs, whatever).

You might also think about how your work compares to whoever would replace you on average, and what implications that might have as well.

Replies from: noob1234

↑ comment by anon_databoy123 (noob1234) · 2025-01-28T22:36:49.539Z · LW(p) · GW(p)

Part of why I ask is because it's difficult for me to construct a concrete gears-level picture of how (if at all) my work influences eventual transformative AI. I'm unsure about the extent to which refining current models' coding capabilities accelerates timelines, whether some tasks are possibly net-positive, whether these impacts are easily offset etc.

Is it ethical to work in AI "content evaluation"?

Contents

2 comments