post by [deleted] · · ? · GW · 0 comments

This is a link post for

0 comments

Comments sorted by top scores.

comment by Milan W (weibac) · 2024-11-10T20:47:26.595Z · LW(p) · GW(p)

I'd rather say that RLHF+'ed chatbots are upon-reflection-not-so-shockingly sycophantic, since they have been trained to satisfy their conversational partner.