LessWrong 2.0 Reader
View: New · Old · Top← previous page (newer posts) · next page (older posts) →
← previous page (newer posts) · next page (older posts) →
This seems a bit odd given past literature on LLMs. As I've noted before, you can do inner-monologue problems specifically via knowledge-distillation somewhat analogous to your finetuning, and it's also possible to ask models to solve multiple problems simultaneously analogous to your base task (or do various kinds of speculative or parallelized decoding at a lower level). There is enormous computational waste and slack, and capacity to spare for multiple problems. So it not working for the OA "finetuning" of GPT-3.5 is unexpected: I can't think of any previous results aimed at making forward passes do more which failed completely (although ofc maybe they just don't get reported or I didn't happen to read them etc).
I notice this is not the first time I've left a puzzled comment on a post where the authors failed to make GPT-3.5 do something via OA "finetuning" that it seemed like it definitely should have been capable of after finetuning or which non-OA models did do... And the common ingredient seems like the OA "finetuning".
I'm not aware of any experiments by third parties demonstrating that OA "finetuning" works like it's supposed to. Maybe someone should do that before more people try to do AI safety research predicated on the assumption that using OA's "finetuning" is telling you anything meaningful about LLMs in general, rather than being like, say, trying to understand LLM poetry by looking at ChatGPT's rhymes or LLM linguistic knowledge by asking one to spell words.
dagon on Are There Other Ideas as Generally Applicable as Natural SelectionThe other common optimization process people generally refer to is "markets". In the same sense that "evolution" is what happens when variation and selection combine, "market" is what happens when multiple traders repeatedly choose how to exchange things.
o-o on Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)Is there evidence that METR had more than nominal impact? I also think the lack of clout will limit his influence in the government. To some government employee, he’s just someone from a random startup they never heard of having outsized influence.
bec-hawk on Stephen Fowler's ShortformI was more focused on the ‘company’ part. To my knowledge there is no such thing as a non-profit company?
buck on Stephen Fowler's ShortformAs a non-profit it is obligated to not take opportunities to profit, unless those opportunities are part of it satisfying its altruistic mission.
buck on Stephen Fowler's ShortformIn your initial post, it sounded like you were trying to say:
This grant was obviously ex ante bad. In fact, it's so obvious that it was ex ante bad that we should strongly update against everyone involved in making it.
I think that this argument is in principle reasonable. But to establish it, you have to demonstrate that the grant was extremely obviously ex ante bad. I don't think your arguments here come close to persuading me of this.
For example, re governance impact, when the board fired sama, markets thought it was plausible he would stay gone. If that had happened, I don't think you'd assess the governance impact as "underwhelming". So I think that (if you're in favor of sama being fired in that situation, which you probably are) you shouldn't consider the governance impact of this grant to be obviously ex ante ineffective.
I think that arguing about the impact of grants requires much more thoroughness than you're using here. I think your post has a bad "ratio of heat to light": you're making a provocative claim but not really spelling out why you believe the premises.
jacques-thibodeau on Ilya Sutskever and Jan Leike resign from OpenAI [updated]In case people missed this, another safety researcher recently left OpenAI: Ryan Lowe.
I don't know Ryan's situation, but he was a "research manager working on AI alignment."
dave-orr on Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)I think your model is a bit simplistic. METR has absolutely influenced the behavior of the big labs, including DeepMind. Even if all impact goes through the big labs, you could have more influence outside of the lab than as one of many employees within. Being the head of a regulatory agency that oversees the labs sets policy in a much more direct way than a mid level exec within the company can.
andeslodes on keltan's ShortformI'm confused by what you mean that GPT-4o is bad? In my experience it has been stronger than plain GPT-4, especially at more complex stuff. I do physics research and it's the first model that can actually improve the computational efficiency of parts of my code that implement physical models. It has also become more useful for discussing my research, in the sense that it dives deeper into specialized topics, while the previous GPT-4 would just respond in a very handwavy way.
amalthea on [Linkpost] Please don't take Lumina's anticavity probioticIt's not an entirely unfair characterization.