Posts

Comments

Comment by sepiatone on Model evals for dangerous capabilities · 2024-09-23T11:58:06.670Z · LW · GW

Open-sourcing scaffolding or sharing techniques is supererogatory.

 

I would think sharing the scaffolding would be important. Stronger scaffolding could skew evaluation results. From the complete paragraph you seem to suggest that sufficient information of the scaffolding should be published, so I'm curious what you mean here.

Comment by sepiatone on EIS II: What is “Interpretability”? · 2024-06-06T05:01:46.333Z · LW · GW