Posts

Hopenope's Shortform 2024-12-22T10:52:39.610Z

Comments

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2025-01-31T12:36:02.958Z · LW · GW

is bitter lesson also applicable to AI alignment? what do you think?

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2025-01-22T10:34:27.828Z · LW · GW

Is COT faithfulness already obsolete?  How does it survive the concepts like latent space reasoning, or RL based manipulations(R1-zero)? Is it realistic to think that these highly competitive companies simply will not use them, and simply ignore the compute efficiency? 

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2025-01-09T19:29:50.065Z · LW · GW

I am not sure if longer timelines are always safer. For example, when comparing a two-year timeline to a five-year one, there are a lot of advantages to the shorter timeline. In both cases you need to outsource a lot of alignment research to AI anyway, and the amount of compute and the number of players with significant compute are lower, which reduces both the racing pressure and takeoff speed.

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2025-01-07T17:31:33.776Z · LW · GW

What happened to Waluigi effect? It used to be a big issue, some people were against it, and suddenly it is pretty much forgotten. Are there any related research, or recent demos, that examine it in more detail?

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2024-12-28T20:20:24.366Z · LW · GW

If you have a very short timeline, and you don't think that alignment is solvable in such a short time, then what can you still  do to reduce the chance of x-risk? 

Comment by Hopenope (baha-z) on Hopenope's Shortform · 2024-12-22T10:52:39.708Z · LW · GW

Many expert level benchmarks totally overestimate the range and diversity of their experts' knowledge. A person with a PhD in physics is probably undergraduate level in many parts of physics that are not related to his/her research area, and sometimes we even see that within expert's domain (Neurologists usually forget about nerves that are not clinically relevant).