Posts

Comments

Comment by voyantvoid on A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX · 2023-09-05T14:18:54.052Z · LW · GW

For the Skunk Works and SpaceX examples, I did find myself wondering whether some aspects like the powerful decisive managers are strictly better or merely increase variance and so appear more often when looking at the most successful projects. I haven't done much reading of the primary and secondary sources for progress studies, how easy would it be to find details of the practices of average or failed projects to compare against?

Comment by voyantvoid on Paper: On measuring situational awareness in LLMs · 2023-09-05T12:07:16.869Z · LW · GW

The Curse of Reversal seems to match the lack of bidirectionality of ROME edits mentioned here: https://www.alignmentforum.org/posts/QL7J9wmS6W2fWpofd/but-is-it-really-in-rome-an-investigation-of-the-rome-model

Comment by voyantvoid on Attempted Gears Analysis of AGI Intervention Discussion With Eliezer · 2021-11-17T21:57:52.684Z · LW · GW

RE: claim 25 about the need for research organisations , my first thought is that government national security organisations might be suitable venues for this kind of research as they have several apparent advantages:

  • Large budgets
  • Existing culture and infrastructure for research in secret with internal compartmentalisation
  • Comparatively good track record for keeping results secret in crypto, such as the NSA with RSA or GCHQ with PGP
  • Routes to internal prestige and advancement without external publication
  • Preventing the creation of unaligned AI would accord with their national security goals

However, they may introduce problems of their own:

  • Clearance requirements limit the talent pool that can work with them
  • As government organisations with less of a start-up culture, they may be less accommodating of this kind of research
  • An information leak that one organisation is researching this area could lead to international arms races
  • Tools suitable for public release that are developed may be seen as untrustworthy by association, such as the skepticism towards the NSA's crypto advice
  • A research group would be more beholden to higher-ups who would likely be less sympathetic to the necessity of alignment work compared to capability work

Has this option been discussed already?