Posts

Catastrophic Cyber Capabilities Benchmark (3CB): Robustly Evaluating LLM Agent Cyber Offense Capabilities 2024-11-05T01:01:08.083Z

Comments