On April 15, 2025, OpenAI released an updated Preparedness Framework aimed at enhancing the safety of advanced AI systems. This revision focuses on identifying and mitigating severe risks associated with frontier AI capabilities.
Key Updates:
Source: https://openai.com/index/updating-our-preparedness-framework/
Key Updates:
- Refined Risk Assessment Criteria: OpenAI now prioritizes risks that are plausible, measurable, severe, novel, and either instantaneous or irreversible. This structured approach helps in categorizing and addressing potential threats more effectively.
- Updated Capability Categories:
- Tracked Categories: These include areas with established evaluations and safeguards, such as Biological and Chemical capabilities, Cybersecurity, and AI Self-improvement.
- Research Categories: OpenAI introduces new focus areas like Long-range Autonomy, Sandbagging (intentional underperformance), Autonomous Replication and Adaptation, Undermining Safeguards, and Nuclear and Radiological risks.
- Operational Enhancements: The framework provides clearer guidance on evaluating, governing, and disclosing safeguards, ensuring a more transparent and actionable safety process.
Source: https://openai.com/index/updating-our-preparedness-framework/