OpenAI and Anthropic Partner with US AI Safety Institute for AI Model Safety Research

Collaboration for AI Model Safety Research

OpenAI and Anthropic have recently signed an agreement with the AI Safety Institute under the National Institute of Standards and Technology (NIST) to collaborate on AI model safety research, testing, and evaluation. This partnership will involve providing the AI Safety Institute with access to new AI models that both companies plan to release before and after public release. Similar to the U.K.’s AI Safety Institute, this agreement allows for pre-release testing of foundation models by AI developers.

The Importance of Responsible AI Development

Elizabeth Kelly, the Director of the AI Safety Institute, expressed her enthusiasm for these collaborations, stating that they mark an important milestone in advancing the science of AI safety and responsible AI development. The AI Safety Institute will also provide feedback to OpenAI and Anthropic on potential safety improvements to their models, working closely with their partners at the U.K. AI Safety Institute.

Defining U.S. AI Rules

Both OpenAI and Anthropic believe that signing this agreement with the AI Safety Institute will contribute to defining how the U.S. develops responsible AI rules. Jason Kwon, OpenAI’s Chief Strategy Officer, expressed their support for the institute’s mission and its role in defining U.S. leadership in responsible AI development. OpenAI has been vocal about the need for regulations surrounding AI systems and is committed to providing its models to government agencies for safety testing and evaluation before release.

Strengthening Safety Measures

Anthropic, which has hired some of OpenAI’s safety and superalignment team, has already sent its Claude 3.5 Sonnet model to the U.K.’s AI Safety Institute for testing before its public release. Jack Clark, Anthropic’s Co-founder and Head of Policy, emphasizes the importance of collaborating with the U.S. AI Safety Institute to rigorously test their models and identify and mitigate risks. This partnership sets new benchmarks for safe and trustworthy AI development.

Regulating Model Safety

The U.S. AI Safety Institute at NIST was established through the Biden administration’s executive order on AI. While the executive order calls for AI model developers to submit models for safety evaluations before public release, it is not legislation and can be overturned by future presidents. NIST acknowledges that the submission of models for safety evaluation remains voluntary but believes it will contribute to the safe and trustworthy development and use of AI.

Addressing Concerns and Ensuring Accountability

While the agreement between the U.S. AI Safety Institute, OpenAI, and Anthropic is seen as a step in the right direction by groups focused on AI safety, concerns remain about the vague nature of the term “safety” and the lack of clear regulations in the field. Nicole Gill, Executive Director and Co-founder of Accountable Tech, emphasizes the importance of AI companies following through with their promises and commitments. Regulators must gain insight into the rapid development of AI to ensure better and safer products.

In conclusion, the collaboration between OpenAI, Anthropic, and the U.S. AI Safety Institute represents a significant step towards responsible AI development and model safety. By working together and conducting thorough testing and evaluation, these companies aim to advance the science of AI safety and define U.S. leadership in this field. However, it is crucial for AI companies to follow through on their commitments and for regulators to monitor and ensure the safe and trustworthy development and use of AI.