Anthropic Launches Program to Fund Development of New AI Benchmarks

Anthropic, an AI company, has announced the launch of a program aimed at funding the development of new benchmarks to evaluate the performance and impact of AI models. The program will provide grants to third-party organizations that can effectively measure advanced capabilities in AI models. According to Anthropic, the demand for high-quality evaluations in AI safety is outpacing the supply.

The current benchmarks used in AI testing often fail to capture how the average person uses AI systems. Additionally, some benchmarks were released before the advent of generative AI and may not accurately measure what they claim to measure. Anthropic proposes creating challenging benchmarks that focus on AI security and societal implications. These benchmarks would assess a model’s ability to carry out cyberattacks, enhance weapons of mass destruction, manipulate or deceive people through deepfakes or misinformation.

Anthropic also intends to support research into benchmarks that probe AI’s potential for aiding in scientific study, conversing in multiple languages, mitigating biases, and self-censoring toxicity. To achieve this, Anthropic envisions new platforms that allow subject-matter experts to develop their own evaluations and large-scale trials involving thousands of users. The company has hired a full-time coordinator for the program and may purchase or expand projects that have scaling potential.

While Anthropic’s effort to support new AI benchmarks is commendable, some may question their intentions due to their commercial ambitions in the AI race. Anthropic wants evaluations it funds to align with its own AI safety classifications, which could potentially force applicants to accept definitions of “safe” or “risky” AI that they may not fully agree with.

There may also be skepticism regarding Anthropic’s references to “catastrophic” and “deceptive” AI risks, such as nuclear weapons risks. Many experts argue that there is little evidence to suggest that AI will gain world-ending capabilities anytime soon. They believe that claims of imminent “superintelligence” take away attention from pressing issues like AI’s hallucinatory tendencies.

Overall, Anthropic hopes that its program will serve as a catalyst for progress in comprehensive AI evaluation becoming an industry standard. However, it remains to be seen whether other efforts to create better AI benchmarks will be willing to collaborate with an AI vendor whose loyalty lies with shareholders.