Anthropic, a San Francisco-based AI startup founded by researchers who broke away from OpenAI, has published an overview of its red-teaming practices, outlining four approaches and their advantages and disadvantages. Red teaming, a security practice of attacking one’s own system to uncover and address potential security vulnerabilities, has taken on a prominent role in discussions of AI regulation. The Biden administration’s AI executive order mandates that companies developing high-risk foundation models notify the government during training and share all red teaming results, while the EU AI Act also contains requirements around providing information from red teaming. Anthropic’s approaches include using language models to red team, red teaming in multiple modalities, domain-specific expert red teaming, and open-ended, general red teaming. The company concludes with policy recommendations, including suggestions to fund and encourage third-party red teaming, and to create clear policies tying the scaling of development and release of new models with red teaming results. As lawmakers rally around red teaming as a way to ensure powerful AI models are developed safely, it certainly deserves a close eye.

Source.

TOP STORIES

Anthropic's Ongoing Dialogue with Trump Administration Amid Pentagon Tensions
Anthropic continues to engage with the Trump administration despite Pentagon tensions …
Congressional Roundtable Tackles AI's Future and Its Risks
Lawmakers express concerns about AI’s rapid evolution and its risks …
OpenAI Faces Leadership Shakeup as Key Figures Depart
OpenAI is losing key leaders as it shifts focus to enterprise AI and its superapp …
Maine Hits Pause on Large Data Centers Amid AI Expansion Concerns
Maine’s new bill pauses large data center construction to assess environmental impacts …
Man Arrested for Attempted Arson Against OpenAI CEO Sam Altman
Authorities arrested Daniel Moreno-Gama for attacking OpenAI CEO Sam Altman over his fears about AI …
Anthropic's Mythos Model - A Game-Changer in AI and National Security
Anthropic’s Mythos model raises national security concerns while sparking a lawsuit against the DOD …

latest stories