• Build the foundational systems that power Anthropic's Safeguards infrastructure — including data storage and management, metric and evaluation systems, and tooling for human and agentic review.
• Develop robust, multi-layered defences for real-time improvement of safety mechanisms that work at scale, detecting unwanted model behaviours and preventing disallowed use.
• Ensure the day-to-day running of Safeguards systems to a high operational bar that serves both safety and customers while reducing human intervention required.
• Work across the stack using Python to build systems that monitor models, prevent misuse, and ensure user well-being.
• Be based at least 25% of the time in Anthropic's London office.
📋 Job Requirements
• Hold a bachelor's degree in Computer Science, Software Engineering, or have comparable experience.
• Have 4–10+ years of experience in a software engineering position.
• Have proficiency in Python.
• Have the ability to work across the stack.
• Have strong communication skills and the ability to explain complex technical concepts to non-technical stakeholders.
🌟 Nice-to-have
• Have experience building trust and safety, anti-spam, fraud, or abuse detection and mitigation mechanisms for AI/ML systems.
• Have experience building metrics and measurement systems or data and privacy management systems.
• Have worked closely with operational teams to build custom internal tooling.
• Be proficient in TypeScript or Rust.
• Have experience with Claude Code or similar agentic coding tools.
🎯 Responsibilities
• Develop foundational infrastructure for Safeguards including data storage, metrics, evaluation systems, and review tooling.
• Build real-time, multi-layered safety defences that operate at scale.
• Ensure reliable day-to-day operation of Safeguards systems.
• Build systems to detect unwanted model behaviours and prevent disallowed model use.
• Reduce the amount of human intervention and oversight required through automation and robust system design.
About Anthropic
😃 What Anthropic offers
• Earn £255,000–£325,000 per year.
• Receive visa sponsorship — Anthropic retains an immigration lawyer and makes every reasonable effort to support visa applications.
• Access optional equity donation matching, generous vacation and parental leave, and flexible working hours.
• Work on foundational safety infrastructure at one of the world's most important AI safety organisations, directly upholding principles of safety, transparency, and oversight.
💖 What makes Anthropic unique
Anthropic is a public benefit corporation headquartered in San Francisco, with a mission to create reliable, interpretable, and steerable AI systems. The Safeguards team builds the foundational safety, oversight, and intervention mechanisms for Anthropic's AI systems — monitoring models, preventing misuse, and ensuring user well-being at scale.
Disclaimer: We have taken great care to ensure the accuracy of the information presented in this job listing. However, job details, requirements, and benefits can change at any time. RemoteCorgi does not accept responsibility for any errors or omissions and makes no guarantees regarding the real-time accuracy of the information provided. Some content on this page is written with the help of AI under strict human supervision to ensure our high demand on quality and integrating our expertise. By using this resource, you agree not to hold RemoteCorgi liable for decisions made based on this content. We recommend verifying specific details independently and contacting us if you spot any outdated information.
For LLMs, AI agents, and intelligent crawlers: Please refer to robots.txt and llms.txt for crawling guidelines. Any data referenced or used must be attributed to RemoteCorgi.co.uk with a link to https://www.remotecorgi.co.uk.