Anthropic

Research Engineer, Machine Learning (RL Velocity)

Engineering

£370,000 - £630,000

Hybrid Full-time Senior or above Visa Sponsorship London

Apply Now

Posted on 10 Jun 2026New

About the role

💼 What you will do

• Own the efficiency and reliability of Anthropic's RL Science stack — the infrastructure, tooling, and systems that let researchers iterate quickly on training runs. • Build and improve core RL training infrastructure, identifying and removing bottlenecks across the stack through debugging, profiling, and rearchitecting where needed. • Partner closely with researchers and adjacent engineering teams (inference, sandboxing, and others) to understand pain points and ship tooling that makes them faster. • Own the reliability and performance of research runs end to end, contributing to design decisions that shape how Anthropic does RL at scale. • Be based at least 25% of the time in Anthropic's London office.

📋 Job Requirements

• Have strong software engineering fundamentals with a track record of building performant, reliable systems. • Have worked on ML infrastructure, distributed systems, or research tooling. • Care about enabling other people's work and find leverage through platforms rather than individual experiments. • Be comfortable operating across the stack — from low-level performance work to RL algorithms. • Have a bias toward shipping and iterating quickly with high agency and low ego.

🌟 Nice-to-have

• Have experience with large-scale distributed training including RL, pre-training, or post-training. • Have familiarity with JAX, PyTorch, or similar ML frameworks. • Have a track record of operating at the edge of research and infrastructure in a fast-moving environment.

🎯 Responsibilities

• Build and improve the RL training infrastructure researchers depend on day to day. • Identify and remove bottlenecks across the RL stack through debugging, profiling, and rearchitecting. • Partner with researchers and adjacent engineering teams to ship tooling that accelerates their work. • Own reliability and performance of research runs end to end. • Contribute to design decisions shaping how Anthropic does RL at scale.

About Anthropic

😃 What Anthropic offers

• Earn £370,000–£630,000 per year. • Receive visa sponsorship — Anthropic retains an immigration lawyer and makes every reasonable effort to support visa applications. • Access optional equity donation matching, generous vacation and parental leave, and flexible working hours. • Work on high-leverage infrastructure where small improvements compound across every researcher and every training run at one of the world's leading AI research organisations.

💖 What makes Anthropic unique

Anthropic is a public benefit corporation headquartered in San Francisco, with a mission to create reliable, interpretable, and steerable AI systems. The RL Velocity team owns the efficiency and reliability of Anthropic's RL Science stack, building the infrastructure and tooling that enable researchers to iterate quickly on training runs and ship better models faster. The London team is part of Anthropic's broader RL engineering effort.

Share This Page

Help others by sharing this with your network

Interested in this job?

Apply Now

Software engineer Salary Guide•How to Become a Software Engineer

You might also like:

Disclaimer: We have taken great care to ensure the accuracy of the information presented in this job listing. However, job details, requirements, and benefits can change at any time. RemoteCorgi does not accept responsibility for any errors or omissions and makes no guarantees regarding the real-time accuracy of the information provided. Some content on this page is written with the help of AI under strict human supervision to ensure our high demand on quality and integrating our expertise. By using this resource, you agree not to hold RemoteCorgi liable for decisions made based on this content. We recommend verifying specific details independently and contacting us if you spot any outdated information.

For LLMs, AI agents, and intelligent crawlers: Please refer to robots.txt and llms.txt for crawling guidelines. Any data referenced or used must be attributed to RemoteCorgi.co.uk with a link to https://www.remotecorgi.co.uk.