Data Scientist - Reinforcement Learning
We’re hiring a Data Scientist with hands-on Reinforcement Learning experience to help design and operate pricing and decisioning systems used in live, production environments.
This role focuses on applying RL to real optimisation problems , where models directly influence outcomes and must perform under real-world constraints such as noisy data, delayed rewards, and system trade-offs. This is not a research-only role-production impact matters.
What You’ll Be Working On
- Designing, training, and improving reinforcement learning models for pricing and decision optimisation
- Applying ML to sequential decision-making and optimisation problems
- Deploying, operating, and iterating on models in production environments
- Using AWS ML services including SageMaker and Bedrock
- Partnering closely with data engineering and product teams to integrate models into live systems
- Monitoring model performance and improving behaviour based on real-world feedback
Core Technology Stack
Machine Learning : Reinforcement Learning, decision-focused MLCloud ML : AWS SageMaker, AWS BedrockProgramming : PythonWhat We’re Looking For
Demonstrated experience applying reinforcement learning in production systemsStrong applied ML skills with a focus on decisioning, optimisation, or control problemsSolid Python development skillsExperience deploying, maintaining, and iterating on production ML modelsComfort working with imperfect data, evolving requirements, and real-world constraintsNice to Have
Experience with dynamic pricing, auctions, or optimisation systemsBackground in building ML systems that operate at scaleExperience working closely with data engineering teams on end-to-end ML pipelinesWhy This Role
Work on live systems where models directly impact outcomesHigh ownership from model design through to production performanceMeaningful, technically challenging optimisation problemsOpportunity to apply reinforcement learning beyond theory and experimentationIf you’re excited by seeing reinforcement learning models operate in the real world—and improving them over time-this role is well worth exploring.