Talent.com
Researcher - Reinforcement Learning
Researcher - Reinforcement LearningHuawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
30+ days ago
Job type
  • Temporary
Job description

Job description

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.


About the team:

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

  • Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.

  • LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.).

  • Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments.

  • Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.

  • Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.


Job requirements

About the ideal candidate:

  • PhD degree in Computer Science or related fields or master's degree with comparable experience.

  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.

  • Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning.

  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.

  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.

  • Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is an asset.

  • Experience with multi-agent RL, tool-use / browser/coding agents, is an asset.

  • Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

Huawei aims to support a French-speaking work environment for its employees in Quebec. We have taken steps to avoid requiring a language other than French for this position. However, proficiency in English is essential for this role for the following reasons:

The person will be required to communicate regularly with colleagues located outside Quebec, where English is the primary language used for communication between offices. In addition, the nature of the tasks related to this position, which falls within a highly specialized field of artificial intelligence, also requires knowledge of English.

Create a job alert for this search

Researcher - Reinforcement Learning • Edmonton, Alberta, CA

Similar jobs
Market Research Insights Manager - Qualitative

Market Research Insights Manager - Qualitative

Kynetec • edmonton, ab, ca
Full-time
Kynetec is the global leader in agricultural and animal health market insights.We have a long history of market research expertise, specialising in animal health and nutrition, crop protection, far...Show more
Last updated: 8 days ago • Promoted
Research Assistant

Research Assistant

ASBB Economics and Research • edmonton, ab, ca
Full-time +1
ASBB Economics and Research Ltd is a social and economic research advisory dedicated to driving impactful public policy discussions.Founded by Mani, a seasoned economist with global experience, the...Show more
Last updated: 14 days ago • Promoted
Remote Biology Researcher (PhD) - Turing

Remote Biology Researcher (PhD) - Turing

Turing • edmonton, ab, ca
Remote
Full-time
Remote contract for PhDs in Biology, Biotechnology, Biochemistry, or related fields.Work on cutting-edge projects with top AI labs while earning up to $50+/hour, fully remote, with flexible weekly ...Show more
Last updated: 6 days ago • Promoted
Research Assistant - ASBB Economics and Research

Research Assistant - ASBB Economics and Research

ASBB Economics and Research • edmonton, ab, ca
Full-time +1
ASBB Economics and Research Ltd is a social and economic research advisory dedicated to driving impactful public policy discussions.Founded by Mani, a seasoned economist with global experience, the...Show more
Last updated: 14 days ago • Promoted
Survey Taker: Earn up to $25 per survey (Remote)

Survey Taker: Earn up to $25 per survey (Remote)

Earn Haus • Beaumont, AB, CA
Remote
Full-time +1
Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se...Show more
Last updated: 30+ days ago • Promoted
Online Survey Participant: Work Remote and Earn Up To $25 Per Survey

Online Survey Participant: Work Remote and Earn Up To $25 Per Survey

Earn Haus • Beaumont, AB, CA
Remote
Full-time +1
Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se...Show more
Last updated: 30+ days ago • Promoted
Complete Online Surveys For Cash (Up to $25/per)

Complete Online Surveys For Cash (Up to $25/per)

Earn Haus • Beaumont, AB, CA
Full-time +1
Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se...Show more
Last updated: 30+ days ago • Promoted
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd. • Edmonton, Division No. 11, CA
Temporary
Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable ...Show more
Last updated: 30+ days ago • Promoted
Organizational Wellbeing Advisor (E-Volunteer) - Spanish Required

Organizational Wellbeing Advisor (E-Volunteer) - Spanish Required

Cuso International • Beaumont, Alberta
Permanent
Online placement (E-Volunteer).Please submit a Spanish Resume and Statement of Interest.Open to Canadian Citizens and Permanent Residents of Canada only.Support the journey to becoming a caring org...Show more
Last updated: 2 days ago • Promoted
Remote Biology Researcher (PhD)

Remote Biology Researcher (PhD)

Turing • edmonton, ab, ca
Remote
Full-time
Remote contract for PhDs in Biology, Biotechnology, Biochemistry, or related fields.Work on cutting-edge projects with top AI labs while earning up to $50+/hour, fully remote, with flexible weekly ...Show more
Last updated: 6 days ago • Promoted
Market Research Insights Manager - Qualitative - Kynetec

Market Research Insights Manager - Qualitative - Kynetec

Kynetec • edmonton, ab, ca
Full-time
Kynetec is the global leader in agricultural and animal health market insights.We have a long history of market research expertise, specialising in animal health and nutrition, crop protection, far...Show more
Last updated: 8 days ago • Promoted
RL Researcher: LLMs & Agentic AI (12-Month)

RL Researcher: LLMs & Agentic AI (12-Month)

Huawei Technologies Canada Co., Ltd. • Edmonton, Division No. 11, CA
Temporary
A leading technology firm in Canada seeks a Reinforcement Learning Researcher to advance research in artificial intelligence.The ideal candidate will hold a PhD in Computer Science or a related fie...Show more
Last updated: 30+ days ago • Promoted
Equity Research Mentor - edmonton

Equity Research Mentor - edmonton

Wall Street Oasis • edmonton, ab, ca
Full-time
Click the following link to submit your application today:.Wall Street Oasis (WSO) | Mentorship Program.Mentors | 1+ Million Students | Global Reach.K in a single week (if you qualify to become a h...Show more
Last updated: 5 hours ago • Promoted • New!
Trigonometry Private Tutoring Jobs Beaumont (Alberta)

Trigonometry Private Tutoring Jobs Beaumont (Alberta)

Superprof • Beaumont (Alberta), Canada
Full-time
Superprof is Canada's #1 tutoring platform, and we're actively recruiting passionate tutors! Whether you're a student, a professional, or simply someone who loves teaching, join the largest communi...Show more
Last updated: 10 hours ago • Promoted • New!
Strategic Partnership Advisor - Spanish Required

Strategic Partnership Advisor - Spanish Required

Cuso International • Beaumont, Alberta
Permanent
This Volunteer Placement is Located in:.Please submit a Spanish Resume and Statement of Interest.Open to Canadian Citizens and Permanent Residents of Canada only.Cuso International is seeking two v...Show more
Last updated: 2 days ago • Promoted
Remote Physics Researcher (PhD) - Turing

Remote Physics Researcher (PhD) - Turing

Turing • edmonton, ab, ca
Remote
Full-time
Remote contract for PhDs in Physics, Applied Physics, or related fields.Work on cutting-edge projects with top AI labs while earning $50+/hour, fully remote, with flexible weekly hours.Help fine-tu...Show more
Last updated: 6 days ago • Promoted
Equity Research Mentor

Equity Research Mentor

Wall Street Oasis • edmonton, ab, ca
Full-time
Click the following link to submit your application today:.Wall Street Oasis (WSO) | Mentorship Program.Mentors | 1+ Million Students | Global Reach.K in a single week (if you qualify to become a h...Show more
Last updated: 5 hours ago • Promoted • New!
Study Participant - Prolific

Study Participant - Prolific

Prolific • edmonton, ab, ca
Full-time
Prolific is not just another research platform – we are building the biggest pool of quality human research data in the world.Over 35,000 researchers, educators, and organizations use Prolific to r...Show more
Last updated: 22 days ago • Promoted