Talent.com
Huawei Technologies Canada Co., Ltd.
Researcher - Reinforcement LearningHuawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
30+ days ago
Job type
  • Temporary
Job description

Job description

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.


About the team:

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

  • Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.

  • LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.).

  • Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments.

  • Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.

  • Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.


Job requirements

About the ideal candidate:

  • PhD degree in Computer Science or related fields or master's degree with comparable experience.

  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.

  • Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning.

  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.

  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.

  • Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is an asset.

  • Experience with multi-agent RL, tool-use / browser/coding agents, is an asset.

  • Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

Huawei aims to support a French-speaking work environment for its employees in Quebec. We have taken steps to avoid requiring a language other than French for this position. However, proficiency in English is essential for this role for the following reasons:

The person will be required to communicate regularly with colleagues located outside Quebec, where English is the primary language used for communication between offices. In addition, the nature of the tasks related to this position, which falls within a highly specialized field of artificial intelligence, also requires knowledge of English.

Create a job alert for this search

Researcher - Reinforcement Learning • Edmonton, Alberta, CA

Similar jobs

Remote Prospect Researcher - Fundraising (Part-Time)

BullyingCanadaEdmonton, Division No. 11, CA
Remote
Part-time +1

A national charity focused on bullying prevention is seeking a part-time Fundraising Prospect Researcher to support their fundraising efforts.This remote position involves researching prospects and... Show more

 • Promoted

Search Consultant - Remote

Berkner Groupedmonton, ab, ca
Remote
Full-time

Berkner Group is a specialized search firm focused on building leadership and technical teams for companies across climate, deep tech, and other innovation-driven sectors.We work closely with found... Show more

 • Promoted

Expert User Researcher for Developer Insights

AutodeskEdmonton, Division No. 11, CA
Full-time

Lead transformative research efforts at Autodesk as an Expert Experience User Researcher focused on developer insights.Drive actionable strategies for AI-driven workflows while enhancing user exper... Show more

 • Promoted

Senior Equity Research Reviewer — AI-Driven Analysis

Great Value HiringEdmonton, Division No. 11, CA
Full-time

A leading investment research firm in Canada is seeking an Expert Equities Research Reviewer to analyze AI-generated equity reports.The role involves reviewing for accuracy, evaluating investment t... Show more

 • Promoted

Search Consultant - Remote - Berkner Group

Berkner Groupedmonton, ab, ca
Remote
Full-time

Berkner Group is a specialized search firm focused on building leadership and technical teams for companies across climate, deep tech, and other innovation-driven sectors.We work closely with found... Show more

 • Promoted

Remote Recreational Therapist for AI Training

Crossing HurdlesEdmonton, Division No. 11, CA
Remote
Full-time

A healthcare technology company in Canada is seeking a Recreational Therapist to develop training materials and curate AI training data based on therapeutic practices.The role involves training AI ... Show more

 • Promoted

Survey Taker: Earn up to $25 per survey (Remote)

Earn HausBeaumont, AB, CA
Remote
Full-time +1

Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se... Show more

 • Promoted

Online Survey Participant: Work Remote and Earn Up To $25 Per Survey

Earn HausBeaumont, AB, CA
Remote
Full-time +1

Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se... Show more

 • Promoted

Exciting Part-Time Remote Market Research Roles in Vancouver - 5 Spots Only!

Occupons QuebecEdmonton, Division No. 11, CA
Remote
Part-time

Exciting Part-Time Remote Market Research Roles in Vancouver - 5 Spots Only!.Join Our Dynamic Team: 5 Exclusive Market Research Positions - Work From Home, Part-Time.Are you ready for an adventure ... Show more

 • Promoted

Machine Learning Fellow - Human Frontier Collective (Canada)

Scale AIEdmonton, Division No. 11, CA
Full-time

This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension.To be eligible, candidates must be authorized to work in Can... Show more

 • Promoted

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd.Edmonton, Division No. 11, CA
Temporary

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable ... Show more

 • Promoted

Remote Physics Researcher for AI Benchmarking | $50+/hr

TuringEdmonton, Division No. 11, CA
Remote
Full-time

A leading research accelerator for AI is seeking PhD candidates in Physics or related fields for a remote contract role.You will help fine-tune large language models by designing advanced problems ... Show more

 • Promoted

Remote Math Researcher for AI Reasoning (PhD Preferred)

TuringEdmonton, Division No. 11, CA
Remote
Full-time

A leading AI company in Toronto seeks PhD-level researchers in mathematics to evaluate the reasoning capabilities of large language models.Responsibilities include creating math problems and analyz... Show more

 • Promoted

AI Implementation and Research Director

Info-Tech Research GroupEdmonton, Division No. 11, CA
Full-time

Shape the future of applied AI by directing client engagements and system prototyping.This position melds hands-on delivery with innovative research to enhance AI applications.The AI Implementation... Show more

 • Promoted

RLHF Evaluation Specialist for AI Training

Rex.zoneEdmonton, Division No. 11, CA
Full-time

A tech company specializing in AI is seeking an AI Trainer based in Canada.The role involves improving large language models through RLHF-style evaluations, prompt evaluation, and data labeling.Can... Show more

 • Promoted

AP Research Tutor

Varsity Tutors, a Nerdy CompanyEdmonton, Division No. 11, CA
Full-time

The Varsity Tutors Live Learning Platform has thousands of students looking for online AP Research tutors nationally.As a tutor on the Varsity Tutors Platform, you'll have the flexibility to set yo... Show more

 • Promoted

IT Sourcer / Recruitment Researcher (Prospecting Only)

Jobs for HumanityEdmonton, Division No. 11, CA
Full-time

IT Sourcer / Recruitment Researcher (Prospecting Only).Canadian IT consulting and professional services firm specializing in the placement of highly qualified technology consultants for public‑sect... Show more

 • Promoted

AI Content Trainer (Remote)

Work VistaEdmonton, Division No. 11, CA
Remote
Part-time

We are hiring on behalf of one of our clients, a leading global player in the AI space.This remote, part-time contract role focuses on teaching and refining AI models through high-quality content c... Show more

 • Promoted

Remote Content & Research Coordinator

Ai JobsEdmonton, Division No. 11, CA
Remote
Part-time

A leading remote job provider is looking for an Outsourcing Coordinator to join their team.This part-time position focuses on content development, research, and ensuring quality across various digi... Show more

 • Promoted

Remote Healthcare Specialist AI Trainer

Invisible AgencyEdmonton, Division No. 11, CA
Remote
Full-time

Enhance AI healthcare training as a specialist fluent in French.Apply your clinical expertise to transform language models for better patient education and medical research outcomes.This mid-senior... Show more