Talent.com
Researcher - Reinforcement Learning
Researcher - Reinforcement LearningHuawei Canada • Markham, York Region, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Canada • Markham, York Region, CA
10 days ago
Job type
  • Temporary
Job description

Join to apply for the Researcher - Reinforcement Learning role at Huawei Canada

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long‑term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company’s products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine‑tuning toward continual, agentic self‑improvement.

LLM post‑training paradigms (e.g., RLHF, GRPO, reward‑free methods, etc.).

Agentic reinforcement learning for tool‑using and browsing‑based LLMs trained in interactive environments.

Agentic evaluation and benchmarking, including design of multi‑turn, verifiable reasoning tasks.

Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning‑enhanced LLMs and tool‑using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

About the ideal candidate

  • PhD degree in Computer Science or related fields or master’s degree with comparable experience.
  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
  • Practical or research experience in reinforcement learning, self‑supervised learning, or language model fine‑tuning.
  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
  • Familiarity with LLM post‑training pipelines (RLHF, GRPO / PPO, SFT, LoRA, MoE, etc.) is an asset.
  • Experience with multi‑agent RL, tool‑use / browser / coding agents, is an asset.
  • Strong communication and writing skills; enthusiasm for open research and collaborative problem‑solving.

Seniority Level

Entry level

Employment type

Contract

Job function / Industries

Human Resources / Telecommunications

#J-18808-Ljbffr

Create a job alert for this search

Researcher • Markham, York Region, CA

Similar jobs
Strategic Research Grants Lead

Strategic Research Grants Lead

University of Toronto • Toronto
Full-time
A leading Canadian university is seeking a Strategic Research Grants Officer to manage research funding initiatives at the Data Sciences Institute. The role involves advising faculty on funding stra...Show more
Last updated: 10 days ago • Promoted
UX Researcher, Director

UX Researcher, Director

Fitch Group, Inc., Fitch Ratings, Inc., Fitch Solutions Group • Toronto
Full-time
Category : Information Technology.Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX...Show more
Last updated: 30+ days ago • Promoted
Remote DSM Research & Evaluation Lead

Remote DSM Research & Evaluation Lead

Resource Innovations, Inc. • Toronto, ON, Canada
Full-time
A leading energy transformation firm in Canada is seeking a Lead level demand side management (DSM) research and evaluation lead. This role involves leading evaluations and research for energy effic...Show more
Last updated: 8 days ago • Promoted
Sr. UX Researcher

Sr. UX Researcher

Insight Global • Toronto
Full-time +1
Be among the first 25 applicants.This range is provided by Insight Global.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the job...Show more
Last updated: 30+ days ago • Promoted
Coop Researcher Simulation & AI

Coop Researcher Simulation & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
Internship
Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Show more
Last updated: 30+ days ago
UX Researcher, Director

UX Researcher, Director

Fitch Ratings • Toronto
Full-time
Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX Researcher to help us understand...Show more
Last updated: 26 days ago • Promoted
UX Research Manager

UX Research Manager

Canadian Tire Financial Services • Toronto
Full-time
What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Show more
Last updated: 10 days ago • Promoted
Machine Learning Researcher

Machine Learning Researcher

Ideogram • Toronto
Full-time
As a Machine Learning Researcher, you will push the frontier of what’s possible with AI and build and deploy state of the art machine learning models at scale. You will work with a creative and ambi...Show more
Last updated: 30+ days ago • Promoted
Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Qualcomm • Markham
Full-time
Engineering Group, Engineering Group > .At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices.Our mission is to m...Show more
Last updated: 10 days ago • Promoted
Design researcher

Design researcher

Global Technical Talent • Toronto
Full-time
Human-Centered Design Researcher.Onsite Flexibility : Hybrid – 3 days onsite starting January 2026 (potentially 4 days later in 2026). Human Centered Design Practice.This role involves applying user‑...Show more
Last updated: 27 days ago • Promoted
UX Researcher, Director

UX Researcher, Director

Fitch Group • Toronto
Full-time
Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX Researcher to help us understand...Show more
Last updated: 29 days ago • Promoted
Director, Ratings AI

Director, Ratings AI

Python Software Foundation • Toronto C6A, ON, Canada
Remote
Full-time
The Fitch Ratings AI team is currently seeking a Director of AI Prototyping based out of our Toronto office.The prospective candidate will be joining our innovative Ratings AI team, which focuses o...Show more
Last updated: 2 days ago • Promoted
Senior Researcher – Hardware Efficient AI Foundation Model Training

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei • Markham
Full-time
The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies.This team fo...Show more
Last updated: 30+ days ago • Promoted
Advisor, UX Researcher (8 month contract)

Advisor, UX Researcher (8 month contract)

Definity • Toronto, Canada
Full-time +1
Advisor, UX Researcher (8 month contract) Join to apply for the.Optimize digital buy-flows and sales funnels through comprehensive analysis of user behavior, leveraging multiple data sources includ...Show more
Last updated: 2 hours ago • Promoted • New!
User Experience Researcher

User Experience Researcher

Cella • Toronto
Full-time +1
Direct message the job poster from Cella.We are seeking a highly skilled and analytical.Human-Centered Design (HCD) practice with our client for a. In this role, you will be a key contributor to the...Show more
Last updated: 9 hours ago • Promoted • New!
E-Commerce & Market Research Coordinator

E-Commerce & Market Research Coordinator

Marcatus QED • Toronto, ON, Canada
Full-time
We are seeking a creative and driven E-Commerce & Market Research Coordinator to join our team.This role plays a key part in developing engaging content, managing social media platforms, and ex...Show more
Last updated: 30+ days ago • Promoted
Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

Tether Operations Limited • Toronto, ON, CA
Remote
Full-time
Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show more
Last updated: 30+ days ago
Coop Researcher Web & AI

Coop Researcher Web & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
Internship
Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Show more
Last updated: 30+ days ago