Talent.com
Researcher - Reinforcement Learning
Researcher - Reinforcement LearningHuawei Canada • Markham, York Region, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Canada • Markham, York Region, CA
12 days ago
Job type
  • Temporary
Job description

Join to apply for the Researcher - Reinforcement Learning role at Huawei Canada

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long‑term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company’s products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine‑tuning toward continual, agentic self‑improvement.

LLM post‑training paradigms (e.g., RLHF, GRPO, reward‑free methods, etc.).

Agentic reinforcement learning for tool‑using and browsing‑based LLMs trained in interactive environments.

Agentic evaluation and benchmarking, including design of multi‑turn, verifiable reasoning tasks.

Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning‑enhanced LLMs and tool‑using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

About the ideal candidate

  • PhD degree in Computer Science or related fields or master’s degree with comparable experience.
  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
  • Practical or research experience in reinforcement learning, self‑supervised learning, or language model fine‑tuning.
  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
  • Familiarity with LLM post‑training pipelines (RLHF, GRPO / PPO, SFT, LoRA, MoE, etc.) is an asset.
  • Experience with multi‑agent RL, tool‑use / browser / coding agents, is an asset.
  • Strong communication and writing skills; enthusiasm for open research and collaborative problem‑solving.

Seniority Level

Entry level

Employment type

Contract

Job function / Industries

Human Resources / Telecommunications

#J-18808-Ljbffr

Create a job alert for this search

Researcher • Markham, York Region, CA

Similar jobs
Strategic Research Grants Lead

Strategic Research Grants Lead

University of Toronto • Toronto
Full-time
A leading Canadian university is seeking a Strategic Research Grants Officer to manage research funding initiatives at the Data Sciences Institute. The role involves advising faculty on funding stra...Show more
Last updated: 12 days ago • Promoted
Remote DSM Research & Evaluation Lead

Remote DSM Research & Evaluation Lead

Resource Innovations, Inc. • Toronto, ON, Canada
Full-time
A leading energy transformation firm in Canada is seeking a Lead level demand side management (DSM) research and evaluation lead. This role involves leading evaluations and research for energy effic...Show more
Last updated: 10 days ago • Promoted
Sr. UX Researcher

Sr. UX Researcher

Insight Global • Toronto
Full-time +1
Be among the first 25 applicants.This range is provided by Insight Global.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the job...Show more
Last updated: 30+ days ago • Promoted
Research Scientist, LLM Agents (Foundational Research)

Research Scientist, LLM Agents (Foundational Research)

Thomson Reuters • Toronto
Full-time
Research Scientist, LLM Agents (Foundational Research).Are you a curious and open-minded individual with an interest in conducting state‑of‑the‑art foundational machine learning research? Thomson R...Show more
Last updated: 30+ days ago • Promoted
Coop Researcher Simulation & AI

Coop Researcher Simulation & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
Internship
Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Show more
Last updated: 30+ days ago
UX Research Manager

UX Research Manager

Canadian Tire Financial Services • Toronto
Full-time
What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Show more
Last updated: 12 days ago • Promoted
Senior Research Analyst - Trend Hunter

Senior Research Analyst - Trend Hunter

Trend Hunter • toronto, on, ca
Full-time
Trend Hunter is the world’s largest, most popular trend website and an innovation consultancy.We help the world’s leading brands predict and create the future. With our New York Times bestselling me...Show more
Last updated: 10 hours ago • Promoted • New!
Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Qualcomm • Markham
Full-time
Engineering Group, Engineering Group > .At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices.Our mission is to m...Show more
Last updated: 12 days ago • Promoted
Design researcher

Design researcher

Global Technical Talent • Toronto
Full-time
Human-Centered Design Researcher.Onsite Flexibility : Hybrid – 3 days onsite starting January 2026 (potentially 4 days later in 2026). Human Centered Design Practice.This role involves applying user‑...Show more
Last updated: 29 days ago • Promoted
Applied Scientist — RL & Reward Modeling for Safe Autonomy

Applied Scientist — RL & Reward Modeling for Safe Autonomy

Wayve • Toronto, ON, Canada
Full-time
A cutting-edge AI company in Toronto is seeking an experienced Applied Scientist specializing in Reinforcement Learning to enhance AI driving technology. The ideal candidate will design reward model...Show more
Last updated: 12 hours ago • Promoted • New!
UX Researcher Co-op : Shape User Experience in Banking

UX Researcher Co-op : Shape User Experience in Banking

President’s Choice Bank • Toronto
Full-time
A leading Canadian bank is seeking a motivated UX Researcher Co-op to support its digital team.This role offers hands-on experience in a professional design environment, where you will assist in us...Show more
Last updated: 8 hours ago • Promoted • New!
Senior Researcher – Hardware Efficient AI Foundation Model Training

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei • Markham
Full-time
The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies.This team fo...Show more
Last updated: 30+ days ago • Promoted
User Experience Researcher

User Experience Researcher

Cella • Toronto
Full-time +1
Direct message the job poster from Cella.We are seeking a highly skilled and analytical.Human-Centered Design (HCD) practice with our client for a. In this role, you will be a key contributor to the...Show more
Last updated: 2 days ago • Promoted
E-Commerce & Market Research Coordinator

E-Commerce & Market Research Coordinator

Marcatus QED • Toronto, ON, Canada
Full-time
We are seeking a creative and driven E-Commerce & Market Research Coordinator to join our team.This role plays a key part in developing engaging content, managing social media platforms, and ex...Show more
Last updated: 30+ days ago • Promoted
Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

Tether Operations Limited • Toronto, ON, CA
Remote
Full-time
Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show more
Last updated: 30+ days ago
Senior ML Researcher Lead

Senior ML Researcher Lead

RBC • Toronto, ON, Canada
Full-time
RBC Borealis is at the forefront of artificial intelligence research and development, driving the innovative edge for the Royal Bank of Canada (RBC). Our team is dedicated to advancing the science o...Show more
Last updated: 12 hours ago • Promoted • New!
Coop Researcher Web & AI

Coop Researcher Web & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
Internship
Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Show more
Last updated: 30+ days ago
UX Research Manager

UX Research Manager

Sport Chek • Toronto
Full-time
What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Show more
Last updated: 12 days ago • Promoted