Researcher - Reinforcement LearningHuawei Canada • Markham, York Region, CA

Researcher - Reinforcement Learning

Huawei Canada • Markham, York Region, CA

12 days ago

Job type

Temporary

Job description

Join to apply for the Researcher - Reinforcement Learning role at Huawei Canada

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long‑term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company’s products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine‑tuning toward continual, agentic self‑improvement.

LLM post‑training paradigms (e.g., RLHF, GRPO, reward‑free methods, etc.).

Agentic reinforcement learning for tool‑using and browsing‑based LLMs trained in interactive environments.

Agentic evaluation and benchmarking, including design of multi‑turn, verifiable reasoning tasks.

Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning‑enhanced LLMs and tool‑using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

About the ideal candidate

PhD degree in Computer Science or related fields or master’s degree with comparable experience.
Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
Practical or research experience in reinforcement learning, self‑supervised learning, or language model fine‑tuning.
Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
Familiarity with LLM post‑training pipelines (RLHF, GRPO / PPO, SFT, LoRA, MoE, etc.) is an asset.
Experience with multi‑agent RL, tool‑use / browser / coding agents, is an asset.
Strong communication and writing skills; enthusiasm for open research and collaborative problem‑solving.

Seniority Level

Entry level

Employment type

Contract

Job function / Industries

Human Resources / Telecommunications

#J-18808-Ljbffr

Create a job alert for this search

Researcher • Markham, York Region, CA

Similar jobs

Strategic Research Grants Lead

University of Toronto • Toronto

Full-time

A leading Canadian university is seeking a Strategic Research Grants Officer to manage research funding initiatives at the Data Sciences Institute. The role involves advising faculty on funding stra...Show more

Last updated: 12 days ago • Promoted

Remote DSM Research & Evaluation Lead

Resource Innovations, Inc. • Toronto, ON, Canada

Full-time

A leading energy transformation firm in Canada is seeking a Lead level demand side management (DSM) research and evaluation lead. This role involves leading evaluations and research for energy effic...Show more

Last updated: 10 days ago • Promoted

Sr. UX Researcher

Insight Global • Toronto

Full-time +1

Be among the first 25 applicants.This range is provided by Insight Global.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the job...Show more

Last updated: 30+ days ago • Promoted

Research Scientist, LLM Agents (Foundational Research)

Thomson Reuters • Toronto

Full-time

Research Scientist, LLM Agents (Foundational Research).Are you a curious and open-minded individual with an interest in conducting state‑of‑the‑art foundational machine learning research? Thomson R...Show more

Last updated: 30+ days ago • Promoted

Coop Researcher Simulation & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA

Internship

Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Show more

Last updated: 30+ days ago

UX Research Manager

Canadian Tire Financial Services • Toronto

Full-time

What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Show more

Last updated: 12 days ago • Promoted

Senior Research Analyst - Trend Hunter

Trend Hunter • toronto, on, ca

Full-time

Trend Hunter is the world’s largest, most popular trend website and an innovation consultancy.We help the world’s leading brands predict and create the future. With our New York Times bestselling me...Show more

Last updated: 10 hours ago • Promoted • New!

Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Qualcomm • Markham

Full-time

Engineering Group, Engineering Group > .At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices.Our mission is to m...Show more

Last updated: 12 days ago • Promoted

Design researcher

Global Technical Talent • Toronto

Full-time

Human-Centered Design Researcher.Onsite Flexibility : Hybrid – 3 days onsite starting January 2026 (potentially 4 days later in 2026). Human Centered Design Practice.This role involves applying user‑...Show more

Last updated: 29 days ago • Promoted

Applied Scientist — RL & Reward Modeling for Safe Autonomy

Wayve • Toronto, ON, Canada

Full-time

A cutting-edge AI company in Toronto is seeking an experienced Applied Scientist specializing in Reinforcement Learning to enhance AI driving technology. The ideal candidate will design reward model...Show more

Last updated: 12 hours ago • Promoted • New!

UX Researcher Co-op : Shape User Experience in Banking

President’s Choice Bank • Toronto

Full-time

A leading Canadian bank is seeking a motivated UX Researcher Co-op to support its digital team.This role offers hands-on experience in a professional design environment, where you will assist in us...Show more

Last updated: 8 hours ago • Promoted • New!

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei • Markham

Full-time

The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies.This team fo...Show more

Last updated: 30+ days ago • Promoted

User Experience Researcher

Cella • Toronto

Full-time +1

Direct message the job poster from Cella.We are seeking a highly skilled and analytical.Human-Centered Design (HCD) practice with our client for a. In this role, you will be a key contributor to the...Show more

Last updated: 2 days ago • Promoted

E-Commerce & Market Research Coordinator

Marcatus QED • Toronto, ON, Canada

Full-time

We are seeking a creative and driven E-Commerce & Market Research Coordinator to join our team.This role plays a key part in developing engaging content, managing social media platforms, and ex...Show more

Last updated: 30+ days ago • Promoted

Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

Tether Operations Limited • Toronto, ON, CA

Remote

Full-time

Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show more

Last updated: 30+ days ago

Senior ML Researcher Lead

RBC • Toronto, ON, Canada

Full-time

RBC Borealis is at the forefront of artificial intelligence research and development, driving the innovative edge for the Royal Bank of Canada (RBC). Our team is dedicated to advancing the science o...Show more

Last updated: 12 hours ago • Promoted • New!

Coop Researcher Web & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA

Internship

Last updated: 30+ days ago

UX Research Manager

Sport Chek • Toronto

Full-time

Last updated: 12 days ago • Promoted