Talent.com
Researcher - Reinforcement Learning
Researcher - Reinforcement LearningHuawei Canada • Markham, York Region, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Canada • Markham, York Region, CA
Il y a 9 jours
Type de contrat
  • Temporaire
Description de poste

Join to apply for the Researcher - Reinforcement Learning role at Huawei Canada

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long‑term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company’s products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine‑tuning toward continual, agentic self‑improvement.

LLM post‑training paradigms (e.g., RLHF, GRPO, reward‑free methods, etc.).

Agentic reinforcement learning for tool‑using and browsing‑based LLMs trained in interactive environments.

Agentic evaluation and benchmarking, including design of multi‑turn, verifiable reasoning tasks.

Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning‑enhanced LLMs and tool‑using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

About the ideal candidate

  • PhD degree in Computer Science or related fields or master’s degree with comparable experience.
  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
  • Practical or research experience in reinforcement learning, self‑supervised learning, or language model fine‑tuning.
  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
  • Familiarity with LLM post‑training pipelines (RLHF, GRPO / PPO, SFT, LoRA, MoE, etc.) is an asset.
  • Experience with multi‑agent RL, tool‑use / browser / coding agents, is an asset.
  • Strong communication and writing skills; enthusiasm for open research and collaborative problem‑solving.

Seniority Level

Entry level

Employment type

Contract

Job function / Industries

Human Resources / Telecommunications

#J-18808-Ljbffr

Créer une alerte emploi pour cette recherche

Researcher • Markham, York Region, CA

Offres similaires
Strategic Research Grants Lead

Strategic Research Grants Lead

University of Toronto • Toronto
Temps plein
A leading Canadian university is seeking a Strategic Research Grants Officer to manage research funding initiatives at the Data Sciences Institute. The role involves advising faculty on funding stra...Voir plus
Dernière mise à jour : il y a 10 jours • Offre sponsorisée
Sr. UX Researcher

Sr. UX Researcher

Insight Global • Toronto
Temps plein +1
Be among the first 25 applicants.This range is provided by Insight Global.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the job...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
UX Researcher, Director

UX Researcher, Director

Fitch Ratings • Toronto
Temps plein
Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX Researcher to help us understand...Voir plus
Dernière mise à jour : il y a 25 jours • Offre sponsorisée
UX Research Manager

UX Research Manager

Canadian Tire Financial Services • Toronto
Temps plein
What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Voir plus
Dernière mise à jour : il y a 10 jours • Offre sponsorisée
Remote DSM Research & Evaluation Lead

Remote DSM Research & Evaluation Lead

Resource Innovations, Inc. • Toronto C6A, ON, Canada
Télétravail
Temps plein
A leading energy transformation firm in Canada is seeking a Lead level demand side management (DSM) research and evaluation lead. This role involves leading evaluations and research for energy effic...Voir plus
Dernière mise à jour : il y a 20 jours • Offre sponsorisée
Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Qualcomm • Markham
Temps plein
Engineering Group, Engineering Group > .At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices.Our mission is to m...Voir plus
Dernière mise à jour : il y a 10 jours • Offre sponsorisée
Design researcher

Design researcher

Global Technical Talent • Toronto
Temps plein
Human-Centered Design Researcher.Onsite Flexibility : Hybrid – 3 days onsite starting January 2026 (potentially 4 days later in 2026). Human Centered Design Practice.This role involves applying user‑...Voir plus
Dernière mise à jour : il y a 26 jours • Offre sponsorisée
Founding Director, AI-Driven Discovery

Founding Director, AI-Driven Discovery

Scribd, Inc. • Toronto C6A, ON, Canada
Temps plein
A leading knowledge-sharing platform in Toronto is searching for a Director of Product Discovery.This role involves leading strategy and execution across search and recommendation systems, managing...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
Senior Researcher – Hardware Efficient AI Foundation Model Training

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei Technologies Canada Co., Ltd. • Markham
Temps plein +1
Huawei Canada has an immediate permanent opening for a Principal Architect.The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into thr...Voir plus
Dernière mise à jour : il y a 10 jours • Offre sponsorisée
UX Researcher, Director

UX Researcher, Director

Fitch Group • Toronto
Temps plein
Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX Researcher to help us understand...Voir plus
Dernière mise à jour : il y a 28 jours • Offre sponsorisée
Director, Ratings AI

Director, Ratings AI

Python Software Foundation • Toronto C6A, ON, Canada
Télétravail
Temps plein
The Fitch Ratings AI team is currently seeking a Director of AI Prototyping based out of our Toronto office.The prospective candidate will be joining our innovative Ratings AI team, which focuses o...Voir plus
Dernière mise à jour : il y a 2 jours • Offre sponsorisée
User Experience Researcher

User Experience Researcher

Mindlance • Toronto, Canada
Temps plein
This range is provided by Mindlance.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Direct message the job poster from Mindlance.Job Title Desig...Voir plus
Dernière mise à jour : il y a 4 jours • Offre sponsorisée
Machine Learning for Drug Discovery Post-Doctoral Research Fellow

Machine Learning for Drug Discovery Post-Doctoral Research Fellow

The Hospital for Sick Children • Toronto C6A, ON, Canada
Temps plein
Machine Learning for Drug Discovery Post-Doctoral Research Fellow.Dedicated exclusively to children and their families, The Hospital for Sick Children (SickKids) is one of the largest and most resp...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
Senior Researcher – Hardware Efficient AI Foundation Model Training

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei • Markham
Temps plein
The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies.This team fo...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
E-Commerce & Market Research Coordinator

E-Commerce & Market Research Coordinator

Marcatus QED • Toronto, ON, Canada
Temps plein
We are seeking a creative and driven E-Commerce & Market Research Coordinator to join our team.This role plays a key part in developing engaging content, managing social media platforms, and ex...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
Senior Data Scientist, Reinforcement Learning

Senior Data Scientist, Reinforcement Learning

Chubb Ltd. • Toronto C6A, ON, Canada
Temps plein
Chubb is a world leader in insurance.With operations in 54 countries, Chubb provides commercial and personal property and casualty insurance, personal accident and supplemental health insurance, re...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée
Coop Researcher Web & AI

Coop Researcher Web & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
Stage
Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Voir plus
Dernière mise à jour : il y a plus de 30 jours
UX Researcher, Risk

UX Researcher, Risk

Stripe • Toronto, Canada
Temps plein
Stripe is a financial infrastructure platform for businesses.Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their reve...Voir plus
Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée