Researcher - Reinforcement LearningHuawei Canada • Markham, York Region, CA

Researcher - Reinforcement Learning

Huawei Canada • Markham, York Region, CA

Il y a 9 jours

Type de contrat

Temporaire

Description de poste

Join to apply for the Researcher - Reinforcement Learning role at Huawei Canada

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long‑term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company’s products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine‑tuning toward continual, agentic self‑improvement.

LLM post‑training paradigms (e.g., RLHF, GRPO, reward‑free methods, etc.).

Agentic reinforcement learning for tool‑using and browsing‑based LLMs trained in interactive environments.

Agentic evaluation and benchmarking, including design of multi‑turn, verifiable reasoning tasks.

Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning‑enhanced LLMs and tool‑using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

About the ideal candidate

PhD degree in Computer Science or related fields or master’s degree with comparable experience.
Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
Practical or research experience in reinforcement learning, self‑supervised learning, or language model fine‑tuning.
Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
Familiarity with LLM post‑training pipelines (RLHF, GRPO / PPO, SFT, LoRA, MoE, etc.) is an asset.
Experience with multi‑agent RL, tool‑use / browser / coding agents, is an asset.
Strong communication and writing skills; enthusiasm for open research and collaborative problem‑solving.

Seniority Level

Entry level

Employment type

Contract

Job function / Industries

Human Resources / Telecommunications

#J-18808-Ljbffr

Créer une alerte emploi pour cette recherche

Researcher • Markham, York Region, CA

Offres similaires

Strategic Research Grants Lead

University of Toronto • Toronto

Temps plein

A leading Canadian university is seeking a Strategic Research Grants Officer to manage research funding initiatives at the Data Sciences Institute. The role involves advising faculty on funding stra...Voir plus

Dernière mise à jour : il y a 10 jours • Offre sponsorisée

Sr. UX Researcher

Insight Global • Toronto

Temps plein +1

Be among the first 25 applicants.This range is provided by Insight Global.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the job...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

UX Researcher, Director

Fitch Ratings • Toronto

Temps plein

Join our innovative team at Fitch Ratings, where we strive to create exceptional user experiences that drive our business forward. We are looking for a passionate UX Researcher to help us understand...Voir plus

Dernière mise à jour : il y a 25 jours • Offre sponsorisée

UX Research Manager

Canadian Tire Financial Services • Toronto

Temps plein

What you'll do • •UX Research manager will influence and inspire all UX Researcher to ensure customer centricity while supporting ‘best-in-class’ digital experience design across CTC (including CT, M...Voir plus

Dernière mise à jour : il y a 10 jours • Offre sponsorisée

Remote DSM Research & Evaluation Lead

Resource Innovations, Inc. • Toronto C6A, ON, Canada

Télétravail

Temps plein

A leading energy transformation firm in Canada is seeking a Lead level demand side management (DSM) research and evaluation lead. This role involves leading evaluations and research for energy effic...Voir plus

Dernière mise à jour : il y a 20 jours • Offre sponsorisée

Machine Learning Researcher, Deep Reinforcement Learning and Optimization

Qualcomm • Markham

Temps plein

Engineering Group, Engineering Group > .At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices.Our mission is to m...Voir plus

Dernière mise à jour : il y a 10 jours • Offre sponsorisée

Design researcher

Global Technical Talent • Toronto

Temps plein

Human-Centered Design Researcher.Onsite Flexibility : Hybrid – 3 days onsite starting January 2026 (potentially 4 days later in 2026). Human Centered Design Practice.This role involves applying user‑...Voir plus

Dernière mise à jour : il y a 26 jours • Offre sponsorisée

Founding Director, AI-Driven Discovery

Scribd, Inc. • Toronto C6A, ON, Canada

Temps plein

A leading knowledge-sharing platform in Toronto is searching for a Director of Product Discovery.This role involves leading strategy and execution across search and recommendation systems, managing...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei Technologies Canada Co., Ltd. • Markham

Temps plein +1

Huawei Canada has an immediate permanent opening for a Principal Architect.The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into thr...Voir plus

Dernière mise à jour : il y a 10 jours • Offre sponsorisée

UX Researcher, Director

Fitch Group • Toronto

Temps plein

Dernière mise à jour : il y a 28 jours • Offre sponsorisée

Director, Ratings AI

Python Software Foundation • Toronto C6A, ON, Canada

Télétravail

Temps plein

The Fitch Ratings AI team is currently seeking a Director of AI Prototyping based out of our Toronto office.The prospective candidate will be joining our innovative Ratings AI team, which focuses o...Voir plus

Dernière mise à jour : il y a 2 jours • Offre sponsorisée

User Experience Researcher

Mindlance • Toronto, Canada

Temps plein

This range is provided by Mindlance.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Direct message the job poster from Mindlance.Job Title Desig...Voir plus

Dernière mise à jour : il y a 4 jours • Offre sponsorisée

Machine Learning for Drug Discovery Post-Doctoral Research Fellow

The Hospital for Sick Children • Toronto C6A, ON, Canada

Temps plein

Machine Learning for Drug Discovery Post-Doctoral Research Fellow.Dedicated exclusively to children and their families, The Hospital for Sick Children (SickKids) is one of the largest and most resp...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

Senior Researcher – Hardware Efficient AI Foundation Model Training

Huawei • Markham

Temps plein

The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies.This team fo...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

E-Commerce & Market Research Coordinator

Marcatus QED • Toronto, ON, Canada

Temps plein

We are seeking a creative and driven E-Commerce & Market Research Coordinator to join our team.This role plays a key part in developing engaging content, managing social media platforms, and ex...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

Senior Data Scientist, Reinforcement Learning

Chubb Ltd. • Toronto C6A, ON, Canada

Temps plein

Chubb is a world leader in insurance.With operations in 54 countries, Chubb provides commercial and personal property and casualty insurance, personal accident and supplemental health insurance, re...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée

Coop Researcher Web & AI

Huawei Technologies Canada Co., Ltd. • Markham, ON, CA

Stage

Huawei Canada has an immediate co-op opening for a Researcher.The Web, Windowing, and Graphics Team, currently a part of the Toronto Research Centre, explores innovative architectures to enhance th...Voir plus

Dernière mise à jour : il y a plus de 30 jours

UX Researcher, Risk

Stripe • Toronto, Canada

Temps plein

Stripe is a financial infrastructure platform for businesses.Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their reve...Voir plus

Dernière mise à jour : il y a plus de 30 jours • Offre sponsorisée