Talent.com
Huawei Technologies Canada Co., Ltd.
Researcher - Reinforcement LearningHuawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
Researcher - Reinforcement Learning

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd. • Edmonton, Alberta, CA
Il y a plus de 30 jours
Type de contrat
  • Temporaire
Description de poste

Job description

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.


About the team:

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

  • Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.

  • LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.).

  • Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments.

  • Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.

  • Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.


Job requirements

About the ideal candidate:

  • PhD degree in Computer Science or related fields or master's degree with comparable experience.

  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.

  • Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning.

  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.

  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.

  • Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is an asset.

  • Experience with multi-agent RL, tool-use / browser/coding agents, is an asset.

  • Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

Huawei aims to support a French-speaking work environment for its employees in Quebec. We have taken steps to avoid requiring a language other than French for this position. However, proficiency in English is essential for this role for the following reasons:

The person will be required to communicate regularly with colleagues located outside Quebec, where English is the primary language used for communication between offices. In addition, the nature of the tasks related to this position, which falls within a highly specialized field of artificial intelligence, also requires knowledge of English.

Créer une alerte emploi pour cette recherche

Researcher - Reinforcement Learning • Edmonton, Alberta, CA

Offres similaires

Remote Prospect Researcher - Fundraising (Part-Time)

BullyingCanadaEdmonton, Division No. 11, CA
Télétravail
Temps partiel +1

A national charity focused on bullying prevention is seeking a part-time Fundraising Prospect Researcher to support their fundraising efforts.This remote position involves researching prospects and... Voir plus

 • Offre sponsorisée

Search Consultant - Remote

Berkner Groupedmonton, ab, ca
Télétravail
Temps plein

Berkner Group is a specialized search firm focused on building leadership and technical teams for companies across climate, deep tech, and other innovation-driven sectors.We work closely with found... Voir plus

 • Offre sponsorisée

Fundraising Prospect Researcher

BullyingCanadaEdmonton, Division No. 11, CA
Temps partiel +1

Registered charity BullyingCanada Inc.Fundraising Prospect Researcher to join our national team for a short-term contract starting.August 9, 2021 and ending November 30, 2021.This role will play a ... Voir plus

 • Offre sponsorisée

Search Consultant - Remote - Berkner Group

Berkner Groupedmonton, ab, ca
Télétravail
Temps plein

Berkner Group is a specialized search firm focused on building leadership and technical teams for companies across climate, deep tech, and other innovation-driven sectors.We work closely with found... Voir plus

 • Offre sponsorisée

RevOps Practice Lead

MergeYourDataedmonton, ab, ca
Temps plein

MergeYourData is a RevOps consultancy and Top 0.HubSpot Partner globally, currently growing 150% YoY.We work with mid-market B2B companies and multi-company organizations who need their CRM to func... Voir plus

 • Offre sponsorisée

Remote Role for Recreation Specialist in AI Innovation

MercorEdmonton, Division No. 11, CA
Télétravail
Temps plein

Take on a Remote Recreation Specialist position, enhancing AI research through your expertise and collaborative spirit.Create impactful deliverables and work asynchronously with dedicated research ... Voir plus

 • Offre sponsorisée

Survey Taker: Earn up to $25 per survey (Remote)

Earn HausBeaumont, AB, CA
Télétravail
Temps plein +1

Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se... Voir plus

 • Offre sponsorisée

Online Survey Participant: Work Remote and Earn Up To $25 Per Survey

Earn HausBeaumont, AB, CA
Télétravail
Temps plein +1

Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se... Voir plus

 • Offre sponsorisée

Writer/Journalist Internship Part-Time in Worldwide - Remote Worldwide

The Borgen ProjectRolly View
Télétravail
Temps partiel +1

Are you passionate about making a difference in the world? Look no further! The Borgen Project is an international organization that works at the political level to improve living conditions for pe... Voir plus

 • Offre sponsorisée

Research Director, Software Channels

IDG (International Data Group)edmonton, ab, ca
Temps partiel

The Research Director for Software Channels & Ecosystems is a senior role covering channels and ecosystems specific to software-centric channels and ecosystems, and also all the external factors th... Voir plus

 • Offre sponsorisée

Clinical Research Contracts Lead- Canada Remote

ICON Strategic Solutionsedmonton, ab, ca
Télétravail
Temps plein

ICON plc is a world-leading healthcare intelligence and clinical research organization.We’re proud to foster an inclusive environment driving innovation and excellence, and we welcome you to join u... Voir plus

 • Offre sponsorisée

Quantitative User Researcher - Mozilla Corporation

Mozilla CorporationEdmonton, Division No. 11, CA
Temps plein

Shape product strategy at Mozilla Corporation as a Senior Staff Quantitative User Researcher, focusing on user insights and data analytics.Collaborate with multidisciplinary teams to influence key ... Voir plus

 • Offre sponsorisée

Senior UX Researcher, Connected Stores (Remote)

InstacartEdmonton, Division No. 11, CA
Télétravail
Temps plein

A leading grocery technology company in Canada is seeking a Senior User Researcher II to drive impactful research aiming to modernize in-store operations and enhance grocery shopping experiences.Th... Voir plus

 • Offre sponsorisée

Researcher - Reinforcement Learning

Huawei Technologies Canada Co., Ltd.Edmonton, Division No. 11, CA
Temporaire

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable ... Voir plus

 • Offre sponsorisée

Remote Research Analyst - Market Rent Analytics

AccuritycanadaEdmonton, Division No. 11, CA
Télétravail
Temps plein

A national appraisal firm is looking for appraiser trainees and real estate professionals to support market rent analysis on a flexible schedule.This role offers per-file or hourly compensation and... Voir plus

 • Offre sponsorisée

Clinical Research Contracts Lead- Canada Remote - ICON Strategic Solutions

ICON Strategic Solutionsedmonton, ab, ca
Télétravail
Temps plein

ICON plc is a world-leading healthcare intelligence and clinical research organization.We’re proud to foster an inclusive environment driving innovation and excellence, and we welcome you to join u... Voir plus

 • Offre sponsorisée

AI Implementation and Research Director

Info-Tech Research GroupEdmonton, Division No. 11, CA
Temps plein

Shape the future of applied AI by directing client engagements and system prototyping.This position melds hands-on delivery with innovative research to enhance AI applications.The AI Implementation... Voir plus

 • Offre sponsorisée

Remote Recreation Domain Expert for AI Research (Contract)

MercorEdmonton, Division No. 11, CA
Télétravail
Temps plein

A leading AI talent agency is seeking Recreation Workers for a remote contract position lasting 3–4 weeks.Candidates should have at least 4 years of professional experience and excellent written co... Voir plus

 • Offre sponsorisée

Reinforcement Learning Engineer

Huawei CanadaEdmonton, Division No. 11, CA
Temporaire

Excel in a dynamic environment as a Reinforcement Learning Engineer.Focus on designing and fine-tuning scalable ML infrastructure for cutting-edge recommendation systems and AI models.In this 12-mo... Voir plus

 • Offre sponsorisée

Research Analyst

Pivotal Research Inc.edmonton, ab, ca
Temps plein

We are seeking a Research Analyst to support a diverse portfolio of projects across public policy, evaluation, and market research.The role involves contributing across the full research process, f... Voir plus