Talent.com

Performance Jobs in Kitchener on

Last updated: 1 day ago
Principal Engineer AI Inference Performance

Principal Engineer AI Inference Performance

Huawei Technologies Canada Co., Ltd.Waterloo, ON, CA
CDI
Show moreLast updated: 30+ days ago
restaurant manager

restaurant manager

Mary Brown's Chicken & TatersKitchener, ON, CA
CAD34.1 hourly
Permanent
Show moreLast updated: 10 days ago
restaurant manager

restaurant manager

A&W 4955Kitchener, ON, CA
CAD22 hourly
Permanent
Show moreLast updated: 6 days ago
Building Department Manager / Director

Building Department Manager / Director

CanMar RecruitmentKitchener Central, Ontario, Canada
CAD45000–CAD50000 yearly
Direct Placement
Show moreLast updated: 30+ days ago
Service Center Manager

Service Center Manager

All-Brite Glass & TintKitchener, Ontario, Canada
CAD60 hourly
Full-time
Show moreLast updated: 30+ days ago
Remote Customer Care, Sales & Team Manager-SD

Remote Customer Care, Sales & Team Manager-SD

AO GLOBE LIFEKitchener, ON, CA
Remote
Full-time
Quick Apply
Show moreLast updated: 3 days ago
Performance Coach (Personal Trainer)

Performance Coach (Personal Trainer)

Movati AthleticWaterloo, ON NTA, CAN
CAD25–CAD30 hourly
Full-Time
Show moreLast updated: 30+ days ago
Commercial Insurance Sales Manager

Commercial Insurance Sales Manager

Western Coast Insurance ServicesKitchener, Ontario, Canada
CAD75000–CAD112000 yearly
Full-time
Show moreLast updated: 30+ days ago
Badminton coach

Badminton coach

KCKW BadmintonKitchener, ON, CA
CAD27–CAD28.5 hourly
Permanent
Show moreLast updated: 18 days ago
DB2 DBA

DB2 DBA

VodastraKitchener, ON, CA
CAD45 hourly
Full-time
Quick Apply
Show moreLast updated: 30+ days ago
Sales Enablement Program Specialist

Sales Enablement Program Specialist

Waste ManagementKitchener, ON, Canada
CAD97000–CAD120000 yearly
Full-time
Show moreLast updated: 30+ days ago
  • Promoted
Administrative Assistant

Administrative Assistant

InsideHigherEdKitchener, Ontario, Canada
CAD75673 yearly
Full-time
Show moreLast updated: 1 day ago
Project Manager, CAN Health (2 year contract)

Project Manager, CAN Health (2 year contract)

CommunitechKitchener, ON, CA
CAD122900 yearly
Communitech Contract
Show moreLast updated: 30+ days ago
Manager, Product Personal Auto - 9516

Manager, Product Personal Auto - 9516

DGA CareersKitchener, Ontario
CAD80000–CAD95000 yearly
Full-TimePermanent
Show moreLast updated: 30+ days ago
Assistant Branch Manager

Assistant Branch Manager

BMOKitchener, ON
CAD42300–CAD78400 yearly
Part-time
Show moreLast updated: 30+ days ago
Store Manager, Fairway Road

Store Manager, Fairway Road

PartSourceKitchener, ON
CAD33.65 hourly
Full-time
Show moreLast updated: 30+ days ago
restaurant manager

restaurant manager

OSCAR'S FAMILY RESTAURANT AND PIZZERIAKitchener, ON, CA
CAD22 hourly
Permanent
Show moreLast updated: 18 days ago
restaurant manager

restaurant manager

Pizza HutKitchener, ON, CA
CAD20.13 hourly
Permanent
Show moreLast updated: 18 days ago
restaurant manager

restaurant manager

SHIVAS DOSA RESTAURANTKitchener, ON, CA
CAD28.5 hourly
Permanent
Show moreLast updated: 18 days ago
Merchandise Processing Manager

Merchandise Processing Manager

TalizeKitchener, ON, CAN
CAD90000–CAD105000 yearly
Full-time
Show moreLast updated: 30+ days ago
Principal Engineer AI Inference Performance

Principal Engineer AI Inference Performance

Huawei Technologies Canada Co., Ltd.Waterloo, ON, CA
30+ days ago
Job description

Our team has an immediate permanent opening for a Principal Engineer. Responsibilities :

  • Develop and maintain real-time and historical performance monitoring tools for AI inference workloads, including profiling tools for various AI model types (small models, LLMs, VLMs, and multimodal systems) in applications like conversational AI, video processing, and real-time analytics.
  • Analyze and classify inference workloads based on characteristics like profile, decode, pre / post-processing overheads, and computational complexity to develop tailored optimization strategies.
  • Develop performance models that consider the systematic factors of AI inference, including model size, architecture (e.g., transformers, CNNs), application-specific constraints (e.g., latency for conversational AI), and compute resource characteristics (GPU, TPU, CPU, and specialized accelerators).
  • Optimize inference workloads across various hardware resources by reducing latency, minimizing memory overhead, and improving throughput. Techniques include quantization, pruning, fusion, and caching. Ensure that models can scale efficiently across diverse compute platforms, from edge devices to large-scale cloud infrastructures.
  • Lead efforts in creating benchmarks for different types of inference tasks. Utilize tools such as NVIDIA Nsight, PyTorch Profiler, and TensorBoard to gain insights into inference performance across diverse hardware platforms.
  • Conduct benchmarking and performance comparisons across various hardware platforms (e.g., GPUs, TPUs, edge accelerators) to identify bottlenecks and optimization opportunities. Provide recommendations for software and hardware improvements based on inference throughput, latency, and power consumption.
  • Work closely with AI research, software engineering, and DevOps teams to improve the end-to-end AI inference pipeline, ensuring optimized deployments across different production environments. Collaborate with system architects to incorporate resource-aware optimizations into design practices.
  • Develop strategies to ensure the scalability of inference workloads in production environments, considering both model performance and resource scaling, whether in on-premises environments, cloud infrastructure, or edge computing devices.

What you’ll bring to the team :

  • Ph.D. or Master’s degree in Computer Science, Electrical Engineering, Machine Learning, or related field.
  • Minimum 5+ years of experience in AI / ML engineering with a focus on inference performance, workload analysis, and system optimization.
  • Extensive experience with AI frameworks (e.g., TensorFlow, PyTorch, ONNX) and model optimizationtechniques (e.g., quantization, pruning, kernel fusion, and hardware-aware tuning).
  • Proficient with profiling tools (e.g., TensorBoard, PyTorch Profiler, NVIDIA Nsight) and workload analysis for diverse AI models and applications.
  • Expertise in optimizing small models, large language models (LLMs), VLMs, and multimodal models for inference.
  • Strong programming skills in Python, C++, CUDA, and experience with low-level hardware performance tuning.
  • Familiarity with performance modeling methodologies and frameworks for predicting inference workload performance under varying conditions.
  • Proven expertise in data parallelism, model parallelism, pipeline parallelism, and other distributed systems for performance improvements at scale.