Senior ML Performance EngineerLemurian Labs Inc. • Toronto, ON, CA

Senior ML Performance Engineer

Lemurian Labs Inc. • Toronto, ON, CA

30+ days ago

Job type

Full-time

Job description

About Us

At Lemurian Labs, we're on a mission to bring the power of AI to everyone—without leaving a massive environmental footprint. We care deeply about the impact AI has on our society and planet, and we're building a solid foundation for its future, ensuring AI grows sustainably and responsibly. Innovation should help the world, not harm it.

We are building a high-performance, portable compiler that lets developers "build once, deploy anywhere." Yes, anywhere. We're talking about seamless cross-platform compatibility, so you can train your models in the cloud, deploy them to the edge, and everything in between—all while optimizing for resource efficiency and scalability.

If the idea of sustainably scaling AI motivates you and you're excited about making AI development both powerful and accessible, then we'd love to have you. Join us at Lemurian Labs, where you can have fun building the future—without leaving a mess behind.

The Role

We're looking for a Senior ML Performance Engineer to architect and lead our Performance Testing Platform from the ground up. You'll be the technical authority on how we measure, validate, and optimize the performance of large language models (Llama 3.2 70B, DeepSeek, and others) before and after compiler optimization on modern GPU architectures.

This is a high-impact role where you'll directly influence our product quality and our customers' success. You'll work at the intersection of ML systems, GPU architecture, and performance engineering—building the infrastructure that proves our compiler delivers real value.

What You'll Do

Design and build a comprehensive performance testing platform for evaluating LLM inference workloads across GPU clusters
Define and implement the benchmarking methodology, metrics, and test suites that measure latency, throughput, memory utilization, power consumption, and model accuracy
Establish baseline performance for unoptimized models (Llama 3.2 70B, DeepSeek, etc.) and validate post-optimization improvements
Develop automated testing pipelines for continuous performance validation across compiler releases and model updates
Investigate performance bottlenecks using profiling tools (ROCm profilers, GPU traces, system-level monitoring) and work with the compiler team to drive optimizations
Create dashboards and reporting that provide clear visibility into performance trends, regressions, and wins
Collaborate cross-functionally with compiler engineers, ML engineers, and DevOps to ensure performance testing is integrated into our development workflow
Document best practices for performance testing and optimization of ML workloads on GPU hardware

What You'll Bring

7+ years of experience in performance engineering, benchmarking, or systems engineering roles

Deep understanding of ML inference workloads, particularly transformer-based models and LLMs

Hands-on experience with GPU programming and optimization (CUDA, ROCm, or similar)

Strong programming skills in Python and C / C++

Proven track record of building performance testing infrastructure or benchmarking platforms from scratch

Experience with ML frameworks (PyTorch, TensorFlow, ONNX Runtime, vLLM, TensorRT-LLM, etc.)

Proficiency with profiling and debugging tools for GPU workloads

Strong analytical skills with the ability to design experiments, analyze results, and communicate findings clearly

Experience with CI / CD systems and test automation frameworks

Nice to Have

Experience with AMD GPUs (Mi200 / Mi300 series) and ROCm ecosystem

Knowledge of compiler optimization techniques and their impact on performance

Experience with distributed inference and multi-GPU workloads

Familiarity with ML model quantization, pruning, and other optimization techniques

Background in high-performance computing or systems-level optimization

Experience with infrastructure-as-code (Kubernetes, Docker, Terraform)

Contributions to open-source ML or systems projects

Personal Attributes

Obsessive about details — you notice the 2% regression that others miss

Self-driven — you take ownership and don't wait for permission to solve problems

Collaborative mindset — you work well across teams and help others succeed

Passionate about sustainability — you care about making AI more efficient and environmentally responsible

Clear communicator — you can explain complex technical concepts to both engineers and stakeholders

Salary depends on experience and geographical location.

This salary range may be inclusive of several career levels and will be narrowed during the interview process based on a number of factors, such as the candidate’s experience, knowledge, skills, and abilities, as well as internal equity among our team.

Additional benefits for this role may include : equity, company bonus opportunities, medical, dental, and vision benefits; retirement savings plan; and supplemental wellness benefits.

Lemurian Labs ensures equal employment opportunity without discrimination or harassment based on race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity or expression, age, disability, national origin, marital or domestic / civil partnership status, genetic information, citizenship status, veteran status, or any other characteristic protected by law.

EOE

#J-18808-Ljbffr

Create a job alert for this search

Performance Engineer • Toronto, ON, CA

Similar jobs

Senior ML Performance Engineer

Lemurian Labs Inc. • Toronto

Full-time

Last updated: 30+ days ago • Promoted

Senior AI and ML Engineer

Svitla Systems, Inc. • Toronto C6A, ON, Canada

Remote

Full-time

Our client is a leading expert network that provides business and government professionals with opportunities to communicate with industry and subject-matter experts to answer research questions.Cu...Show more

Last updated: 16 days ago • Promoted

Performance Engineer

Cerebras • Toronto C6A, ON, Canada

Full-time

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more

Last updated: 30+ days ago • Promoted

Senior ML Engineer for Healthcare AI & LLM Apps

Fullscript • Toronto C6A, ON, Canada

Remote

Full-time

A leading health technology company is seeking a Senior Machine Learning Engineer to enhance AI-driven services for clinicians and patients. You will be responsible for developing and implementing L...Show more

Last updated: 19 days ago • Promoted

ML Engineer : Predictive Maintenance & Time-Series Faults - Equity

MaintainX • Toronto C6A, ON, Canada

Full-time

A leading software company in Toronto is seeking a Machine Learning Engineer specialized in predictive maintenance.This role involves developing and training machine learning models, performing dat...Show more

Last updated: 30+ days ago • Promoted

Senior ML Performance Engineer – Neuron SDK

Amazon • Toronto C6A, ON, Canada

Full-time

A leading cloud computing company is looking for systems and compiler engineers to join their performance team in Toronto. In this role, you will analyze and optimize the system-level performance of...Show more

Last updated: 5 days ago • Promoted

Senior ML Performance Engineer : Benchmarking Platform Lead

Lemurian Labs Inc. • Toronto

Full-time

A pioneering AI company in Toronto is seeking a Senior ML Performance Engineer to architect a performance testing platform for large language models. You will validate and optimize models, collabora...Show more

Last updated: 30+ days ago • Promoted

Senior ML Kernel Performance Engineer — Flexible Hours

Amazon Web Services (AWS) • Toronto

Full-time

A leading cloud services provider in Toronto is seeking a Sr.ML Kernel Performance Engineer to design high-performance kernels for machine learning operations using cutting-edge technology.This rol...Show more

Last updated: 30+ days ago • Promoted

Senior ML Engineer, Ads Ranking & Personalization

Pinterest • Toronto C6A, ON, Canada

Remote

Full-time

A leading social media platform is seeking a Machine Learning Engineer in Toronto, Ontario, to develop and implement innovative machine learning solutions for ads conversion modeling.The ideal cand...Show more

Last updated: 29 days ago • Promoted

Senior AI / ML Engineer

LTV.ai • Toronto C6A, ON, Canada

Full-time

AI‑powered ambassadors to deliver hyper‑personalized Email and SMS interactions at an unprecedented scale.Our platform enables brands to communicate with their audience in a natural and contextuall...Show more

Last updated: 30+ days ago • Promoted

Inference Performance Engineer, ML Systems & Optimization

Cerebras Systems • Toronto C6A, ON, Canada

Full-time

A leading AI technology company in Toronto is seeking an experienced software engineer to join their inference model team. This role involves prototyping AI architectural tweaks, developing benchmar...Show more

Last updated: 30+ days ago • Promoted

Senior ML Engineer : Personalization & Recommendations (Hybrid Toronto)

Tubi Tv • Toronto, ON, Canada

Full-time

A leading streaming service provider is seeking a Principal Machine Learning Engineer for its Toronto office.This hybrid role emphasizes developing advanced recommendation systems and requires sign...Show more

Last updated: 30+ days ago • Promoted

Performance Engineer

Cerebras Systems • Toronto, Canada

Full-time

Join Cerebras as a Performance Engineer within our innovative Runtime Team.Our groundbreaking CS-3 system, hosted by a distributed set of modern and powerful x86 machines, has set new benchmarks in...Show more

Last updated: 26 days ago • Promoted

Senior I / O Performance Modeling Engineer

AMD • Markham

Full-time

A leading semiconductor company is looking for a Senior Performance Modeling Engineer in Markham.You will drive I / O performance from pre-silicon to post-silicon by analyzing performance data and op...Show more

Last updated: 30+ days ago • Promoted

Hybrid ML Engineer : MLOps & Production

Manulife Insurance Malaysia • Toronto C6A, ON, Canada

Full-time

An established industry player is seeking a Machine Learning Engineer to join their Group Advanced Analytics team.This role involves collaborating with data scientists and engineers to design and i...Show more

Last updated: 30+ days ago • Promoted

Quantum-Inspired ML Engineer | Hybrid & Flexible Hours

Hyperproof • Toronto C6A, ON, Canada

Remote

Full-time

A leading deep-tech company in Toronto is seeking a Machine Learning Engineer with a strong foundation in AI and deep learning. The candidate will design and develop techniques to enhance Large Lang...Show more

Last updated: 27 days ago • Promoted

Senior Performance Modeling Engineer

Advanced Micro Devices • Markham

Full-time

WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...Show more

Last updated: 9 days ago • Promoted

Senior Performance Engineering Lead – Scale & Automation

TD Bank • Toronto C6A, ON, Canada

Full-time

A major Canadian financial institution in Toronto is seeking a Performance Engineering Lead to strategize and lead testing for high-quality software solutions. The role emphasizes automation, mentor...Show more

Last updated: 3 days ago • Promoted