Talent.com
Sr. Systems Design Engineer - Data Center GPU
Sr. Systems Design Engineer - Data Center GPUAdvanced Micro Devices • Markham, York Region, CA
Sr. Systems Design Engineer - Data Center GPU

Sr. Systems Design Engineer - Data Center GPU

Advanced Micro Devices • Markham, York Region, CA
30+ days ago
Job type
  • Full-time
Job description

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.

Together, we advance your career.

THE ROLE :

We are looking for a dynamic, energetic Senior Systems Design Engineer to join our growing Data Center GPU team. Our team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development. In this role, will w ork closely with the automation, infrastructure, and validation teams to ensure scalability and reliability. You will also d ocument processes, best practices, and provide training for internal teams. As a key contributor to the success of AMD’s product, you will be part of a leading team to drive and improve AMD’s abilities to deliver the highest quality, industry leading technologies to market.

THE PERSON :

As a Systems Design Engineer, you will drive balanced, scalable, and automated solutions. In this high visibility position, your software systems engineering expertise will be necessary towards p roduct development, definition, and root cause resolution. You will have s trong problem-solving and debugging skills, e xcellent communication and collaboration abilitiesm and the a bility to work in fast-paced, cross-functional environments.

KEY RESPONSIBILITIES :

Containerization & Image Management

  • Design, build, and maintain Docker images optimized for ML / AI workloads.
  • Implement multi‑stage builds , image hardening, and vulnerability scanning.
  • Manage Docker registries (e.g., Harbor) and enforce retention policies for large‑scale deployments.

Automation & Orchestration

  • Develop and maintain Python‑based automation scripts for Conductor workflows.
  • Implement CI / CD pipelines for automated container builds and workload deployment.
  • Integrate orchestration frameworks (Conductor, Kubernetes, Slurm) for multi‑node workload execution.
  • ML / AI Workload Enablement

  • Enable training and inference workloads using frameworks like PyTorch, TensorFlow, VLLM .
  • Optimize distributed training and inference across multi‑node clusters using MPI and RDMA.
  • Collaborate with app experts to benchmark and tune performance for AI / HPC workloads.
  • Infrastructure & Performance

  • Integrate ROCm stack and GPU resource management into containerized environments.
  • Troubleshoot latency, networking, and storage bottlenecks for at‑scale workloads.
  • Implement monitoring and logging for containerized ML workloads.
  • PREFERRED EXPERIENCE :

  • Strong proficiency in Python and automation frameworks.
  • Hands‑on experience with Docker and container orchestration (Kubernetes, Podman).
  • Familiarity with CI / CD tools (Jenkins, GitHub Actions) and infrastructure‑as‑code (Terraform, Ansible).
  • Knowledge of ML frameworks (PyTorch, TensorFlow) and GPU acceleration (ROCm, CUDA).
  • Understanding of networking concepts (RDMA, MPI) for distributed workloads.
  • Prior experience enabling ML / AI workloads in production or HPC environments.
  • Exposure to orchestration platforms like Conductor or similar workflow engines.
  • ACADEMIC CREDENTIALS :

  • Bachelors or Masters degree in electrical or computer engineering, minimum 5‑7 years relevant experience
  • LOCATION : Markham, ON

    Benefits offered are described : AMD benefits at a glance.

    AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee‑based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and / or third‑party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

    #J-18808-Ljbffr

    Create a job alert for this search

    Sr Data Engineer • Markham, York Region, CA

    Similar jobs
    Applied Data Center Design Engineer

    Applied Data Center Design Engineer

    Cerebras • Toronto
    Full-time
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more
    Last updated: 30+ days ago • Promoted
    Senior Systems Engineer - Avionics

    Senior Systems Engineer - Avionics

    Essence Coaching Group • Markham, ON, Canada
    Full-time
    Senior Systems Engineer – Avionics.Lindsay, Ontario, Canada (Hybrid).CAD 165,000 – 210,000 gross / year.A senior-level Systems Engineer – Avionics is sought to lead the definition, ...Show more
    Last updated: 2 days ago • Promoted
    Sr. Analog IC Design Engineer

    Sr. Analog IC Design Engineer

    Xanadu • Toronto
    Full-time
    Senior Analog IC Design Engineer.Xanadu’s mission is to build quantum computers that are useful and available to people everywhere. At Xanadu, we are learners, innovators, researchers, collaborators...Show more
    Last updated: 12 days ago • Promoted
    Sr System Engineer - Integrated Control Center

    Sr System Engineer - Integrated Control Center

    Alstom • Toronto
    Full-time
    Sr System Engineer - Integrated Control Center.Sr System Engineer - Integrated Control Center.Sr System Engineer - Integrated Control Center. Sr System Engineer - Integrated Control Center.At Alstom...Show more
    Last updated: 30+ days ago • Promoted
    (Data Center) Sr Software Engineer - Rack Management (Neihu)

    (Data Center) Sr Software Engineer - Rack Management (Neihu)

    Qualcomm • Markham
    Full-time
    Qualcomm Semiconductor Limited.Engineering Group, Software Engineering.Qualcomm is seeking an experienced server and rack SW management engineer for AI accelerator products aimed at Data Centers.Th...Show more
    Last updated: 30+ days ago • Promoted
    System Development Engineer - Amazon Fulfillment Technology

    System Development Engineer - Amazon Fulfillment Technology

    CNSC • Toronto
    Full-time
    Come and be a part of Amazon's incredible growth story! Are you inspired by invention? Is problem-solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigg...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Systems Engineer (Networking)

    Sr. Systems Engineer (Networking)

    PagerDuty • Toronto
    Full-time
    NYSE : PD) is a global leader in digital operations management.Trusted by nearly half of both the Fortune 500 and the Forbes AI 50, as well as approximately two-thirds of the Fortune 100, PagerDuty i...Show more
    Last updated: 30+ days ago • Promoted
    Senior System Engineer

    Senior System Engineer

    Arc Compute • Toronto
    Full-time
    GPU clusters and is focused on improving efficiency, throughput, and reliability at scale.We’re looking for a Senior Embedded Software Engineer to help build the software that makes our GPU infrast...Show more
    Last updated: 1 day ago • Promoted
    Senior Engineer – I&C Systems Design

    Senior Engineer – I&C Systems Design

    GE Vernova • Markham
    Full-time
    Senior Engineer – I&C Systems Design.The I&C Systems Design Engineer is responsible for design and analysis of I&C systems for nuclear power plant applications. Demonstrates accountability for cost,...Show more
    Last updated: 5 days ago • Promoted
    Senior SoC Design Engineer & Tech Lead

    Senior SoC Design Engineer & Tech Lead

    Arm Limited • Toronto C6A, ON, Canada
    Full-time
    A leading global technology company in Toronto is seeking a creative SoC Design Engineer to join their team.You will design and verify innovative products, mentor team members, and collaborate acro...Show more
    Last updated: 30+ days ago • Promoted
    Electrical Engineer (Data Center)

    Electrical Engineer (Data Center)

    HDR • Richmond Hill
    Full-time
    Be among the first 25 applicants.At HDR, our employee-owners are fully engaged in creating a welcoming environment where each of us is valued and respected, a place where everyone is empowered to b...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Systems Design Engineer - Data Center GPU

    Sr. Systems Design Engineer - Data Center GPU

    Advanced Micro Devices • Markham
    Full-time
    WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...Show more
    Last updated: 30+ days ago • Promoted
    Applied Data Center Design Engineer

    Applied Data Center Design Engineer

    Cerebras Systems • Toronto
    Full-time
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more
    Last updated: 30+ days ago • Promoted
    Senior Systems Engineer

    Senior Systems Engineer

    illumin • Toronto, ON, Canada
    Full-time
    At illumin, we are transforming the advertising landscape.Our platform offers an integrated space for journey planning, execution, and reporting. It empowers marketers to connect with their audience...Show more
    Last updated: 30+ days ago • Promoted
    System Design Specialist, Senior

    System Design Specialist, Senior

    Hitachi Vantara Corporation • Toronto
    Full-time
    A career at Hitachi Rail will help create a legacy.With operations in every corner of the world, our work goes to the cutting‑edge of digital transformation and technology.From the multi‑cultural s...Show more
    Last updated: 30+ days ago • Promoted
    Senior Systems Engineer

    Senior Systems Engineer

    General Motors of Canada • Markham
    Full-time
    This posting is not for an existing vacancy within the organization and is open to new applications.As part of the application process, Artificial Intelligence will be used in the hiring process fo...Show more
    Last updated: 1 day ago • Promoted
    Senior Data Center GPU Systems Engineer — Validation

    Senior Data Center GPU Systems Engineer — Validation

    AMD • Markham
    Full-time
    A leading semiconductor company in York Region, Markham is searching for a Senior Systems Design Engineer to join their Data Center GPU team. This high-visibility role involves driving technical inn...Show more
    Last updated: 11 hours ago • Promoted • New!
    Data Center Electrical Engineer - Design & Lead Projects

    Data Center Electrical Engineer - Design & Lead Projects

    Fashion Institute of Design & Merchandising • Richmond Hill
    Full-time
    A leader in engineering solutions is seeking an Electrical Engineer - Data Center to provide electrical design and coordinate teams on large projects in Richmond Hill, ON.Ideal candidates should ha...Show more
    Last updated: 30+ days ago • Promoted