Talent.com
Sr. Systems Design Engineer - Data Center GPU
Sr. Systems Design Engineer - Data Center GPUAdvanced Micro Devices • Markham, York Region, CA
Sr. Systems Design Engineer - Data Center GPU

Sr. Systems Design Engineer - Data Center GPU

Advanced Micro Devices • Markham, York Region, CA
30+ days ago
Job type
  • Full-time
Job description

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.

Together, we advance your career.

THE ROLE :

We are looking for a dynamic, energetic Senior Systems Design Engineer to join our growing Data Center GPU team. Our team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development. In this role, will w ork closely with the automation, infrastructure, and validation teams to ensure scalability and reliability. You will also d ocument processes, best practices, and provide training for internal teams. As a key contributor to the success of AMD’s product, you will be part of a leading team to drive and improve AMD’s abilities to deliver the highest quality, industry leading technologies to market.

THE PERSON :

As a Systems Design Engineer, you will drive balanced, scalable, and automated solutions. In this high visibility position, your software systems engineering expertise will be necessary towards p roduct development, definition, and root cause resolution. You will have s trong problem-solving and debugging skills, e xcellent communication and collaboration abilitiesm and the a bility to work in fast-paced, cross-functional environments.

KEY RESPONSIBILITIES :

Containerization & Image Management

  • Design, build, and maintain Docker images optimized for ML / AI workloads.
  • Implement multi‑stage builds , image hardening, and vulnerability scanning.
  • Manage Docker registries (e.g., Harbor) and enforce retention policies for large‑scale deployments.

Automation & Orchestration

  • Develop and maintain Python‑based automation scripts for Conductor workflows.
  • Implement CI / CD pipelines for automated container builds and workload deployment.
  • Integrate orchestration frameworks (Conductor, Kubernetes, Slurm) for multi‑node workload execution.
  • ML / AI Workload Enablement

  • Enable training and inference workloads using frameworks like PyTorch, TensorFlow, VLLM .
  • Optimize distributed training and inference across multi‑node clusters using MPI and RDMA.
  • Collaborate with app experts to benchmark and tune performance for AI / HPC workloads.
  • Infrastructure & Performance

  • Integrate ROCm stack and GPU resource management into containerized environments.
  • Troubleshoot latency, networking, and storage bottlenecks for at‑scale workloads.
  • Implement monitoring and logging for containerized ML workloads.
  • PREFERRED EXPERIENCE :

  • Strong proficiency in Python and automation frameworks.
  • Hands‑on experience with Docker and container orchestration (Kubernetes, Podman).
  • Familiarity with CI / CD tools (Jenkins, GitHub Actions) and infrastructure‑as‑code (Terraform, Ansible).
  • Knowledge of ML frameworks (PyTorch, TensorFlow) and GPU acceleration (ROCm, CUDA).
  • Understanding of networking concepts (RDMA, MPI) for distributed workloads.
  • Prior experience enabling ML / AI workloads in production or HPC environments.
  • Exposure to orchestration platforms like Conductor or similar workflow engines.
  • ACADEMIC CREDENTIALS :

  • Bachelors or Masters degree in electrical or computer engineering, minimum 5‑7 years relevant experience
  • LOCATION : Markham, ON

    Benefits offered are described : AMD benefits at a glance.

    AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee‑based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and / or third‑party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

    #J-18808-Ljbffr

    Create a job alert for this search

    Sr Data Engineer • Markham, York Region, CA

    Similar jobs
    Senior Systems Engineer - Avionics

    Senior Systems Engineer - Avionics

    Essence Coaching Group • Markham, ON, Canada
    Full-time
    Senior Systems Engineer – Avionics.Lindsay, Ontario, Canada (Hybrid).CAD 165,000 – 210,000 gross / year.A senior-level Systems Engineer – Avionics is sought to lead the definition, ...Show more
    Last updated: 9 days ago • Promoted
    AI Systems Engineer – Serverless Distributed Computing

    AI Systems Engineer – Serverless Distributed Computing

    Huawei Technologies Canada Co., Ltd. • Markham, ON, CA
    Permanent
    Huawei Canada has an immediate permanent opening for a Software Engineer.The Distributed Data Storage and Management Lab leads research in distributed data systems, aiming to develop next-generatio...Show more
    Last updated: 30+ days ago
    GenAI Systems Engineer : Build Scalable AI Pipelines

    GenAI Systems Engineer : Build Scalable AI Pipelines

    Manulife Financial • Toronto, Canada
    Full-time
    A leading financial services company in Toronto is seeking a talented individual to design, build, and maintain data pipelines for AI applications. The ideal candidate will collaborate with Data Sci...Show more
    Last updated: 1 day ago • Promoted
    Senior FPGA Design Engineer - 5G (Remote)

    Senior FPGA Design Engineer - 5G (Remote)

    BTA Design Services • Toronto, ON, Canada
    Full-time
    A leading engineering firm in Toronto is looking for a Senior FPGA Designer to architect and design complex IP blocks for 4G / 5G Radio Units. The ideal candidate should have a minimum of 7 years of F...Show more
    Last updated: 7 hours ago • Promoted • New!
    Lead Systems Engineer, Launch Program

    Lead Systems Engineer, Launch Program

    The Wohl Group - Recruitment Made Easy! • Markham, ON, Canada
    Full-time
    The Lead Systems Engineer owns cross-functional planning and execution for the launch program at a systems engineering level, ensuring work meets the company’s technical requirements, risk po...Show more
    Last updated: 3 days ago • Promoted
    Senior Hardware Design Engineer — Aerospace & Defense

    Senior Hardware Design Engineer — Aerospace & Defense

    Aversan Inc. • Toronto, ON, Canada
    Full-time
    A multi-service engineering firm in Toronto seeks a Senior Hardware Design Engineer to design innovative electrical systems and ensure quality and safety standards. The role involves responsibilitie...Show more
    Last updated: 7 hours ago • Promoted • New!
    Senior Hardware Design Engineer – Hybrid (Aerospace)

    Senior Hardware Design Engineer – Hybrid (Aerospace)

    Aversan • Toronto, ON, Canada
    Full-time
    A leading engineering firm in Canada seeks an experienced Senior Hardware Design Engineer to design and develop safety-critical electronics for aerospace. The role involves significant responsibilit...Show more
    Last updated: 3 days ago • Promoted
    Sr. Infrastructure Engineer with Kubernetes - Confidential

    Sr. Infrastructure Engineer with Kubernetes - Confidential

    Confidential • richmond hill, on, ca
    Full-time
    The role seeks a highly experienced Infrastructure Specialist to spearhead the design, deployment, and operational excellence of a modern cloud-native infrastructure. The ideal candidate must posses...Show more
    Last updated: 9 days ago • Promoted
    Power Systems Engineer

    Power Systems Engineer

    Actalent • Toronto, ON, Canada
    Full-time
    NOW HIRING : Power Systems Engineer.Are you open to relocating to Western Canada? We are now hiring a Power Systems Engineer for a firm with locations in LANGLEY and CALGARY.We are seeking a talente...Show more
    Last updated: 13 days ago • Promoted
    Sr. Infrastructure Engineer with Kubernetes

    Sr. Infrastructure Engineer with Kubernetes

    Confidential • newmarket, on, ca
    Full-time
    The role seeks a highly experienced Infrastructure Specialist to spearhead the design, deployment, and operational excellence of a modern cloud-native infrastructure. The ideal candidate must posses...Show more
    Last updated: 9 days ago • Promoted
    Senior Systems Engineer

    Senior Systems Engineer

    illumin • Toronto, ON, Canada
    Full-time
    At illumin, we are transforming the advertising landscape.Our platform offers an integrated space for journey planning, execution, and reporting. It empowers marketers to connect with their audience...Show more
    Last updated: 30+ days ago • Promoted
    Systems Engineer

    Systems Engineer

    STACK IT Recruitment • Toronto, ON, Canada
    Full-time
    Love complex infrastructure challenges and client-facing problem solving?.This is your chance to lead major tech transformations - from server migrations to cloud upgrades - all while being the go-...Show more
    Last updated: 30+ days ago • Promoted
    Senior Systems Engineer

    Senior Systems Engineer

    Essence Coaching Group • Markham, ON, Canada
    Full-time
    Lindsay, Ontario, Canada (Hybrid).CAD 165,000 – 210,000 gross / year.A senior-level Systems Engineer is sought to lead aircraft- and system-level engineering activities for next-generation elec...Show more
    Last updated: 9 days ago • Promoted
    Facilities Engineer : HPC DLC & Data Center Strategy

    Facilities Engineer : HPC DLC & Data Center Strategy

    aramco • Toronto, Canada
    Full-time +1
    A major oil and gas company is seeking a Facilities Engineer for a permanent expatriate role with a relocation package to Saudi Arabia. The successful candidate will design and manage high-density c...Show more
    Last updated: 6 days ago • Promoted
    Staff SoC Design Engineer

    Staff SoC Design Engineer

    Arm Limited • Toronto, ON, Canada
    Full-time
    Our Solution Engineering division develops SoCs for various application segments, using the latest IP products from Arm and other vendors. We are looking for a creative and hard-working SoC Design E...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Center Infrastructure Engineer

    Senior Data Center Infrastructure Engineer

    Nextologies Limited • Markham, ON, Canada
    Full-time
    Senior Data Center Infrastructure Engineer.Physical infrastructure, networking, power, cooling, and video transport).Act as senior escalation point for data center infrastructure incidents, physica...Show more
    Last updated: 2 days ago • Promoted
    Sr. Systems Engineer (Networking)

    Sr. Systems Engineer (Networking)

    PagerDuty • Toronto, ON, Canada
    Full-time
    NYSE : PD) is a global leader in digital operations management.Trusted by nearly half of both the Fortune 500 and the Forbes AI 50, as well as approximately two-thirds of the Fortune 100, PagerDuty i...Show more
    Last updated: 30+ days ago • Promoted
    Turbomachinery Systems Engineer

    Turbomachinery Systems Engineer

    The Wohl Group - Recruitment Made Easy! • Markham, ON, Canada
    Full-time
    As a Turbomachinery Engineer , you will design, develop, and test high-performance turbomachinery components, such as pumps and turbines, for orbital launch vehicle propulsion systems.You will coll...Show more
    Last updated: 30+ days ago • Promoted