Talent.com
Advanced Micro Devices, Inc
AI Developer – Rack Systems EngineeringAdvanced Micro Devices, Inc • MARKHAM, Ontario, Canada
AI Developer – Rack Systems Engineering

AI Developer – Rack Systems Engineering

Advanced Micro Devices, Inc • MARKHAM, Ontario, Canada
30+ days ago
Job type
  • Full-time
Job description

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.Together, we advance your career.

The Role

The AI Developer – Rack Systems Engineering drives AI-driven work automation and intelligent tooling directly into AMD’s rack-scale hardware workflows. This senior technical role designs and implements Python-based and LLM-powered agents that automate and streamline how rack systems are planned, configured, validated, debugged, reported, and managed across the organization.

You will work at the intersection of AI development and rack-level hardware execution, turning complex, cross-discipline engineering activities into reliable automated workflows that reduce manual effort, shorten cycle time, improve process consistency, and increase predictability for leadership and partner teams. As an SMTS, you are expected to solve complex, non-recurring problems, lead significant changes in existing processes, operate with minimal supervision, and mentor other developers and engineers.

Key Responsibilities

AI Development for Rack-Scale Hardware Workflows

  • Design and build Python-based AI agents, automations, and workflow pipelines that support rack-level configuration, readiness tracking, debug workflows, execution reporting, and day-to-day work automation for rack systems.
  • Develop LLM-driven tools and multi-agent workflows that transform existing rack-systems processes, checklists, spreadsheets, and tribal knowledge into reliable, reusable, and increasingly automated AI flows aligned with DPEG’s AI enablement roadmap.
  • Implement robust data processing, orchestration, and integration logic that connects AI agents to design, planning, validation, and reporting sources used by rack-systems teams.
  • Identify repetitive, manual, and time-consuming engineering tasks and convert them into scalable AI-assisted or fully automated workflows that improve execution speed and consistency.

Cross‑Functional Collaboration

Collaborate closely with the following groups to ensure AI workflows reflect real rack‑scale engineering needs and constraints:

  • Firmware Engineering – Ingest and structure firmware inputs, status, and configuration requirements into AI‑driven flows supporting rack integration and debug.
  • Product Ops – Embed AI agents into product and rack‑systems process development, capacity views, and execution dashboards to reduce manual reporting and improve decision speed.
  • System Design – Align AI logic with system‑level architectures, design constraints, and configuration rules at the rack level.
  • Quality Engineering – Use AI agents to surface risks, coverage gaps, and recurring issue patterns from quality and defect data.
  • PC Board Design – Support board‑level inputs to rack configurations (BOMs, options, constraints) via AI‑assisted extraction and transformation.
  • Hardware Development – Ensure AI workflows accurately represent hardware states, dependencies, and readiness as racks move through development milestones.
  • Failure Engineering – Apply AI‑driven triage, pattern detection, and summarization across failure analysis artifacts, logs, and lessons‑learned to feed back into rack‑systems processes.
  • Systems Architecture – Capture architectural rules, design intents, and trade‑offs into AI logic used to guide rack‑level decisions.
  • Testing and Validation – Automate test‑result aggregation, coverage summaries, and risk views using AI agents integrated into validation workflows and dashboards.

Enterprise‑Ready AI Implementation

  • Use modern AI-assisted developer tools such as GitHub, VS Code, Cursor, Claude-based tools, and OpenAI-style code agents to rapidly prototype, automate, and harden rack-focused AI workflows.
  • Implement AI solutions within AMD’s enterprise AI environment, adhering to internal security, governance, and deployment patterns.
  • Design workflows that can scale across teams while respecting data boundaries, confidentiality, and applicable AI security standards.
  • Build automations that are maintainable, auditable, and suitable for repeated use in production engineering environments rather than one-off prototypes.

Technical Leadership

  • Own end-to-end AI solution definition for selected rack-systems workflows: problem framing, automation opportunity identification, architecture, implementation, deployment, and handoff to ongoing owners.
  • Drive significant improvements in how rack-systems processes are executed, reducing manual steps and accelerating time-to-insight for leadership.
  • Advise and guide peers and cross-team representatives in AI development techniques, workflow automation approaches, AI-tool usage, and best practices for integrating AI into hardware-centric workflows.

Required Experience & Skills

  • Strong, proven Python development capability applied to automation, data processing, or system integration.
  • Hands‑on experience building and deploying LLM‑driven agents, workflows, or tools that deliver measurable impact (e.g., time savings, quality improvement, reduced manual effort).
  • Proficiency with GitHub and modern development workflows (branching, reviews, CI).
  • Familiarity with AI‑assist coding environments such as Cursor, Claude‑based code tools, or similar agentic coding platforms.
  • Experience with HW development process and workflow.
  • Experience with board design desirable, experience with system design a big plus.
  • Proven track record operating at the following level:Solving complex, non‑recurring technical problemsMaking technical decisions with 6–12‑month impactWorking with minimal supervisionMentoring less‑experienced engineers and influencing multi‑team technical direction

ACADEMIC QUALIFICATIONS:

  • Bachelor’s degree in Computer Science, Computer Engineering, Software Engineering, Electrical Engineering, Systems Engineering, or related field.
  • Advanced degree is a plus.

#LI-KW1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.

This posting is for an existing vacancy.

The Role

The AI Developer – Rack Systems Engineering drives AI-driven work automation and intelligent tooling directly into AMD’s rack-scale hardware workflows. This senior technical role designs and implements Python-based and LLM-powered agents that automate and streamline how rack systems are planned, configured, validated, debugged, reported, and managed across the organization.

You will work at the intersection of AI development and rack-level hardware execution, turning complex, cross-discipline engineering activities into reliable automated workflows that reduce manual effort, shorten cycle time, improve process consistency, and increase predictability for leadership and partner teams. As an SMTS, you are expected to solve complex, non-recurring problems, lead significant changes in existing processes, operate with minimal supervision, and mentor other developers and engineers.

Key Responsibilities

AI Development for Rack-Scale Hardware Workflows

  • Design and build Python-based AI agents, automations, and workflow pipelines that support rack-level configuration, readiness tracking, debug workflows, execution reporting, and day-to-day work automation for rack systems.
  • Develop LLM-driven tools and multi-agent workflows that transform existing rack-systems processes, checklists, spreadsheets, and tribal knowledge into reliable, reusable, and increasingly automated AI flows aligned with DPEG’s AI enablement roadmap.
  • Implement robust data processing, orchestration, and integration logic that connects AI agents to design, planning, validation, and reporting sources used by rack-systems teams.
  • Identify repetitive, manual, and time-consuming engineering tasks and convert them into scalable AI-assisted or fully automated workflows that improve execution speed and consistency.

Cross‑Functional Collaboration

Collaborate closely with the following groups to ensure AI workflows reflect real rack‑scale engineering needs and constraints:

  • Firmware Engineering – Ingest and structure firmware inputs, status, and configuration requirements into AI‑driven flows supporting rack integration and debug.
  • Product Ops – Embed AI agents into product and rack‑systems process development, capacity views, and execution dashboards to reduce manual reporting and improve decision speed.
  • System Design – Align AI logic with system‑level architectures, design constraints, and configuration rules at the rack level.
  • Quality Engineering – Use AI agents to surface risks, coverage gaps, and recurring issue patterns from quality and defect data.
  • PC Board Design – Support board‑level inputs to rack configurations (BOMs, options, constraints) via AI‑assisted extraction and transformation.
  • Hardware Development – Ensure AI workflows accurately represent hardware states, dependencies, and readiness as racks move through development milestones.
  • Failure Engineering – Apply AI‑driven triage, pattern detection, and summarization across failure analysis artifacts, logs, and lessons‑learned to feed back into rack‑systems processes.
  • Systems Architecture – Capture architectural rules, design intents, and trade‑offs into AI logic used to guide rack‑level decisions.
  • Testing and Validation – Automate test‑result aggregation, coverage summaries, and risk views using AI agents integrated into validation workflows and dashboards.

Enterprise‑Ready AI Implementation

  • Use modern AI-assisted developer tools such as GitHub, VS Code, Cursor, Claude-based tools, and OpenAI-style code agents to rapidly prototype, automate, and harden rack-focused AI workflows.
  • Implement AI solutions within AMD’s enterprise AI environment, adhering to internal security, governance, and deployment patterns.
  • Design workflows that can scale across teams while respecting data boundaries, confidentiality, and applicable AI security standards.
  • Build automations that are maintainable, auditable, and suitable for repeated use in production engineering environments rather than one-off prototypes.

Technical Leadership

  • Own end-to-end AI solution definition for selected rack-systems workflows: problem framing, automation opportunity identification, architecture, implementation, deployment, and handoff to ongoing owners.
  • Drive significant improvements in how rack-systems processes are executed, reducing manual steps and accelerating time-to-insight for leadership.
  • Advise and guide peers and cross-team representatives in AI development techniques, workflow automation approaches, AI-tool usage, and best practices for integrating AI into hardware-centric workflows.

Required Experience & Skills

  • Strong, proven Python development capability applied to automation, data processing, or system integration.
  • Hands‑on experience building and deploying LLM‑driven agents, workflows, or tools that deliver measurable impact (e.g., time savings, quality improvement, reduced manual effort).
  • Proficiency with GitHub and modern development workflows (branching, reviews, CI).
  • Familiarity with AI‑assist coding environments such as Cursor, Claude‑based code tools, or similar agentic coding platforms.
  • Experience with HW development process and workflow.
  • Experience with board design desirable, experience with system design a big plus.
  • Proven track record operating at the following level:Solving complex, non‑recurring technical problemsMaking technical decisions with 6–12‑month impactWorking with minimal supervisionMentoring less‑experienced engineers and influencing multi‑team technical direction

ACADEMIC QUALIFICATIONS:

  • Bachelor’s degree in Computer Science, Computer Engineering, Software Engineering, Electrical Engineering, Systems Engineering, or related field.
  • Advanced degree is a plus.

#LI-KW1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.

This posting is for an existing vacancy.

Create a job alert for this search

AI Developer – Rack Systems Engineering • MARKHAM, Ontario, Canada

Similar jobs

AI Systems Architect

DarkRoast DesignToronto, ON, CA
Full-time

Get AI-powered advice on this job and more exclusive features.We’re building an AI intelligence infrastructure embedded inside a B2B platform used by consumer brands to manage creative work, assets... Show more

 • Promoted

Director of AI - newmarket

People In AInewmarket, on, ca
Full-time

Director, AI / ML (Applied AI & Agentic Systems).A scaled, product-led technology company operating at the intersection of data, AI, and vertical SaaS—focused on transforming how complex, real-worl... Show more

 • Promoted

Responsible AI Architect — RAIOps & Governance (Hybrid Toronto)

Iris Software Inc.Toronto, ON, CA
Full-time

A leading IT consulting firm is seeking an AI/ML Architect to lead the design and implementation of Responsible AI Governance frameworks.This position requires extensive experience in AI/ML archite... Show more

 • Promoted

Agentic AI Systems Developer - Remote

NTT America, Inc.Toronto, ON, CA
Remote
Full-time

We are currently seeking a Agentic AI Systems Developer - Remote to join our team in Toronto, Ontario (CA-ON), Canada (CA).You will design and build agentic AI systems for healthcare using the Neur... Show more

 • Promoted

AI Developer – Rack Systems Engineering

AMDMarkham, York Region, CA
Full-time

WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst... Show more

 • Promoted

Senior Full-Stack Developer AI Systems

Société Financière ManuvieToronto, ON, CA
Full-time

Shape the future of AI by developing a scalable, cloud native platform tailored for enterprise needs.This hybrid role emphasizes Akka, MLOps, and high-performance service architecture.As a Senior F... Show more

 • Promoted

AI Developer – Rack Systems Engineering

Advanced Micro DevicesMarkham
Full-time

WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst... Show more

 • Promoted

AI Systems Engineer for Context Management

CapIntelToronto, ON, CA
Full-time

Elevate your career as a Context Engineer, focusing on AI systems for the wealth management sector.You'll design and implement LLM-driven features that provide reliable service for financial adviso... Show more

 • Promoted

AI Systems Architect at ContactMonkey

Updata PartnersToronto, ON, CA
Full-time

Become an AI Systems Architect at ContactMonkey and lead transformative AI innovations.Drive architectural strategies from our hybrid workspace in downtown Toronto, creating intelligent workflows.T... Show more

 • Promoted

Agentic AI Developer

HaysGreater Toronto Area, Canada, Canada
Full-time

Our client is a fast‑growing technology organization building next‑generation AI‑driven products.The team is focused on designing and scaling intelligent, autonomous systems that solve real‑world p... Show more

 • Promoted

Senior Fullstack Engineer for AI Agent Systems

11xToronto, ON, CA
Full-time

A leading AI company based in Toronto is seeking a fullstack engineer to design and develop systems for autonomous digital workers.In this role, you'll collaborate across teams to enhance user expe... Show more

 • Promoted

Agentic AI Systems Developer - Remote

NTT Data Americas, Inc.Toronto, ON, CA
Remote
Full-time

We are currently seeking a Agentic AI Systems Developer - Remote to join our team in Toronto, Ontario (CA-ON), Canada (CA).You will design and build agentic AI systems for healthcare using the Neur... Show more

 • Promoted

AI System Architect at Solink

SolinkToronto, ON, CA
Full-time

Shape the future of AI solutions as an AI System Architect at Solink in Ottawa or Toronto.This role is hybrid and emphasizes practical implementation of innovative technologies.Solink is actively s... Show more

 • Promoted

AI Systems Optimization Engineer

CerebrasToronto, ON, CA
Full-time

Become a key player in AI technology as a Performance Engineer focused on enhancing software performance and system optimizations for ML applications.Work within an innovative Runtime Team to push ... Show more

 • Promoted

Research Engineer - Decentralized AI Systems

Yotta LabsToronto, ON, CA
Full-time

Research Engineer - Decentralized AI Systems.Join to apply for the Research Engineer - Decentralized AI Systems role at Yotta Labs.Yotta Labs is pioneering the development of a Decentralized Operat... Show more

 • Promoted

AI Systems Lead

ExaCare AIToronto, ON, CA
Full-time

AI engine that powers how the company scales.This cross‑functional role sits across Sales, RevOps, Growth, BizOps, and more, and is responsible for designing and implementing the systems, workflows... Show more

 • Promoted

AI Developer (LLM & Agentic Systems)

WSPToronto, ON, CA
Full-time

We are the home of ambitious, passionate, and innovative world shapers.With an unmatched breadth and depth of engineering, advisory and science‑based expertise, our global minds unite to power loca... Show more

 • Promoted

Remote AI Systems Engineering Opportunity

Lifted, an Upwork CompanyToronto, ON, CA
Remote
Full-time

Engage in a transformative role as a remote AI Systems Evaluation Engineer with a leading enterprise client.This contract position centers on enhancing advanced AI systems through rigorous human fe... Show more

 • Promoted

Agentic AI Systems Developer - Remote

NTT DATA, Inc.Toronto, ON, CA
Remote
Full-time

We are currently seeking a Agentic AI Systems Developer - Remote to join our team in Toronto, Ontario (CA-ON), Canada (CA).You will design and build agentic AI systems for healthcare using the Neur... Show more

 • Promoted

Senior AI Developer: Agentic Systems & Platform Lead

Constellation Dealer GroupToronto, ON, CA
Full-time

A North American software provider is seeking a Senior Developer to design and build AI-native systems that impact business outcomes.The role emphasizes collaborative system design and engineering ... Show more