Talent.com
Search Atlas
Kubernetes Reliability Engineer at Search AtlasSearch Atlas • Toronto, ON, Canada
Kubernetes Reliability Engineer at Search Atlas

Kubernetes Reliability Engineer at Search Atlas

Search Atlas • Toronto, ON, Canada
4 days ago
Job type
  • Full-time
Job description
Be a key player at Search Atlas, architecting Kubernetes-based platforms ensuring robust AI execution with 99.99% reliability. This role demands expertise in Terraform, ArgoCD, and high-concurrency systems. In the role of Platform Reliability Engineer, you will focus on building and maintaining the Autonomous Nervous System for Atlas Brain. You’ll optimize ML inference pipelines, automate infrastructure processes, and design self-healing systems. The position requires an innovator who can push the boundaries of operational excellence for our autonomous marketing systems. Key Responsibilities: • Architect and maintain EKS/GKE-based Kubernetes platforms • Automate infrastructure deployment with Terraform and ArgoCD • Optimize high-concurrency crawling systems for real-time decisions • Establish SLOs for AI execution and agent task completion • Implement distributed monitoring solutions with OpenTelemetry and Grafana Requirements: • 6+ years in Platform Engineering or SRE roles • Mastery of Terraform, ArgoCD, and GitOps workflows • Expert in Kubernetes networking and security • Hands-on MLOps experience for autonomous agents • Proficiency in Python for automation scripts With your skills in platform engineering and automation, drive Search Atlas to the next level of autonomous execution. #J-18808-Ljbffr
Create a job alert for this search

Kubernetes Reliability Engineer at Search Atlas • Toronto, ON, Canada

Similar jobs

Infrastructure & Reliability Engineer at Rootly

RootlyToronto, ON, CA
Full-time

Join Rootly as an Infrastructure & Reliability Engineer to optimize incident response mechanisms.This role requires a blend of software engineering and operations expertise in a vibrant startup cul... Show more

 • Promoted

Remote AWS DevOps Engineer with Kubernetes Expertise

Newrich, Inc.Toronto, ON, CA
Remote
Full-time

Become a driving force for innovation as a Remote AWS DevOps Engineer, specializing in Kubernetes and Docker.Embrace ownership in a role focused on automation and cloud services while collaborating... Show more

 • Promoted

Expert Site Reliability Engineer at IBM

IBM ComputingToronto
Full-time

Take your career to the next level as an Expert Site Reliability Engineer with IBM Software, driving Confluent incident management efficiencies.Play a key role in improving cloud-based reliability ... Show more

 • Promoted

DevOps & Site Reliability Engineer Role

AppspaceToronto, ON, CA
Full-time

Shape the future of cloud operations as a DevOps Engineer at Appspace.Focus on automation and performance in a dynamic environment.In your role as Senior DevOps & Site Reliability Engineer, you wil... Show more

 • Promoted

Kubernetes Engineer at Pinterest

PinterestToronto, Ontario, Canada
Full-time

Transform technology at Pinterest as a Kubernetes Engineer focused on reliability and innovation.Collaborate with teams in Ontario to enhance system performance and automate processes.Pinterest’s P... Show more

 • Promoted

Remote Platform Engineer — Cloud & Kubernetes Ops

PlanetToronto, ON, CA
Remote
Full-time

A leading global space and data company is seeking a Software Engineer in Platform Operations.This full-time remote role prioritizes building and operating cloud infrastructure supporting engineeri... Show more

 • Promoted

Remote Senior Site Reliability Engineer Role

ViafouraToronto, ON, CA
Remote
Full-time

Advance your career as a Senior Site Reliability Engineer at Viafoura, specializing in Kubernetes and AWS infrastructure.This remote role positions you to improve our platform's performance and sca... Show more

 • Promoted

DevOps Engineer - AWS, CI/CD & Reliability Focus

NewtonToronto, ON, CA
Full-time

A leading cryptocurrency firm in Canada is seeking a DevOps Engineer to improve CI/CD workflows and manage infrastructure.The ideal candidate will have experience with AWS, automation, and operatio... Show more

 • Promoted

Forward Deployed Senior Engineer at Cloudflare

Cloudflare Area 1 SecurityToronto, ON, CA
Full-time

Transform customer solutions as a Forward Deployed Senior Engineer with Cloudflare in Toronto, ON.This role combines deep technical expertise with real customer engagement and production-level impa... Show more

 • Promoted

Senior Site Reliability Engineer (Remote-First)

VySystemsToronto, ON, CA
Remote
Full-time

A leading technology company is seeking a Senior Site Reliability Engineer with robust Kubernetes knowledge to work remotely.Ideal candidates have over 6 years of experience in IT disciplines, prof... Show more

 • Promoted

Remote AWS DevOps Engineer with Kubernetes Expertise

Newrich NetworkToronto, ON, CA
Remote
Full-time

Become a driving force for innovation as a Remote AWS DevOps Engineer, specializing in Kubernetes and Docker.Embrace ownership in a role focused on automation and cloud services while collaborating... Show more

 • Promoted

Lead Site Reliability Engineer at iManage

iManageToronto, ON, CA
Full-time

Advance your career as a Lead Site Reliability Engineer at iManage, focused on maintaining and enhancing cloud resilience while enjoying flexible work arrangements.You will play a crucial role in d... Show more

 • Promoted

Lead DevOps Engineer at TekWissen

TekWissen ®Markham, ON, CA
Full-time

Empower your career as a Lead DevOps Engineer with TekWissen in Ann Arbor, Michigan.Collaborate on cutting-edge semiconductor technologies while mastering CI/CD processes and deployment strategies.... Show more

 • Promoted

Innovative Site Reliability Engineer for Cloud and AI Solutions

Themesoft Inc.Toronto
Full-time

Lead the charge in site reliability engineering focusing on cloud systems and AI-driven observability.Leverage your strong Python scripting and experience with tools like PagerDuty and Moogsoft.In ... Show more

 • Promoted

Senior Reliability Engineer - AWS & Kubernetes

CloudbedsToronto, ON, CA
Full-time

Step into the role of Senior Site Reliability Engineer at Cloudbeds and help innovate our hospitality solutions with AWS and Kubernetes.Enjoy a fully remote position that promotes a diverse, collab... Show more

 • Promoted

Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Chainlink LabsToronto, ON, CA
Full-time

Elevate decentralized architecture as a Senior Site Reliability Engineer.Spearhead Kubernetes-based infrastructure for decentralized applications, driving scalability, security, and operational eff... Show more

 • Promoted

Senior Site Reliability Engineer — Kubernetes, AWS & Observability

ThinkificToronto, ON, CA
Full-time

A leading e-learning provider in Canada is seeking a Senior Site Reliability Engineer to enhance and secure their infrastructure supporting online course creators.This role involves improving perfo... Show more

 • Promoted

Kubernetes Engineer

RumbleToronto, ON, CA
Full-time

Our mission is to restore the internet to its roots by making it free and open once again.This role will focus on our new CAPI/CAPO-based Kubernetes solution, which is designed to be compatible wit... Show more

 • Promoted

AWS DevOps Engineer at OceanMD

OceanMDToronto, ON, CA
Full-time

Advance your career with OceanMD as an AWS DevOps Engineer in a hybrid role.Based in Toronto, Vancouver, or Victoria, you will work on cloud infrastructure and CI/CD automation.This position involv... Show more

 • Promoted

Lead Engineer for AI Systems at Upwork

UpworkToronto, ON, CA
Full-time

Enhance data-driven decision-making as a Lead Engineer for AI Systems at Upwork in Toronto.Utilize advanced AI capabilities to drive business impact.In this role, you'll be part of the Data Platfor... Show more