Talent.com
Cloudbeds
Senior Site Reliability EngineerCloudbeds • Ahuntsic North, ca
No longer accepting applications
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Cloudbeds • Ahuntsic North, ca
12 days ago
Job type
  • Full-time
Job description
At Cloudbeds, we transform hospitality with a platform that powers properties in 150 countries and processes billions in bookings annually. Our remote, global team builds AI‑powered solutions for hotels, ranging from independent properties to large groups, integrating with hundreds of partners.

Responsibilities

Design and implement reliable and scalable AWS architecture to meet organizational needs.

Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure‑related components.

Support the CI/CD process with ArgoCD and GitOps.

Automate platform deployments using Terraform infrastructure‑as‑code.

Develop and continuously improve product observability and monitoring systems based on Grafana, Prometheus, DataDog, and CloudWatch.

Respond to and participate in incident management and root cause analysis, minimizing service impact.

Optimize system performance and troubleshoot issues as they arise.

Collaborate with development teams to establish monitoring best practices and ensure systems meet reliability targets.

Collaborate with security teams to implement and maintain security best practices.

Provide infrastructure support rotation guidance to other engineering teams.

Qualifications

5+ years as a DevOps or SRE within the AWS ecosystem.

5+ years working with Kubernetes (EKS) and Helm charts.

Experience designing, building, and supporting CI/CD pipelines with ArgoCD and GitHub Actions.

Proficiency with infrastructure‑as‑code methodologies using Terraform.

Skilled in observability and monitoring with Grafana, Prometheus, DataDog, and CloudWatch.

Experience with incident management, full‑stack troubleshooting, performance analysis, and root cause analysis (RCA).

Knowledge of web application systems such as Nginx, ingress controllers, load balancing, and CDN.

Experience with databases (MySQL, PostgreSQL, Aurora) and middleware (Redis, Memcached, SQS).

Strong networking skills with VPC, Security Groups, and Network ACLs.

Ability to work remotely and manage your own time in a global team.

Good written and verbal communication in English.

Bachelor’s degree in Computer Science or equivalent experience.

Bonus Skills

Advanced database administration experience (Aurora, MySQL, PostgreSQL).

Experience working in a PCI‑compliant environment.

Experience with Kong API Gateway.

Benefits

Remote First, Remote Always.

PTO in accordance with local labor requirements.

Monthly Wellness Fridays—an extra long weekend each month.

Full paid parental leave.

Home office stipend based on country of residency.

Professional development courses through Cloudbeds University.

Access to manager training, upskilling, and knowledge transfer.

Inclusion Cloudbeds is a proud Equal Opportunity Employer that celebrates diversity. We do not discriminate based on race, religion, color, national origin, gender (including pregnancy or related medical conditions), sexual orientation, gender identity, gender expression, age, veteran status, disability, or any other legally protected characteristic. We provide reasonable accommodations, including an American Sign Language interpreter, for applicants with disabilities.

#J-18808-Ljbffr
Create a job alert for this search

Senior Site Reliability Engineer • Ahuntsic North, ca

Similar jobs

Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Chainlink LabsMontreal, Montreal (administrative region), CA
Full-time

Elevate decentralized architecture as a Senior Site Reliability Engineer.Spearhead Kubernetes-based infrastructure for decentralized applications, driving scalability, security, and operational eff... Show more

 • Promoted

Site Reliability Engineer in Hybrid Setup

CGIMontreal
Full-time

Shape cloud operations as a Site Reliability Engineer in a hybrid role.Focus on CI/CD pipeline implementation and enhance deployment automation for stability.As a key contributor, you will ensure e... Show more

 • Promoted

Site Reliability Engineer with Automation Focus

YelpMontreal (administrative region), QC, CA
Full-time

Join a collaborative, remote SRE team dedicated to ensuring service reliability.In this role, leverage your expertise in automation and systems management to support a platform serving millions.You... Show more

 • Promoted

Senior Site Reliability Engineer

VantageMontreal, Montreal (administrative region), CA
Full-time

Do you enjoy keeping systems reliable, performant, and scalable while continuing to grow your technical depth? As a Senior Site Reliability Engineer (SRE) / DevOps Engineer at Vantage, you’ll contr... Show more

 • Promoted

Lead Site Reliability Engineer Innovating AI Tools and Standards

Coalition IncMontreal (administrative region), QC, CA
Full-time

Shape the future of AI in site reliability engineering as a Staff SRE.Drive impactful standards, tooling, and integrations while ensuring reliable development practices in a remote-first culture.As... Show more

 • Promoted

Senior Site Reliability Engineer (Remote-First)

VySystemsMontreal (administrative region), QC, CA
Remote
Full-time

A leading technology company is seeking a Senior Site Reliability Engineer with robust Kubernetes knowledge to work remotely.Ideal candidates have over 6 years of experience in IT disciplines, prof... Show more

 • Promoted

Senior Site Reliability Engineer in Crypto

P2PMontreal (administrative region), QC, CA
Full-time

Join Kraken as a Senior Site Reliability Engineer, contributing to innovative crypto solutions from anywhere in the world.This remote role emphasizes managing infrastructure and enhancing CI/CD pro... Show more

 • Promoted

Remote Site Reliability Engineer - Scale Crypto Systems

NewtonMontreal, Montreal (administrative region), CA
Remote
Full-time

A leading innovative tech company in Toronto is looking for a Site Reliability Engineer.In this pivotal role, you will enhance the reliability and resilience of critical services, manage incidents,... Show more

 • Promoted

Senior Site Reliability Engineer - Data Governance

Lightspeed Commerce, Inc.Montreal (administrative region), QC, CA
Full-time

Lead the charge in data governance as a Senior Site Reliability Engineer at Lightspeed.Focus on building secure, scalable cloud infrastructure that enables high data availability.This senior role a... Show more

 • Promoted

Hybrid Site Reliability Engineer Role

SAP SEMontreal
Full-time

Join a Site Reliability Engineering team focused on cloud service reliability.Use your skills in incident management and container technologies to enhance operational efficiency in a hybrid work se... Show more

 • Promoted

Specialist Site Reliability Engineer

Global Talent Alliance, CanadaMontreal
Full-time

About the job Specialist Site Reliability Engineer.The role of the Specialist Site Reliability Engineer (SRE) is to execute RAM analysis and engineering in support of the I&T solutions.The overall ... Show more

 • Promoted

Site Reliability Engineer

SynechronMontreal
Full-time

Chez Synechron, nous croyons au pouvoir du numérique pour transformer les entreprises en mieux.Notre cabinet de conseil mondial allie créativité et technologie innovante pour fournir des solutions ... Show more

 • Promoted

Senior Site Reliability Engineer

ThinkificMontreal (administrative region), QC, CA
Full-time

Senior Site Reliability Engineer.Senior Site Reliability Engineer.Are you an experienced Site Reliability Engineer looking for a new challenge?.Senior Site Reliability Engineer.Senior Site Reliabil... Show more

 • Promoted

Senior Site Reliability Engineer- Remote

ClickHouseMontreal (administrative region), QC, CA
Remote
Full-time

Senior Site Reliability Engineer- Remote.Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies.With more than 3,000 custome... Show more

 • Promoted

Senior Site Reliability Engineer I

InstacartMontreal, Montreal (administrative region), CA
Permanent

Join our team as a Senior Site Reliability Engineer II, where your expertise will play a crucial role in maintaining the backbone of our platform's operations.You'll take on challenges directly, en... Show more

 • Promoted

Cloud-Focused Site Reliability Engineer Driving Automation and Reliability

Dayforce US, Inc.Montreal, Montreal (administrative region), CA
Full-time

Play a vital role as a Site Reliability Engineer, enhancing cloud systems' automation and reliability.Collaborate with teams and build strong relationships while working remotely in a dynamic envir... Show more

 • Promoted

Sr. Site Reliability Engineer I

Axon EnterpriseMontreal (administrative region), QC, CA
Full-time

At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud software.Like our products, we work b... Show more

 • Promoted

Site Reliability Engineer

TecsysMontreal (administrative region), QC, CA
Permanent

Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company.The... Show more

 • Promoted

Senior Site Reliability Engineer

MedeloopMontreal (administrative region), QC, CA
Full-time

We are seeking a Senior DevOps & Site Reliability Engineer to own the reliability, scalability, performance, and operational excellence of Medeloop’s platform.This role blends deep DevOps engineeri... Show more

 • Promoted

Sr. Site Reliability Engineer I

AxonMontreal, Montreal (administrative region), CA
Full-time

Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof... Show more