Talent.com
Site Reliability Engineer
Site Reliability EngineerHCLTech • Toronto, Canada
Site Reliability Engineer

Site Reliability Engineer

HCLTech • Toronto, Canada
24 days ago
Job type
  • Full-time
Job description

Join our SRE squad supporting ~1000 AWS-hosted services for BMO. You’ll own operational reliability, rapid triage, and proactive maintenance across production and non-prod, partnering closely with Cloud Engineering, SOC, and application teams.

Key Responsibilities

Deliver 24×7 monitoring, incident response, and problem management; drive MTTA / MTTR reduction and SLO / SLI adherence.

Perform preventive health checks; analyze ticket trends to implement continual service improvements and automation to reduce toil.

Execute blameless postmortems and high-quality RCA; maintain SOPs / runbooks and reliability dashboards.

Configure / tune observability (Dynatrace, CloudWatch, ELK); enable self-healing workflows and workload optimizations.

Support change / service requests within agreed SLAs; collaborate during transitions and onboard new AWS services.

Core Skills & Tools

AWS :

Lambda, ECS / Fargate / EC2, API Gateway, SNS / SQS, Kinesis, RDS; IAM / KMS foundations.

Observability & ITSM :

Dynatrace, CloudWatch, ELK; ServiceNow for incidents / changes; SLI / SLO dashboards.

Reliability Practices :

Error budgets, capacity / performance benchmarking, automation / runbook execution, FinOps awareness.

Qualifications

5+ years SRE / DevOps or L2 operations for cloud-native stacks; strong AWS production experience.

Proven incident / change / problem management in 24×7 environments; adept at RCA and postmortems.

Hands‑on with observability tooling and operational automation; excellent collaboration and documentation skills.

Shift Coverage & Locations

Follow-the-sun model with overlapping handoffs across Canada / India to ensure continuous support. Success is measured by uptime, MTTR / MTTD, change failure rate, error‑budget consumption, SLO adherence, RCA quality, and CSI throughput.

#J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer • Toronto, Canada

Similar jobs
Site Reliability Engineer

Site Reliability Engineer

Staples • Richmond Hill
Full-time
The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, and operational excellence of Staples Canada’s digital platforms. This role supports production systems...Show more
Last updated: 19 days ago • Promoted
Site Reliability Engineer 3

Site Reliability Engineer 3

Behavox • Toronto
Full-time
Behavox is shaping the future of how businesses harness their most important raw material - data.Our mission is bold : Organize enterprise data into actionable information that protects and promotes...Show more
Last updated: 19 days ago • Promoted
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Acquird.io • Toronto
Full-time
B2B SaaS company, teams are based out of North America.Role is 95% remote in Toronto (we meetup 1x a month).Must be able to legally work in Canada (visa or sponsorship won't be provided).Our Platfo...Show more
Last updated: 19 days ago • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Tubi, Inc. • Toronto
Full-time
Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users.Tubi offers the world's largest collection of Hollywood movies and TV shows, th...Show more
Last updated: 19 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Tyk • Toronto, Canada
Full-time
About Tyk The Tyk API Management platform is helping to drive the connected world and power new products and services.We're changing the way that organisations connect any number of their systems a...Show more
Last updated: 26 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

ScalePad • Toronto, Canada
Full-time
About ScalePad ScalePad is a market‑leading SaaS company headquartered in Vancouver, Toronto, Montreal and Phoenix, AZ.With a global employee reach, we serve over 12,000 MSPs worldwide, helping the...Show more
Last updated: 26 days ago • Promoted
Site Reliability Engineer II

Site Reliability Engineer II

Electronic Arts (EA) • Toronto, Canada
Full-time
Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world.Here, everyone is part of the story. A team where everyone makes play happen.The Productio...Show more
Last updated: 26 days ago • Promoted
Site Reliability Engineer, Inference Infrastructure

Site Reliability Engineer, Inference Infrastructure

Cohere • Toronto
Full-time
Our mission is to scale intelligence to serve humanity.We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like cont...Show more
Last updated: 19 days ago • Promoted
Azure Site Reliability Engineer

Azure Site Reliability Engineer

Epsilon Solutions Ltd. • Toronto
Full-time
Team Lead Recruitment @ Epsilon Solutions Ltd.Azure Site Reliability Engineer.Implement and maintain monitoring systems to proactively identify potential issues and alert engineers to problems befo...Show more
Last updated: 19 days ago • Promoted
Reliability Engineer

Reliability Engineer

Interpro Pipe & Steel • Toronto, Canada
Full-time
As a team we collaborate to solve problems, contribute ideas and challenge each other to ensure growth and ultimately success for the business and our employees. Job Description & Responsibilities.D...Show more
Last updated: 26 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Aarorn Technologies Inc • Toronto
Full-time
Toronto, ON (3x onsite a week).We are seeking a skilled Site Reliability Engineer (SRE) to enhance the reliability, scalability, and performance of our systems and applications.The ideal candidate ...Show more
Last updated: 19 days ago • Promoted
Global SaaS Site Reliability Engineer

Global SaaS Site Reliability Engineer

Kong • Toronto
Full-time
A leading developer of cloud API technologies is seeking a Site Reliability Engineer to join their global Platform SRE team in Toronto, Ontario. The role involves managing and scaling a multi-region...Show more
Last updated: 19 days ago • Promoted
Senior Site Reliability Engineer, Kong Konnect

Senior Site Reliability Engineer, Kong Konnect

Kong Inc. • Toronto
Full-time
Senior Site Reliability Engineer, Kong Konnect.This range is provided by Kong Inc.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Are you ready ...Show more
Last updated: 19 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Manulife • Toronto
Full-time
We are seeking a motivated Site Reliability Engineer (SRE) to join the Manulife Bank Service Delivery Management (SDM) team. In this role, you will be responsible for ensuring the reliability, avail...Show more
Last updated: 9 days ago • Promoted
Lead Site Reliability Engineer

Lead Site Reliability Engineer

Movable Ink • Toronto
Full-time
Movable Ink scales content personalization for marketers through data-activated content generation and AI decisioning.The world’s most innovative brands rely on Movable Ink to maximize revenue, sim...Show more
Last updated: 22 hours ago • Promoted • New!
Site Reliability Engineer II

Site Reliability Engineer II

Electronic Arts • Toronto, Canada
Full-time
Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world.Here, everyone is part of the story. Part of a community that connects across the globe.A ...Show more
Last updated: 26 days ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Tangerine Bank • Toronto
Full-time +1
Press Tab to Move to Skip to Content Link.Select how often (in days) to receive an alert : .Tangerine is Canada’s leading direct bank. We offer flexible and accessible banking options, innovative prod...Show more
Last updated: 19 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

STAPLES Canada • Richmond Hill
Full-time
The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, and operational excellence of Staples Canada’s digital platforms. This role supports production systems...Show more
Last updated: 19 days ago • Promoted