Talent.com
Site Reliability Engineer (SRE) – Observability
Site Reliability Engineer (SRE) – ObservabilityAstra North Infoteck Inc. • Toronto, ON, ca
Site Reliability Engineer (SRE) – Observability

Site Reliability Engineer (SRE) – Observability

Astra North Infoteck Inc. • Toronto, ON, ca
29 days ago
Job type
  • Full-time
Job description
Job Description: Site Reliability Engineer (SRE) – Observability
Toronto - Hybrid (1-2 days office)

Role Summary
We are looking for a Observability Engineer to help implement, operate, and improve observability capabilities across our applications and platforms. This role focuses on hands-on onboarding, instrumentation, dashboarding, and alerting, working under established standards and guidance from senior engineers.

You will collaborate with application, SRE, and operations teams to ensure systems are observable, supportable, and production-ready.

Key Responsibilities
Observability Implementation
• Implement and maintain metrics, logs, and traces for applications and infrastructure • Assist with onboarding applications into observability platforms (e.g., Dynatrace, ELK, Datadog) • Configure dashboards, alerts, and basic anomaly detection Application Support & Instrumentation • Work with development teams to enable structured logging, basic distributed tracing, and core metrics • Validate observability requirements during Production Readiness Reviews (PRR) • Troubleshoot missing or low-quality telemetry Monitoring & Alerting • Configure alerts based on golden signals (latency, errors, traffic, saturation) • Help reduce alert noise by tuning thresholds and alert logic • Support incident response by gathering logs, metrics, and traces Operations & Reliability • Support root cause analysis using observability tools • Maintain dashboards and documentation used by on-call and support teams • Participate in on-call rotations (as applicable) Automation & Continuous Improvement • Assist in automating observability onboarding and validation tasks • Create and maintain reusable dashboards and alert templates • Follow established observability standards and best practices Required Qualifications • 2–4 years of experience in Observability, or SRE • Working knowledge of metrics, logs, and basic tracing concepts • Hands-on experience with at least one observability platform (Dynatrace, Elastic/ELK, Datadog, New Relic, etc.) • Basic understanding of SLIs/SLOs and service health indicators • Experience with cloud platforms or hybrid environments • Ability to write scripts (Python, Bash, PowerShell) for automation and troubleshooting


Preferred Qualifications
• Experience with OpenTelemetry or APM agents • Familiarity with Kubernetes or containerized workloads • Experience working with incident management tools (PagerDuty, ServiceNow) • Exposure to Dynatrace/Kibana ELK or similar cloud-native monitoring • Experience in regulated or enterprise environments

Create a job alert for this search

Site Reliability Engineer (SRE) – Observability • Toronto, ON, ca

Similar jobs
Site Reliability Engineer

Site Reliability Engineer

Apptoza Inc. • toronto, ON, ca
Full-time
Title :Site Reliability Engineer (SRE) – GenAI PlatformLocation: Toronto , ONDuration: Long termWe’re looking for an experienced SRE (8+ yrs) to support and scale inf...Show more
Last updated: less than 1 hour ago • Promoted • New!
Senior Site Reliability Engineer

Senior Site Reliability Engineer

RBC • Toronto, ON, CA
Full-time
This role will be responsible for the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by the Digital Branch SRE organization.As t...Show more
Last updated: 24 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Dexian • Toronto, ON, CA
Full-time
Working Location: Toronto, ON [Hybrid 2 days a week in office].The DevOps and Automation is looking for a Site Reliability Engineer with strong expertise in Dynatrace to ensure the reliability, per...Show more
Last updated: 30+ days ago • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Sim • Toronto, ON, CA
Full-time
Join some of the most innovative thinkers in FinTech as we lead the evolution of financial technology.If you are an innovative, curious, collaborative person who embraces challenges and wants to gr...Show more
Last updated: 4 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Scotiabank • Toronto
Full-time
As a Site Reliability Engineer (SRE), you will join the Digital Engineering Operations team, responsible for ensuring the operations and reliability of Scotiabank digital applications.You will have...Show more
Last updated: 30+ days ago • Promoted
Lead Site Reliability Engineer

Lead Site Reliability Engineer

Movable Ink • Toronto, ON, CA
Full-time
Movable Ink scales content personalization for marketers through data-activated content generation and AI decisioning.The world’s most innovative brands rely on Movable Ink to maximize revenue, sim...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer — Scale Observability & Autonomy

Site Reliability Engineer — Scale Observability & Autonomy

MaintainX, Inc. • Toronto, ON, CA
Full-time
A leading technology company seeks a Site Reliability Engineer (SRE) to enhance service reliability and observability as it scales its cloud-based platform.The role involves assessing service matur...Show more
Last updated: 30+ days ago • Promoted
Senior Site Reliability Engineer, Observability

Senior Site Reliability Engineer, Observability

Framework Ventures • Toronto, ON, CA
Full-time
Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi).The Chainlink stack provides essential data, intero...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

McCain Foods • Toronto, ON, CA
Full-time
Our Global Technology team’s goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections t...Show more
Last updated: 30+ days ago • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Vantage • Toronto, ON, CA
Full-time
Do you enjoy keeping systems reliable, performant, and scalable while continuing to grow your technical depth? As a Senior Site Reliability Engineer (SRE) / DevOps Engineer at Vantage, you’ll contr...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer (GCP)

Site Reliability Engineer (GCP)

Stacktics Announces Partnership • Toronto, ON, CA
Full-time
As a Site Reliability Engineer (GCP) you will play a key role at Stacktics Inc.Cloud Infrastructure, Big Data Analytics and Cloud For Marketing products, solutions and services.As a SRE/DevOps team...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer (GCP)

Site Reliability Engineer (GCP)

Stacktics Inc. • Toronto, ON, CA
Full-time
As a Site Reliability Engineer (GCP) you will play a key role at Stacktics Inc.Cloud Infrastructure, Big Data Analytics and Cloud For Marketing products, solutions and services.As a SRE/DevOps team...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Tangerine Bank • Toronto, ON, CA
Permanent
Press Tab to Move to Skip to Content Link.Select how often (in days) to receive an alert:.Tangerine is Canada’s leading direct bank.We offer flexible and accessible banking options, innovative prod...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Deltatre • Toronto, ON, CA
Permanent
The Site Reliability Engineer (SRE) is responsible for improving the reliability, stability, and operational readiness of critical digital platforms.The role focuses on proactively reducing risk, s...Show more
Last updated: 30+ days ago • Promoted
Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Chainlink Labs • Toronto, ON, CA
Full-time
Elevate decentralized architecture as a Senior Site Reliability Engineer.Spearhead Kubernetes-based infrastructure for decentralized applications, driving scalability, security, and operational eff...Show more
Last updated: 8 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Dayforce US, Inc. • Toronto, ON, CA
Full-time
Posted Friday, March 27, 2026 at 12:00 AM | Expires Friday, May 29, 2026 at 10:59 PM.For this role, we are open to remote work and can hire anywhere in Great Britain.Dayforce is a global human capi...Show more
Last updated: 6 days ago • Promoted
Site Reliability Engineer II — Observability Platform

Site Reliability Engineer II — Observability Platform

Loblaw Companies Limited • Toronto, ON, CA
Full-time
A leading Canadian retail company is seeking a Site Reliability Engineer II to enhance their observability and reliability platform.In this hands-on role, you will design, operate, and improve syst...Show more
Last updated: 21 days ago • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Fivetran, Inc. • Toronto, ON, CA
Full-time
From Fivetran’s founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity.With Fivetran, customer data arrives in their warehouses, canonic...Show more
Last updated: 30+ days ago • Promoted