Talent.com
Astra North Infoteck Inc.
Site Reliability Engineer – APM, Dynatrace, ObservabilityAstra North Infoteck Inc. • Toronto, ON, ca
Site Reliability Engineer – APM, Dynatrace, Observability

Site Reliability Engineer – APM, Dynatrace, Observability

Astra North Infoteck Inc. • Toronto, ON, ca
9 days ago
Job type
  • Full-time
Job description
Job Description
Site Reliability Engineer – APM, Dynatrace, Observability

Duration: 12 months

Location: Toronto

Hybrid: 2 days in office a week

SRE Lead
Deep application and system-level knowledge across complex end-to-end environments, including tightly integrated on prem and cloud native services, supporting large-scale, multitier transaction flows
Prior hands-on experience with APM and observability platforms, including Dynatrace or comparable enterprise observability tools, with the ability to instrument, analyze, and troubleshoot complex distributed applications
Proven deep troubleshooting experience resolving issues across multilayer, end to end (E2E) environments, spanning application, infrastructure, network, and platform layers across on prem and cloud services
The person is to drive and execute the SREWCCS Roadmap
Hand-on role from day 1
Observability experience expectations please see description for Observability SME below
Deep knowledge and experience in implementing SRE practices and guiding complex SRE implementations across the industry
Would provide
o Assessments of current capability help identify gaps and contribute to the SRE WCCS roadmap
o Able to navigate multi-team SRE IT Ops to drive results
o Creative workaround and solutions
SRE Observability SME
Hands-on role from day 1
Day 1 Dynatrace expertise i.e.
o DQL
o Gen3 dashboards
o Traces on Grail
o Active-Gate Plugins
o SRG Workflow development
o Biz Events
Prior hands-on experience with APM and observability platforms, including Dynatrace or comparable enterprise observability tools, with the ability to instrument, analyze, and troubleshoot complex distributed applications
Deep troubleshooting expertise leveraging observability signals (metrics, events, logs, and traces) to identify root causes and resolve failures across multilayer E2E environments
Deep background on Observability fundamentals - MELT
Expert level Dashboard (related UIUX design)
Experienced in troubleshooting performance non-functional issues
Familiar with SRE concepts as outlined in Google SRE book workbook etc.
Expertise in AWS Observability, CW, Application Signals, Metrics, logs traces, Lambda, API-GW
Able to come up with creative ways to monitor observe systems like IBM Data power where sufficient observability isnt present
Development with Python, AWS Lambda, ECS, Azure Functions
Understands fundamentals of how AI based systems built and monitored
Background or knowledge of OTEL
Experienced in Financial Services are or equivalent i.e. very complex end-to-end transaction e.g. 50 systems working together to fulfil one customer request
Platform Engineering experience
Shipping platform capabilities (e.g., self-service onboarding pipeline, policy-as-code, golden signals-as-code, standardized instrumentation libraries).
Depth of knowledge for the role
Programming depth requires strong programming in Python and Node.js and building backend integrations components.
Looking for
Practically observability experience with multi-system integration
In-depth Observability


Requirements
60-70
Create a job alert for this search

Site Reliability Engineer – APM, Dynatrace, Observability • Toronto, ON, ca

Similar jobs

Site Reliability Engineer – APM, Dynatrace, Observability (Toronto)

Astra North InfoteckToronto, ON, CA
Full-time

Site Reliability Engineer – APM, Dynatrace, Observability.Hybrid: 2 days in office a week.Deep application and system-level knowledge across complex end-to-end environments, including tightly integ... Show more

 • Promoted • New!

Site Reliability Engineer

CapgeminiToronto, ON, CA
Full-time

Talent Acquisition Business Partner – Strategic Business Unit at Capgemini America Inc.Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d ... Show more

 • Promoted

Site Reliability Engineer, Observability

PricelineToronto
Full-time

This role is eligible for our hybrid work model: Two days in-office.This job posting is for an existing, currently vacant position.Site Reliability Engineer, Observability.Our Technology team is th... Show more

 • Promoted

Site reliability engineer

Société Financière ManuvieToronto, ON, CA
Full-time

The Site Reliability Engineer (SRE) role provides an opportunity to support critical data modernization initiatives, including migration from legacy systems to cloud-based platforms such as Databri... Show more

 • Promoted • New!

Site Reliability Engineer

KyndrylToronto, ON, CA
Full-time +1

Join to apply for the Site Reliability Engineer role at Kyndryl.Direct message the job poster from Kyndryl.Recruitment & Strategic Staffing @Kyndryl | Partnering with IT Consultants in Financial Se... Show more

 • Promoted

Senior Site Reliability Engineer

ThinkificToronto, ON, CA
Full-time

Senior Site Reliability Engineer.Senior Site Reliability Engineer.Are you an experienced Site Reliability Engineer looking for a new challenge?.Senior Site Reliability Engineer.Senior Site Reliabil... Show more

 • Promoted

Site Reliability Engineer - HCLTech

HCLTechtoronto, on, ca
Full-time

Hands-on experience with at least one major public cloud platform (Azure, AWS, or GCP).Strong understanding of cloud infrastructure and application runtime components, including compute, storage, n... Show more

 • Promoted

Site Reliability Engineer

TELUS DigitalToronto, ON, CA
Full-time

Welcome to TELUS Digital — where innovation drives impact at a global scale.As an award-winning digital product consultancy and the digital division of TELUS, one of Canada’s largest telecommunicat... Show more

 • Promoted

Impactful Site Reliability Engineer Fostering Reliability and Performance

RootlyToronto, ON, CA
Full-time

Join as an impactful Site Reliability Engineer, shaping the technical future and enhancing system reliability.Tackle rewarding challenges in a collaborative startup atmosphere.As a key player, you’... Show more

 • Promoted

Site Reliability Engineer

Tata Consultancy ServicesToronto
Full-time

Tata Consultancy Services (TCS) is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to c... Show more

 • Promoted

Senior Site Reliability Engineer at Auth0 (Toronto)

OktaToronto, ON, Canada
Full-time

Join the SRE team at Auth0 in Europe as a Senior Site Reliability Engineer.Focus on enhancing system reliability, scalability, and performance through cutting‑edge solutions and automation.In this ... Show more

 • Promoted • New!

Observability Site Reliability Engineer

Priceline.comToronto
Full-time

Elevate your career as an Observability Site Reliability Engineer with Priceline in a hybrid work model.Drive the enhancement of observability solutions across various environments and platforms.In... Show more

 • Promoted

Site Reliability Engineer

Artech LLCToronto, ON, CA
Full-time

Title: Site Reliability Engineer.We are seeking a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of platform services.The ideal candidate will bring strong ... Show more

 • Promoted • New!

Site Reliability Engineer

HCLTechtoronto, on, ca
Full-time

Hands-on experience with at least one major public cloud platform (Azure, AWS, or GCP).Strong understanding of cloud infrastructure and application runtime components, including compute, storage, n... Show more

 • Promoted

Site Reliability Engineer

E-ITtoronto, on, ca
Full-time

Incident Management and Reliability:.Lead the incident management process, ensuring high availability and performance of the applications.Develop and implement SRE practices to improve system relia... Show more

 • Promoted

Senior Site Reliability Engineer

CaptivateIQToronto, ON, CA
Full-time

The Site Reliability Engineering team in CaptivateIQ operates across the engineering organization, supporting our development teams by providing them with the tools and processes they need to get t... Show more

 • Promoted

Site Reliability Engineer Position at MaintainX

MaintainXToronto, ON, CA
Full-time

Drive operational excellence at MaintainX as a Site Reliability Engineer.Focus on reliability and mentoring while collaborating across development teams in a cloud environment.In this pivotal role,... Show more

 • Promoted

Site Reliability Engineer Role in Toronto

Indotronix UKToronto, ON, CA
Full-time

Step into a pivotal role as a Site Reliability Engineer in Toronto, Ontario, where your skills can enhance production environment management.This position focuses on business alignment and operatio... Show more

 • Promoted

Senior Site Reliability Engineer

Global Technical Talent, an Inc. 5000 CompanyToronto, ON, CA
Permanent

Monday to Friday – core business hours; overtime could come up depending on business needs.Applicants must be authorized to work for ANY employer in the U.We are unable to sponsor or take over spon... Show more

 • Promoted

Site Reliability Engineer

Momentum Financial Services GroupToronto
Full-time

At Momentum Financial Services Group, we help people move forward by reimagining how money works for those who need it most.With more than 40 years of experience, we’re the team behind Money Mart—C... Show more