Talent.com
Astra North Infoteck Inc.
Site Reliability Engineer – APM, Dynatrace, ObservabilityAstra North Infoteck Inc. • Toronto, ON, ca
Site Reliability Engineer – APM, Dynatrace, Observability

Site Reliability Engineer – APM, Dynatrace, Observability

Astra North Infoteck Inc. • Toronto, ON, ca
2 days ago
Job type
  • Full-time
  • Quick Apply
Job description
Job Description
Site Reliability Engineer – APM, Dynatrace, Observability

Duration: 12 months

Location: Toronto

Hybrid: 2 days in office a week

SRE Lead
Deep application and system-level knowledge across complex end-to-end environments, including tightly integrated on prem and cloud native services, supporting large-scale, multitier transaction flows
Prior hands-on experience with APM and observability platforms, including Dynatrace or comparable enterprise observability tools, with the ability to instrument, analyze, and troubleshoot complex distributed applications
Proven deep troubleshooting experience resolving issues across multilayer, end to end (E2E) environments, spanning application, infrastructure, network, and platform layers across on prem and cloud services
The person is to drive and execute the SREWCCS Roadmap
Hand-on role from day 1
Observability experience expectations please see description for Observability SME below
Deep knowledge and experience in implementing SRE practices and guiding complex SRE implementations across the industry
Would provide
o Assessments of current capability help identify gaps and contribute to the SRE WCCS roadmap
o Able to navigate multi-team SRE IT Ops to drive results
o Creative workaround and solutions
SRE Observability SME
Hands-on role from day 1
Day 1 Dynatrace expertise i.e.
o DQL
o Gen3 dashboards
o Traces on Grail
o Active-Gate Plugins
o SRG Workflow development
o Biz Events
Prior hands-on experience with APM and observability platforms, including Dynatrace or comparable enterprise observability tools, with the ability to instrument, analyze, and troubleshoot complex distributed applications
Deep troubleshooting expertise leveraging observability signals (metrics, events, logs, and traces) to identify root causes and resolve failures across multilayer E2E environments
Deep background on Observability fundamentals - MELT
Expert level Dashboard (related UIUX design)
Experienced in troubleshooting performance non-functional issues
Familiar with SRE concepts as outlined in Google SRE book workbook etc.
Expertise in AWS Observability, CW, Application Signals, Metrics, logs traces, Lambda, API-GW
Able to come up with creative ways to monitor observe systems like IBM Data power where sufficient observability isnt present
Development with Python, AWS Lambda, ECS, Azure Functions
Understands fundamentals of how AI based systems built and monitored
Background or knowledge of OTEL
Experienced in Financial Services are or equivalent i.e. very complex end-to-end transaction e.g. 50 systems working together to fulfil one customer request
Platform Engineering experience
Shipping platform capabilities (e.g., self-service onboarding pipeline, policy-as-code, golden signals-as-code, standardized instrumentation libraries).
Depth of knowledge for the role
Programming depth requires strong programming in Python and Node.js and building backend integrations components.
Looking for
Practically observability experience with multi-system integration
In-depth Observability


Requirements
60-70
Create a job alert for this search

Site Reliability Engineer – APM, Dynatrace, Observability • Toronto, ON, ca

Similar jobs

Site Reliability Engineer

CapgeminiToronto, ON, CA
Full-time

Talent Acquisition Business Partner – Strategic Business Unit at Capgemini America Inc.Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d ... Show more

 • Promoted

Site Reliability Engineer (Sre)

ITRidersToronto, ON, CA
Full-time

Observability, SRE, DevOps roles with proven expertise across infrastructure and application-level reliability.Dynatrace, ELK, Splunk, and PagerDuty; SLI/SLO frameworks.Azure Kubernetes Service, Te... Show more

 • Promoted

Site Reliability Engineer

Tata Consultancy ServicesToronto, ON, CA
Full-time

Tata Consultancy Services (TCS) is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to c... Show more

 • Promoted

Lead Site Reliability Engineer

Movable InkToronto, ON, CA
Full-time

Movable Ink scales content personalization for marketers through data-activated content generation and AI decisioning.The world’s most innovative brands rely on Movable Ink to maximize revenue, sim... Show more

 • Promoted

Site Reliability Engineer

KyndrylToronto, ON, CA
Full-time +1

Join to apply for the Site Reliability Engineer role at Kyndryl.Direct message the job poster from Kyndryl.Recruitment & Strategic Staffing @Kyndryl | Partnering with IT Consultants in Financial Se... Show more

 • Promoted

Senior Site Reliability Engineer

ThinkificToronto, ON, CA
Full-time

Senior Site Reliability Engineer.Senior Site Reliability Engineer.Are you an experienced Site Reliability Engineer looking for a new challenge?.Senior Site Reliability Engineer.Senior Site Reliabil... Show more

 • Promoted

Senior Site Reliability Engineer, Kong Konnect

Kong Inc.Toronto, ON, CA
Full-time

Senior Site Reliability Engineer, Kong Konnect.This range is provided by Kong Inc.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Are you ready ... Show more

 • Promoted

Site Reliability Engineer

TELUS DigitalToronto, ON, CA
Full-time

Welcome to TELUS Digital — where innovation drives impact at a global scale.As an award-winning digital product consultancy and the digital division of TELUS, one of Canada’s largest telecommunicat... Show more

 • Promoted

Site Reliability Engineer

Future Secure AIToronto, ON, CA
Full-time

At Future Secure AI, we're building something genuinely new — and we're looking for people bold enough to build it with us.We work at the frontier of AI, tackling big, real‑world problems for globa... Show more

 • Promoted

Impactful Site Reliability Engineer Fostering Reliability and Performance

RootlyToronto, ON, CA
Full-time

Join as an impactful Site Reliability Engineer, shaping the technical future and enhancing system reliability.Tackle rewarding challenges in a collaborative startup atmosphere.As a key player, you’... Show more

 • Promoted

Senior Site Reliability Engineer I

InstacartToronto, ON, CA
Permanent

Join our team as a Senior Site Reliability Engineer II, where your expertise will play a crucial role in maintaining the backbone of our platform's operations.You'll take on challenges directly, en... Show more

 • Promoted

Senior Site Reliability Engineer

SimCorpToronto, ON, CA
Full-time

Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: Torontotime type: Full timeposted on: Posted Todayjob requisition id: R-211168Job Advertisement*... Show more

 • Promoted

Expert Site Reliability Engineer Position

Okta for DevelopersToronto, ON, CA
Full-time

Ensure secure identity management as a Senior Site Reliability Engineer.Collaborate in a remote team to enhance the reliability and scalability of mission-critical authentication systems.The SRE po... Show more

 • Promoted

Sr. Site Reliability Engineer I

Axon EnterpriseToronto, ON, CA
Full-time

At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud software.Like our products, we work b... Show more

 • Promoted

Senior Site Reliability Engineer

CaptivateIQToronto, ON, CA
Full-time

The Site Reliability Engineering team in CaptivateIQ operates across the engineering organization, supporting our development teams by providing them with the tools and processes they need to get t... Show more

 • Promoted

Site Reliability Engineer Role in Toronto

Indotronix UKToronto, ON, CA
Full-time

Step into a pivotal role as a Site Reliability Engineer in Toronto, Ontario, where your skills can enhance production environment management.This position focuses on business alignment and operatio... Show more

 • Promoted

CaptivateIQ Senior Site Reliability Engineer

CaptivateIQToronto, ON, CA
Full-time

Become a Senior Site Reliability Engineer at CaptivateIQ, optimizing processes and infrastructure remotely.Leverage your experience in automation and observability to support development teams.In t... Show more

 • Promoted

Senior Site Reliability Engineer

Global Technical Talent, an Inc. 5000 CompanyToronto, ON, CA
Permanent

Monday to Friday – core business hours; overtime could come up depending on business needs.Applicants must be authorized to work for ANY employer in the U.We are unable to sponsor or take over spon... Show more

 • Promoted

Site Reliability Engineer Role at RBC

RBCToronto, ON, CA
Full-time

RBC is seeking a Site Reliability Engineer in Toronto to lead the design and support of SRE solutions, utilizing advanced knowledge in application operations.This full-time role emphasizes technica... Show more

 • Promoted

Sr. Site Reliability Engineer I

AxonToronto, ON, CA
Full-time

Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof... Show more