Talent.com
E-IT
Site Reliability EngineerE-IT • Ottawa, CA
Site Reliability Engineer

Site Reliability Engineer

E-IT • Ottawa, CA
1 day ago
Job type
  • Full-time
Job description
Job Description Key Responsibilities: - Incident Management and Reliability: Lead the incident management process, ensuring high availability and performance of the applications. Develop and implement SRE practices to improve system reliability and resilience. - Monitoring and Observability: Utilize Dynatrace, Splunk, and Grafana to monitor system health, detect anomalies, and provide actionable insights for performance optimization. - Root Cause Analysis: Conduct thorough root cause analysis of incidents and outages, developing long-term solutions to prevent recurrence. - DevOps Practices: Collaborate with development and operations teams to streamline CI/CD pipelines, automate workflows, and implement infrastructure as code (IaC) for efficient service deployment and management. - Networking Expertise: Provide expertise in networking technologies (Cisco, Arista, AVI, etc.), ensuring robust network infrastructure design, implementation, and troubleshooting. Utilize tools like Wireshark for in-depth network analysis and debugging. - Collaboration and Leadership: Work closely with cross-functional teams to share knowledge, mentor junior engineers, and lead by example in adopting best practices in SRE, DevOps, and networking. - Innovation and Continuous Improvement: Stay abreast of industry trends and new technologies, advocating for and implementing innovative solutions to enhance system reliability and performance. Qualifications: - Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field. - 10+ years of experience in an SRE/DevOps role, with a proven track record in managing high-availability systems. - Strong expertise in monitoring and observability tools (Dynatrace, Splunk, Grafana). - Proficient in network debugging and analysis tools, including Wireshark. - Solid understanding of on-prem and hybrid cloud infrastructure (VMware, Linux, Windows, Azure) and container orchestration (Kubernetes, Docker). - Certifications in relevant technologies (Dynatrace, Splunk) are a plus. - Excellent communication and leadership skills, capable of leading incident response initiatives and collaborating effectively across teams. - Excellent problem-solving skills, with the ability to conduct comprehensive root cause analysis and troubleshooting.
Create a job alert for this search

Site Reliability Engineer • Ottawa, CA

Similar jobs

Site Reliability Engineer

E-ITottawa, on, ca
Full-time

Incident Management and Reliability:.Lead the incident management process, ensuring high availability and performance of the applications.Develop and implement SRE practices to improve system relia... Show more

 • Promoted

Site Reliability Engineer

Vertex Elite LLCOttawa, ON, CA
Full-time

Monitoring / Observability tools - Dynatrace, ELK etc.Platform/ cloud Observability - OpenShift, Prometheus / Azure Cloud etc.Collaborate with various Infrastructure, Applications, platforms, and c... Show more

 • Promoted

Staff Site Reliability Engineer

ThinkificOttawa, ON, CA
Full-time

Are you an experienced Site Reliability Engineer looking for a new challenge? We’re looking for a.Staff Site Reliability Engineer.Staff Site Reliability Engineer (SRE).As a Staff Site Reliability E... Show more

 • Promoted

Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Chainlink LabsOttawa, ON, CA
Full-time

Elevate decentralized architecture as a Senior Site Reliability Engineer.Spearhead Kubernetes-based infrastructure for decentralized applications, driving scalability, security, and operational eff... Show more

 • Promoted

Senior Site Reliability Engineer- Remote

ClickHouseOttawa, ON, CA
Remote
Full-time

Senior Site Reliability Engineer- Remote.Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies.With more than 3,000 custome... Show more

 • Promoted

Site Reliability Engineer

QlikOttawa, ON, CA
Full-time

A Gartner® Magic Quadrant™ Leader for 15 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes.Serving over 40,000 global customers, ... Show more

 • Promoted

Sr. Site Reliability Engineer I

Axon EnterpriseOttawa, ON, CA
Full-time

At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud software.Like our products, we work b... Show more

 • Promoted

Experienced Site Reliability Engineer - Remote

Tech InsightsOttawa, ON, CA
Remote
Full-time

TechInsights seeks a Senior Site Reliability Engineer to enhance AI operations from anywhere in Canada.Oversee reliability strategies, manage error budgets, and collaborate closely with engineering... Show more

 • Promoted

Cloud-Focused Site Reliability Engineer Driving Automation and Reliability

Dayforce US, Inc.Ottawa, ON, CA
Full-time

Play a vital role as a Site Reliability Engineer, enhancing cloud systems' automation and reliability.Collaborate with teams and build strong relationships while working remotely in a dynamic envir... Show more

 • Promoted

Senior Site Reliability Engineer

ThinkificOttawa, ON, CA
Full-time

Senior Site Reliability Engineer.Senior Site Reliability Engineer.Are you an experienced Site Reliability Engineer looking for a new challenge?.Senior Site Reliability Engineer.Senior Site Reliabil... Show more

 • Promoted

Site Reliability Engineer

TELUS DigitalOttawa, ON, CA
Full-time

Welcome to TELUS Digital — where innovation drives impact at a global scale.As an award-winning digital product consultancy and the digital division of TELUS, one of Canada’s largest telecommunicat... Show more

 • Promoted

Senior Site Reliability Engineer

Entrust CorporationOttawa, ON, CA
Full-time

At Entrust, we’re shaping the future of identity centric security solutions.From our comprehensive portfolio of solutions to our flexible, global workplace, we empower careers, foster collaboration... Show more

 • Promoted

Lead Site Reliability Engineer in Cloudbeds

CloudbedsOttawa, ON, CA
Full-time

Step into a pivotal role as a Lead Site Reliability Engineer with Cloudbeds, transforming the hospitality industry with cutting-edge technology.Enjoy a fully remote working environment.This role re... Show more

 • Promoted

Remote Site Reliability Engineer — Build Reliable, Automated Systems

Dayforce US, Inc.Ottawa, ON, CA
Remote
Full-time

A global human capital management company is seeking a Site Reliability Engineer to bridge software engineering and operations.This role involves building reliable systems through automation and pr... Show more

 • Promoted

Senior Site Reliability Engineer

VantageOttawa, ON, CA
Full-time

Do you enjoy keeping systems reliable, performant, and scalable while continuing to grow your technical depth? As a Senior Site Reliability Engineer (SRE) / DevOps Engineer at Vantage, you’ll contr... Show more

 • Promoted

Site Reliability Engineer Ottawa, ON, CA + 2 more

QlikOttawa, ON, CA
Full-time

Hybrid## Site Reliability EngineerOttawa, ON, CanadaA Gartner(R) Magic Quadrant(TM) Leader for 15 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic... Show more

 • Promoted

Site Reliability Engineer

Tecsys Inc.Ottawa, ON, CA
Permanent

Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company.The... Show more

 • Promoted

Senior Site Reliability Engineer I

InstacartOttawa, ON, CA
Permanent

Join our team as a Senior Site Reliability Engineer II, where your expertise will play a crucial role in maintaining the backbone of our platform's operations.You'll take on challenges directly, en... Show more

 • Promoted

Sr. Site Reliability Engineer I

AxonOttawa, ON, CA
Full-time

Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof... Show more

 • Promoted

Site Reliability Engineer with Automation Focus

YelpOttawa, ON, CA
Full-time

Join a collaborative, remote SRE team dedicated to ensuring service reliability.In this role, leverage your expertise in automation and systems management to support a platform serving millions.You... Show more