Principal Site Reliability Engineering Specialist (SRE)CGI • Vancouver, Canada

Principal Site Reliability Engineering Specialist (SRE)

CGI • Vancouver, Canada

16 days ago

Job type

Full-time

Job description

Position Description :

Location : Edmonton

Open to other locations within proximity to a CGI Office

Hybrid work model

We are hiring a Senior Site Reliability Engineer (SRE) with a strong foundation in building and operating reliable, scalable, and resilient cloud platforms. You bring a reliability and performance engineering mindset to everything you do—balancing operational stability with modernization and automation. In this role, you will apply core SRE practices—including SLIs / SLOs, observability, incident management, and operational automation—while temporarily supporting a regional support strategy engagement focused on assessing and strengthening large-scale operational environments. You will work closely with platform, operations, and architecture teams to evaluate current-state practices, identify reliability and support gaps, and contribute to the definition of future-state operating models and implementation roadmaps. Beyond this engagement, the role is designed for ongoing, hands-on SRE delivery, where you will lead and implement monitoring, reliability engineering, automation, and tooling across cloud and hybrid environments. You will collaborate with cross-functional teams to design, build, and continuously improve platform reliability, engineering standards, and operational excellence practices for mission-critical services. This position places you in a client-facing, high-impact environment, where your technical depth, operational judgment, and ability to translate reliability principles into practical outcomes will directly influence service stability, modernization efforts, and future cloud initiatives. If you are a proven SRE who thrives in complex environments and values both hands-on engineering and operational leadership, this role offers the opportunity to make a meaningful and lasting impact.

Your future duties and responsibilities :

Who are You?

You are a senior Site Reliability Engineer who thrives on solving complex reliability and operational challenges at scale. You are curious, collaborative, and continuously focused on improving how platforms, infrastructure, and services are operated and supported. Your strength lies in applying sound engineering judgment to real-world operational problems, balancing reliability, performance, and maintainability. You are equally comfortable working hands-on with tools and systems and stepping back to assess how operational practices, support models, and workflows impact service reliability. You can engage confidently in technical discussions with engineers while also communicating clearly with operational leaders and stakeholders to explain risks, trade-offs, and improvement opportunities.

With a mindset grounded in continuous improvement and learning, you champion modernization, automation, and pragmatic reliability practices. You are trusted for your ability to identify root causes rather than symptoms, to raise concerns early, and to translate reliability principles into practical, actionable outcomes. Your peers value your technical depth and calm leadership in complex environments, and teams rely on you to elevate operational maturity and execution quality. At CGI, we recognize strong SRE practitioners and provide the environment and support for them to grow, contribute, and make a meaningful impact across engagements.

Responsibilities

Develop, operate, and evolve monitoring, logging, and alerting capabilities across cloud and hybrid environments, while temporarily contributing SRE expertise to assess and rationalize existing operational monitoring practices as part of a regional support strategy initiative.
Define, implement, and continuously improve SLIs, SLOs, and SLAs for platform and service reliability, applying these principles during the engagement to evaluate current-state service outcomes and inform future-state reliability targets.
Lead and participate in incident response, problem investigation, and root cause analysis, leveraging hands-on SRE experience to identify systemic reliability issues and recurring operational failure patterns observed across regional support operations.
Design and automate reliability and operational processes, including integration with CI / CD pipelines and operational workflows, while contributing insights into where automation and tooling can reduce manual effort and improve support consistency across regions.
Collaborate closely with DevOps, platform engineering, architecture, and application teams, providing SRE leadership during this engagement and transitioning seamlessly to tool- and platform-heavy delivery roles on future projects.
Analyze and document current operational workflows, support models, and escalation paths, translating frontline operational insights into actionable reliability and service improvement recommendations.
Contribute to the definition of future-state operating models and implementation roadmaps by applying SRE and operational excellence principles to improve reliability, supportability, and scalability.
Provide regular status updates and risk assessments, highlighting operational risks, dependencies, and reliability impacts to support informed decision-making.

Required qualifications to be successful in this role :

5+ years of experience in Site Reliability Engineering, platform engineering, or infrastructure operations, with demonstrated ability to apply reliability principles across both delivery and operational contexts.

Strong proficiency with observability and monitoring platforms such as Grafana, Prometheus, ELK, New Relic, or equivalent, with the ability to assess, design, and improve monitoring strategies in complex environments.

Hands-on experience operating cloud platforms (Azure, AWS, and / or GCP), including production support, reliability engineering, and operational troubleshooting.

Strong automation and scripting skills using tools such as Python, Bash, Ansible, or equivalent, with a mindset focused on reducing toil and improving operational efficiency.

Excellent communication skills in English (French considered an asset), with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders.

Proven track record of improving system reliability, availability, and operational stability, including measurable reductions in incident frequency or impact.

Experience analyzing and documenting operational workflows, support models, and escalation paths within IT or platform operations environments.

Ability to facilitate technical and operational workshops with engineers, operations teams, and service stakeholders to validate findings and align on improvements.

Working knowledge of ITSM / ITIL practices (Incident, Problem, Change), particularly as they relate to reliability, supportability, and operational maturity.

Experience working in regulated, enterprise, or public-sector environments where documentation quality, security classification, and auditability are required.

CGI is providing a reasonable estimate of the pay range for this role. The determination of this range includes factors such as skill set level, geographic market, experience and training, and licenses and certifications. Compensation decisions depend on the facts and circumstances of each case. A reasonable estimate of the current range is $90,–$,. This role is a future opportunity.

#LI-AB19

Use of the term ‘engineering’ in this job posting refers to the technical sense related to Information Technology (IT) and does not imply that the individual practices engineering or possesses the requisite license as prescribed by the applicable provincial or territorial engineering regulator. We are seeking individuals with expertise in IT engineering-related functions, but licensure from an engineering regulator is not a prerequisite for this position. Engineering is a regulated profession in Canada which is restricted in terms of use of titles and designation.

Skills :

Finance&Ops Apps Solution Arch

Create a job alert for this search

Principal Site Reliability Engineering Specialist SRE • Vancouver, Canada

Similar jobs

Platform Engineering Leader — Scale, Reliability & Growth

Hiive • Vancouver

Full-time

A leading fintech startup in Vancouver is seeking an Associate Director of Engineering, Platform.You will lead and mentor teams in DevOps, Site Reliability Engineering, and QA, ensuring operational...Show more

Last updated: 17 days ago • Promoted

Engineering Manager - High-Impact S3 Initiative

Amazon Web Services (AWS) • Vancouver

Full-time

A leading cloud services provider in Vancouver is seeking a Software Development Manager to lead engineering teams focused on improving internal communication protocols for one of their foundationa...Show more

Last updated: 21 hours ago • Promoted • New!

Safety Manager - Vortex Companies - Trenchless Infrastructure Rehabilitation Solutions

Vortex Companies - Trenchless Infrastructure Rehabilitation Solutions • delta, bc, ca

Full-time

Cette offre d'emploi est disponible en français.Si vous avez besoin de la version anglaise, elle est fournie uniquement pour la commodité. L'employeur soutient l'équité en matière d'emploi et encour...Show more

Last updated: 1 day ago • Promoted

Senior Site Reliability Engineer - Distributed Systems & Platforms

Apple • Vancouver

Full-time

A leading tech company in Metro Vancouver is seeking Site Reliability Engineers to develop processes and tools for managing distributed systems. The role involves building scalable services and coll...Show more

Last updated: 17 days ago • Promoted

Staff Site Reliability Engineer (Staff SRE)

Walt Disney Animation Studios • Vancouver

Full-time

Staff Site Reliability Engineer (Staff SRE).Walt Disney Animation Studios’ world‑class filmmakers, artists, and technical collaborators create the magic of animation. Bring your unique talents, pass...Show more

Last updated: 17 days ago • Promoted

SRE Specialist

Fortinet, Inc. • Burnaby

Full-time

We are the SSP (Support Systems and Processes).Fortinet and passionate about building, improving, and maintaining various information systems that serve our employees worldwide, as well as consumer...Show more

Last updated: 17 days ago • Promoted

Team Lead, Systems Engineering

OSI Maritime Systems Ltd. • Burnaby

Full-time

Posted Thursday, October 16, 2025 at 10 : 00 a.At OSI Maritime Systems, we pride ourselves on delivering world-class navigation and bridge systems. With decades of experience serving military customer...Show more

Last updated: 17 days ago • Promoted

Structural Field Review Specialist — On-Site, Employee-Owned

Read Jones Christoffersen Ltd. • Vancouver

Full-time

A prominent engineering firm is seeking a Construction Field Review Representative in Metro Vancouver.This full-time role involves reviewing on-site work for varied projects, engaging with clients ...Show more

Last updated: 17 days ago • Promoted

Senior Director, Development

Thor Companies • richmond, BC, ca

Full-time

My client is looking to hire Senior Director, Development.This role will see you execute the comprehensive development lifecycle for large-scale data center infrastructure campuses.This executive r...Show more

Last updated: 1 day ago • Promoted

Site Reliability Engineer

BNB Chain • Vancouver

Full-time

Founded in 2021, LayerZero’s vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities.With LayerZero's simple, ...Show more

Last updated: 2 days ago • Promoted

Engineering Lead, Offers Platform & Catalog Systems

Amazon • Vancouver

Full-time

A leading e-commerce company based in Vancouver is seeking a Software Development Manager to lead engineering teams and drive the delivery of product capabilities. The ideal candidate will have exte...Show more

Last updated: 17 days ago • Promoted

Seed Capital Investor – Sustainable Aviation Fuel (SAF) Project Company - DHT Energy Corp

DHT Energy Corp • delta, bc, ca

Full-time

Canadian headquartered multinational energy transition developer is seeking one additional seed capital investor for a large-scale SAF project currently advancing in Latin America.The project addre...Show more

Last updated: 2 days ago • Promoted

Epicor Kinetic Implementation Specialist - Tenth Revolution Group

Tenth Revolution Group • delta, bc, ca

Full-time

Job Description : Epicor Kinetic Implementation Consultant.Epicor Kinetic Implementation Consultant.ERP implementations for manufacturing and distribution clients. This role requires strong expertise...Show more

Last updated: 6 days ago • Promoted

Site Reliability Engineer

ScalePad • Vancouver

Full-time

ScalePad is a market‑leading SaaS company headquartered in Vancouver, Toronto, Montreal and Phoenix, AZ.With a global employee reach, we serve over 12,000 MSPs worldwide, helping them increase clie...Show more

Last updated: 17 days ago • Promoted

LNG Reliability & Integrity Engineer – Onsite

Woodfibre LNG • Squamish

Full-time

A leading energy firm is seeking a Reliability & Integrity Engineer to support operations at the LNG plant in Squamish, BC. This role involves ensuring technical integrity, managing risks, and leadi...Show more

Last updated: 17 days ago • Promoted

Product Reliability Engineer

Motorola Solutions • Vancouver

Full-time

Motorola Solutions values your privacy • •.Product Reliability Engineer page is loaded## Product Reliability Engineerlocations : Vancouver, Canadatime type : Full timeposted on : Posted Todayjob r...Show more

Last updated: 17 days ago • Promoted

Mechanical Engineering Lead

Scout Talent • Vancouver, British Columbia, Canada

Full-time

Combine technical leadership with hands-on mechanical design in a fast-paced engineering team.Enjoy a $100,000–$125,000 annual salary with performance-based incentives and benefits.Build your...Show more

Last updated: 24 days ago • Promoted

D365 CE Solutions Specialist

R2 Global • delta, bc, ca

Full-time

Dynamics 365 Solutions Specialist.Microsoft Dynamics 365 CRM platform.This role is ideal for someone who thrives at the intersection of business process optimization, functional solution design, an...Show more

Last updated: 1 day ago • Promoted