Talent.com
Tech Talent International
Senior Site Reliability Engineer (SRE) – Automation & ObservabilityTech Talent International • Montreal, QC, Canada
Senior Site Reliability Engineer (SRE) – Automation & Observability

Senior Site Reliability Engineer (SRE) – Automation & Observability

Tech Talent International • Montreal, QC, Canada
28 days ago
Job type
  • Full-time
  • Permanent
  • Quick Apply
Job description

Tech Talent International (SI) supplies technical talent to a variety of clients ranging from Fortune 100/500/1000 companies to small and mid-sized organizations in Canada/US and Europe.

We currently have a role as a Senior Site Reliability Engineer (SRE) – Automation & Observability with our large consulting client, working onsite at a major financial services client in the downtown Montreal area

Role: Cybersecurity - Senior Site Reliability Engineer (SRE) – Automation & Observability

Type: Permanent or Contract 40 hrs/week

Location: Hybrid - Downtown Montreal, QC -(roles starts off 5 days in office for 1st 3 months, then turns into hybrid setup 3 days onsite, 2 days from home)

Salary: $110,000 - $120,000 + 9% bonus + 3-5 weeks paid vacation + RRSP contribution + benefits + sick/personal days

Position Overview

The Automation team consists of several Subject Matter Experts (SMEs) who assist the Global Process Owner in designing, building, and maintaining the organization's IT services. While leading the company's IT services team, the IT Service Manager strives to develop reliable IT services and improve the organization's existing IT service infrastructure.

IT Service Managers are responsible for maintaining a high standard of service delivery while managing the organization's IT services and anticipating and resolving issues that may arise within company systems or client environments. These services include infrastructure monitoring, task automation, server asset management, and network inventory management.

Change, incident, problem, and request management, along with CMDB (Configuration Management Database) functions, are core services widely used throughout CIB IT. The ITSM team serves as the bridge between IT and business stakeholders, ensuring coordination and predictability for CIB IT and its business operations.

The team includes SMEs focused on key service areas as directed by management, with the objective of delivering high-quality services through various platforms that maximize efficiency and consistent results.

Within the Automation & Observability organization, the Production Smart Automation team provides production support services for the Analytics Consulting and Digital Assets IT clusters. This includes both functional and technical support as well as project delivery for production and non-production platforms. The team operates globally and consists of approximately 10 members located in Paris, Warsaw, Mumbai, and Montreal.

Key Responsibilities

The Site Reliability Engineer (SRE) will be part of a multidisciplinary team providing Level 1 and Level 2 technical and project support. This is a production-focused role requiring a broad range of technical expertise.

The SRE will work closely with development and infrastructure teams to:

  • Monitor, manage, and proactively improve the availability and performance of production environments, from presentation and application layers through infrastructure layers.
  • Plan and implement application deployments, load testing activities, and configuration changes.
  • Ensure production environments are operational and available while collaborating with teams to understand user needs.
  • Contribute to medium- and large-scale technical projects, including architecture reviews, solution design, application upgrades, and migrations to new platforms.
  • Collaborate on prioritized tasks while providing regular status updates and maintaining focus on target solutions.
  • Understand delivery lifecycle phases to ensure work is completed according to defined specifications and timelines.
  • Identify opportunities to improve operational efficiency and contribute to automation initiatives.
  • Provide constructive feedback and recommendations to management regarding performance, capacity, and system design.
  • Assist in documenting architectures and designs, as well as distributing meeting minutes and action items.

The SRE will also work with other teams to respond to incidents and resolve issues quickly, often under pressure, in order to restore normal business services. As a result, participation in on-call rotations and after-hours support may be required.

Candidates should possess both the aptitude and desire to learn new technologies and contribute innovative ideas that may benefit the department.

Requirements

Candidates should have:

  • 5–7 years of experience in a similar role.
  • Experience providing multidisciplinary technical support within a team environment.
  • Practical knowledge of performance and capacity management across:
    • Applications
    • Databases
    • Networks
  • Strong automation skills and mindset.

Skills & Competencies

Systems Administration

  • Strong Linux/Unix administration skills
  • Good knowledge of Windows environments

Containerization & Cloud

  • Strong knowledge of Docker and Kubernetes
  • Understanding of cloud-based platforms and solutions

Infrastructure & Networking

  • Good understanding of enterprise infrastructure, firewalls, and networking concepts
  • Knowledge of load-balancing technologies
  • Strong understanding of networking fundamentals

Security

  • Experience with APIs
  • Familiarity with CyberArk or HashiCorp Vault

Databases

  • Experience with SQL Server
  • Experience with Oracle
  • Exposure to NoSQL databases

Monitoring & Observability

  • Experience configuring application monitoring tools such as Dynatrace

DevOps & CI/CD

Experience with:

  • Jenkins
  • Bitbucket
  • Artifactory
  • Ansible
  • ArgoCD

Development & Automation

  • Knowledge of software development and scripting methodologies
  • Demonstrated programming ability in languages such as Python

IT Service Management

  • Good understanding of ITIL processes
  • Understanding of user and server authentication mechanisms that enable automated deployment cycles while maintaining strong security controls

Personal Attributes

  • Strong problem-solving abilities
  • Team-oriented mindset
  • Customer-focused approach

Create a job alert for this search

Senior Site Reliability Engineer (SRE) – Automation & Observability • Montreal, QC, Canada

Similar jobs

Ingénieur·e SRE / Site Reliability Engineer

mthree Recruiting PortalMontreal
Full-time

Want to work in technology at an investment bank?.We are looking for someone to be a part of a dynamic team as a Site Reliability Engineer for one of our clients.Systems Reliability Engineering (SR... Show more

 • Promoted

Sr. Site Reliability Engineer I

AxonMontreal (administrative region), QC, CA
Full-time

Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof... Show more

 • Promoted

Senior Site Reliability Engineer

I did my part and supported the Regular ToiletMontreal (administrative region), QC, CA
Full-time

MongoDB’s Storage Layer Services (SLS) team is re‑architecting the MongoDB cloud storage layer and sits at the heart of our next‑generation cloud storage architecture.This relatively new team is bu... Show more

 • Promoted

Senior Site Reliability Engineer Focused on Kubernetes Infrastructure

Chainlink LabsMontreal (administrative region), QC, CA
Full-time

Elevate decentralized architecture as a Senior Site Reliability Engineer.Spearhead Kubernetes-based infrastructure for decentralized applications, driving scalability, security, and operational eff... Show more

 • Promoted

Senior Site Reliability Engineering Specialist

LeadingtalentMontreal (administrative region), QC, CA
Full-time

We are hiring an elite SRE with a passion for building fault‑tolerant, scalable systems in the cloud.You bring a performance engineering mindset to everything you do—balancing innovation with relia... Show more

 • Promoted

Senior Site Reliability Engineer (Remote-First)

VySystemsMontreal (administrative region), QC, CA
Remote
Full-time

A leading technology company is seeking a Senior Site Reliability Engineer with robust Kubernetes knowledge to work remotely.Ideal candidates have over 6 years of experience in IT disciplines, prof... Show more

 • Promoted

Specialist Site Reliability Engineer

Global Talent Alliance, CanadaMontreal
Full-time

About the job Specialist Site Reliability Engineer.The role of the Specialist Site Reliability Engineer (SRE) is to execute RAM analysis and engineering in support of the I&T solutions.The overall ... Show more

 • Promoted

Azure SRE Engineer: Automation & Reliability Specialist

TrekrecruitMontreal
Full-time

A leading recruitment firm in Montreal is seeking a skilled technologist to provide production support and lead various reliability engineering tasks within Operations.The ideal candidate will trou... Show more

 • Promoted

Senior Site Reliability Engineer

ThinkificMontreal (administrative region), QC, CA
Full-time

Senior Site Reliability Engineer.Senior Site Reliability Engineer.Are you an experienced Site Reliability Engineer looking for a new challenge?.Senior Site Reliability Engineer.Senior Site Reliabil... Show more

 • Promoted

Site Reliability Engineer

TELUS DigitalMontreal (administrative region), QC, CA
Full-time

Welcome to TELUS Digital — where innovation drives impact at a global scale.As an award-winning digital product consultancy and the digital division of TELUS, one of Canada’s largest telecommunicat... Show more

 • Promoted

Senior Site Reliability Engineer- Remote

ClickHouseMontreal (administrative region), QC, CA
Remote
Full-time

Senior Site Reliability Engineer- Remote.Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies.With more than 3,000 custome... Show more

 • Promoted

Senior Reliability Engineering Specialist (Hybrid)

Morgan StanleyMontreal (administrative region), QC, CA
Full-time

We are seeking someone to join our Finance Regulatory RPE as a.Senior Data Reliability Engineering Specialist.Finance and Regulatory Reporting platforms.The role requires strong ownership of end‑to... Show more

 • Promoted

Hybrid Site Reliability Engineer Role

SAP SEMontreal (administrative region), QC, CA
Full-time

Join a Site Reliability Engineering team focused on cloud service reliability.Use your skills in incident management and container technologies to enhance operational efficiency in a hybrid work se... Show more

 • Promoted

Reliability Specialist

Selby JenningsMontreal (administrative region), QC, CA
Full-time

Join a global quantitative trading organization known for its long-standing commitment to advancing high‑performance electronic trading systems.With a history spanning over two decades, the firm ha... Show more

 • Promoted

Senior Site Reliability Engineer

MedeloopMontreal (administrative region), QC, CA
Full-time

We are seeking a Senior DevOps & Site Reliability Engineer to own the reliability, scalability, performance, and operational excellence of Medeloop’s platform.This role blends deep DevOps engineeri... Show more

 • Promoted

Experienced Site Reliability Engineer - Remote

Tech InsightsMontreal (administrative region), QC, CA
Remote
Full-time

TechInsights seeks a Senior Site Reliability Engineer to enhance AI operations from anywhere in Canada.Oversee reliability strategies, manage error budgets, and collaborate closely with engineering... Show more

 • Promoted

Remote Site Reliability Engineer - Scale Crypto Systems

NewtonMontreal (administrative region), QC, CA
Remote
Full-time

A leading innovative tech company in Toronto is looking for a Site Reliability Engineer.In this pivotal role, you will enhance the reliability and resilience of critical services, manage incidents,... Show more

 • Promoted

Senior Maintenance & Reliability Leader

PharmascienceMontreal (administrative region), QC, CA
Full-time

A leading pharmaceutical company in Montreal is looking for a Maintenance Manager to oversee and improve the maintenance program for manufacturing and packaging equipment.The successful candidate w... Show more

 • Promoted

Lead Developer in Site Reliability Engineering

Z953Montreal
Full-time

Become a Lead Developer in Site Reliability Engineering at Stingray, located in Montreal.This role focuses on transforming technology and advancing team efficiency.You'll work closely with the tech... Show more

 • Promoted

Intact SRE & Resiliency Engineer

Intact Financial CorporationMontreal
Full-time

Shape the future of operational excellence as an SRE & Resiliency Engineer with Intact.This hybrid role focuses on deploying advanced tooling and enhancing production reliability across cloud envir... Show more