Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerEntrust Corporation • Quebec,Canada
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Entrust Corporation • Quebec,Canada
23 hours ago
Job type
  • Full-time
Job description

Join us at Entrust

At Entrust, we’re shaping the future of identity centric security solutions. From our comprehensive portfolio of solutions to our flexible, global workplace, we empower careers, foster collaboration, and build solutions that help keep the world moving safely.

Get to Know Us

Headquartered in Minnesota, Entrust is an industry leader in identity-centric security solutions, serving over 150 countries with cutting-edge, scalable technologies. But our secret weapon? Our people. It’sthe curiosity, dedication, and innovation that drive our success and help us anticipate the future.

Position Overview:

The Instant Financial Issuance as a Service (IFIaaS) Cloud Service includes a wide array of components including web services, application servers, and databases hosted in an on-prem environment. The Sr. Site Reliability Engineer (SRE) will be responsible for ensuring that the SaaS platform is reliable, available, and performant, as well as scalable, secure, and cost-effective. Ultimately, the individual will be responsible for the platform uptime, functional management of all the IFIaaS cloud environments, applications, networks, scoping projects, and the resolution of application and network issues.

How You Can Make an Impact:

The Instant Financial Issuance as a Service (IFIaaS) Cloud Platform spans multiple on‑prem environments. The Senior Site Reliability Engineer (SRE) will play a critical role in ensuring the platform’s reliability, scalability, security, and operational excellence across these geographically distributed environments. Given the asymmetric nature of our data centers, the SRE will design and operate systems that prioritize local HA while ensuring effective, tested, and compliant failover for DR scenarios. This role includes responsibility for platform uptime, environment management, network and application reliability, observability, automation maturity, compliance, and operational excellence.

Responsibilities:

  • Own SLOs/SLIs for availability (99.9%), latency, error rate, and quality of service across microservices.

  • Design/operate end‑to‑end observability: metrics, logs, traces, synthetic checks, real‑user monitoring (RUM).

  • Instrument services (Windows services, APIs, background jobs) with structured logs and trace context.

  • Build health probes and SLA monitors for critical transactions and cross-service dependencies.

  • Monitor system issues using various metrics, such as uptime, latency, error rate, throughput, and availability

  • Deploy and maintain monitoring and on-call tools i.e.: Splunk on-call, Prometheus, Datadog, etc.

  • Lead incident response (triage, comms, coordination, real-time mitigation) and conduct blameless postmortems with actionable follow-ups.

  • Maintain and continuously improve runbooks, escalation paths, on call rotations, and paging policies.

  • Implement MTTA/MTTR reduction programs.

  • Stand up war room protocols and ensure stakeholder updates during incidents.

  • Forecast compute, storage, network needs, track headroom against growth and peak patterns.

  • Conduct performance profiling and bottleneck analyses (CPU, memory, I/O, thread pools, connection pools).

  • Optimize resource allocation on VMware (DRS, affinity rules, reservations) and Windows VM tuning (kernel, TCP stack, NICs).

  • Validate scaling strategies (horizontal vs. vertical) and implement auto-scaling where supported.

  • Standardize gold images, configuration baselines, and desired state for Windows Server (PowerShell DSC or equivalent).

  • Manage patching (OS, middleware, runtime) with maintenance windows aligned to error budgets.

  • Ensure backup, snapshot, and restore strategies meet RPO/RTO; regularly test restores.

  • Maintain secure baselines (CIS benchmarks for Windows/VMware), vulnerability management, and patch cadence.

  • Support compliance audits (PCI-CP, PCI-DSS, SOC 2/ISO 27001), produce evidence (configs, logs, access reviews), and remediate gaps.

  • Automate provisioning (VM templates, DSC/Ansible for Windows, Terraform for VMware) and configuration drift detection/correction.

  • Build runbooks to reduce toil (deploy, scale, rollback, etc)

  • Create reliability guardrails (pre‑flight checks, change freeze rules, policy controls) as code.

  • Continuously refactor scripts/runbooks into idempotent automation.

  • Collaborate with development teams and other stakeholders to identify potential risks, such as security vulnerabilities, performance bottlenecks, deployment issues, or configuration errors

  • implement various risk mitigation strategies, such as patching, backup, redundancy, encryption, or testing

  • Collaborate with product teams and other teams to understand the user needs, expectations, and satisfaction.

  • Coach engineers on SRE principles, incident handling, and reliability centric design.

  • Lead knowledge sharing, runbooks quality, and postmortem culture (blameless, action-oriented).

  • Provide after-hours support for production issues on a rotational basis with other team members to ensure system availability 24/7/365.

Basic Qualifications:

  • 5+ years of experience in SRE, DevOps, or Software Engineering roles supporting distributed, production-grade environments, with strong skills in troubleshooting microservices, Windows/VMware systems, and on‑prem hybrid infrastructure.

  • Hands‑on experience with automation and observability, including Terraform/Ansible/DSC, CI/CD pipelines, logs/metrics/tracing systems, and enterprise monitoring tools such as Datadog, Prometheus, or Splunk.

  • Demonstrated capability with infrastructure automation tools (Terraform, Ansible, Jenkins, Octopus, PowerShell DSC, etc.).

  • Proficiency in VMware, Windows Server administration, networking fundamentals, and system‑level performance analysis.

  • Hands‑on experience operating and troubleshooting enterprise microservices, APIs, and distributed application stacks in on‑prem/hybrid infrastructure.

  • Must have: Ability to provide after-hours production support on a rotational basis to ensure 24/7/365 system availability.

Preferred Qualifications:

  • Demonstrated integrity and accountability, including reliability, ownership of mistakes, and commitment to high operational standards across compliance-sensitive environments (PCI‑DSS, PCI‑CP, SOC2).

  • High self‑confidence, strong presentation and communication abilities, and a history of leading through example, helping establish a culture of operational excellence and continuous improvement.

  • Leadership behaviors, including initiative, thoughtful risk‑taking, reflective decision‑making, and the ability to take action confidently amid uncertainty.

Where you will be: This hybrid role requires three inoffice days per week in Minneapolis, Ottawa, Colorado, or Dallas, as outlined in the job description. Entrust operates with a distributed workforce.

About Entrust:

Entrust keeps the world moving safely by enabling trusted identities, payments and data protection around the globe. Today more than ever, people demand seamless, secure experiences, whether they’re crossing borders, making a purchase, or accessing corporate networks. With our unmatched breadth of digital security and credential issuance solutions, it’s no wonder the world’s most entrusted organizations trust us.

For more information, visit . Follow us on, , , , and

Entrust Corporation is an EOE/AA/Veteran/People with Disabilities employer.

NO AGENCIES, NO RELOCATION

#LI-GR1

#ENT123

At Entrust, we don’t just offer jobs – we offer career journeys. Here is what you can expect when you join our team:

  • Career Growth: Whether you’re a budding developer or a seasoned expert, we’re invested in your professional journey. With learning-forward initiatives and exciting challenges, your growth is our priority.

  • Flexibility: Life is all about balance. Whether you’re remote, hybrid, or on-site, we offer flexible options that fit your lifestyle.

  • Collaboration: Here, your voice matters. Our teams thrive on sharing ideas, brainstorming solutions, and working together to build a better tomorrow.

We believe in securing identities—but it doesn’t stop there. At Entrust, we’re passionate about valuing all identities. Our culture is built on diversity, inclusion, and respect. From unconscious bias training for our leaders to global affinity groups that connect colleagues across the globe, we’re creating a community where everyone is encouraged to be themselves.

Ready to Make an Impact?

If you’re excited by the prospect of innovating, growing your career, and collaborating in a dynamic environment, Entrust is the place for you. Join us in making a difference. Let’s build a more secure world—together.

Apply today!

For more information, visit . Follow us on, , , , and

Compensation Range:


In the US: The anticipated starting base pay for this position is: $129,098-$189,343 per year (in the primary posting location). Actual compensation will be determined based on geographic location, education, skills and experience. This position is also eligible for the company’s discretionary annual incentive plan. In addition to your pay, Entrust offers eligible colleagues and their dependents comprehensive health and well-being programs which include medical, vision, dental, a generous 401(k) matching contribution, life and disability insurance, mental health coaching, virtual fitness programs, paid personal time off plus 12 paid holidays, parental leave and education reimbursement. Please speak with the recruiter for more details. Note: Benefit and Compensation programs are subject to eligibility requirements and other terms of the applicable plan or program. Entrust has the right to end, suspend or amend any of its plans at any time in whole or in part. In Canada: The pay range for this position is $120,500 - $170,500 per year. This position is also eligible for the company's discretionary annual incentive plan. Actual compensation will be determined based on education, skills and experience. In addition to your pay, Entrust offers eligible colleagues and their dependents comprehensive benefits, vacation, paid time off and paid holidays. Please speak with the recruiter for more details. Note: Benefit and Compensation programs are subject to eligibility requirements and other terms of the applicable plan or program. Entrust has the right to end, suspend or amend any of its plans at any time in whole or in part.

Entrust is an EEO/AA/Disabled/Veterans Employer

Entrust values diversity and inclusion and we are committed to building a diverse workforce with wide perspectives and innovative ideas. We welcome applications from qualified individuals of all backgrounds, and we strive to provide an accessible experience for candidates of all abilities.

Recruiter:

Grace Rusingiza
Create a job alert for this search

Senior Site Reliability Engineer • Quebec,Canada

Similar jobs
Reliability Engineer

Reliability Engineer

IKO Global • Quebec, Capitale-Nationale, CA
Full-time
IKO is a Canadian owned and operated business with production facilities worldwide and has many years of unparalleled success in the roofing materials industry.Quality, integrity, and trustworthine...Show more
Last updated: 30+ days ago • Promoted
EMS/SCADA Engineer - québec city

EMS/SCADA Engineer - québec city

Pacer Group • québec city, qc, ca
Full-time
EMS Software Engineer with experience in Energy Management Systems (EMS), Transmission EMS, and power grid operations, strong software development skills, and familiarity with GE, Siemens, ABB, Als...Show more
Last updated: 30+ days ago • Promoted
Senior DevOps Engineer - lévis

Senior DevOps Engineer - lévis

enableIT • lévis, qc, ca
Full-time
Senior DevOps Engineer (Terraform / AWS).Location: Toronto (3 days Onsite).Design, implement, and maintain scalable, secure cloud infrastructure on AWS using Terraform within a Capital Markets envi...Show more
Last updated: less than 1 hour ago • Promoted • New!
Survey Taker: Earn up to $25 per survey (Remote)

Survey Taker: Earn up to $25 per survey (Remote)

Earn Haus • Stoneham-et-Tewkesbury, QC, CA
Remote
Full-time +1
Looking for people to participate in taking online surveys for Fortune 500 brands.All you need to do is complete online surveys by sharing your opinion.You will help influence brand decisions on se...Show more
Last updated: 30+ days ago • Promoted
Sr. Site Reliability Engineer I

Sr. Site Reliability Engineer I

Axon • Québec, Canada
Full-time
Join Axon and be a Force for Good.At Axon, we're on a mission to Protect Life.We're explorers, pursuing society's most critical safety and justice issues with our ecosystem of devices and cloud sof...Show more
Last updated: 14 days ago • Promoted
Senior Software Engineer, Site Reliability - $186,818 - $224,183 A Year - Remote

Senior Software Engineer, Site Reliability - $186,818 - $224,183 A Year - Remote

Babylist • Québec, Canada
Remote
Full-time
The Senior Site Reliability Engineer will ensure system stability, scalability, and reliability, working with the engineering teams to support infrastructure and developer tools.Show more
Last updated: 17 days ago • Promoted
Site Superintendent -Abbostford Project - lévis

Site Superintendent -Abbostford Project - lévis

JPD Contracting • lévis, qc, ca
Full-time
Lower Mainland, specializing in civil works, concrete, tenant improvements, and complex upgrades in active environments.Our team is built on accountability, safety, and strong execution in the fiel...Show more
Last updated: 2 hours ago • Promoted • New!
Senior Cloud Engineer - saint-augustin-de-desmaures

Senior Cloud Engineer - saint-augustin-de-desmaures

TekStaff IT Solutions • saint-augustin-de-desmaures, qc, ca
Full-time
Candidate Requirements/Must Have Skills:.IaC: Terraform with strong experience in building reusable modules following best practices.GCP and Kubernetes networking model - to be able to troubleshoot...Show more
Last updated: less than 1 hour ago • Promoted • New!
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Orion Innovation • Québec, Canada
Full-time
SRE will be responsible for the reliability, scalability, and performance of systems supporting classified government projects in an air-gapped deployment.This role leverages advanced monitoring an...Show more
Last updated: 30+ days ago • Promoted
System Design Engineer - québec city

System Design Engineer - québec city

Insight Global • québec city, qc, ca
Temporary
As an Intermediate Systems Designer [Engineer] you are a technical leader behind problem solving efforts and project execution.You will work with a highly capable team driven to provide solutions t...Show more
Last updated: less than 1 hour ago • Promoted • New!
Site Reliability Engineer - C$173,000 - C$240,000 A Year - Remote

Site Reliability Engineer - C$173,000 - C$240,000 A Year - Remote

MongoDB • Québec, Canada
Remote
Full-time
This role is for a Staff Site Reliability Engineer to guide the security of cloud-based infrastructure, with a focus on cloud security design and implementation, automation and monitoring, and secu...Show more
Last updated: 30+ days ago • Promoted
Kubernetes Platform Engineer - lévis

Kubernetes Platform Engineer - lévis

Capgemini Engineering • lévis, qc, ca
Full-time
Job Title: Kubernetes Platform Engineer.At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the worl...Show more
Last updated: 2 hours ago • Promoted • New!
Forward Deployed Engineer - saint-augustin-de-desmaures

Forward Deployed Engineer - saint-augustin-de-desmaures

ForgeSight • saint-augustin-de-desmaures, qc, ca
Full-time
MVPs and pilot use cases to enterprise-wide deployments, optimization, and ongoing support.We are dedicated to helping organizations achieve measurable results and maximize the value of their inves...Show more
Last updated: 5 days ago • Promoted
Senior Project Engineer (Testing)

Senior Project Engineer (Testing)

C-FER Technologies • lévis, qc, ca
Full-time
For over 40 years, C-FER Technologies’ unique test capabilities have been used to evaluate and de-risk novel technologies under the most challenging real-world applications.Our work spans a broad r...Show more
Last updated: 2 hours ago • Promoted • New!
Guidewire Senior Developer (1856) - Focus Systems Inc.

Guidewire Senior Developer (1856) - Focus Systems Inc.

Focus Systems Inc. • saint-augustin-de-desmaures, qc, ca
Full-time
If you bring deep Guidewire expertise and cloud experience—this is a high-impact role where you’ll help shape the evolution of a modern insurance platform.Location: Remote within Canada or On-site ...Show more
Last updated: less than 1 hour ago • Promoted • New!
Staff Platform Site Reliability Specialist (Observability & Kubernetes) (Copy)

Staff Platform Site Reliability Specialist (Observability & Kubernetes) (Copy)

Everbridge • Québec, Canada
Full-time
Everbridge is seeking a Staff Platform Site Reliability Specialist to own, operate, and evolve our enterprise observability platform.In this role, you will be responsible for the up-keep, reliabili...Show more
Last updated: 2 days ago • Promoted
Aménagiste de sentiers

Aménagiste de sentiers

Centre de ski Le Relais (2024) inc. • Capitale-Nationale (Stoneham), QC, ca
Full-time +2
Le Centre Le Relais offre à ses employés une expérience de travail enrichissante.L’esprit de famille, le plaisir, la sécurité, l’accessibilité et la communication sont les valeurs qui représentent ...Show more
Last updated: 5 hours ago • Promoted • New!
Senior Tanium Support Engineer - lévis

Senior Tanium Support Engineer - lévis

emergiTEL • lévis, qc, ca
Full-time
Senior Tanium Support Engineer – EDR Team.Montreal, Quebec, Canada | Contract (until November 11, 2026) | Hybrid (3 days/week in Downtown Montreal).Our client is seeking a Senior Tanium Support Eng...Show more
Last updated: less than 1 hour ago • Promoted • New!