Talent.com

Reliability engineer Jobs in Vancouver, BC

Create a job alert for this search

Reliability engineer • vancouver bc

Last updated: 3 days ago

Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform

Astra North Infoteck Inc.Vancouver, BC, ca
Full-time

Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform.Observability, SRE, DevOps roles with expertise in infrastructure and application reliability.Dynatrace, ELK, Splunk, Page... Show more

Construction Engineer

BouyguesVancouver, BC, CA
CA$100,000.00 yearly
Full-time
Quick Apply

Regarding Hiring: This role offers the potential for employment with any of the joint venture companies involved in the project.Hiring decisions are based on equal opportunities across each team.Ab... Show more

Senior Site Reliability Engineer

SemiosVancouver, British Columbia, Canada
CA$120,000.00 yearly
Full-time

Founded in 2010, Semios Group is a leading agricultural technology company helping growers, agronomists, and ag retailers manage over 200 million acres across five countries.Semios pioneered variab... Show more

Engineer - E3

TEEMABurnaby, British Columbia
CA$63.00 hourly
Temporary

The E3 – Engineer is a fully qualified and accredited engineering professional who applies theoretical and practical knowledge to provide engineering design, estimating, planning, and quality manag... Show more

Electrical Engineer

Saltworks TechnologiesRichmond, BC, CA
CA$120,000.00 yearly
Full-time
Quick Apply

Saltworks Technologies is a leading Canadian cleantech company solving two of the world's most pressing industrial challenges: water sustainability and critical minerals supply.We design and build ... Show more

DevOps Engineer

BALLY’S INTRALOT SAVancouver, BC, CA
Full-time

Join INTRALOT as a DevOps Engineer.At INTRALOT, we shape the future of gaming through innovation and technology.Our global presence, diverse culture, and dynamic teams set the foundation for a peop... Show more

Process Engineer

TWD Technologies Ltd.Burnaby, BC, CA
CA$115,000.00 yearly
Full-time
Quick Apply

Process Engineer At TWD , we know the secret to success isn’t just technical skill, it’s the people behind it.That’s why we hire for attitude first.If you’re collaborative, ... Show more

Electrical Engineer

Vard Marine Inc.Vancouver, BC, CA
Full-time
Quick Apply

Who we are? ​ Vard Marine is the market leader in designing specialized ships to meet diverse mission requirements in harsh operating environments – vessels such as polar research ships... Show more

Data Engineer

DarkVisionNorth Vancouver, British Columbia
CA$70,000.00 yearly
Full-time

DarkVision is seeking a Data Engineer to join our Imaging & AI team.You will ensure the accuracy and reliability of our reporting pipelines by owning pipeline validation, data transformation logic ... Show more

Senior Site Reliability Engineer

The Semios GroupVancouver, BC, CA
CA$120,000.00 yearly
Full-time
Quick Apply

Founded in 2010, Semios Group is a leading agricultural technology company helping growers, agronomists, and ag retailers manage over 200 million acres across five countries.Semios pioneered variab... Show more

Electrical Engineer

HatchVancouver, BC, CA
Full-time

Join a company that is passionately committed to the pursuit of a better world through positive change.With more than 70 years of business and technical expertise in.With practical solutions that a... Show more

QA Engineer

veritreeVancouver, BC, CA
CA$80,000.00 yearly
Full-time
Quick Apply

Launched in 2021, our technology measures and verifies the impact of global restoration efforts from the ground up.We are on a mission to plant 1 billion verified trees by 2030, collaborating with ... Show more

AI Security Control Developer/Site Reliability Engineer (Global Security)

Royal Bank of Canada>VANCOUVER, Canada
Full-time

As an AI Security Control Developer/Site Reliability Engineer (Global Security) on the AI Security team, you will build, operate, and continuously validate the security controls that protect RBC's ... Show more

People also ask
Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform

Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform

Astra North Infoteck Inc.Vancouver, BC, ca
3 days ago
Job type
  • Full-time
Job description
Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform

Remote Role

Key Responsibilities

  • Observability, SRE, DevOps roles with expertise in infrastructure and application reliability
  • Dynatrace, ELK, Splunk, PagerDuty
  • SLI/SLO frameworks
  • Azure Kubernetes Service (AKS), Terraform, Azure managed services

What will you do

  • Design and implement observability-as-code solutions using Terraform for monitoring pipelines, dashboards, and alerting across distributed systems
  • Drive observability improvements using Dynatrace, ELK, Splunk, PagerDuty for real-time performance insights and system visibility
  • Instrument applications for end-to-end observability including distributed tracing, metrics collection, and log aggregation across Node.js and .NET microservices and event-driven architectures
  • Troubleshoot complex production incidents across service layers, databases, caches, and APIs using SLI/SLO frameworks
  • Investigate and resolve Azure Kubernetes Service (AKS) infrastructure issues ensuring reliability and scalability of containerized workloads using Terraform and Azure services (SQL MI, Redis, Functions, Event Grid)
  • Translate business requirements into observable, resilient systems aligned to SLIs/SLOs
  • Automate operational tasks using Infrastructure-as-Code and CI/CD to reduce toil and improve resilience
  • Lead incident response and remediation for critical systems, including blameless postmortems and chaos engineering practices
  • Collaborate with development, platform, and business teams to improve availability, scalability, and operational excellence

What do you need to succeed

Must-have

  • 8+ years experience in SRE, DevOps, or Observability roles focused on infrastructure and application reliability
  • Strong expertise in Dynatrace, ELK, Splunk, PagerDuty and observability principles (instrumentation, correlation IDs, SLIs/SLOs)
  • Advanced proficiency in Azure Kubernetes Service (AKS), Terraform, and Azure managed services (SQL MI, Redis, Functions, Event Grid)
  • Hands-on experience with observability instrumentation (distributed tracing, metrics, logs) across Node.js and .NET microservices and event-driven systems
  • Strong troubleshooting skills across distributed systems (services, databases, caches, APIs) in production environments
  • Incident management expertise using PagerDuty and ServiceNow, including high-severity incident resolution and RCA
  • Knowledge of incident, problem, and change management, SRE principles, blameless postmortems, and chaos engineering
  • Strong communication and leadership skills for cross-functional coordination and incident handling