Talent.com
Platform Reliability Engineer
Platform Reliability EngineerJ&M Group • Toronto, Canada
Platform Reliability Engineer

Platform Reliability Engineer

J&M Group • Toronto, Canada
28 days ago
Job type
  • Full-time
Job description

Join to apply for the

Platform Reliability Engineer

role at

J&M Group

Continue with Google Continue with Google

Join to apply for the

Platform Reliability Engineer

role at

J&M Group

Infrastructure as Code (IaC) : Terraform, ARM templates, CloudFormation

Scripting Languages : Python, PowerShell, Bash

Security & Compliance : Access control models, cloud security practices

Platform Governance : Unity Catalog (nice to have)

Operational Excellence : SRE principles, SLOs, SLIs

Job Description

Technical Skills

Cloud Platforms : Azure, AWS

Infrastructure as Code (IaC) : Terraform, ARM templates, CloudFormation

Scripting Languages : Python, PowerShell, Bash

Monitoring & Observability : Azure Monitor, Log Analytics, Prometheus

CI / CD Tools : Azure DevOps, GitHub Actions

Platform Services : Compute, Storage, Networking, Data Plane Infrastructure

Security & Compliance : Access control models, cloud security practices

Platform Governance : Unity Catalog (nice to have)

Operational Excellence : SRE principles, SLOs, SLIs

Automation & Cost Optimization : Platform automation, cost reduction strategies

Soft Skills

Effective communication and cross-team collaboration

Strong problem-solving and analytical mindset

Proactive, independent, and team-oriented work style

Attention to detail

Experience & Qualifications

3-6 years in platform engineering, SRE, or infrastructure roles

Bachelor's degree in Computer Science, IT, or related field

Experience in agile or iterative development environments

Certifications (nice to have) : Azure Administrator, Azure DevOps Engineer, AWS Solutions Architect

Job Summary

We are looking for a skilled and motivated Platform Reliability Engineer to support and optimize our platform services. This role bridges the gap between infrastructure services and the platform capabilities required by development and operations teams. The engineer will contribute to automation, reliability, cost optimization, and service excellence of core platform components hosted in the cloud (Azure / AWS). This is a hands-on technical role with a focus on enabling reliable, secure, and scalable platform foundations for enterprise-scale workloads.

Key Responsibilities

Support the design and implementation of core platform services that enable development teams to build, deploy, and operate applications reliably.

Develop Infrastructure as Code (IaC) templates and scripts using tools like Terraform or ARM to automate provisioning and configuration.

Monitor and maintain platform services including compute, storage, networking, and data plane infrastructure for scalability and performance.

Collaborate with development, cloud engineering, and security teams to ensure platform alignment with architectural standards and security requirements.

Implement observability practices using tools for monitoring, logging, and alerting to support performance tuning and incident detection.

Troubleshoot platform-related incidents, perform root cause analysis, and document findings for continuous improvement.

Participate in deployment activities, ensuring proper controls and validations are in place when promoting workloads to production.

Support optimization initiatives to reduce costs across services such as compute, storage, Synapse, and platform integration tools.

Contribute to ongoing platform modernization efforts, including migration from legacy configurations to unified governance models such as Unity Catalog.

Qualifications

Bachelor's degree in Computer Science, Information Technology, or a related field.

3-6 years of experience in platform engineering, SRE, or related infrastructure roles.

Practical experience with Azure or AWS cloud services, particularly related to infrastructure and platform-level resource management.

Proficiency in Infrastructure as Code (IaC) tools such as Terraform, ARM templates, or CloudFormation.

Hands-on experience with monitoring and observability solutions (e.g., Azure Monitor, Log Analytics, Prometheus).

Familiarity with CI / CD pipelines and release processes (e.g., Azure DevOps, GitHub Actions).

Strong scripting skills (Python, PowerShell, or Bash) to automate tasks and workflows.

Understanding of access control models, security practices, and compliance in cloud platforms.

Familiarity with SRE principles and operational excellence metrics (SLOs, SLIs).

Experience working in agile or iterative environments

Soft Skills

Effective communicator with the ability to coordinate across platform, security, cloud, and development teams.

Strong problem-solving mindset with attention to detail.

Proactive and collaborative team player, able to work independently and drive issues to resolution.

Nice to Have

Exposure to Unity Catalog or similar data governance tooling in the context of platform services.

Experience supporting platform migrations or re-architecture projects.

Certification in Azure Administrator, Azure DevOps Engineer, or AWS Solutions Architect.

This role is ideal for someone with a strong technical foundation who is ready to take on ownership of platform-level responsibilities, contribute to modernization efforts, and apply SRE practices to maintain high availability and performance of services.

Seniority level

Seniority level Entry level

Employment type

Employment type Contract

Job function

Industries IT Services and IT Consulting

Referrals increase your chances of interviewing at J&M Group by 2x

Get notified about new Reliability Engineer jobs in

Toronto, Ontario, Canada .

Applications Consultant 2 - Platform Reliability Engineer

Toronto, Ontario, Canada CA$90,000 - CA$130,000 3 weeks ago

Field Engineer - SPT Canada (Ontario / North Bay / Edmonton)

Mississauga, Ontario, Canada

CA$109,000.00

CA$118,000.00

2 weeks ago

Software Quality Assurance and Automation Test Engineer -Automotive Infotainment

Mechanical Engineer - Thermal Management

Performance Engineer / Analyst (H / F) - SAFRAN LANDING SYSTEMS

Engineer- Autonomy Test and Validation (Contract)

Integration Reliability Engineer, Technical Operations

Greater Toronto Area, Canada 14 hours ago

Assistant Engineer / Scientist / Technical Officer

Toronto, Ontario, Canada CA$150,000 - CA$170,000 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Create a job alert for this search

Reliability Engineer • Toronto, Canada

Similar jobs
Site Reliability Engineer - Observability

Site Reliability Engineer - Observability

Flinks Technology Inc. • Toronto
Full-time
Flinks is where financial data moves—with purpose, trust, and impact.We’re on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial prod...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer 3

Site Reliability Engineer 3

Behavox • Toronto
Full-time
Behavox is shaping the future of how businesses harness their most important raw material - data.Our mission is bold : Organize enterprise data into actionable information that protects and promotes...Show more
Last updated: 28 days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Tecsys Inc. • Toronto, Ontario, Canada
Permanent
Get AI-powered advice on this job and more exclusive features.Having recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and t...Show more
Last updated: 30+ days ago • Promoted
Platform Engineering Leader : Scale, Reliability & Devex

Platform Engineering Leader : Scale, Reliability & Devex

Rootly • Toronto, Canada, CA
Full-time
A pioneering incident management company in Toronto is seeking a Head of Platform to shape the foundation of its incident response capabilities. This leadership role requires extensive experience in...Show more
Last updated: 30+ days ago • Promoted
Senior Platform Engineer — Scale & Reliability (Americas)

Senior Platform Engineer — Scale & Reliability (Americas)

Ashby • Toronto
Full-time
A leading tech company in Toronto is seeking a Principal Platform Engineer to enhance their platform's scalability and reliability. The role involves optimizing infrastructure, creating automated se...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer II

Site Reliability Engineer II

Fivetran • Toronto
Full-time
From Fivetran’s founding until now, our mission has remained the same : to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonic...Show more
Last updated: 5 hours ago • Promoted • New!
Integration Reliability Engineer, Technical Operations, Cards

Integration Reliability Engineer, Technical Operations, Cards

Stripe • Toronto
Full-time
Stripe is a financial infrastructure platform for businesses.Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their reve...Show more
Last updated: 13 days ago • Promoted
Reliability Engineer

Reliability Engineer

Transarete • Toronto
Full-time
Resolute Workforce Solutions | Full time.Resolute Workforce Solutions (Staff augmentation subsidiary of Brevitas Consulting Inc. Our expertise is in Commissioning & Qualification, Validation, Qualit...Show more
Last updated: 30+ days ago • Promoted
Reliability & Observability Engineer

Reliability & Observability Engineer

Scotiabank • Toronto
Full-time
A leading financial institution in Toronto is seeking a Manager, Reliability to oversee monitoring tools integration and collaboration with diverse teams. The successful candidate will leverage perf...Show more
Last updated: 5 days ago • Promoted
Platform Reliability Engineer

Platform Reliability Engineer

J&M Group • Toronto
Full-time
Continue with Google Continue with Google.Infrastructure as Code (IaC) : Terraform, ARM templates, CloudFormation.Scripting Languages : Python, PowerShell, Bash. Security & Compliance : Access control ...Show more
Last updated: 30+ days ago • Promoted
Regional Reliability Engineer

Regional Reliability Engineer

Carmeuse • Yorkville, Canada, CA
Full-time
Analyzes equipment performance and develops maintenance strategies.Show more
Last updated: 30+ days ago • Promoted
Staff Platform Engineer : Reliability & Scale (Hybrid)

Staff Platform Engineer : Reliability & Scale (Hybrid)

Rover • Toronto
Full-time
A technology company is hiring a Staff Platform Engineer in Toronto, responsible for building reliable systems on Google Cloud. Applicants should have over 5 years of experience in backend and infra...Show more
Last updated: 30+ days ago • Promoted
Platform Engineer

Platform Engineer

RAVL Inc. • Toronto
Full-time
RAVL helps technologists accelerate their careers.At RAVL, we connect strategy with execution, care deeply about the people we work with, and measure success by the lasting impact we leave behind.O...Show more
Last updated: 11 days ago • Promoted
Platform Reliability Engineer (contract)

Platform Reliability Engineer (contract)

Capgemini • Toronto
Full-time
Platform Reliability Engineer (contract).Platform Reliability Engineer (contract).Platform Reliability Engineer (contract). Platform Reliability Engineer (contract).Get AI-powered advice on this job...Show more
Last updated: 30+ days ago • Promoted
Ai-Driven Reliability Engineer – On-Site Toronto

Ai-Driven Reliability Engineer – On-Site Toronto

Brevitas • Toronto, Canada, CA
Full-time
Resolute Workforce Solutions (Staff augmentation subsidiary of Brevitas Consulting Inc.Our expertise is in Commissioning & Qualification, Validation, Quality Systems, Regulatory Affairs, Engine...Show more
Last updated: 9 hours ago • Promoted • New!
Senior Site Reliability Engineer

Senior Site Reliability Engineer

P2P • Toronto
Full-time
San Francisco / Remote, Ciudad de México / Remote, Boston / Remote, Vancouver / Remote, Toronto / Remote, Chicago / Remote, Buenos Aires / Remote, São Paulo / Remote. Chainlink Labs is the primary c...Show more
Last updated: 30+ days ago • Promoted
Lead Site Reliability Engineer, Observability (Remote, North America) - Remote

Lead Site Reliability Engineer, Observability (Remote, North America) - Remote

Confluent • Toronto, Canada, CA
Remote
Full-time
A leading data streaming platform company is seeking a remote Senior Software Engineer II specializing in observability.This role involves designing and maintaining critical observability infrastru...Show more
Last updated: 10 days ago • Promoted
Reliability Engineer

Reliability Engineer

Mondelēz International • Toronto
Full-time
Reliability Engineer – Mondelēz International.Join our mission to lead the future of snacking.As a Reliability Engineer, you will drive operational excellence in manufacturing and deliver key perfo...Show more
Last updated: 30+ days ago • Promoted