Join to apply for the Site Reliability Engineer role at Kyndryl.
Direct message the job poster from Kyndryl.
Recruitment & Strategic Staffing @Kyndryl | Partnering with IT Consultants in Financial Services & Technology
- Position : Site Reliability Engineer
- Client : Financial Services - Capital Markets Technology
- Duration : 12-month contract with potential extensions
- Location : Toronto, Canada - 2 to 3 days onsite per week
- Language : English
- Hours : 37.5 hours / week
Our client is looking for a Site Reliability Engineer (SRE) to enhance the reliability, performance, and efficiency of mission‑critical batch workloads across Capital Markets Technology. The SRE will serve as a technical lead focused on automation, application development, systems performance engineering, and observability using Dynatrace. This position is pivotal in driving operational excellence and maturing reliability practices across the organization.
Qualifications
Expert‑level Python skills, including performance tuning, concurrency (async / multiprocessing), testing, and packaging.Strong Linux systems engineering expertise (kernel tuning, networking, process management, filesystem optimization).Proven experience optimizing batch workloads for performance, reliability, and cost efficiency.Deep knowledge of Dynatrace for observability (dashboards, KPIs, tagging, alerts, anomaly detection).Hands‑on experience with Apache Airflow (DAG design, scheduler tuning, SLA management).Strong understanding of distributed systems concepts — retries, idempotency, backpressure, data integrity.Experience with CI / CD pipelines (GitHub Actions, Azure DevOps, Jenkins) and Infrastructure as Code (Terraform, Ansible).Familiarity with containers and orchestration tools (Docker, Kubernetes).Excellent incident management, troubleshooting, and communication skills.Responsibilities
Reliability & Performance : Engineer resilient and performant batch processing pipelines by reducing runtime and minimizing failures.Observability : Implement and maintain Dynatrace dashboards, alerts, and runbooks to ensure deep visibility into system health.Systems Engineering : Configure and tune Linux and Windows environments for optimal reliability and speed.Automation & Orchestration : Design and refine Airflow DAGs, automate deployments with CI / CD pipelines, and reduce operational toil through code.Incident Management : Lead incident response, conduct root‑cause analysis, and implement improvements based on post‑mortems and SLOs.Security & Compliance : Ensure all reliability and automation processes adhere to security best practices and regulatory compliance standards.Please note this is for a contract position with one of our clients and not a full-time employment role with Kyndryl Canada.
Seniority level
Mid‑Senior levelEmployment type
ContractJob function
Information TechnologyIndustries
IT Services and IT ConsultingReferrals increase your chances of interviewing at Kyndryl by 2x.
Sign in to set job alerts for “Site Reliability Engineer” roles.
#J-18808-Ljbffr