Work Mode: Hybrid (2 days per week in-person at Toronto office preferred)
Skills required:
· 10–12 years in technical program/project management with at least 3–5 years in data platforms and AI/ML operations.
· Strong understanding of data architectures (lake/lakehouse, warehouse, streaming), data governance, and MLOps/ModelOps concepts.
· MLOps/AI: Azure ML, SageMaker, Vertex AI; MLflow, model registry, feature stores, drift/fairness/explainability tools.
· Data Governance: Purview, Collibra, Alation; data lineage, cataloging, DQ tooling.
· Orchestration and CI/CD: Airflow, Prefect, dbt; GitHub Actions/Azure DevOps/Jenkins; Terraform/Bicep/CloudFormation.
· Monitoring/Observability: Prometheus/Grafana, cloud-native monitors, logging, data quality monitors, model monitoring.
· Cloud & Data: Azure (Synapse, Fabric), AWS (S3/Glue/Redshift), GCP (BigQuery/Dataflow), Databricks, Snowflake.
· Proven experience embedding security/privacy-by-design and RAI principles into delivery and ops.
· Excellent stakeholder management, vendor management, and executive communication skills.
Roles and responsibilities
· Program Delivery Leadership
· Own end-to-end delivery of data platform and AI/ML operational initiatives discovery design implementation hypercare steady-state operations.
· Maintain multi-quarter roadmap, backlog and release trains (Scrum, Kanban, SAFe), run standups, PI planning, demos and retros.
· Manage dependencies across data ingestion, storage processing, cataloging, lineage, access, MLOps pipelines and app integrations.
· Orchestrate cross-functional squads.
· Data Engineering, Platform SRE, Security, Risk, Legal and Business to deliver secure, governed and compliant data capabilities and AI services at scale.
· Own roadmaps, delivery governance, risk controls, release management and post-production reliability for data, AI workloads and ensuring Responsible AI principles are codified into day-to-day operations.
· Platform Technical Ownership
· Partner with Platform Engineering
· SRE to evolve the data platform reference architecture.
· Drive integration and operationalization of MLOps and Model Ops practices.
· Oversee environment strategy (dev test stage prod), IaC-driven provisioning, cost guardrails and performance SLAs.
· Responsible AI Data Governance.
· Embed Responsible AI guardrails into SDLC and runtime model cards, fairness bias checks, explainability, human-in-the-loop, monitoring drift and incident response.
· Operationalize data governance meta data catalog, lineage, PII classification, DLP, RBAC (Role-Based Access Control), ABAC (Attribute-Based Access Control), data quality SLAs, retention deletion schedules.
· Align with privacy, security and regulatory frameworks (e.g. privacy laws, model risk management and AI assurance frameworks).
· Risk and Compliance Controls
· Maintain risk register, control library, audit trail, approvals and evidence for releases and model lifecycle events.
· Run change advisory (CAB) workflows for platform and model changes ensure traceability from requirements to deployment and monitoring.
· Stakeholder Management Communication.
· Translate business outcomes into measurable platform and AI service capabilities, SLIs and SLOs.
· Provide executive-level status (OKRs, KPIs, burn-up down, RAID, budget vs. actuals)
Certifications (nice-to-have):
· PMP/Prince2, CSM/SAFe, Azure/AWS/GCP data/AI, Databricks/Snowflake, Governance/Privacy.
Technical Program Manager MLOps Data Governance • Toronto, ON, ca