Job Description
Work Mode : Hybrid (2 days per week in-person at Toronto office preferred)
Skills required :
- 10–12 years in technical program / project management with at least 3–5 years in data platforms and AI / ML operations.
- Strong understanding of data architectures (lake / lakehouse, warehouse, streaming), data governance, and MLOps / ModelOps concepts.
- MLOps / AI : Azure ML, SageMaker, Vertex AI; MLflow, model registry, feature stores, drift / fairness / explainability tools.
- Data Governance : Purview, Collibra, Alation; data lineage, cataloging, DQ tooling.
- Orchestration and CI / CD : Airflow, Prefect, dbt; GitHub Actions / Azure DevOps / Jenkins; Terraform / Bicep / CloudFormation.
- Monitoring / Observability : Prometheus / Grafana, cloud-native monitors, logging, data quality monitors, model monitoring.
- Cloud & Data : Azure (Synapse, Fabric), AWS (S3 / Glue / Redshift), GCP (BigQuery / Dataflow), Databricks, Snowflake.
- Proven experience embedding security / privacy-by-design and RAI principles into delivery and ops.
- Excellent stakeholder management, vendor management, and executive communication skills.
Roles and responsibilities
Program Delivery LeadershipOwn end-to-end delivery of data platform and AI / ML operational initiatives discovery design implementation hypercare steady-state operations.Maintain multi-quarter roadmap, backlog and release trains (Scrum, Kanban, SAFe), run standups, PI planning, demos and retros.Manage dependencies across data ingestion, storage processing, cataloging, lineage, access, MLOps pipelines and app integrations.Orchestrate cross-functional squads.Data Engineering, Platform SRE, Security, Risk, Legal and Business to deliver secure, governed and compliant data capabilities and AI services at scale.Own roadmaps, delivery governance, risk controls, release management and post-production reliability for data, AI workloads and ensuring Responsible AI principles are codified into day-to-day operations.Platform Technical OwnershipPartner with Platform EngineeringSRE to evolve the data platform reference architecture.Drive integration and operationalization of MLOps and Model Ops practices.Oversee environment strategy (dev test stage prod), IaC-driven provisioning, cost guardrails and performance SLAs.Responsible AI Data Governance.Embed Responsible AI guardrails into SDLC and runtime model cards, fairness bias checks, explainability, human-in-the-loop, monitoring drift and incident response.Operationalize data governance meta data catalog, lineage, PII classification, DLP, RBAC (Role-Based Access Control), ABAC (Attribute-Based Access Control), data quality SLAs, retention deletion schedules.Align with privacy, security and regulatory frameworks (e.g. privacy laws, model risk management and AI assurance frameworks).Risk and Compliance ControlsMaintain risk register, control library, audit trail, approvals and evidence for releases and model lifecycle events.Run change advisory (CAB) workflows for platform and model changes ensure traceability from requirements to deployment and monitoring.Stakeholder Management Communication.Translate business outcomes into measurable platform and AI service capabilities, SLIs and SLOs.Provide executive-level status (OKRs, KPIs, burn-up down, RAID, budget vs. actuals)Certifications (nice-to-have) :
PMP / Prince2, CSM / SAFe, Azure / AWS / GCP data / AI, Databricks / Snowflake, Governance / Privacy.Requirements
Work Mode : Hybrid (2 days per week in-person at Toronto office preferred) Skills required :
10–12 years in technical program / project management with at least 3–5 years in data platforms and AI / ML operations.Strong understanding of data architectures (lake / lakehouse, warehouse, streaming), data governance, and MLOps / ModelOps concepts.MLOps / AI : Azure ML, SageMaker, Vertex AI; MLflow, model registry, feature stores, drift / fairness / explainability tools.Data Governance : Purview, Collibra, Alation; data lineage, cataloging, DQ tooling.Orchestration and CI / CD : Airflow, Prefect, dbt; GitHub Actions / Azure DevOps / Jenkins; Terraform / Bicep / CloudFormation.Monitoring / Observability : Prometheus / Grafana, cloud-native monitors, logging, data quality monitors, model monitoring.Cloud & Data : Azure (Synapse, Fabric), AWS (S3 / Glue / Redshift), GCP (BigQuery / Dataflow), Databricks, Snowflake.Proven experience embedding security / privacy-by-design and RAI principles into delivery and ops.Excellent stakeholder management, vendor management, and executive communication skills. Roles and responsibilitiesProgram Delivery LeadershipOwn end-to-end delivery of data platform and AI / ML operational initiatives discovery design implementation hypercare steady-state operations.Maintain multi-quarter roadmap, backlog and release trains (Scrum, Kanban, SAFe), run standups, PI planning, demos and retros.Manage dependencies across data ingestion, storage processing, cataloging, lineage, access, MLOps pipelines and app integrations.Orchestrate cross-functional squads.Data Engineering, Platform SRE, Security, Risk, Legal and Business to deliver secure, governed and compliant data capabilities and AI services at scale.Own roadmaps, delivery governance, risk controls, release management and post-production reliability for data, AI workloads and ensuring Responsible AI principles are codified into day-to-day operations.Platform Technical OwnershipPartner with Platform EngineeringSRE to evolve the data platform reference architecture.Drive integration and operationalization of MLOps and Model Ops practices.Oversee environment strategy (dev test stage prod), IaC-driven provisioning, cost guardrails and performance SLAs.Responsible AI Data Governance.Embed Responsible AI guardrails into SDLC and runtime model cards, fairness bias checks, explainability, human-in-the-loop, monitoring drift and incident response.Operationalize data governance meta data catalog, lineage, PII classification, DLP, RBAC (Role-Based Access Control), ABAC (Attribute-Based Access Control), data quality SLAs, retention deletion schedules.Align with privacy, security and regulatory frameworks (e.g. privacy laws, model risk management and AI assurance frameworks).Risk and Compliance ControlsMaintain risk register, control library, audit trail, approvals and evidence for releases and model lifecycle events.Run change advisory (CAB) workflows for platform and model changes ensure traceability from requirements to deployment and monitoring.Stakeholder Management Communication.Translate business outcomes into measurable platform and AI service capabilities, SLIs and SLOs.Provide executive-level status (OKRs, KPIs, burn-up down, RAID, budget vs. actuals) Certifications (nice-to-have) :PMP / Prince2, CSM / SAFe, Azure / AWS / GCP data / AI, Databricks / Snowflake, Governance / Privacy.