The Senior Data Developer Lead- Digital Analytics is joining our growing hybrid team in Vancouver, not looking to relocate. We are interested in your data implementation experience.
Must Have Skills :
7 to 10 years working in ETL, Databricks, Python, Spark, SQL Databases with data implementation experience.
Responsibilities :
- Strong proficiency in Python (especially data packages like pandas, Numpy, etc.) and SQL for analytics, database development, and data modelling
- Professional experience with data ingestion, ETL, and ELT for structured and unstructured data.
- Experience with DevOps and CICD for data.
- Python - Strong proficiency, unit testing expertise and good knowledge of packages for data, like Pandas, SQL Alchemy, and Alembic.
- SQL - strong proficiency in DDL and DML, including window function (ex. : lag), CTEs, sub-queries, joins, optimization, and performance profiling. Know how SQL behaves in different platforms, like Spark, PostgreSQL,
- Spark - Pyspark, Spark SQL, Batch and Streaming processing, partitioning / liquid clustering, delta tables, parquet
- Databricks - Workflows / Jobs, Clusters, SQL Warehouse, Unity Catalog, Performance profiling, log analysis.
- Azure EventHub or a similar streaming solution understand how to best consume data from streaming for aggregations and parallelization / scaling.
- PostgreSQL - Queries, indexes (different types of indexes), performance profiling, JSON columns.
- Data Modeling : Dimensional Modeling (and experience with this model being used in BI tools), Normal Forms.
- Docker containerization
- DBT - models, seeds, multiple environments, parameters, macros, unit tests, data tests, incremental materialization, snapshots
- Infra as code / CICD tools : Kubernetes, Argo, Cross Plane, Terraform or similar.
- Migrations - versioning for database, for example, Alembic (preferred), Flyway, Liquibase
- SQL and NoSQL databases and selecting the best fit for different use cases.
- Logging and being able to query logs using KQL(Azure) or similar.
- Support, maintain, optimize, and create ETL / ELT pipelines, both batch and streaming, in Databricks (Pyspark, Databricks SQL), Python, SQL and / or DBT.
- Proficient in Dimensional Modeling and Database normalization (Normal Form)
Life at Capgemini
Capgemini supports all aspects of your well-being throughout the changing stages of your life and career. For eligible employees, we offer :
Collaborating with teams of creative, fun, and driven colleaguesFlexible work options enabling time and location-based flexibilityCompany-provided home office equipmentVirtual collaboration and productivity tools to enable hybrid teamsComprehensive benefits program (Health, Welfare, Retirement and Paid time off)Other perks and wellness benefits like discount programs, and gym / studio access.Paid Parental Leave and coaching, baby welcome gift, and family care / illness daysBack-up childcare / elder care, childcare discounts, and subsidized virtual tutoringTuition assistance and weekly hot skill development opportunitiesExperiential, high-impact learning series eventsAccess to mental health resources and mindfulness programsAccess to join Capgemini Employee Resource Groups around communities of interestAbout Capgemini
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, cloud and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of €22.5 billion