- Search jobs
- Banff, AB
- data scientist
Data scientist Jobs in Banff, AB
- New!
Data Engineer- Remote-Canada
Zortech SolutionsAB, CanadaFrench Canadian Speaking Audio Data Collector
LXTAnywhere, AB, CACanadian Centre of Recovery Excellence Chief Information Officer
BoydenAB, CAOffice Manager
York Region District School BoardAlberta, CanadaWork at Home Data Entry Agent - Part Time
UsaSurveyJobBoardStrathcona County, Alberta, ,Maintenance Coordinator
BGISAB, CanadaPlautdietsch / Mennonite Low German Transcribers (Remote)
Sigma GroupAlberta, Alberta, .CAdata entry clerk
Art of Man GalleryBanff, AB, CAadministrative assistant
SOUTH WEST BULK EXPRESS INC.Rocky View, AB, CAData Governance Lead
RocheAlberta, CanadaMedical Science Liaison - Western Canada (BC AB SK MB)
IncyteRemote Location, Alberta, CanadaAs a Financial Planner, are you seeking a career change to Data Scientist
estaffing IncAB, CanadaSenior Mineral Land Data Analyst
PetroplanAlbertaSoftware Engineer II
Bazaarvoice, Alberta, ,System Analyst (Data Engineer) – Information Technology
Alberta Blue CrossAlbertaData Engineer- Remote-Canada
Zortech SolutionsAB, Canada- Full-time
- Remote
- Quick Apply
L!-CEIPAL
Role : Data Engineer with expertise in Apache Spark, PySpark, Python, and AWS services (particularly Glue)
Location : Remote-Canada
Duration : 6+ Months
Job Description :
We are looking for an experienced Data Engineer with expertise in Apache Spark, PySpark, Python, and AWS services (particularly Glue) to join our team.
The ideal candidate will have hands-on experience with ETL processes in the cloud, a deep understanding of data pipelines, and the ability to work with large datasets efficiently.
This role will focus on designing, building, and optimizing data workflows on AWS Cloud using Spark-based frameworks and Python.
Mandatory Skill Sets : -
Proficient in using Apache Spark and PySpark for big data processing and transformation.
Hands-on experience with AWS Glue for building ETL workflows in the cloud.
Strong programming skills in Python, particularly for data manipulation, automation, and integration with Spark and Glue.
Solid understanding of ETL principles and data pipeline design, including optimization techniques.
Experience working with AWS services, particularly those related to data processing (e.g., S3, Glue, Lambda, Redshift).
Must have proficiency in writing optimized SQL with performance tuning aspects.
Ability to translate complex business requirements into technical solutions.
Experience with Apache Airflow for orchestrating data workflows.
Knowledge of data warehousing concepts and cloud-native analytics tools.
Key Responsibilities :
Spark & PySpark Development :
Design and implement scalable data processing pipelines using Apache Spark and PySpark for large-scale data transformations.
ETL Pipeline Development :
Develop, maintain, and optimize ETL processes to efficiently extract, transform, and load data across various data sources and destinations.
AWS Glue Integration :
Utilize AWS Glue for serverless ETL jobs, including creating, running, and monitoring Glue jobs for data transformations and integrations.
Python Scripting :
Write efficient and reusable Python code to support data manipulation, analysis, and transformation in Spark and Glue environments.
Data Pipeline Optimization :
Ensure that data workflows are optimized for performance, scalability, and cost-efficiency on AWS Cloud.
Collaboration :
Work closely with data analysts, data scientists, and other engineering teams to build reliable data solutions that support business analytics and decision-making.
Documentation & Best Practices :
Document processes, workflows, and code, while adhering to best practices in data engineering, cloud architecture, and ETL design.