Role : Senior Data Engineer
Location : Mississauga, ON
Contract
Mandatory Skills : Data Lake Storage, Azure Data Factory, RDD / DataFrame / SQL
- Design, develop, optimize, and maintain scalable data pipelines using Databricks (Spark / Scala) on Azure.
- Collaborate with cross-functional teams to deliver robust end-to-end data solutions.
- Write efficient SQL queries for ETL processes, data validation, and reporting.
- Optimize performance of complex data processing workflows.
- Implement best practices for code quality, testing, deployment, and documentation.
- Participate in code reviews and mentor junior engineers.
- Work closely with stakeholders to understand business requirements and translate them into technical solutions.
- Integrate data solutions with other applications; exposure to .Net APIs or React-based UIs is a plus.
- Required Skills & Qualifications :
- Bachelor's or Master's degree in Computer Science or related field.
- 6+ years of hands-on experience in software engineering / data engineering roles.
- Strong experience with Databricks platform for big data analytics.
- Proficient in Scala programming language; solid understanding of Spark framework (RDD / DataFrame / SQL).
- Hands-on experience working on Azure cloud services (e.g., Data Lake Storage, Azure Data Factory).
- Strong background in Database design & development; excellent SQL skills (T-SQL / PLSQL etc.).
- Understanding of CI / CD pipelines for automated deployments on cloud platforms.
- Experience working within Agile / Scrum methodologies.
- Advanced knowledge of Python for scripting, automation, and AI / ML development.
- Familiarity with Langchain for integrating and orchestrating large language model workflows.
- Hands-on experience with Azure OpenAI to leverage generative AI capabilities in applications
- Understanding of Vector Databases for efficient storage and retrieval of high-dimensional data embeddings, essential for semantic search and recommendation systems.
- Knowledge of RAG (Retrieval-Augmented Generation) techniques to combine information retrieval with generative AI models for enhanced application intelligence.