Overview
The Data Engineer will focus on developing data architectures and ingestion pipelines within the Azure Databricks environment for financial services, specifically in the insurance and banking sectors. Working fully remote, this contract position lasts for 2-3 months, with potential for extension, and requires collaboration with stakeholders to ensure effective data processing and governance.
Responsibilities
- Define and build a data lake architecture on Azure (ADLS + Databricks).
- Implement layered data models (bronze/silver/gold) for ingestion, processing, and reporting.
- Develop ingestion patterns from API sources, both batch and streaming.
- Establish data governance, metadata, and access controls.
- Define storage structures, naming conventions, and schema design.
- Develop API connectors and ingestion pipelines using Python.
- Implement integration services feeding into the Databricks environment.
- Build data transformation logic using PySpark/Python (data cleansing, standardisation, validation).
Requirements
- Experience with Azure Databricks.
- Proficiency in Python.
- Familiarity with financial services sector, particularly insurance or banking.
- Knowledge of data lake architecture and data governance principles.
- Experience with API integration and data ingestion patterns.
- Skills in data transformation and cleansing using PySpark/Python.