Overview
We are seeking a skilled Data Engineer with expertise in Azure and Databricks to join our team on a fully remote 3-month contract with a potential extension. In this role, you will collaborate with stakeholders to design and implement a robust data lake architecture and ensure the seamless ingestion and processing of data for reporting in the financial services sector, particularly in insurance.
Responsibilities
- Define and build a data lake architecture on Azure using ADLS and Databricks.
- Implement a layered data model (bronze/silver/gold) for efficient data ingestion and reporting.
- Develop ingestion patterns from API sources including both batch and streaming methods.
- Establish data governance processes alongside metadata and access controls.
- Define storage structures and naming conventions to ensure data consistency.
- Create API connectors and ingestion pipelines utilizing Python.
- Build data transformation logic with PySpark/Python for data cleansing and validation.
- Configure monitoring, logging, and alerting to ensure pipeline reliability.
Requirements
- Proven experience in Azure Databricks and Python programming.
- Strong understanding of data ingestion and pipeline development.
- Experience in building data lake architectures on cloud platforms.
- Familiarity with data governance and metadata management practices.
- Knowledge of API development and integration services.
- Background in the financial services sector, especially in insurance.
- Proficiency in data transformation and processing using PySpark.