This role will provide the candidate an opportunity to deliver enterprise-scale data workloads following a data lakehouse architecture on a modern cloud-based data stack for our new Commercial Risk Solutions Ecosystem. These data workloads represent strategic feeds to support our BI / Analytics solutions that are essential to the value proposition of our ecosystem.
responsibilities :
Cross-functional Team : Work within a cross-functional Agile team alongside Product Owner, Data Architect and Data Analysts to deliver on release goals for a global program
Data Workload Design : Design scalable and efficient data workloads architectures that integrate our transactional systems with a new lakehouse data model.
Data Integration : Develop, test and maintain data pipelines (ETL) for integrating diverse data sources into a unified format embedding best practices and standards.
Data Management : Manage and optimize data to ensure efficient data storage, retrieval, and processing.
Data Quality Management : Implement automated data quality checks and ensure data integrity throughout the migration process.
Documentation : Create and maintain comprehensive documentation for data processes, ensuring knowledge transfer, observability and supportability.
Performance Monitoring : Monitor and optimize data performance to meet defined service-level agreements.
Troubleshooting : Identify and resolve data-related issues in a timely manner, collaborating with relevant teams.
requirements-expected :
Advanced Technical Skills :
In-depth knowledge in programming languages such as Python incl. Spark / Scala.
Experience with ETL tools and lakehouse architectures (3+years) through Databricks, Apache SparkSql and similar
Strong SQL skills for data manipulation and querying Pipeline efficiency optimization skills
Hands on experience with Agile technical practices, source versioning and Agile project management tools (Azure Repos, GitLab, Azure DevOps, Jira, Confluence, other)
Database Knowledge : Familiarity with relational and non-relational databases (Oracle, MSSQLServer).