Responsibilities
Design and implement Lakehouse architecture using Databricks, AWS S3, Glue Catalog, and Delta Lake.
Set up and manage multi-environment Databricks Workspaces (Dev, UAT, Prod) with consistent configuration andernance.
Implement cluster policies for standardizedpute usage and cost efficiency.
Create and orchestrate Databricks Jobs and Workflows, integrating with CI / CD systems and version control.
Configure fine-grained access controls using Unity Catalog, Lake Formation, and IAM roles.
Manage user provisioning and role-based access with SCIM integrations to IdPs (, Okta, AWS SSO).
Implement audit trails, lineage tracking, and data masking techniques to meet GDPR, PSD2, and Open Banking standards.
Collaborate withpliance teams to align data handling practices with consent and privacy regulations.
Develop scalable ETL / ELT pipelines in PySpark and Spark SQL, handling batch and streaming data flows.
Ingest and transform data from Open Banking APIs, Amazon Aurora, and third-party aggregators.
Optimize pipeline performance using Z-Ordering, Delta Lakepaction, and caching strategies.
Monitor cluster health, job failures, and performance metrics using native tools and AWS CloudWatch.
Diagnose and resolve issues across jobs, notebooks, and integrations with AWS services.
Act as the go-to Databricks expert for other engineering and analytics teams.
Enforce coding standards for modular notebooks, library packages, and secure secrets handling, use of parameterized, reusable notebooks, schema enforcement with Delta Lake, designing segregatedpute clusters
Promote reusability, version control, and deployment automation using GitHub and CI / CD pipelines.
Must-have skills
6+ years of hands-on Databricks experience on AWS, with a focus on scalable data pipeline development.
Proficiency in PySpark, Spark SQL, Delta Lake, and Databricks Workflows.
Strong understanding of cloud data lake architecture using S3, Glue, and Lake Formation.
Hands-on experience with user access management, IAM policies, and Unity Catalog or similarernance tools.
Good-to-Have Skills
Experience with ETL tools such as HVR, AWS Glue or DBT.
Familiarity with Amazon Aurora (MySQL / PostgreSQL) or equivalent relational databases.
Understanding of Open Banking APIs, API standards (, FAPI), and consent frameworks.
Exposure to DevOps practices and tools like Terraform, GitLab, Jenkins.
Experience working in a regulated industry, preferably Open Banking, fintech, or financial services.
Preferred Certifications
Databricks Certified Data Engineer - Associate or Professional
AWS Certified Solutions Architect - Associate
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networkses with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must :
Job ID R-249220
Senior Data Engineer • Gdańsk, Poland