We are seeking a highly skilled and visionary Data Architect to lead the design and implementation of our global data model at InPost. This position is ideal for individuals passionate about data ecosystems, medallion architecture, and top-noch frameworks for processing large-scale datasets. This role will encompass both performance (vast data volumes) and conceptual aspects (naming convention and its consistency).
Your responsibilities :
- Defining and overseeing the implementation of data modeling strategies
- Defining and overseeing the data naming strategy (establishing conventions, harmonizing existing names of schemas, objects, columns, ensuring compliance)
- Training data engineers and data consultants on data modeling approaches (theory, tools, strategies)
- Creating technical specifications for data model development and data architecture documentation
- Ensuring effective collaboration between engineering and analytical teams
- Tracking trends and introducing innovations in data modeling approaches (tools, performance, naming)
Requirements :
Minimum 6 years of experience in the field of data engineering / analytics engineering2 years of experience as a Data ArchitectExpert in SQLAdvanced knowledge of Python, Spark, Databricks, Azure, dbt + willingness to work with in-house frameworks for creating data objectsExtensive experience in designing and implementing data warehouses (preferably modern big data warehouses, data lakehouse), (long-term) experience with data modeling in star schema, theoretical and practical knowledge of Kimball approachDeep understanding of medalion architectureProven track record of successfully delivering Big Data & Analytics solutionsProficiency in modern Data & Analytics technology stacks and architectural patternsKnowledge and experience in optimization of : queries in the Big Data stack, ETL processes, data storage solutions and partitioning strategies for high-performance analytics and processingFamiliarity with the Parquet file format and experience working with Delta Lake, including data versioning, storage optimization, ensuring data integrity, and handling ACID transactionsUnderstanding of the end-to-end development lifecycle of analytical data productsHigh level of communication and cross-team collaboration skillsFamiliarity with CI / CD and git (preferred GitLab)Fluent Polish and EnglishWe will appreciate also :
Knowledge of BI tools (Power BI preferred / Tableau)Familiarity with CI / CD and git (preferred GitLab)Experienced in establishing and enforcing data governance policies using Unity Catalog in Databricks for centralized data access control, lineage, and auditingCapability in defining and managing data quality frameworks, metadata management, and data catalogingExpertise in implementing role-based access control (RBAC) and data encryption (at rest and in transit) to ensure data security and compliance with regulations like GDPR, CCPA, and HIPAAExperience in building global data models