Job Responsibilities
- Design, develop, and maintain ETL processes using AWS analytics services for extracting, transforming, and loading data from various sources into our data lake and data warehouse.
- Collaborate with cross-functional teams to understand data requirements, identify data sources, and implement solutions for data ingestion and integration.
- Support data & ETL migrations from On-prem to Cloud platforms.
- Develop custom transformations and data pipelines using PySpark and Python as needed.
- Ensure data quality and integrity throughout the ETL process, implementing appropriate error handling and monitoring mechanisms.
- Implement and enforce data governance policies and procedures within the data warehouse, ensuring compliance with regulatory requirements and industry best practices.
- Work closely with data scientists, analysts, and business stakeholders to gather data requirements to design and implement data models, schemas, and structures, with a focus on data governance principles such as data lineage, data quality monitoring, and metadata management.
- Expand the business intelligence and data analytics product capabilities using GenAI technology.
- Monitor and troubleshoot data pipelines to ensure data quality, consistency, and performance.
- Ensure data consistency, availability, and recoverability through backup and disaster recovery procedures.
- Collaborate with stakeholders to define and document data governance standards, processes, and guidelines, and facilitate training and awareness programs as needed.
- Continuously evaluate and implement best practices, tools, and technologies to improve data governance practices and ensure data security, privacy, and confidentiality.
What We're Looking For :
8+ years’ Experience with Datawarehouse, Datalake ETL platforms.8+ years’ Experience with Data Modelling framework such as Data Vault 2.0 / Star Model / Snowflake Model.8+ years’ Experience SQL scripting including SCD1, SCD2 & SCD3 logic.Expert in distributed data processing frameworks such as Spark (Core, Streaming, SQL), Storm, Flink etc.3+ years’ Experience with IAC (Infrastructure as Code) using Terraform, CloudFormation.5+ years’ Experience with AWS Cloud service e.g., AWS Glue, Athena, EMR, Firehose, Kinesis, Redshift, RDS, DMS, S3, App Flow, SQS Lambda, Airflow, Eventbridge etc.8+ years’ Experience working on some of the relational databases (Oracle / SQL Server / DB2 / PostgreSQL / MySQL, SAP Hana) on AWS and / or on-prem infrastructure.5+ years’ Experience with NoSQL solutions such as DynamoDB, Big Table, MongoDB, Cassandra.5+ years’ Experience with Python / Java / Scala programming languages.5+ years’ Experience with Business Intelligence technologies such as PowerBI, Tableau, Quick Sight.5+ years’ Experience with CI / CD pipelines on platforms such as GitHub, Azure DevOps.Experienced in Hadoop eco-system with AWS cloud distribution.Experienced in Big Data ingestion tools such as Sqoop, Flume, NiFI & distributed messaging and ingestion frameworks like Theobald, Kafka, Pulsar, Pub / Sub, etc.Agile methodology and skill, including experience with Scrum Ceremonies and work management tools (e.g., (JIRA, Confluence, ADO).Work as a cross-functional team member to support multiple work streams and products.Experienced in AWS Foundation and Organizations.Education and Certification :
Bachelor's degree in computer science, Software Engineering, Data Science and or equivalent work experience.AWS Certification -Database, Data Engineer, AI / ML, AWS Certified Solutions Architect.Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Information Technology and Engineering
Industries
Motor Vehicle Parts Manufacturing
Location : Ostrava, Moravia-Silesia, Czechia
#J-18808-Ljbffr