1 day ago Be among the first 25 applicants
Responsibilities
- Design, develop, and maintain ETL processes using AWS analytics services for extracting, transforming, and loading data from various sources into our data lake and data warehouse.
- Collaborate with cross‑functional teams to understand data requirements, identify data sources, and implement solutions for data ingestion and integration.
- Support data & ETL migrations from On‑prem to Cloud platforms.
- Develop custom transformations and data pipelines using PySpark and Python as needed.
- Ensure data quality and integrity throughout the ETL process, implementing appropriate error handling and monitoring mechanisms.
- Implement and enforce data governance policies and procedures within the data warehouse, ensuring compliance with regulatory requirements and industry best practices.
- Work closely with data scientists, analysts, and business stakeholders to gather data requirements to design and implement data models, schemas, and structures, with a focus on data governance principles such as data lineage, data quality monitoring, and metadata management.
- Expand the business intelligence and data analytics product capabilities using GenAI technology.
- Monitor and troubleshoot data pipelines to ensure data quality, consistency, and performance.
- Ensure data consistency, availability, and recoverability through backup and disaster recovery procedures.
- Collaborate with stakeholders to define and document data governance standards, processes, and guidelines, and facilitate training and awareness programs as needed.
- Continuously evaluate and implement best practices, tools, and technologies to improve data governance practices and ensure data security, privacy, and confidentiality.
What We’re Looking For
8+ years’ experience with Datawarehouse, Datalake ETL platforms.8+ years’ experience with Data Modelling framework such as Data Vault 2.0 / Star Model / Snowflake Model.8+ years’ experience SQL scripting including SCD1, SCD2 & SCD3 logic.Expert in distributed data processing frameworks such as Spark (Core, Streaming, SQL), Storm, Flink etc.3+ years’ experience with IAC (Infrastructure as Code) using Terraform, CloudFormation.5+ years’ experience with AWS Cloud services e.g., AWS Glue, Athena, EMR, Firehose, Kinesis, Redshift, RDS, DMS, S3, App Flow, SQS, Lambda, Airflow, Eventbridge etc.8+ years’ experience working on some of the relational databases (Oracle / SQL Server / DB2 / PostgreSQL / MySQL, SAP Hana) on AWS and / or on‑prem infrastructure.5+ years’ experience with NoSQL solutions such as DynamoDB, Big Table, MongoDB, Cassandra.5+ years’ experience with Python / Java / Scala programming languages.5+ years’ experience with Business Intelligence technologies such as PowerBI, Tableau, QuickSight.5+ years’ experience with CI / CD pipelines on platforms such as GitHub, Azure DevOps.Experienced in Hadoop ecosystem with AWS cloud distribution.Experienced in Big Data ingestion tools such as Sqoop, Flume, Nifi & distributed messaging and ingestion frameworks like Theobald, Kafka, Pulsar, Pub / Sub, etc.Agile methodology and skill, including experience with Scrum Ceremonies and work management tools (e.g., JIRA, Confluence, ADO).Work as a cross‑functional team member to support multiple work streams and products.Experienced in AWS Foundation and Organizations.Education and Certifications
Bachelor's degree in Computer Science, Software Engineering, Data Science and / or equivalent work experience.AWS Certification - Database, Data Engineer, AI / ML, AWS Certified Solutions Architect.#J-18808-Ljbffr