Overview
Jacobs is seeking an experienced AI Data Engineer to join our team in Krakow as part of the Digital & Data Hub. The role focuses on scalable AI / ML data architectures in cloud environments, integrating large-scale datasets, automating pipelines, and enabling data flow to support advanced machine learning initiatives across the organization.
Responsibilities
- Design, develop, and optimize data architectures that support AI and Machine Learning workflows
- Integrate large-scale datasets from multiple sources to ensure consistency, accessibility, and performance
- Build scalable, automated data pipelines to support real-time and batch data processing
- Utilize cloud-based ETL services (e.g., Glue, Lambda, Data Pipeline or equivalent) to manage data ingestion, transformation, and integration tasks
- Collaborate with data scientists and ML engineers to ensure data infrastructure aligns with AI / ML model requirements
- Monitor and maintain data workflows to ensure reliability, scalability, and cost-efficiency
- Continuously improve data engineering practices to support evolving AI / ML use cases and business needs
Qualifications
5+ years of experience in data engineering or related roles, with a strong focus on cloud-based solutionsProficiency in programming languages such as Python, Scala, or similarSolid understanding of machine learning frameworks such as TensorFlow or PyTorchStrong experience in data classification, including the identification of PII data entitiesKnowledge and experience with retrieval augmented generation (RAG) and agent-based workflowsDeep understanding of how to re-rank and improve LLM outputs using Index and Vector storesAbility to leverage cloud services (e.g., SageMaker, Comprehend, Entity Resolution or similar) to solve complex data and AI-related challengesAbility to manage and deploy machine learning models and frameworks at scale using cloud infrastructureStrong analytical and problem-solving skills, with the ability to innovate and develop new approaches to data engineering and AI / MLExperience with ETL services in cloud environments (such as Glue, Lambda, Data Pipeline or equivalent) to handle data processing and integration tasks effectivelyExperience in core cloud services including IAM, VPC, compute instances, object storage, relational databases, monitoring, and logging toolsNice to Have
Experience with data privacy and compliance requirements, especially related to PII dataFamiliarity with advanced data indexing techniques, vector databases, and other technologies that improve the quality of AI / ML outputsWe offer
Rewarding employment - full-time employment / b2b contract with a salary that matches your qualificationsHybrid work model - flexibility to work from home with several office days per monthFlexible hours - start your day anytime between 7 : 30 and 10 : 00 AMComprehensive benefits, including Lux Med medical care, psychological support, life insurance, My Benefit cafeteria system, Multisport card co-financing, and a car / bike park sharing systemCo-financed holidaysGlobal projects - engage in international projectsInclusive networks and employee resource groupsContinuous learning opportunities including Graduate Development Program and self-learning platformsLanguage courses in English, German, and PolishWe are an inclusive employer and support recruitment regardless of age, disabilities, gender identity, race, religion, sexual orientation, or family status. If you require reasonable adjustments in the recruitment process, please contact the team at
#J-18808-Ljbffr