This job offer is not available in your country.

Big Data Engineer

RTB HouseWarsaw, Masovian Voivodeship, Poland

30+ days ago

Job description

We are looking for experienced developers who will help to build and expand our data processing platform.

Your tasks :

Development and maintenance of distributed systems processing large amounts of data (most real-time) for the needs of our RTB platform
Optimization of the developed software in terms of efficiency and resource consumption
Ensuring the reliability and scalability of the solutions built
Creating performance and correctness tests for new system components
Analysis of new technologies in terms of their applicability in production conditions
Development of tools for monitoring and analyzing the operation of the production system
Continuous optimization of existing tools and processes

Selected technologies used :

Java, Python

Hadoop, Kafka

Kafka Streams, Flume, Logstash

Docker, Jenkins, Graphite

Aerospike, PostgreSQL

Google Big Query, Elastic

Selected issues that we have dealt with recently :

Replacement of the framework in the data processing component (transition from Storm to Kafka Streams)

Creating a data stream merger based on the Kafka Client API

Creating a user profile synchronizer between DC's based on Kafka Streams

Creating a component that calculates aggregates based on the Kafka Client API and Bloom filters

Implementation of Logstash for loading and Elastic for querying indexed data (transition from Flume + Solr)

Creating end-to-end monitoring of data correctness and delay

Replacement of the data streaming component to BigQuery and HDFS (from Flume to a proprietary solution based on Kafka Client API)

Continuous system maintenance, detection and resolution of performance problems, as well as scaling due to the growing amount of data

Our expectations :

Proficiency in programming

Excellent understanding of how complex IT systems work (from the hardware level, through software, to algorithmics)

Good knowledge of basic methods of creating concurrent programs and distributed systems (from thread level to continental level)

Practical ability to observe, monitor and analyze the operation of production systems (and draw valuable conclusions from it)

The ability to critically analyze the solutions created in terms of efficiency (from estimating the theoretical performance of the designed systems to detecting and removing actual performance problems in production)

Readiness to work in the DevOps model

Additional advantages will be :

Experience in creating distributed systems

Good knowledge of selected Big Data technologies such as Hadoop, Kafka, Storm, Spark or Flink

Knowledge of application profiling methods and tools (preferably Java, both from the JVM and Linux level)

We offer :

Attractive salary

Work in a team of enthusiasts who are willing to share their knowledge and experience

Extremely flexible cooperation conditions - we do not have core hours, we do not have holiday limits, you can work fully remotely

Access to the latest technologies and the possibility of real use of them in a large-scale and highly dynamic project

Do you have questions about the project, team, style of work? Visit our tech blog : http : / / techblog.rtbhouse.com / jobs /

Create a job alert for this search

Data Engineer • Warsaw, Masovian Voivodeship, Poland