Site Reliability Engineering ManagerIBM • Kraków, Województwo małopolskie, Polska

Site Reliability Engineering Manager

IBM • Kraków, Województwo małopolskie, Polska

30+ days ago

Job description

Introduction

A career in IBM Software means you'll be part of a team that transforms our customer's challenges into solutions.

Seeking new possibilities and always staying curious, we are a team dedicated to creating the world's leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.

IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

Your Role And Responsibilities

As a site reliability engineering manager (SRE) in the IBM Software organization, you will be responsible for managing and leading a team of SRE engineers. Responsibilities include ensuring the reliability, scalability, and operational efficiency of IBM Asset Lifecycle Management services. You will do the hiring, training, and mentoring team members, assigning tasks, setting goals, and conducting performance evaluations. You will work closely with development teams, SRE peers and engineering managers to automate infrastructure management, optimize system performance, and enhance monitoring capabilities.. Overall, an SRE Manager plays a crucial role in aligning engineering and operations to achieve reliable software systems. Combine technical expertise with leadership and management skills to drive continuous improvement and ensure high-quality service delivery.

Key Responsibilities

Leadership

Provide strategic guidance to engineering teams on architectural decisions and directions.
Empower teams to achieve technical excellence, with a focus on reliability, scalability, and simplicity.
Foster collaboration across engineering, product, and other cross-functional teams to deliver optimal solutions.

Monitoring & Observability

Design and implement monitoring solutions to gain insights into system health, performance, and reliability.

Build and maintain intuitive dashboards for real-time visibility into critical system metrics.

Set up proactive alerting mechanisms to detect and resolve issues before they impact end users.

Incident Management

Lead incident response, performing root cause analysis (RCA) and implementing long-term fixes to improve system resilience.

Build observability solutions with monitoring, logging, and alerting using tools like Prometheus, Grafana, Instana

Define and monitor Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to ensure service reliability.

Security & Compliance

Ensure compliance with security best practices and regulatory requirements across all infrastructure components.

Implement secret management, encryption, and access control for sensitive systems and data.

Participate in security audits, vulnerability assessments, and compliance automation efforts.

Cross-Team Collaboration & DevOps Culture

Collaborate closely with development, operations, and security teams to design and implement resilient architectures.

Promote SRE best practices, such as blameless postmortems, incident retrospectives, and operational readiness reviews.

Mentor junior engineers and contribute to knowledge sharing across teams to build a strong SRE culture.

Preferred Education

Bachelor's Degree

Required Technical And Professional Expertise

Bachelor's degree in computer science engineering / information technology

5+ years' of experience working in global organizations with the ability to effectively communicate with executives, leaders, and individual contributors across the organization.

5+ years of SRE experience working with telemetry, observation, self-healing solutions, and platform automation.

Cloud & Infrastructure : Expertise in Kubernetes, OpenShift, Docker, IBM Cloud and other cloud platforms

Create a job alert for this search

Engineering Manager • Kraków, Województwo małopolskie, Polska

Related jobs

Site Reliability Engineer

InPost • Kraków, Województwo małopolskie, Polska

InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers. With operations across several countries, our network of intelligen...Show more

Last updated: 30+ days ago • Promoted

Mid Site Reliability Engineer

Sabre • Kraków, Województwo małopolskie, Polska

Sabre is a technology company that powers the global travel industry.By leveraging next-generation technology, we create global technology solutions that take on the biggest opportunities and solve...Show more

Last updated: 30+ days ago • Promoted

Site Reliability

Canonical - Jobs • Kraków, Województwo małopolskie, Polska

Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise in...Show more

Last updated: 20 hours ago • Promoted • New!

Engineering, Manager

Etraveli Group • Kraków, Lesser Poland Voivodeship, PL

Quick Apply

Tripstack is a leading travel technology firm dedicated to transforming global travel through innovation.Our proprietary virtual interlining technology dynamically combines flights and ground trans...Show more

Last updated: 30+ days ago

Site Reliability Engineer

Sabre Corporation • Kraków, Województwo małopolskie, Polska

Last updated: 20 hours ago • Promoted • New!

Site Reliability Engineer

Talent Search Technology • Kraków, Województwo małopolskie, Polska

Administrative knowledge with hands‑on experience with container and Kubernetes infrastructure as well as management and operation experience on production environment. Engineering skillset to manag...Show more

Last updated: 6 days ago • Promoted

Site Reliability Engineer

mthree • Metropolitan Boutique Hotel, Województwo małopolskie, Polska

Launch Your Career as a Graduate Site Reliability Engineer (SRE).Ready to start your journey in technology with graduate training, ongoing support, and opportunities at leading global employers?.An...Show more

Last updated: 30+ days ago • Promoted

Senior Engineering Manager @ Mindbox Sp.z.o.o.

Mindbox Sp.z.o.o. • Kraków, Poland

Creating an inspiring place to thrive for the talented, we use their expertise and courage to introduce the technology of the future into your business. We operate and develop in four areas : .Busines...Show more

Last updated: 22 days ago • Promoted

Site Reliability Engineering Lead

HSBC Service Delivery (Polska) Sp. z o.o. • Kraków, małopolskie, Polska

Site Reliability Engineering Lead.Analyse incident and change data to identify patterns, root causes, and systemic risks. Define and track service health metrics (MTTR, failure rates, change success...Show more

Last updated: 30+ days ago • Promoted

Platform SRE / Senior Platform SRE

Vertex Agility • Kraków, Województwo małopolskie, Polska

Platform SRE / Senior Platform SRE.We’re looking for a Site Reliability Engineer to join our dynamic team and support a portfolio of projects across the business. This is a hands‑on, technical role id...Show more

Last updated: 20 hours ago • Promoted • New!

Data Business Developer – Order Management Production

MAN Trucks Sp. z o.o. • Niepołomice, Lesser Poland, Poland

Active contribution to the digitalization and transformation of the MAN production network within customer order processing. Implementation of data analytics projects – including data mining and dat...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer

IBM • Kraków, Województwo małopolskie, Polska

A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.Seeking new possibilities and always staying curious, we are a team dedicated to cre...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer III

OpenX • Kraków, Województwo małopolskie, Polska

OpenX is focused on unleashing the full economic potential of digital media companies.We do this by making digital advertising markets and technologies that are designed to deliver optimal value to...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineering Lead @ HSBC Technology Poland

HSBC Technology Poland • Kraków, Poland

Some careers shine brighter than others.If you’re looking for a career that will help you stand out, join HSBC, and fulfil your potential. Whether you want a career that could take you to the top, o...Show more

Last updated: 30+ days ago • Promoted

Site reliability engineer @ antal

Antal • Kraków, Małopolskie, Polska

Site Reliability Engineer ???? Kraków (Hybrid – minimum 2 days / week in the office)???? Employment type : B2 B.Are you looking for an opportunity to join a high-impact project in a global financial i...Show more

Last updated: 30+ days ago • Promoted

Site Reliability Engineer Internal Kubernetes Platform

Antalpl • Kraków, Województwo małopolskie, Polska

Senior Site Reliability Engineer – Internal Kubernetes Platform.Krakow | hybrid - 6 days per month | full time .We are currently expanding our Hybrid Integration Platform team and are looking for a...Show more

Last updated: 20 hours ago • Promoted • New!

Senior Site Reliability

Canonical - Jobs • Kraków, Województwo małopolskie, Polska

Last updated: 20 hours ago • Promoted • New!

Engineering, Manager

TripStack • Kraków, Województwo małopolskie, Polska

Last updated: 30+ days ago • Promoted