The Private Cloud SRE L3 position sits within Client Enterprise Computing organization. This role supports cloud and container-based infrastructure in a high-availability, globally distributed environment. As a member of the global L3 team, you will provide advanced technical support, participate in on-call rotations, and collaborate with engineering teams on performance, testing, and automation.
Key Responsibilities :
- Provide L3 support for Client private cloud infrastructure and participate in on-call rotation.
- Collaborate with internal engineering teams to test and validate software releases, upgrades, and infrastructure changes.
- Drive process improvements including automation, scripting, documentation, and incident management.
- Assist in the development of capacity planning, performance monitoring, and alerting solutions.
- Coordinate closely with L2 teams and global L3 peers to ensure consistent support across regions.
- Champion operational excellence through robust change, incident, and problem management practices.
Required Qualifications :
5–7 years of relevant experience in systems or infrastructure roles.3–5 years of hands-on experience with Linux systems in enterprise environments.Strong understanding of server infrastructure, virtualization, and cloud computing architectures.Proven experience with Kubernetes and Docker in a production setting.Solid grasp of internet and networking protocols (TCP / IP, HTTP / S) and security protocols (SSL / TLS, Kerberos).Strong scripting skills (e.g., Python preferred) for automation and tooling.Experience with Agile development and DevOps / SRE methodologies.Excellent communication skills and the ability to work effectively with diverse teams and stakeholders.Preferred / Nice-to-Have Skills :
Experience with cloud-native monitoring tools (e.g., Prometheus, Grafana, ELK stack).Hands-on experience in enterprise-scale hosting environments.Familiarity with high-availability system design and disaster recovery strategies.Knowledge of monitoring architecture, including deployment of agents, custom dashboards, and alerting logic.Prior work experience in regulated environments (e.g., financial services) is a plus.Soft Skills :
Strong problem-solving and incident management capabilities.Ability to manage multiple high-pressure issues simultaneously.Highly organized with attention to detail and a proactive attitude toward continuous improvement.#J-18808-Ljbffr