Role : Software Developer Reliability
Team : Market Access & Risk
As a Reliability Software Engineer, you will play a critical role in ensuring the performance, stability and availability of our software systems, as well as their day-to-day operations.
As such, the team requires a high software development capacity, along with strong analytical skills.
You will primarily be developing reliability features directly in our applications, implementing observability capabilities, running benchmarks to measure performance, and building automation and tooling to support the operations of our systems.
Operations are important to ensure business continuity, they include responding to level-2 support escalations, monitoring our infrastructure capacity, and tweak system configuration to address user requests.
Position Overview :
System Reliability : Develop incremental stability, recovery, scalability and performance improvements. Perform root cause analyses to understand the source of incidents.
Suggest and implement remedial actions in response to incidents
Observability : Monitor, measure, and analyze the performance, availability and stability of technology systems to identify areas of improvement and allow the team to take data-driven decisions
Performance Optimization : Optimize performance of production systems to address bottlenecks and improve system response times, resource utilization, and overall application performance
Automation and Tooling : Develop and maintain automation systems and tooling for operations, deployment, and incident management to reduce manual intervention and enhance system stability
Production Management : Provide level-2 support for incident response to ensure business uptime. Work closely with core developers and support teams to plan and prepare for scaling technology systems to accommodate user demands
Required Qualifications :
- Education : Bachelor’s degree in Computer Science or related subject
- Experience : 4+ years proven experience in Software Engineering, Software Reliability, or similar role with hand-on experience in software development and providing L2 support
- Experience of developing in Python or similar, and familiarity with version control systems such as git
- Experience working in a Linux environment
- Problem-Solving Skills : Strong analytical and problem-solving skills with a keen eye for detail and a proactive approach to resolving issues
- Communication : Excellent communication and collaboration skills to work effectively with cross-functional teams
- Adaptability : Ability to work in a fast-paced and dynamic environment, adapting to changing priorities and requirements
- Automation and Tooling : Experience developing automation tools and implementing configuration management
Nice to have :
- C++ or KDB / q development experience
- Experience with Slurm, Airflow or middleware such as Kafka and AMPS