Our client, Netskrt.io is looking for a Director of Live Operations & Systems Reliability to oversee our managed service. Netskrt’s eCDN service is comprised of three major components : intelligent content collection, staging and distribution; adaptive networking, leveraging connectivity as and when available; and an edge cache that allows users to access the content they want locally, using the apps and subscriptions that they already have. Your prime responsibility and priority is to ensure customer excellence. You are passionate about system reliability to influence and drive the strategic Systems Reliability Engineering mission. As the leader of Live operations / Systems Reliability you are responsible for monitoring and maintaining the health of the system. We are a highly motivated team, dedicated to delivering products and services that improve the customer experience when accessing internet video at the edges of the network. You are somebody who enjoys solving problems and has a customer-centric mindset. You should be passionate not only about learning new technologies, but also about running systems and software in the real world. You must enjoy a close-knit team environment of shared responsibility, be a team player and a self-starter. You have exceptional technical skills, and enjoy solving challenging problems. You are a quick learner, you adapt easily and you have great interpersonal and communication skills. Netskrt offers the opportunity to obtain hands-on experience with storage, networking, security, and cloud technologies. As part of the Netskrt team you will have the opportunity to design and implement solutions to solve challenging problems in a startup environment; working with accomplished engineers and a leadership team with a proven track history of success.
Key Responsibilities :
- Monitor, manage and maintain Netskrt’s managed service
- Manage availability, latency, scalability and efficiency by instilling engineering reliability into our deployed systems with a focus on fault tolerant approaches
- Drive quality accountability within the organization with well-defined processes, metrics, and goals for process quality. This includes leading effective post mortems and ensuring actions are followed-up
- Drive capacity planning, performance analysis, instrumentation and other nonfunctional systems requirements
- Define and report "progress" on strategic initiatives and project level tasks to all stakeholders including senior executives, clients and use effective communication approaches with each constituency.
- Implement metrics driven processes to ensure service quality targets are met
- Engage, influence, and evangelize SRE practices with development, operational and product groups to align technology service / solution delivery.
Required Qualifications, Skills, Experience :
Degree in Computer Science or related technical fieldAccomplished leader with 5+ years managing regional and global teams and systemsExpert knowledge in all aspects of designing, developing, managing large realtime systemsProject and process managementPrior successful experience as a systems performance or systems reliability engineerMastery of Linux / UnixMastery of coding / scripting languages (e.g., C++, PHP, Python, Perl)Mastery of fault tolerant approaches in a large scale distributed environment and high performance systemsDemonstrated experience working in large, complex systems environmentsDeep understanding of internet and networking protocolsAnalytical mind with excellent problem-solving skillsExcellent time management, communication, decision-making, presentation, and leadership and organizational skillsAbility to lead across functions and motivate a matrix staffDesired Qualifications :
Proven leader of technology solutions in a high volume transaction environmentMaintain excellent written and verbal communications with clients, employees, and management chain, including status reports, project plans, presentations, etc.Familiarity with security frameworks and risk management methodologiesKnowledge of patch management, intrusion detection / prevention systemsCloud computing and cloud technologies (AWS, OpenStack)Experience with caching and CDN (content delivery network) technologies (Netflix, Amazon, Google, Limelight, Akamai, Fastly)Knowledge of data protection operations and legislation (e.g. GDPR)Experience with securing IoT and / or autonomous remote devices.Any questions about the company or to apply : [email protected] or [email protected]
J-18808-Ljbffr