Senior Site Reliability Engineer

Unreal Gigs
CA
Remote
Full-time
Quick Apply

We are seeking an experienced Site Reliability Engineer (SRE) who is passionate about leveraging data and automation to optimize a highly dynamic infrastructure.

This role entails managing infrastructure and internal tooling to streamline operations and ensure a seamless customer experience.

As a member of our team, you will be instrumental in scaling our infrastructure, automating tasks to reduce manual effort, and fostering a culture of innovation and continuous improvement.

Requirements

What You’ll Do

  • By 30 Days :
  • Utilize your expertise in observability to enhance our existing tools and scale our platform to accommodate growth.
  • Improve automation processes to streamline infrastructure scaling and enhance the development experience.
  • By 90 Days :
  • Contribute to diversifying and scaling our platform across additional regions to meet growing demands.
  • Assess options for upgrading our real-time data pipeline to support enhanced multi-regional capabilities.
  • Provide platform support to engineering teams, leveraging data insights to drive decision-making.
  • By 1 Year :
  • Collaborate with engineering to redefine observability standards for the Fathom platform and implement improvements to minimize friction.
  • Participate in designing and implementing enhancements to our elastic multi-regional storage platform.
  • Lead initiatives to enhance platform reliability and efficiency.

Requirements

Hard Skills :

  • Proficiency with Infrastructure as Code (IaC) and GitOps tools.
  • Strong foundation in Observability best practices and implementation.
  • Experience working in a Software as a Service (SaaS) or Platform as a Service (PaaS) environment.
  • Familiarity with our tech stack, including Google Compute, Kubernetes, Message Queues, Prometheus, ClickHouse, ArgoCD, and Github Actions.

Knowledge of Golang is a plus, and familiarity with Ruby / Rails is a bonus.

Soft Skills :

  • Curiosity-driven with a focus on delivering tangible results.
  • Ability to tackle a wide range of challenges with a generalist mindset.
  • Resilience and determination to solve complex problems.
  • Openness to diverse perspectives and a commitment to decisions once made.
  • Strong collaboration skills, with the ability to communicate complex insights effectively.
  • Independence in managing workload and priorities effectively.

Benefits

What You'll Get

  • The opportunity to shape the dynamic platform of a rapidly growing company.
  • A role that encompasses infrastructure scaling, development team support, and internal tooling development.
  • Collaboration with a dynamic and supportive team.
  • A supportive environment that fosters innovation, creativity, and personal growth.
  • Competitive compensation and benefits package, including :
  • Comprehensive health, dental, and vision insurance plans.
  • Flexible spending accounts for medical expenses.
  • Retirement savings plans with employer matching contributions.
  • Generous vacation and paid time off policies to support work-life balance.
  • Professional development stipend for ongoing learning and skill enhancement.
  • Wellness programs and resources, including gym memberships or wellness app subscriptions.
  • Employee assistance programs for mental health support and counseling.
  • Opportunities for remote work and flexible scheduling.
  • Company-sponsored events, team outings, and social activities to foster camaraderie and collaboration.

Join Us

If you're passionate about driving the data journey at Fathom and contributing your analytical expertise to our mission, we invite you to apply.

Join us and become a key player in our data-driven success story. Apply now!

4 days ago
Related jobs
CIRCLE
Vancouver, British Columbia

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Senior Site Reliability Engineer (III). Senior Site Reliability Engineer (III). All the ...

Coinbase
Canada
Remote

The Reliability Engineering team helps realize our vision by supporting Coinbase engineering teams to build software that is world-class in terms of its reliability. Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value ...

Thomson Reuters
Toronto, Ontario

Thomson Reuters is seeking a Senior Site Reliability Engineer to join our Service Management, Technology team. In this opportunity as Senior Site Reliability Engineer, you will:. You're a fit for the role of Senior Site Reliability Engineer if your background includes:. DevOps Engineer, Cloud Engine...

RAZR Marketing, Inc.
Vancouver, British Columbia

You can’t stand not knowing; you have an unquenchable thirst to understand why  You are an active participant, never a spectator  You are focused on outcomes and not the time it takes to achieve them  Enjoy the Ride  You are grateful for both life’s challenges and opp...

S.i. Systems
Vancouver, British Columbia

Senior Site Reliability Engineer to design and implement Dynatrace and rollout adoption of Observability practices, tools and frameworks. Site Reliability Engineer (SRE)/Azure/DevOps engineer. Bachelor's degree in Computer Science, Engineering, or related field;. Design and implement monitoring stra...

CIRCLE
Toronto, Ontario

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Senior Site Reliability Engineer (III). Senior Site Reliability Engineer (III). All the ...

Trend Micro
Ottawa, Ontario

Trend Micro, a global cybersecurity leader, helps make the world safe for exchanging digital information.Fueled by decades of security expertise, world-leading global threat research and intelligence, and continuous innovation, our cybersecurity platform protects hundreds of thousands of organizatio...

CIRCLE
Toronto, Ontario

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Senior Site Reliability Engineer (III). Senior Site Reliability Engineer (III). All the ...

Electronic Arts
Edmonton, Alberta

Work as a technical liaison with development teams to address build issues and improvements.Create, modify, and maintain pipelines and workflow tools.Write application code to enhance various tools in the system.Collaborate with team-mates to maintain and enhance an automation pipeline.Monitor autom...

The Voleon Group
Toronto, Ontario

As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor production-critical infrastructure and data pipelines. Others are embedded with teams of software engineers to improve specific production sys...