Site Reliability Engineer

Dapper Labs

Vancouver, British Columbia

$110K-$140K a year (estimated)

Full-time

We’re looking for a Site Reliability Engineer who wants to be at the technical core of an organization that’s completely reshaping how distributed applications on blockchains can reach massive audiences.

You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems.

SRE also guides the organization in areas of Observability, Reliability, and Incident Response. The support we provide the other engineering teams enable them to deliver features that wow and delight our customers at a fast pace.

In the role, you can expect to help us launch reliable products and services with your experience and skills. You’ll join an established team with a focus on providing highly technical support to the rest of the Engineering organization.

You will be leveraging infrastructure-as-code, submitting code changes via Pull Requests, and finding creative solutions for the unique and varying needs of each Engineering team.

You’ll contribute to the improvement of our in-house systems by researching and applying the latest and greatest technology to our stack.

You’ll become empowered to fully apply your experience, lessons learned, and technical abilities in an environment with little tech debt, no on-prem servers, and a strong foundation based on cloud-native technologies such as Kubernetes and industry leading cloud platforms.

Every day, you’ll collaborate with a world-class team both in our Vancouver office and distributed worldwide

What we’ll accomplish together :

Develop effective infrastructure (cloud platform services, networking, kubernetes, etc.) for our projects to deploy onto, ensuring projects are scalable, resilient, and reliable in support of growing products.
Build shared observability services including metrics, logs, tracing, and dashboarding as well as embody a center of excellence partnering with other teams to define SLOs and actionable error budgets for everyone’s services.
Respond to infrastructure incidents and support the larger Engineering team with their product incident response strategy.
Perform post-mortems and in-depth root cause analysis to ensure we are always improving.
Enhance tools and automation to fill the gaps in our current systems as well as build entirely new ones as we face bigger and more complex challenges.

A little about you :

You execute on defined projects to achieve team-level goals and independently define the right solutions or use existing approaches to solve defined problems.
You understand OS, networking, kubernetes and other cloud native services and can debug system issues and identify system bottlenecks.
You have experience working with Infrastructure as Code systems like Terraform, pulumi, or CloudFormation.
You have experience collecting and processing metrics from tools such as Prometheus / Datadog / NewRelic and are familiar with the concepts of SLOs and SLI targets.
You are comfortable with responding to production incidents and can fight fires with a calm and level head, leveraging post mortems to apply lessons learned.
You have experience coding and developing applications. Bonus points for Go experience.
You are comfortable diving into an unfamiliar system and finding your way around.
While you believe in processes and the power of planning, you understand that you will often have to roll with the punches and prioritize the most impactful tasks on the fly.
You have a strong ability to collaborate with cross-functional teams and build solid working relationships with everyone in the organization, from individual contributors to the CEO.
You have experience building and working on deployment systems.
You have self-awareness about your strengths and areas for development
At Dapper Labs, we're looking for people who are passionate about what they do. You're encouraged to apply even if your experience doesn't precisely match the job description!

14 days ago

Related jobs

Site Reliability Engineer

Dapper Labs

Vancouver, British Columbia

Full-time

We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization.. SRE also guides the organization in areas of Observability, Reliability, and Incident Response..

Promoted

Site Reliability Engineer

Raas Infotek

Vancouver, British Columbia

Full-time

Hi, Hope you are doing well. I have an immediate requirement, please let me know if you are interested with this role. Role. SRE Engineer Location. Vancouver, BC( onsite) Mode. Contract Skills..

Promoted

Site Reliability Engineer

Jotform

Vancouver, British Columbia

Full-time

From small businesses to enterprises. We are looking for experienced Site Reliability Engineers in.. As a critical member of our Engineering team, the ideal candidate will combine engineering experience..

Site Reliability Engineer

Deloitte

Vancouver, British Columbia

Part-time

Review application architecture reviews to recommend improvements for better reliability and application.. Desire to understand our businesses and users. Understanding of the Reliability and configuration..

New!

Site Reliability Engineer II

Electronic Arts Inc

Burnaby, British Columbia

Full-time

Requisition Number. 184108 Position Title. Site Reliability Engineer II External Description.. EA's Production Infrastructure & Engineering (PI&E) organization provides the essential platforms and..

Site Reliability Engineer

Insight Global

Richmond, British Columbia

Part-time

Job Description. Insight Global is looking for a Associate SFMC Engineer to join a large retail client.. metrics and participate in discussions on how to improve devOps (security, quality, reliability..

Senior Site Reliability Engineer

TEEMA

Vancouver, British Columbia

Quick Apply

Full-time

And network to minimize downtime and improve system reliability Participate in capacity planning and.. Capacity Planning and Disaster Recovery Knowledge of Chaos Engineering Ability to design, author, and..