Talent.com
Distributed Software Engineer
Distributed Software EngineerCerebras Systems Inc. • Toronto, Canada
No longer accepting applications
Distributed Software Engineer

Distributed Software Engineer

Cerebras Systems Inc. • Toronto, Canada
17 days ago
Job type
  • Full-time
Job description

About The Role

Cerebras Systems is a pioneer in large-scale AI Supercomputers. These multi-exaflop supercomputers are deployed in some of the biggest datacenters. These supercomputers are built using our Wafer-Scale Cluster technology – a cluster of several Wafer Scale Engine (WSE) chips. The Cluster engineering team is responsible for delivering software that are all‑things related to cluster.

Responsibilities

Automate bare‑metal configuration of networking, OS, and application software in large clusters of Cerebras WSE, servers, and switches.

Additional push button workflows for cluster upgrades, downgrades, and security patching with key metrics to minimize downtime on clusters.

An orchestration and scheduler system for resource allocation, job submission C placements for a multi‑user environment on a cluster.

Seamless support for both on‑premise and cloud mode deployment and operations.

A robust system for monitoring, detecting and handling failures for a variety of resources on the clusters (including High Availability of clusters).

Broad cluster and job monitoring and visualization capabilities, along with alerting systems.

User facing tools to monitor the status of jobs and collect metrics.

Administrator facing tools to manage and operate large clusters.

Skills & Qualifications

Strong track record of software architecture, system design and development.

Strong track record of development in distributed cluster.

Strong understanding of Kubernetes (K8s) software ecosystem, Prometheus and Grafana.

Strong debugging skills with distributed systems.

Strong skill to develop tests for the new features and regress old features.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras :

Build a breakthrough AI platform beyond the constraints of the GPU.

Publish and open source their cutting‑edge AI research.

Work on one of the fastest AI supercomputers in the world.

Enjoy job stability with startup vitality.

Our simple, non‑corporate work culture that respects individual beliefs.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third‑party tools process personal data. For more details, click here to review our CCPA disclosure notice.

J-18808-Ljbffr

Create a job alert for this search

Distributed Software Engineer • Toronto, Canada

Similar jobs

Senior Software Engineer - Distributed Storage & Db

ZipToronto, Canada
Full-time

A innovative procurement platform company in Toronto seeks an experienced software engineer to lead improvements on their data access layer for scalability and performance.With a focus on supportin...Show more

 • Promoted

Innovative Software Engineer for High-Throughput Distributed Systems

Rakuten Kobo Inc.Toronto, ON, CA
Full-time

Become a key Software Engineer on the Attribution Team.Drive the design and maintenance of resilient, high-throughput systems that enhance real-time processing and tracking experiences.This role fo...Show more

 • Promoted

Dynamic Software Engineer in Pioneering Payments Technology Role

ScotiabankToronto, ON, CA
Full-time

Join the forefront of payments technology as a Software Engineer.Your innovative mindset and technical capability will drive impactful solutions and enhance customer interactions in a thriving envi...Show more

 • Promoted

Dynamic Software Engineer Specializing in Pricing and Distributed Systems

VictraysToronto
Full-time

Join a team focused on pricing and system enhancements as a Software Engineer.Lead projects that transform user experiences through innovative solutions and developed programming skills.This role o...Show more

 • Promoted

Innovative Software Engineer for High-Throughput Distributed Systems

Rakuten RewardsToronto
Full-time

Take your software engineering skills to the next level! Join a dynamic team focused on architecting resilient, high-throughput distributed systems for tracking and rewards orchestration.In this im...Show more

 • Promoted

Senior Software Development Engineer for Distributed Systems

AmazonToronto, ON, CA
Full-time

Shape the future of cloud services by connecting customers with crucial software solutions.In this Software Development Engineer II role, your expertise will guide the design of services that suppo...Show more

 • Promoted

Senior Software Engineer | Cloud & Distributed Systems - C$114,400 - C$203,900 A Year

Microsoft CanadaToronto, Canada
Full-time

Senior Software Engineer to build and improve cloud and distributed systems, creating productivity solutions using LLMs and AI.Show more

 • Promoted

Senior Staff Software Engineer - Distributed Systems Leader

ACVToronto, ON, CA
Full-time

A technology company in Toronto is seeking a Staff Software Developer to design, develop, and maintain software applications.The ideal candidate has over 8 years of experience, expert knowledge in ...Show more

 • Promoted

Innovative Software Engineer For High-Throughput Distributed Systems

Rakuten Kobo Inc.Toronto, Canada
Full-time

Become a key Software Engineer on the Attribution Team.Drive the design and maintenance of resilient, high-throughput systems that enhance real-time processing and tracking experiences.This role fo...Show more

 • Promoted

Dynamic Software Engineer Focused on High-Scale Incentive Solutions

UberToronto, ON, CA
Full-time

Become a key player in the Incentive Platform team as a Software Engineer.Utilize your expertise in distributed systems and machine learning to craft engaging driver experiences worldwide.As part o...Show more

 • Promoted

Senior Staff Software Engineer, Distributed Systems(Remote)

NuToronto, ON, CA
Remote
Full-time

A leading digital banking platform is seeking a Software Engineer to drive innovation in financial services.This role focuses on developing scalable applications using Clojure, mentoring teams, and...Show more

 • Promoted

Senior Backend Engineer — Distributed Systems + Equity

NubankToronto
Full-time

A leading digital financial platform is seeking a Senior Software Engineer to join their team in Toronto.The role focuses on developing scalable microservices mainly in Clojure and working with mod...Show more

 • Promoted

Senior Software Engineer, Distributed Traffic Systems

AkamaiToronto, Canada
Full-time

A leading technology company in Canada is seeking a Senior II Software Engineer to enhance internet traffic management through algorithm development.You'll collaborate with teams and improve softwa...Show more

 • Promoted

Senior Software Engineer - Distributed Storage & Db Infra

ZipHQ, Inc.Toronto, Canada
Full-time

A leading tech company in Ontario, Canada, seeks an experienced software engineer to enhance their database access layer and support their global expansion.The role requires over 5 years of cloud i...Show more

 • Promoted

Dynamic Software Engineer Focused On High-Scale Incentive Solutions

UberToronto, Canada
Full-time

Become a key player in the Incentive Platform team as a Software Engineer.Utilize your expertise in distributed systems and machine learning to craft engaging driver experiences worldwide.As part o...Show more

 • Promoted

Senior Mts Software Engineer – Distributed Systems At Scale

eBayToronto, Canada
Full-time

A leading global ecommerce platform in Toronto seeks an experienced software engineer.The role involves building solutions using distributed systems and developing complex software to enhance user ...Show more

 • Promoted

Senior Software Engineer, Distributed Query Engine

MongoDBToronto, Canada
Full-time

A leading database technology company in Toronto is seeking a Software Engineer for their Server Query team to enhance their distributed query engine.This role requires over 3 years of experience i...Show more

 • Promoted

Senior Software Engineer, Backend - Distributed Systems

CamundaToronto, ON, CA
Full-time

Join us for our upcoming webinar, How to Be a Leader, Not Just a Manager.A practical session with a powerhouse panel on leading modern, remote teams.By creating production‑ready, enterprise‑grade a...Show more