Site Reliability Engineer

SGS
ON, Canada
Remote
Full-time

Job Description

The Site Reliability Engineer will play a critical part in ensuring the reliability, supportability, scalability, and performance of our .

NET stack applications built with ASP.NET MVC, Angular, and Web API.

  • Partner with developers and product operations teams to understand application requirements and translate them into operational practices.
  • Design, implement, and maintain infrastructure automation tools using Infrastructure as Code (IaC) methodologies.
  • Monitor application health and performance metrics, proactively identifying and resolving potential issues.
  • Implement incident response procedures to ensure timely resolution of outages and service disruptions.
  • Establish and improve best practices for product solution design / architecture, and development.
  • Participate in peer and team code reviews by developing comprehensive coding standards and guidelines to ensure consistency, maintainability, and quality in software development.

By establishing clear protocols for code formatting, naming conventions, error handling, testing, and documentation, we can enhance code readability, reduce defects, and facilitate knowledge sharing among team members.

  • Collaborate with engineers to develop and implement disaster recovery plans.
  • Continuously improve monitoring and alerting processes to ensure efficient problem identification and resolution.
  • Stay up-to-date on the latest advancements in .NET infrastructure and SRE best practices.

Qualifications

  • Bachelor degree required
  • Minimum 3+ years of experience in a related technical role (e.g., Systems Administrator, Network Engineer) required
  • Experience with configuration management tools like Ansible, Puppet, or Chef preferred
  • Azure experience required
  • Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana) is a plus preferred
  • Ability to manage and coordinate multiple projects in a fast paced, highly professional environment.
  • While coding proficiency is not required, a strong understanding of the .NET ecosystem and a desire to delve into infrastructure and automation will be essential for success.
  • Strong understanding of system administration principles, including operating systems (Windows Server preferred) and networking concepts.
  • Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana)
  • Ability to work independently and as part of a team

Additional Information

SGS is an Equal Opportunity Employer, and as such we recruit, hire, train, and promote persons in all job classifications without regard to race, color, religion, sex, national origin, disability, age, marital status, sexual orientation, gender identity or expression and Indigenous status, or any other characteristics protected by law.

To perform this job successfully, an individual must be able to perform each essential duty satisfactorily with or without reasonable accommodations.

The requirements listed above are representative of the knowledge, skills, and / or abilities required.

This job description should not be construed as an exhaustive statement of duties, responsibilities, or requirements, but a general description of the job.

Nothing contained herein restricts the company's rights to assign or reassign duties and responsibilities to this job at any time.

Accommodations are available on request for qualified candidates during each stage of the recruitment process.

Please note that candidates applying for Canadian job openings should be authorized to work in Canada.

30+ days ago
Related jobs
CARTA
Waterloo, Ontario

You’ll be joining the Infrastructure Engineering team at Carta. The Infrastructure Engineering team is responsible for providing secure, reliable, scalable and performant Infrastructure to Carta’s customers and developers. We are Software and Infrastructure Engineers who specialize in cloud computin...

Lorven Technologies
Toronto, Ontario

Site Reliability Engineer (SRE). A Bachelor’s degree in Computer Science or related technical field (Example: Mathematics/Engineering/Physics), or equivalent practical experience. ...

Commonwealth Bank
Sebringville, Ontario

Site Reliability Engineering (SRE) is key to us achieving this goal. Reliability Engineering (SRE) is key to us achieving this goal. As a Principal Software Engineer in our SRE team, you’ll be a technical leader, designing and implementing large scale solutions, as well as influencing and engaging t...

MongoDB
Toronto, Ontario

The Cloud Site Reliability Engineering Team designs and builds the global infrastructure on which we deploy our services. ...

Loblaw Companies Limited
Brampton, Ontario

As a SRE I (platform engineering) you will be responsible for the architecture, maintenance, and development of tools to ensure reliability of our applications. SRE I (platform engineering), Toronto, ON. Working as part of the Loblaw Technology Container Platform Team, you will collaborate with mult...

Braze
Fort Albany, Ontario
Remote

As a Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these technologies. Site Reliability Engineers (SREs) are responsible for keeping all inte...

Leica Geosystems
Canada

Senior DevOps Engineer / Site Reliability. DevOps &/or Site Reliability Engineering principles. Senior DevOps Engineer / Site Reliability | Hexagon Geosystems. As a Senior DevOps/SRE Engineer, you will help build solutions that allow our cloud-based platform, HxDR, to continue to evolve and grow thr...

0000050007 Royal Bank of Canada
Toronto, Ontario

As a Senior Site Reliability Engineer on the Client360 Advisor Platform team you will be responsible for monitoring, deploying, and maintaining applications built on the Salesforce platform & applications used to integrate Salesforce with other RBC systems. Agile Methodology, Application Infrastruct...

Bentley Systems
Burlington, Ontario

Share on-call responsibilities, including collaborating with other engineers to triage and fix issues that come up in production for our users. Degree in computer science, software engineering or relevant training and/or experience. An exciting career as an integral part of a world-leading software ...

Qlik
Ottawa, Ontario

The Regional Director, Site Reliability Engineer (SRE) Role. We are seeking a highly experienced and execution-minded Regional Director of Site Reliability Engineering (SRE) to lead and build a robust regional SRE organization. Site Reliability Engineering, Software Development, or a similar role. B...