Talent.com

Java architect Jobs in Beaconsfield, QC

Create a job alert for this search

Java architect • beaconsfield qc

Last updated: 26 days ago

Resiliency Architect

Intact Financial CorporationBourassa, Robert,Montréal
CA$149,600.00 yearly
Full-time

Pay at Intact is about much more than just salary.Multiple benefits offered to support.Wellness account and much more.Share plan & other savings: up to.Employee Share Purchase Plan (ESPP) – with In... Show more

Product Cybersecurity Architect

Randstad CanadaSaint-Eustache, Quebec, CA
Temporary
Quick Apply

Our client, an international pioneer in sustainable public transit solutions and heavy vehicle manufacturing, is seeking a talented Product Cybersecurity Architect.In this strategic role, you will ... Show more

People also ask
Resiliency Architect

Resiliency Architect

Intact Financial CorporationBourassa, Robert,Montréal
30+ days ago
Salary
CA$149,600.00 yearly
Job type
  • Full-time
Job description

Pay at Intact is about much more than just salary.

  • Flexible work arrangements and a hybrid work model

  • Possibility to purchase up to 5 extra days off per year

  • Multiple benefits offered to support physical and mental wellbeing, including telemedicine, Wellness account and much more

  • Share plan & other savings: up to 12% of salary or even more (ask how you could earn guaranteed income for life)

Salary range (but not limited to):

149,600 - 182,800

Annual bonus target, based on the base salary, with a potential payout of up to double the target (subject to personal and company performance):

15%

As part of our commitment to Win As A Team, we share our success with employees through our annual bonus plan and Employee Share Purchase Plan (ESPP) – with Intact matching 50% of your net shares.

Our pension offerings provide flexibility and long-term security for our employees beyond their careers. We are one of the few companies offering the opportunity to receive guaranteed income for life via our defined benefit pension plan.

Salary for the candidate will be determined taking into consideration a number of factors including: experience, skills, qualifications, anticipated contribution to role, internal equity, etc. The salary range presented above is based on a 35-hour workweek and would represent a majority of different candidate profiles. However, we encourage candidates who may fall outside of this range to apply as well.


About the role

We are seeking a Resiliency Architect to define and drive our end-to-end resiliency architecture and production reliability posture across Azure, AWS, Google Cloud, and on‑prem environments.

This person will be responsible to design standards, production readiness, and enforcement mechanisms at enterprise scale.

The ideal candidate combines deep SRE expertise with advanced systems architecture and a strong vision for explicit blue/green and chaos engineering practices—alongside AI/GenAI—to make systems reliable, leverage AI as a force multiplier for resiliency, transform team workflows, and deliver resilient, intelligent user solutions.

What you'll do here:

Core objectives :

  • Establish the enterprise resiliency architecture, patterns, and production guardrails for all critical platforms and services.

  • Govern design quality through rigorous architecture reviews and production readiness assessments.

  • Make blue/green deployments and chaos engineering first-class, codified practices across the estate: design, tooling, automation, and continuous validation.

  • Integrate AI/GenAI into reliability engineering: robust AI system architectures, AI-assisted observability, causal detection, and autonomous remediation.

  • Lead the evolution of disaster recovery, ransomware protection, and continuity strategies grounded in hard SLAs/SLOs and measurable business outcomes.

Key responsabilities

  • Own the resiliency reference architecture for multi-cloud/hybrid (multi-region/zone, active-active/passive, blast-radius reduction) and define/enforce NFRs (availability, latency, durability, RTO/RPO).

  • Establish governance via design reviews, production gates, policy-as-code, scorecards, and automated controls integrated with CI/CD, IaC, and runtime platforms.

  • Standardize blue/green deployment architecture and engineer safe traffic shifting, health gates, progressive cutovers, rollback, and zero-downtime data migrations.

  • Lead an enterprise chaos engineering program (experiments, failure injection, game days) and feed outcomes back into architecture guardrails and SLO improvements.

  • Define production readiness standards (capacity/saturation, graceful degradation, retries/backoff, circuit breakers, rate limiting) and codify runbooks, dependency maps, and failover topologies validated via DR drills and rehearsals.

  • Drive observability and SRE practices: OpenTelemetry adoption, distributed tracing, SLIs/SLOs/SLAs, error budgets, and executive reliability dashboards.

  • Architect DR and cyber-resilience (immutable/air-gapped backups, PITR, ransomware-resistant segmentation, recovery validation) aligned to regulatory and audit needs.

  • Guide platform and data resiliency across Kubernetes/service mesh, replication/consensus, geo-distribution, and event streaming (DLQs, backpressure, reprocessing).

  • Enable reliable AI/GenAI systems and AI-driven operations (monitoring/guardrails, anomaly detection, predictive modeling, human-in-the-loop remediation, ops copilots).

  • Serve as principal resilience authority: mentor teams, lead councils/forums, and communicate tradeoffs clearly to executives and engineers.

What you bring to the table:

  • 10+ years in SRE/Platform/Infrastructure/Systems Architecture with proven large-scale, production-critical experience across Azure, AWS, GCP, and on‑prem.

  • Multi‑region traffic management, global load balancing, DNS/BGP, TLS/mTLS, CDN/edge patterns.

  • Kubernetes ecosystems (AKS/EKS/GKE), service meshes (Istio/Linkerd), autoscaling strategies, readiness/liveness, topology constraints.

  • Observability stacks: OpenTelemetry, Prometheus/Grafana, Jaeger/Tempo, ELK/OpenSearch, commercial APM; correlation and topology modeling.

  • Data resilience: consensus/replication (Raft/Paxos), partitioning, PITR, snapshots, CDC; caches (Redis), databases (Aurora, Cosmos DB, Spanner).

  • IaC and automation: Terraform/Pulumi, GitOps (Argo CD/Flux), policy‑as‑code (OPA), CI/CD patterns (blue/green, canary, progressive delivery).

  • Chaos engineering, DR orchestration, and automated failover at enterprise scale.

  • For candidates located in Quebec, bilingualism is required considering the necessity to interact on a regular basis with English speaking colleagues across the country.

  • No Canadian work experience required however must be eligible to work in Canada

AI/GenAI competencies:

  • Architecting reliable AI systems: model serving (Ray/SageMaker/Vertex), vector stores (Pinecone/FAISS/pgvector), retrieval pipelines, guardrails and safety.

  • ML/ops: model monitoring (drift, performance, hallucination detection), feature pipelines, lineage/observability, prompt/content governance.

  • Applying AI to operations: causal detection, predictive resiliency, autonomous remediation frameworks.

  • Strong software engineering skills (Go/Python/TypeScript) and systems thinking; excellent communication (written, visual, verbal) and executive presence.

#LI-Hybrid

Il s'agit d'un nouveau rôle au sein de notre équipe en plein croissance | This role is a new member of our growing team.