Description de posteRole: SRE Cloud Engineer Location: RemoteDuration: Long TermRoles & Responsibilities• Proficiency in one or more major cloud platforms, such as Azure (preference), Google Cloud, AWS or others • Experience with modularized IaC(Infrastructure as Code) and industry tooling e.g. Terraform/OpenTofu, Ansible, Packer • Configuration management, Secrets Management (Consul, Vault, KMS systems, Consul Template) • Operational concerns: Disaster Recovery, Disaster Resilience, Monitoring, Alerting • (ITSM) Change, Knowledge & Incident Management o Containerization technologies: Docker/ContainerD, Kubernetes (K8S, K3S), container registries and artifact tools (e.g. Artifactory, GCR, ACR, ECR)Administration of Kubernetes clusters o Understanding of microservices principles and best practices for designing scalable and modular architectures including API Gateway technologies and best practices o Understanding of event-driven architecture principles and best practices and tooling (e.g. Pub/Sub, Kafka etc) o Familiarity with serverless computing services and functions (e.g. Azure Functions, AWS Lambda etc) o Proficient in cloud networking concepts, including VPCs, subnets, routing, load balancing, Firewalling (including WAF), and security groups o Strong knowledge of cloud security best practices, including identity and access management (IAM), encryption, and security group configurations, SAST & DAST tooling (CrowdStrike, Qwiet.AI, Prisma Cloud etc) o Experience with cloud monitoring and logging tools like AWS CloudWatch, EFK/ELK Stack, OpenTelemetry, Dynatrace, Datadog, New Relic etc o Knowledge of diagramming tools and methodologies (e.g. Miro, Diagrams.net, Figjam, Mermaid.JS etc) o Educated to Degree level in Computer Science or equivalent Responsibilities Help the Cloud Architectural Artifacts • Networking, firewalling, routing • Implementation level architecture for scalable IaCstrategies (e.g. modularization etc) • Platform Vs Service level • Contribute to and maintain our org wide Architecture Decision Records Enable our Cloud Policies, Procedures and Standards • Disaster Resiliency, Recovery and Availability Planning • IaCtooling, use and quality gating • Cloud technology stack/services consultancy and selection in accordance with technical and quality requirements (scalability, performance, security, compliance) Support our Cloud Operational Practices • Drive Ecommerce and Partstownwide projects • Cloud infrastructure TCO modeling and optimization • Capacity planning by continuously monitoring utilized resources and demand trends • Manage relationships with cloud service providers/vendors to ensure smooth operations and support. • Lead and ensure compliance with organizational security practices and auditing • Ensure maximal uptime of our infrastructure and services through monitoring, alerting, golden signals and on-call support Drive best in class cloud strategies • Knowledge transfer, training, mentoring of cloud practices across the engineering org