Full-time Service Region: UCC Company Description We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in! Job Description Experienced L3 SRE engineer based on business-critical SaaS application. Capacity to L3 across the full stack including infra, backend and front-end, before escalation to engineering business unit. Capacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategy. Capacity to work under business pressure for business critical applications. Capacity to communicate accordingly with L1, L2, Engineering, Product managers, leadership and end-users during troubleshooting. Experience with incident and problem management. Experience with multitenant applications. Solid understanding of networking concepts (TCP/IP, DNS, Routing, etc.), including VPCs, subnets, firewalls, load balancing, TLS and SSL. Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control. Proficiency in Python and React/Next.js. Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues using tools like Grafana, Prometheus, Loki or ELK. Experience with AWS, particularly EKS, serverless, queues & various databases. Qualifications Must have Skills: EKS, Github Actions, Python (Strong), Kubernetes (Expert), Prometheus. Good to Have Skills: Previous experience building a user-facing GenAI/LLM software application. Security best practices in cloud environments, including AWS Managed Services (RDS, Batch, Lambda, Fargate, Step Functions, SQS/SNS, etc.). FastAPI and Next.js experience. Cloud security concepts (IAM, access control). #J-18808-Ljbffr