Join to apply for the SRE Engineer role at Ansira 3 days ago Be among the first 25 applicants Join to apply for the SRE Engineer role at Ansira Get AI-powered advice on this job and more exclusive features. This SRE will ensure the reliability, performance, and scalability of our MarTech SaaS platform that serves millions of users running thousands of marketing campaigns daily. They'll be responsible for monitoring systems, responding to incidents, and implementing automation to improve platform reliability. About Ansira Ansira is a leading marketing technology company dedicated to helping brands connect with customers and grow their businesses. Our platform integrates internal and external teams across channels, markets, and regions to deliver impactful brand-to-local growth strategies. At Ansira, we empower companies by optimizing marketing performance through AI-powered technology, growing partner ecosystems, cultivating brand loyalty, and ensuring profitable client growth. We serve a variety of industries, including financial services, retail, automotive, and technology. About The Role Join our growing organization as a Site Reliability Engineer and help ensure the reliability and performance of our SaaS platform that serves millions of users executing thousands of marketing campaigns every day. You'll be joining a lean, high-impact team where your work directly influences the experience of our customers and the success of their marketing efforts. This is a remote-first position where you'll play a crucial role in maintaining and improving the reliability, scalability, and performance of our mission-critical systems. What You'll Do Monitor & Alert: Design, implement, and maintain comprehensive monitoring and alerting systems using tools such as Prometheus, Grafana, and DataDog to ensure early detection of issues and optimal system performance Incident Response: Lead incident response efforts, conduct root cause analyses, and implement preventive measures to reduce future occurrences Automation: Build and maintain automation tools and processes to reduce manual work, improve deployment reliability, and enhance system resilience Reliability Engineering: Identify and implement reliability improvements across our platform, working closely with development teams to embed best practices Capacity Planning: Monitor system performance trends and plan for scaling needs to support our growing user base and campaign volume Documentation: Create and maintain runbooks, procedures, and system documentation to support the team and improve knowledge sharing Required What We're Looking For 3+ years of hands-on experience in site reliability engineering, DevOps, or similar roles with focus on monitoring and reliability improvements Strong knowledge of SRE best practices including SLIs/SLOs, error budgets, and reliability engineering principles Cloud Platform experience with services like Compute Engine, Kubernetes, Cloud SQL, and related infrastructure components DataDog or similar expertise for monitoring, alerting, and observability Backend development experience with Java, PHP and/or Node.js to understand and troubleshoot application-level issues Incident management skills including on-call experience, troubleshooting under pressure, and post-incident review processes Automation mindset with experience in scripting and Infrastructure as Code principles Preferred SaaS platform experience, particularly in high-volume environments serving millions of users MarTech or AdTech industry background with understanding of campaign management systems Experience scaling systems that handle thousands of concurrent operations CI/CD pipeline experience and deployment automation Security best practices knowledge for cloud environments What We Offer Remote-first culture with flexible working arrangements High-impact role in a small, collaborative team where your contributions directly matter Growth opportunities as we scale our platform and expand our engineering team Competitive compensation and benefits package Learning budget for professional development and certifications Modern tech stack with opportunities to work with cutting-edge solutions Our Environment You'll be working with systems that process millions of user interactions daily across thousands of active marketing campaigns. Our platform operates at significant scale, requiring robust monitoring, quick incident response, and continuous reliability improvements. As part of a small cross-functional team, you'll have the opportunity to make a substantial impact on both our technical infrastructure and our growing engineering culture. Ready to Apply? We're looking for someone who thrives in a fast-paced environment, enjoys solving complex technical challenges, and wants to help build reliable systems that power successful marketing campaigns for our customers. Please Submit Your Resume Explaining Your relevant SRE/reliability engineering experience Examples of monitoring and automation improvements you've implemented Why you're interested in joining a MarTech company Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Job function Engineering and Information Technology Industries Software Development Referrals increase your chances of interviewing at Ansira by 2x Sign in to set job alerts for “Site Reliability Engineer” roles. Fullstack Engineer (work from home in Medellin, Colombia) We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr