SWE922 CHIEF SOFTWARE OPERATIONS SPECIALIST

Bebeeobservability


Job Title: Senior Observability Engineer About the Role: This is a senior-level engineering position that focuses on observability, which involves deploying and managing technology to improve enterprise application reliability and performance optimization through advanced monitoring, logging, and tracing practices. The successful candidate will report directly to the Technical Lead of Observability Platforms and collaborate with application owners to implement observability solutions. They will provide our organization with relevant insights into the behavior of our critical applications, ensuring proactive identification and resolution of issues before they impact our customers. This role requires a high level of technical expertise, as well as excellent collaboration and communication skills. The ideal candidate will have a strong background in software engineering, including experience with modern programming languages and frameworks such as C#, Java, JavaScript, or NodeJS. They will also have experience with distributed systems, microservices architectures, and cloud computing platforms like AWS, Azure, and Linux/Windows administration. A bachelor's degree or equivalent years of relevant work experience is required, along with 5+ years of overall technical experience. A strong understanding of Agile/Scrum development methodologies is also essential. Responsibilities: - Monitor System Integration: Administer and integrate monitoring systems into our BigPanda Monitor of Monitors (MOM), providing a comprehensive 'pane of glass' for our Virtual Integrated Command Center operators. - Advanced Observability Solutions: Implement and administer observability tools, including Dynatrace, to ensure comprehensive visibility into system performance. - Performance Analysis and Optimization: Conduct analysis of system performance metrics, identify optimization opportunities, and collaborate with development teams to implement performance improvements, ensuring the scalability of our systems. - Incident Response Expertise: Participate in incident response activities, applying your expertise to quickly identify issues. Conduct post-incident reviews to identify causes and recommend preventive measures. - Knowledge Sharing and Documentation: Develop best practices for observability and share your knowledge with other team members through workshops and detailed documentation, promoting a culture of continuous improvement. Requirements: - Bachelor's degree or equivalent years of relevant work experience. - Typically requires 5+ years of overall relevant technical experience. - 3+ years of experience with Dynatrace implementation and administration. - 1+ year of experience with other APM tools e.g., New Relic, AppDynamics, Zabbix, Prometheus, SolarWinds, Avantra, or LogicMonitor. - 3+ years of experience in enterprise software engineering with modern programming languages and frameworks such as C#, Java, JavaScript or NodeJS. - Experience with distributed systems, microservices architectures, and cloud computing platforms e.g., AWS, Azure. - Windows and Linux administrator experience. - 1+ year of experience working in an Agile/Scrum environment. - Experience with containerization technologies such as Docker and Kubernetes. - Experience with Terraform. - Ansible experience. - Experience with ServiceNow integration. What We Offer: We offer a comprehensive benefits package, including mindfulness programs, volunteer paid time off, company volunteer and donation matching program, employee assistance program, personalized wellbeing programs, on-demand digital course library, and local benefits. Our hybrid policy allows employees to work at least Mondays, Tuesdays, and Thursdays at our Rockwell location, unless they have a business obligation out of the office.

trabajosonline.net © 2017–2021
Más información