We are hiring a skilled and accomplished Senior Data Software Engineer to enhance our team's efforts in creating a secure and cutting-edge document management solution hosted on AWS. As a key team member, you will collaborate with highly skilled professionals to refine an end-to-end information lifecycle platform, incorporating advanced technologies like AWS Glue, Athena, and Apache Spark. Your responsibilities will emphasize optimizing the scalability, performance, and dependability of a modern system designed to streamline digital document management for a global audience. Responsibilities Spearhead the design, development, and optimization of complex data pipelines and workflows using AWS Glue and related technologies Architect scalable, efficient, and secure data models utilizing Athena and S3 to meet advanced reporting and analytics needs Develop and oversee advanced ETL processes with Apache Spark to handle large-scale, high-volume data workloads effectively Collaborate with BI analysts, architects, and cross-functional teams to enhance data-driven workflows for Business Intelligence and analytics Establish best practices to optimize cost, performance, and security for cloud-based solutions using fully managed AWS services Design CI/CD pipelines to streamline deployment automation, improve scalability, and enhance productivity in data workflows Monitor key solution performance metrics to ensure high reliability and cost efficiency with modern observability tools Support reporting dashboard development by delivering accurate, well-structured, and timely data models Produce high-quality, enterprise-grade code by enforcing best practices for testing, versioning, and documentation Diagnose, troubleshoot, and address complex issues in data workflows to ensure consistent system reliability and uptime Mentor junior team members, promoting growth, teamwork, and technical excellence Requirements 3+ years of experience in data engineering or software development Proficiency in AWS Glue, Amazon Athena, and core AWS tools including S3, Lambda, and CloudFormation Expertise in Apache Spark and experience developing large-scale, high-performance data processing systems Knowledge of BI process analysis and ability to collaborate with analytics teams to optimize data workflow implementations Skills in SQL, with demonstrated experience creating and fine-tuning complex queries for data manipulation and analytics Understanding of data lake architecture, ETL pipeline design, and modern data storage principles Competency in designing and managing CI/CD pipelines to seamlessly integrate data workflows with deployment frameworks Familiarity with tools such as Amazon Kinesis, Apache Hive, or Elastic Kubernetes Service to enhance data processing capabilities Capability to design secure, efficient, and reliable solutions conforming to best practices for cloud-first architectures Excellent written and verbal communication skills in English (B2+ level) Nice to have Proficiency in Amazon Elastic Kubernetes Service (EKS) to manage containerized applications at scale Understanding of Amazon Kinesis for managing real-time data stream processing and events Skills in using Apache Hive to build and maintain efficient, high-performing data warehouses Showcase of expertise in enhancing workflows and efficiency across Business Intelligence platforms Qualifications in programming languages like Java, Scala, or Node.js for expanding data processing solutions We offer/Benefits - International projects with top brands - Work with global teams of highly skilled, diverse peers - Healthcare benefits - Employee financial programs - Paid time off and sick leave - Upskilling, reskilling and certification courses - Unlimited access to the LinkedIn Learning library and 22,000+ courses - Global career opportunities - Volunteer and community involvement opportunities - EPAM Employee Groups - Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn