Job Summary We are seeking a seasoned Data Engineer to join our team. As a key member of our data engineering team, you will play a pivotal role in designing, developing, and maintaining robust data solutions on-premises environments. You will work closely with internal teams and client stakeholders to build and optimize data pipelines and analytical tools using Python, Scala, SQL, Spark, and Hadoop ecosystem technologies. This role requires deep hands-on experience with big data technologies in traditional data centre environments (non-cloud). - Design, build, and maintain on-prem data pipelines to ingest, process, and transform large volumes of data from multiple sources into data warehouses and data lakes - Develop and optimize Scala-Spark and SQL jobs for high-performance batch and real-time data processing - Ensure the scalability, reliability, and performance of data infrastructure in an on-prem setup - Collaborate with data scientists, analysts, and business teams to translate their data requirements into technical solutions - Troubleshoot and resolve issues in data pipelines and data processing workflows - Monitor, tune, and improve Hadoop clusters and data jobs for cost and resource efficiency - Stay current with on-prem big data technology trends and suggest enhancements to improve data engineering capabilities Requirements - Bachelor's degree in software engineering or a related field - 5+ years of experience in data engineering or a related domain - Strong programming skills in Python or Scala - Expertise in SQL with a solid understanding of data warehousing concepts - Hands-on experience with Hadoop ecosystem components (e.g., HDFS, Hive, Apache Hudi, Iceberg, and Delta Lake) - Proven ability to design and manage data solutions in on-prem environments (no cloud dependency) - 3rd party data integrations from different sources (including APIs) - Proficiency in Airflow or similar orchestration tool - Strong problem-solving skills with an ability to work independently and collaboratively - Excellent communication skills and ability to engage with technical and non-technical stakeholders Desirable Qualifications - Master's degree in data science or a related field - Knowledge of Google and Facebook APIs and accessing S3 and SFTP buckets - Prompt engineering with basic GenAI understanding Language Skills You will need excellent written and verbal English for clear and effective communication with the team. Benefits and Perks - Certifications in AWS (we are AWS Partners), Databricks, and Snowflake. - Access to AI learning paths to stay up to date with the latest technologies. - Study plans, courses, and additional certifications tailored to your role. - Access to Udemy Business, offering thousands of courses to boost your technical and soft skills. - English lessons to support your professional communication. Mentoring and Development - Career development plans and mentorship programs to help shape your path. Celebrations & Support - Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones. - Company-provided equipment. Flexible Working Options We offer flexible working options to help you strike the right balance between work and personal life.