Job DescriptionAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!WHAT YOU WILL DO- ML Infrastructure Support and Development: Build and maintain scalable ML infrastructure on Databricks, leveraging Unity Catalog and feature stores to support model development and deployment; - Drift Detection Frameworks: Design and implement frameworks for detecting data and model drift, ensuring continuous monitoring and high reliability of ML models in production;- Model Calibration & Versioning: Develop model calibration frameworks and establish versioning practices to maintain transparency and reproducibility across the ML lifecycle;- Low-Latency Orchestration: Design and optimize reinforcement learning (RL) orchestration pipelines, including Contextual Bandits, for real-time execution in low-latency environments;- Automated Training Pipelines: Create automated frameworks for training, retraining, and validating ML models, enabling efficient experimentation and deployment;- CI/CD for ML: Implement CI/CD best practices to streamline the deployment and monitoring of ML models, integrating with Databricks workflows and Git-based version control systems;- Collaboration: Work closely with ML Scientists to ship, deploy, and maintain models;- Monitoring & Optimization: Build tools for model performance monitoring, operational analytics, and drift mitigation, ensuring reliable operation in production environments.MUST HAVES- 3+ years in MLOps, ML Engineering, Data Engineering or related roles, focusing on deploying and managing ML workflows in production environments; - 5+ years of experience using Python;- Proficient in using Databricks (2-3 years), Apache Spark, ML Flow, Unity Catalog, and feature stores;- Familiarity with ML lifecycle tools such as MLflow, Kubeflow, and Airflow;- Strong knowledge of Git workflows, CI/CD practices, and tools like GitLab or similar;- Strong understanding of model performance monitoring, drift detection, and retraining workflows;- Upper-Intermediate English level.THE BENEFITS OF JOINING US- Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps.- Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities.- A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands.- Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.Your application doesn't end here! To unlock the next steps, check your email and complete your registration on our Applicant Site. The incomplete registration results in the termination of your process. Requirements3+ years in MLOps, ML Engineering, Data Engineering or related roles, focusing on deploying and managing ML workflows in production environments; 5+ years of experience using Python; Proficient in using Databricks (2-3 years), Apache Spark, ML Flow, Unity Catalog, and feature stores; Familiarity with ML lifecycle tools such as MLflow, Kubeflow, and Airflow; Strong knowledge of Git workflows, CI/CD practices, and tools like GitLab or similar; Strong understanding of model performance monitoring, drift detection, and retraining workflows; Upper-Intermediate English level.