ML OPS ENGINEER ID38029 – 2,500 SIGN-ON BONUS - GLW80

Ingepsy


AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI / ML, and our people-first culture has earned us multiple Best Place to Work awards. If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you. Key Responsibilities: - Build and maintain scalable ML infrastructure on Databricks, leveraging Unity Catalog and feature stores to support model development and deployment; - Design and implement frameworks for detecting data and model drift, ensuring continuous monitoring and high reliability of ML models in production; - Develop model calibration frameworks and establish versioning practices to maintain transparency and reproducibility across the ML lifecycle; - Design and optimize reinforcement learning (RL) orchestration pipelines, including Contextual Bandits, for real-time execution in low-latency environments; - Create automated frameworks for training, retraining, and validating ML models, enabling efficient experimentation and deployment; - Implement CI / CD best practices to streamline the deployment and monitoring of ML models, integrating with Databricks workflows and Git-based version control systems; - Work closely with ML Scientists to ship, deploy, and maintain models; - Build tools for model performance monitoring, operational analytics, and drift mitigation, ensuring reliable operation in production environments. Requirements: - 3+ years in MLOps, ML Engineering, Data Engineering or related roles, focusing on deploying and managing ML workflows in production environments; - 5+ years of experience using Python ; - Proficient in using Databricks (2-3 years), Apache Spark, ML Flow, Unity Catalog, and feature stores; - Familiarity with ML lifecycle tools such as MLflow , Kubeflow , and Airflow ; - Strong knowledge of Git workflows, CI / CD practices, and tools like GitLab or similar; - Strong understanding of model performance monitoring, drift detection, and retraining workflows;

trabajosonline.net © 2017–2021
Más información