Machine Learning Operations Engineer AgileEngine is a leader in application development and AI/ML, with a people-first culture recognized by multiple Best Place to Work awards. We are looking for a talented Machine Learning Operations Engineer to join us on our mission to drive innovation and growth. Key Responsibilities: - Build and maintain scalable machine learning infrastructure on Databricks, leveraging Unity Catalog and feature stores for model development and deployment. - Design and implement frameworks for detecting data and model drift, ensuring continuous monitoring and high reliability of machine learning models in production. - Develop calibration frameworks and establish versioning practices for transparency and reproducibility across the machine learning lifecycle. - Design and optimize reinforcement learning orchestration pipelines, including Contextual Bandits, for real-time, low-latency environments. - Create frameworks for training, retraining, and validating machine learning models to enable efficient experimentation and deployment. - Implement best practices for CI/CD to streamline deployment and monitoring of machine learning models, integrating with Databricks workflows and Git systems. - Collaborate closely with machine learning scientists to ship, deploy, and maintain models. Requirements: - 3+ years of experience in MLOps, ML Engineering, Data Engineering, or related roles focusing on machine learning workflows in production. - 5+ years of experience with Python. - Proficiency with Databricks (2-3 years), Apache Spark, MLflow, Unity Catalog, and feature stores. - Familiarity with machine learning lifecycle tools such as MLflow, Kubeflow, and Airflow. - Strong knowledge of Git workflows, CI/CD practices, and tools like GitLab. - Understanding of model performance monitoring, drift detection, and retraining workflows.