About the position
Quantix is seeking an MLOps Engineer to own the operational backbone of our machine learning ecosystem. You will design automated pipelines, manage model deployments, and ensure that our predictive systems run efficiently, reliably, and at scale. This role focuses on bridging data science and engineering — turning research models into production-grade services used across institutional asset management and on-chain intelligence.
Job responsibilities
- Build and maintain automated pipelines for model training, testing, deployment, and monitoring.
- Develop CI/CD workflows specifically tailored for machine learning and data-driven systems.
- Manage model serving environments, ensuring high uptime, low latency, and production reliability.
- Implement tools for experiment tracking, version control, dataset management, and reproducibility.
- Optimize cloud and compute resources to support large-scale ML workloads efficiently.
- Collaborate with ML researchers and infrastructure engineers to operationalize new models.
- Monitor model health, detect drift, and manage automated retraining workflows.
- Create internal tooling that streamlines development, deployment, and continuous improvement of ML systems.
- Ensure proper observability through logging, alerting, dashboards, and performance analytics.
Required skills
- Strong experience with Python and ML ecosystem tooling.
- Hands-on experience with MLOps frameworks such as MLflow, Kubeflow, Airflow, Weights & Biases, or similar.
- Solid understanding of CI/CD pipelines, DevOps practices, and automated deployment workflows.
- Familiarity with containerization (Docker), orchestration (Kubernetes), and distributed systems.
- Experience deploying and monitoring production ML models or data pipelines.
- Ability to optimize model performance, inference speed, and system reliability.
- Familiarity with cloud environments (AWS, GCP, Azure) and scalable ML infrastructure.
- Strong debugging, system analysis, and performance optimization skills.
- Bonus: experience with GPU workloads, tensor runtimes, or real-time model refresh cycles.