About the position
Quantix is seeking a Data Engineer to architect and maintain the data infrastructure that powers our machine learning models, analytics engines, and intelligence systems. You will design real-time pipelines, optimize data flows, and ensure the availability, quality, and performance of large market and on-chain datasets. This role is essential to building a reliable foundation for AI-driven insights across institutional finance.
Job responsibilities
- Build and maintain scalable data pipelines for ingestion, transformation, and storage of large multi-source datasets.
- Design workflows for real-time and batch data processing, enabling fast and reliable model access to clean data.
- Develop ETL/ELT systems that handle market data, on-chain data, alternative datasets, and metadata efficiently.
- Optimize database schemas, query performance, and storage architectures for analytical workloads.
- Work closely with ML and research teams to ensure datasets are structured and enriched for modelling.
- Implement data validation, quality checks, and monitoring systems to maintain consistency and reliability.
- Manage data pipelines across cloud environments, ensuring security, scalability, and high availability.
- Build internal tools and APIs that improve data accessibility and streamline research workflows.
- Troubleshoot pipeline issues, performance bottlenecks, and latency challenges in production environments.
Required skills
- Strong proficiency in Python, SQL, and data processing frameworks.
- Experience with ETL/ELT pipelines, workflow orchestration, and data transformation tools.
- Familiarity with distributed data systems such as Spark, Flink, Dask, or similar frameworks.
- Solid understanding of databases (PostgreSQL, BigQuery, Snowflake, or similar).
- Experience handling large-scale datasets, especially time-series or structured financial data.
- Knowledge of cloud platforms (AWS, GCP, Azure) and scalable storage systems (S3, GCS, etc.).
- Ability to work with real-time streaming tools such as Kafka, Pub/Sub, or Kinesis (preferred).
- Strong understanding of data integrity, quality checks, and pipeline observability.
- Bonus: experience with on-chain data, blockchain analytics, or API/data source integrations.