Job Description

Let’s shape the future together!

Your Role & Responsibilities

  • Own the end-to-end data lifecycle—from collection and annotation to quality assurance, governance, and cloud storage—ensuring high-quality datasets feed into model training pipelines.
  • Define the dataset strategy and roadmap to support scalable and diverse training data for advanced machine learning models in close collaboration with research teams.
  • Architect and continuously improve a robust cloud-based data platform (including ingestion, labeling workflows, cataloging, versioning, and compliance processes).
  • Operationalize the data lifecycle with clear processes, KPIs, and feedback loops spanning data collection, model training, and deployment.
  • Lead and mentor a high-performing data team, fostering technical excellence, ownership, and continuous professional growth.
  • Partner closely with Product, Research, and Engineering teams as the central point of contact for data strategy and data-related initiatives.
  • Collaborate with robotics and systems teams to support data collection and integration workflows within robotics environments (e.g., ROS2-based systems).

Required Technical & Professional Expertise

  • 3+ years of experience leading or managing technical teams in a fast-paced environment.
  • 5+ years of professional experience in data engineering, machine learning systems, cloud infrastructure, or related domains.
  • An advanced degree (Master’s or PhD) in Computer Science, Machine Learning, Data Engineering, or a related field.
  • Proven ability to design and scale large-scale data pipelines, data platforms, or cloud-based storage systems.
  • Experience building or managing datasets for deep learning systems, ideally including multimodal data.
  • Experience with data lifecycle management, data collection operations, or high-throughput cloud data platforms is a strong advantage.
  • Strong foundations in Python, cloud data tooling, and modern data/ML engineering practices.
  • Experience working within robotics ecosystems (ROS2 or similar frameworks) is beneficial but not required.