Worldly is expanding its data infrastructure and AI capabilities to help companies unlock insights, build credible sustainability claims, and power compliance with evolving regulations worldwide — all while managing complex global data challenges. They are looking for an MLOps Engineer to design, deploy, and support the next-generation data infrastructure and AI systems that unify structured and unstructured data at scale — across regions, including China.
Requirements
- 4+ years of experience in ML engineering, MLOps, or data infrastructure roles.
- Proven hands-on experience with containerized open-source data tools such as: Object stores: MinIO, Ceph, or HDFS
- Table formats: Apache Iceberg, Hudi, or Delta Lake
- Query engines: Trino/Presto, ClickHouse, or DuckDB
- Workflow orchestration: Airflow, Dagster, or Prefect
- ML tools: MLflow, LangChain, Hugging Face, or vLLM
- ETL/ELT tools: Airbyte, NiFi, dbt
Responsibilities
- Design and deploy data lakehouse infrastructure using open-source technologies (e.g., MinIO, Apache Iceberg, Trino) to ingest and manage high-volume structured and unstructured data.
- Build and scale ML pipelines using modern tools such as MLflow, LangChain/Haystack, and orchestrate them via Airflow or Dagster.
- Implement data ingestion and transformation workflows using tools like Apache NiFi, Airbyte, and dbt.
- Support federated querying and real-time analytics via Trino, ClickHouse, or StarRocks.
- Enable retrieval-augmented generation (RAG) and other LLM-powered applications by integrating the data lake with AI/ML systems.
- Develop CI/CD pipelines for ML models, infrastructure-as-code, and data pipeline deployments.
- Monitor, debug, and optimize data and ML services running across distributed environments (including mainland China).
Other
- US - Remote
- Collaborate cross-functionally with data scientists, platform & DevOps engineers, and sustainability analysts to translate real-world use cases into scalable MLOps workflows.
- Experience managing infrastructure across multiple regions, including self-hosted deployments (Kubernetes, Docker Compose, Terraform, etc.).
- Strong understanding of data engineering best practices, including security, governance, and versioning.
- Experience deploying AI/ML infrastructure in China-compatible cloud environments (e.g., Alibaba Cloud, Huawei Cloud).