Capgemini is looking to hire a seasoned Data Engineer to help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world by developing and deploying machine learning solutions.
Requirements
- Python (Pandas, NumPy, PyTorch, TensorFlow), SQL.
- MLOps: Experience with tools like MLflow, Kubeflow, Docker, Kubernetes, and CI/CD pipelines.
- Generative AI & NLP: Expertise in transformer models (e.g., GPT, BERT), Hugging Face, and LangChain.
- Data Engineering: Proficient in PySpark and distributed data processing.
- Proven experience with one major cloud platform (AWS, Azure, or GCP) is an added advantage.
Responsibilities
- Develop and deploy scalable ML/AI solutions, including MLOps pipelines for CI/CD, model monitoring, and governance.
- Lead the design and development of GenAI and NLP solutions for applications such as text summarization, conversational AI, and entity recognition.
- Build and optimize data pipelines using PySpark and SQL for large-scale data processing.
- scalable ML development and deployment
- Collaborate with stakeholders to align AI initiatives with business goals
- mentor junior team members.
Other
- 7+ years of experience in developing and deploying machine learning solutions.
- experience with one major cloud platform (AWS, Azure, or GCP) is an added advantage.
- Bachelors degree with exp in Data, AI, ML
- Flexible work
- Healthcare including dental, vision, mental health, and well-being programs