Northeastern University is looking to solve complex challenges at the intersection of Large Language Models, Computer Vision, and Predictive Analytics while ensuring production reliability and scalability for their clients.
Requirements
- Expert knowledge of at least one major cloud platform (AWS, GCP, or Azure)
- Strong programming skills in Python and infrastructure-as-code tools
- Proficient with containerization (Docker) and orchestration (Kubernetes)
- Knowledge of cloud data services (e.g., Redshift, BigQuery, Synapse)
- Experience with GitHub Actions, Jenkins, or equivalent
- Knowledge of cloud-native security services (e.g., KMS, Cloud KMS, Key Vault)
- Experience with GPU-based compute resources
Responsibilities
- Design and implement scalable model serving architectures for both GenAI (LLMs, diffusion models) and traditional ML models
- Build and maintain real-time and batch inference pipelines with high availability and fault tolerance
- Optimize AI workloads for performance, cost-efficiency, and low-latency inference
- Develop distributed model training and inference architectures leveraging GPU-based compute resources
- Implement server-less and containerized solutions using Docker, Kubernetes, and cloud-native services
- Architect end-to-end MLOps pipelines covering training, validation, deployment, and monitoring
- Implement CI/CD pipelines for ML model deployment using GitHub Actions, Jenkins, or equivalent
Other
- Bachelor's degree at least 3+ years of experience in software engineering with a focus on cloud infrastructure plus 1 more years of hands-on experience deploying ML models to production
- Collaboration skills to help clients realize transformative AI projects
- Strong collaboration skills to work with Data Scientists
- Ability to work in a research environment
- Compliance with data privacy requirements