JPMorgan Chase is looking to design and deliver trusted, market-leading technology products in a secure, stable, and scalable way, and to build and maintain highly scalable and resilient infrastructure platforms to enable training and inference for Large Language Models.
Requirements
- Good knowledge of cloud computing delivery models (IaaS, PaaS, and SaaS) and deployment models related to Public, Private, and Hybrid Cloud services
- Proficient in Linux environments, including scripting and administration
- Familiarity with cloud data services and big data processing tools
- Foundational understanding of machine learning concepts such as transformer architecture, ML training, and inference
- Experience in one or more high-performance computing and machine learning frameworks such as vLLM, Ray.io, or Slurm is preferable
- Strong hands-on coding experience with Python and/or Golang
- Experience in solutions design and engineering, with experience in containerization (Docker, Kubernetes) and cloud service providers (AWS, Azure, GCP)
Responsibilities
- Execute software solutions, design, development, and technical troubleshooting with the ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Create secure and high-quality production code and maintain algorithms that run synchronously with appropriate systems
- Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development
- Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems
- Engineer infrastructure platforms that are secure, scalable, and optimized for AI and machine learning workloads
- Collaborate with AI teams to understand computational needs and translate these into infrastructure requirements
- Design and implement continuous integration and delivery pipelines for machine learning workloads
Other
- Formal training or certification in software engineering concepts and 3+ years of applied experience
- Ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Ability to collaborate with AI teams to understand computational needs and translate these into infrastructure requirements
- Ability to contribute to software engineering communities of practice and events that explore new and emerging technologies
- Ability to proactively identify hidden problems and patterns in data and use these insights to drive improvements to coding hygiene and system architecture