nou Systems, Inc. (nSI) is looking to solve diverse and challenging technical problems in advanced AI/ML applications in missile defense, battle management, and related domains by hiring a Machine Learning Operations (MLOps) Engineer.
Requirements
- Strong proficiency in Linux-based development, shell scripting (Bash), and remote development on GPU-accelerated hardware.
- Hands-on experience with containerization and reproducibility using Podman (or similar) and tools like Conda, Poetry, or Pixi, including setup behind proxies and secure artifact repositories (e.g., Artifactory).
- Proven ability to build and manage CI/CD pipelines in GitLab (or equivalent), including automated testing, MR enforcement, and runner configuration.
- Familiarity with Python-based machine learning frameworks and tooling (e.g., PyTorch, PyTorch Lightning, Optuna, MLflow, Ray-RLLib).
- Experience deploying ML solutions in secure, classified, or DoD-restricted environments.
- Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible) and DevSecOps best practices.
- Understanding of software supply chain security, including artifact validation and SBOMs.
Responsibilities
- Build and maintain GPU-capable development environments and containerized workflows.
- Develop CI/CD pipelines (e.g., GitLab CI) with automated testing, linting, and deployment.
- Implement and enforce GitLab merge request policies and code quality standards.
- Create shell scripts and config files to streamline setup, reproducibility, and remote development (e.g., SSH-based HPC access).
- Define and document standardized ML development practices across teams.
- Collaborate with ML researchers, software engineers, and cybersecurity teams to transition prototypes into production-ready systems.
- Support secure deployment of ML solutions in DoD-restricted environments.
Other
- A bachelor’s degree in computer science, engineering, data science, or a related technical field.
- U.S. citizenship and the ability to obtain a Secret security clearance.
- 4 years of experience in DevOps or MLOps - including support for machine learning workloads.
- Strong communication skills with a track record of writing clear technical documentation and supporting onboarding across technical teams.
- Final compensation for this position is determined by a variety of factors, such as a candidate’s relevant work experience, skills, certifications, and geographic location.