d-Matrix is focused on unleashing the potential of generative AI to power the transformation of technology. They are seeking to scale their software infrastructure to support the entire Software Engineering organization and the development of their ML accelerator systems.
Requirements
- Proficient in C/C++, Python.
- Proficient with Gitlab merge process, merge trains, and CI/CD.
- Proficient with Docker and Podman containers.
- Experience with orchestration tools used for data center deployment of ML workloads such as K8s.
- Experience with Bazel, code coverage, data source integration, and DevOps metrics.
- Experience with root causing CI issues, linting, bisect tooling, remote code coverage, security and vulnerability test coverage
- Minimum 7+ years of industry experience in software infrastructures and DevOps.
Responsibilities
- development, enhancement, and maintenance of Gitlab, Docker, and the tools to support the development of our ML accelerator systems both on hardware and software.
- root causing CI issues, linting, bisect tooling, remote code coverage, security and vulnerability test coverage
- orchestration tools used for data center deployment of ML workloads such as K8s.
- Bazel, code coverage, data source integration, and DevOps metrics.
- Gitlab merge process, merge trains, and CI/CD.
- Docker and Podman containers.
- C/C++, Python.
Other
- Hybrid, working onsite at our Santa Clara, Ca headquarters 3-5 days per week.
- Prior startup, small team, or incubation experience.
- Work experience at an AI compute/subsystem company.
- Prior experience as a DevOps engineer.
- Willing to keep oneself up-to-date with the latest trends and research in the ML community and understand how the trends affect d-Matrix requirements and approach