Susquehanna is looking to build and optimize high-performance trading systems, research compute clusters, databases, and support systems by hiring a Platform Development team member who can manage and tune their HPC environment.
Requirements
- 5-7 years of progressive experience building Linux and/or Windows based HPC based platforms.
- Familiarity with kernel-level and I/O subsystem tweaks and tools such as sysctl, strace, tcpdump, and netstat.
- Bonus points for equivalent Windows knowledge (registry, procmon, wireshark, tshark).
- Recent hands-on experience with automation in Python or other tools.
- Experience administering Lustre, GPFS, VAST, or other parallel filesystems.
- Understanding of resource schedulers like HTCondor, SLURM, or similar.
Responsibilities
- Contribute to our library of home-grown tools, written primarily in Python and Bash, to automate monitoring, and maintenance, allowing you to focus on performance related issues and projects.
- Tune operating systems and batch workflows for performance.
- Dive deep on root-cause analysis of systems issues.
- Integrate all of these solutions into our systems effectively and efficiently.
- Oversee all aspects of our HPC environment, including the scheduler, parallel filesystems, GPUs, and interconnects.
- Implement and optimize high-performance storage solutions, including Lustre, VAST, and GPFS, to efficiently support and enhance cluster performance.
- Develop strategies to ensure optimal resource allocation and scalability, using analytics to forecast needs and design efficient, reliable systems.
Other
- A Bachelor’s degree in Engineering, Computer Science, Information Systems, or a related discipline.