ServiceNow is looking to build and optimize a high-performance inferencing platform to provide high quality AI solutions to their enterprise customers globally, transforming user experience and workflow efficiency.
Requirements
- Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving.
- Low Latency Optimization: Experience in optimizing models for low latency inference, important for real-time applications.
- High Throughput Optimization: Knowledge of maximizing inference throughput.
- Real-time Systems: Understanding the constraints of real-time systems on model inference.
- Model Quantization and Compression: Practical experience in reducing model size and computational cost.
- Proficient in prompt engineering and developing LLM based features
- Proficiency in Python and Golang, with a strong grasp of software engineering principles.
Responsibilities
- Utilize your expertise in Python and Golang to develop high-performance components of the AI Platform.
- Collaborate with cross-functional teams to integrate AI capabilities seamlessly into workflows and user experiences.
- Ensure reliability and performance of AI models by applying best practices in software engineering and AI inferencing.
- Stay ahead of the curve by quickly learning emerging technologies and applying them to enhance the AI Platform.
Other
- This Role is based in our Santa Clara office and requires two days in the office
- Minimum 5 years of experience working in Software Development role.
- Demonstrated ability to thrive in fast-paced, dynamic environments.
- Knowledge of unit testing, profiling, and code tuning