PitchBook is looking to solve the problem of extracting meaningful insights from its wealth of structured and unstructured data, including reports, news, and other textual content, by delivering AI-powered features that drive insight generation on the PitchBook Platform.
Requirements
- Proven expertise in natural language processing (NLP) and machine learning, including hands-on experience with classifiers, transformer models, large language models (LLMs), and widely used ML and data science libraries such as scikit-learn, pandas, numpy, TensorFlow, and PyTorch.
- Experience delivering production-grade GenAI or LLM-based systems with measurable business impact.
- Familiarity with the LangChain ecosystem, including tools such as LangSmith and LangGraph, and experience using them in production environments.
- Deep proficiency in building and maintaining scalable data pipelines and distributed systems using technologies such as Apache Kafka, Airflow, and cloud data platforms like Snowflake.
- Strong programming skills in Python and SQL, with working knowledge of additional languages such as Java or Scala considered a plus.
- Practical experience with cloud-native development, containerization, and orchestration technologies such as Docker and Kubernetes.
- Demonstrated ability to solve complex technical problems, contribute to architectural decisions, and deliver high-performance, reliable solutions.
Responsibilities
- Deliver high-impact AI and ML capabilities that drive insight generation on the PitchBook Platform.
- Provide hands-on expertise in designing, building, and deploying AI/ML models and services with a focus on NLP, summarization, semantic search, classification, and prediction.
- Build and optimize models that leverage classifiers, transformers, LLMs, and other NLP techniques to generate meaningful insights from structured and unstructured data.
- Collaborate with engineering, product management, and data collection teams to ensure models are informed by high-quality data and support strategic product goals.
- Explore and experiment with emerging technologies, methodologies, and tools in the fields of GenAI, NLP, and search.
- Contribute to best practices in model transparency, monitoring, evaluation, and compliance.
- Apply principles from Agile, Lean, and Fast-Flow methodologies to support efficient model development and deployment cycles.
Other
- Bachelor's, Master's, or PhD in Computer Science, Mathematics, Data Science, or a related technical field.
- 8+ years of experience in software engineering or machine learning engineering, with a strong focus on AI/ML applications in insight generation, summarization, semantic search, and prediction.
- Excellent communication and collaboration skills, with experience working cross-functionally with product managers, engineers, and data scientists in globally distributed teams.
- Experience working in fast-paced, data-driven environments. Prior exposure to fintech or financial data platforms is a strong advantage.
- Demonstrated experience authoring research papers for peer-reviewed AI/ML conferences (e.g., NeurIPS, ICML, ACL) and participating in the broader AI research community is strongly preferred.