Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Caterpillar Inc. Logo

Principal AI Data Scientist / Engineer

Caterpillar Inc.

$144,960 - $235,440
Sep 4, 2025
Irving, TX, US
Apply Now

Caterpillar Inc. is seeking a Principal AI Data Scientist / Engineer to lead the AI-enablement vision for their next-generation, enterprise-scale accounting data harmonizer system, aiming to create a resilient and highly accurate system that serves as the backbone of enterprise-wide corporate accounting.

Requirements

  • Mastery in Python (for AI/ML) AND strong proficiency in at least one compiled, high-performance language (e.g., Go, Java, C-Sharp/.NET) for building scalable backend services
  • Extensive experience architecting solutions on AWS or Azure.
  • Knowledge of Docker and Kubernetes (K8s) in a production environment
  • Proven experience designing systems utilizing Kafka or similar technologies (e.g., Kinesis, RabbitMQ)
  • Demonstrated experience deploying LLMs in a production environment for data-centric tasks (not just chatbots)
  • Specific expertise in building RAG pipelines, managing Vector Databases (e.g., Pinecone, Weaviate, PGVector), and advanced prompt/context engineering
  • Hands-on experience with Neo4J, AWS Neptune, or CosmosDB, specifically applied to ER or MDM

Responsibilities

  • Leverage previous experience in end-to-end architecture for a multi-sourced data platform, evaluating scalability, performance, and resilience.
  • Develop and implement sophisticated strategies for Entity Resolution (ER) by utilizing Large Language Models (LLMs) and Graph Databases (e.g., Neo4J, AWS Neptune, CosmosDB) to accurately map, reconcile, and standardize accounting data across diverse sources.
  • Architect and deploy production-grade Retrieval-Augmented Generation (RAG) pipelines for complex data interpretation and standardization. This includes managing the underlying Vector Databases and optimizing prompt/context engineering for high accuracy.
  • Understand performance SLAs. Leverage specialized databases such as OLAP solutions (e.g., DuckDB, ClickHouse) for rapid analytics and column stores/caching (e.g., Redis) for low-latency access.
  • Engage with IT experts on cloud deployment strategy (AWS/Azure), containerization (Docker) and orchestration (Kubernetes) to ensure robust, scalable, and observable deployments.
  • Collaborate directly with Accounting, ERP knowledge owners, IT, MDM, and Data Quality teams to translate complex accounting requirements into scalable, automated technical solutions.

Other

  • Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively
  • Strong understanding of corporate accounting principles, consolidation processes, ERP system data structures (e.g., SAP, Oracle), and the nuances of accounting data
  • Travel requirements will be less than 10%
  • Sponsorship is NOT available
  • This position requires working onsite five days a week.