
Principal AI Data Scientist / Engineer
- Irving, TX
- Permanent
- Full-time
- IT Architecture Experience: Leverage previous experience in end-to-end architecture for a multi-sourced data platform, evaluating scalability, performance, and resilience.
- AI-Driven Entity Resolution: Develop and implement sophisticated strategies for Entity Resolution (ER) by utilizing Large Language Models (LLMs) and Graph Databases (e.g., Neo4J, AWS Neptune, CosmosDB) to accurately map, reconcile, and standardize accounting data across diverse sources.
- Advanced RAG Implementation: Architect and deploy production-grade Retrieval-Augmented Generation (RAG) pipelines for complex data interpretation and standardization. This includes managing the underlying Vector Databases and optimizing prompt/context engineering for high accuracy.
- Performance Optimization: Understand performance SLAs. Leverage specialized databases such as OLAP solutions (e.g., DuckDB, ClickHouse) for rapid analytics and column stores/caching (e.g., Redis) for low-latency access.
- Cloud Infrastructure and Deployment: Engage with IT experts on cloud deployment strategy (AWS/Azure), containerization (Docker) and orchestration (Kubernetes) to ensure robust, scalable, and observable deployments.
- Cross-Functional Strategy: Collaborate directly with Accounting, ERP knowledge owners, IT, MDM, and Data Quality teams to translate complex accounting requirements into scalable, automated technical solutions.
- Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively
- Programming Proficiency: Mastery in Python (for AI/ML) AND strong proficiency in at least one compiled, high-performance language (e.g., Go, Java, C#/.NET) for building scalable backend services
- Cloud Expertise: Extensive experience architecting solutions on AWS or Azure.
- Containerization & Orchestration: Knowledge of Docker and Kubernetes (K8s) in a production environment
- Streaming/Messaging: Proven experience designing systems utilizing Kafka or similar technologies (e.g., Kinesis, RabbitMQ)
- Demonstrated experience deploying LLMs in a production environment for data-centric tasks (not just chatbots)
- Specific expertise in building RAG pipelines, managing Vector Databases (e.g., Pinecone, Weaviate, PGVector), and advanced prompt/context engineering
- Experience with deep learning frameworks (PyTorch, TensorFlow) and the HuggingFace ecosystem
- Experience with Agentic Frameworks (e.g. LangGraph, AutoGen)
- Entity Resolution: Proven track record of solving complex entity resolution challenges at scale
- Graph Databases: Hands-on experience with Neo4J, AWS Neptune, or CosmosDB, specifically applied to ER or MDM
- Data Warehousing: Deep expertise in Snowflake architecture and optimization.
- Fast Analytics: Experience utilizing OLAP databases (e.g., DuckDB) and in-memory/column stores (e.g., Redis) for performance optimization
- Strong understanding of corporate accounting principles, consolidation processes, ERP system data structures (e.g., SAP, Oracle), and the nuances of accounting data
- Certifications in AWS or Azure architecture
- Experience with Infrastructure as Code (IaC) tools (e.g., Terraform)
- A track record of technical excellence in a Fortune 500 environment
- This position can be located in Irving, TX or Peoria, IL
- Travel requirements will be less than 10%
- This role currently has no direct reports
- Domestic relocation is available to those that qualify
- Sponsorship is NOT available
- Our goal at Caterpillar is for you to have a rewarding career. Our teams are critical to the success of our customers who build a better world.
- Here you earn more than just a salary because we value your performance. We offer a total rewards package that provides benefits on day one (medical, dental, vision, RX, and 401K) along with the potential of an annual bonus. Additional benefits include paid vacation days and paid holidays.
- All qualified individuals - Including minorities, females, veterans and individuals with disabilities - are encouraged to apply.
- These benefits also apply to part-time employees