Trend Micro is looking to bridge the gap between LLM/SLM model research and enterprise productization to shape the next generation of agentic AI for cybersecurity.
Requirements
- Proven end-to-end experience bringing LLM/SLM research into production — from fine-tuning and inference optimization to evaluation and AI Ops integration.
- Deep understanding of data-model-infrastructure trade-offs and optimization under real business constraints.
- Hands-on with at least one fine-tuning or adaptation framework (ex: LLaMA Factory, NeMo, PEFT, LoRA, Transformers).
- Strong knowledge of GPU-accelerated inference (ex: vLLM, NIM, Triton, CUDA, NCCL, PyTorch/XLA).
- Familiarity with AI Ops toolchains (ex: Weights & Biases, MLflow, Ray Serve, BentoML).
- Proficiency in Python, and experience building containerized AI microservices (ex: Docker, Kubernetes, Ray).
- 3+ years of applied AI/ML research or engineering, including 2+ years in production-scale deployment.
Responsibilities
- Drive research-to-production of LLM/SLM systems — from design and fine-tuning to evaluation, deployment, and continual adaptation in enterprise agent workflows.
- Lead technical choices — determine when to apply context engineering, prompt tuning, continued pretraining, supervised fine-tuning, reasoning fine-tuning, LoRA, or RL.
- Architect high-performance inference and serving using vLLM, NVIDIA NIM, Triton, CUDA, or other optimized frameworks.
- Integrate reinforcement learning frameworks (veRL, SkyRL, Torch, Ray RLlib) to enhance reasoning, adaptability, and agent feedback loops.
- Develop and operationalize AI Ops pipelines — build benchmark and metrics for model evaluation, observability, drift detection, and lifecycle automation.
- Advance agent interoperability using A2A (Agent-to-Agent) or MCP (Model Context Protocol) for large-scale coordination.
- Collaborate with cybersecurity researchers to embed threat reasoning, anomaly detection, and defensive logic directly into model behavior.
Other
- This is a hybrid role based out of our Austin, TX office and requires in-office presence three days a week.
- Research-driven yet delivery-focused — capable of balancing innovation with practical deployment.
- Data- and results-oriented — every hypothesis must be measurable.
- Ownership mentality — from exploration and experiment to evaluation, optimization, and monitoring.
- Passionate about turning AI research into defensible, intelligent, and proactive cybersecurity systems.