Push the frontier of agentic LLMs and reinforcement learning for Normal Computing's agentic code generation tool.
Requirements
- Strong Python and ML framework experience (PyTorch preferred; JAX/HF a plus).
- Demonstrated ability to turn research into working systems; reproducibility mindset (tests, seeds, configs, logging).
- Experience designing eval harnesses and success metrics for sequential/agentic tasks.
- Comfortable with data acquisition/curation from documents/logs; good instincts about data quality and licenses.
- Research on program synthesis/codegen, constrained decoding, or execution‑based rewards.
- Experience with offline RL from tool traces or human corrections.
- Familiarity with semiconductor/chip domains or other complex technical specs.
Responsibilities
- Design and implement multi‑agent and RL approaches for agentic code generation and tool‑use.
- Build research prototypes that integrate with our agentic code generation tool; collaborate to productionize wins.
- Create evaluation suites: task specs, pass/fail checkers, coverage, cost/latency dashboards.
- Acquire and curate datasets from PDFs/logs/tables; generate synthetic data where appropriate; maintain data cards and licensing.
- Analyze experiments with disciplined ablations; document results and decisions.
- Stay current on LLM agents, RL (offline/online, RLHF/RLAIF), constrained decoding, and program synthesis.
Other
- PhD in CS/AI/ML (or equivalent research experience) with publications ideally in multi‑agent RL, agentic AI, or RL for language/code.
- Clear communicator who partners well with engineers.
- Open‑source contributions (e.g., CleanRL, RLlib, AutoGen, LangGraph, CrewAI, Transformers).
- Track record of shipping research to production and measuring impact.