Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
Requirements
- Deep grasp of fixed-point arithmetic, quantization theory, and statistical calibration.
- Fluent in Python, PyTorch or TensorFlow, NumPy/Pandas/SciPy, and data-viz tools (Matplotlib/Plotly).
- Hands-on with at least one quantization toolkit (PyTorchFX/PTQ/QAT, TF-Lite, ONNX-Runtime, TVM, MLIR Quant).
- Working knowledge of CNNs, Transformers and DNN architectures
Responsibilities
- Design statistically rigorous experiments to compare PTQ, QAT, pruning, and mixed-precision schemes on vision, language, and multimodal models.
- Build calibration datasets; develop Python notebooks/dashboards to track accuracy, latency, power, and memory trade-offs.
- Perform layer- and token-level error analysis to guide numerical-format choices.
- Partner with compiler team to convert your findings into turnkey SDK flows and reference configs.
- Publish internal whitepapers, external benchmarks, and present results to customers and at industry events.
- Monitor academic literature in compression and efficient inference; translate promising ideas into reproducible prototypes.
Other
- M.S./Ph.D. in CS, EE, Applied Math, or similar, with5+years in ML model optimization or data-science-driven research.
- Integrity, Humility, Happiness
- Initiative, Collaboration, Completion