Publications

(2024). Pushing the Limits of Large Language Model Quantization via the Linearity Theorem..

PDF

(2024). PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression..

PDF