A dynamic parallel method for performance optimization on hybrid CPUs Paper β’ 2411.19542 β’ Published Nov 29, 2024 β’ 5 β’ 2
Efficient Post-training Quantization with FP8 Formats Paper β’ 2309.14592 β’ Published Sep 26, 2023 β’ 11 β’ 2
Effective Quantization for Diffusion Models on CPUs Paper β’ 2311.16133 β’ Published Nov 2, 2023 β’ 4 β’ 1
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs Paper β’ 2309.05516 β’ Published Sep 11, 2023 β’ 10 β’ 2