Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
This significantly reduces quantization loss such that you can run models in 4-bit precision without experiencing any performance degradation.