Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 465 Bytes

5fa1a76

Compute data type
To speedup computation, you can change the data type from float32 (the default value) to bf16 using the bnb_4bit_compute_dtype parameter in [BitsAndBytesConfig]:

import torch
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)

Normal Float 4 (NF4)
NF4 is a 4-bit data type from the QLoRA paper, adapted for weights initialized from a normal distribution.