Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

602 Bytes

	With Jukebox, there are several lm_head modules that should be skipped using the llm_int8_skip_modules parameter in [BitsAndBytesConfig]:

	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	model_id = "bigscience/bloom-1b7"
	quantization_config = BitsAndBytesConfig(
	llm_int8_skip_modules=["lm_head"],
	)
	model_8bit = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="auto",
	quantization_config=quantization_config,
	)

	Finetuning
	With the PEFT library, you can finetune large models like flan-t5-large and facebook/opt-6.7b with 8-bit quantization.