marianna13 commited on
Commit
eb62fda
·
1 Parent(s): c823f45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -18,3 +18,51 @@ The following `bitsandbytes` quantization config was used during training:
18
 
19
 
20
  - PEFT 0.4.0.dev0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
 
20
  - PEFT 0.4.0.dev0
21
+
22
+ ### Usage
23
+
24
+ ```python
25
+ from peft import PeftModel
26
+
27
+
28
+ temperature: float = 0.1,
29
+ top_p: float = 0.75
30
+ top_k: int = 40,
31
+ num_beams: int = 4,
32
+ max_new_tokens: int = 128
33
+
34
+ load_8bit: bool = False
35
+ lora_weights: str = "marianna13/alpaca-lora-sum"
36
+
37
+ model = LlamaForCausalLM.from_pretrained(
38
+ base_model,
39
+ load_in_8bit=load_8bit,
40
+ torch_dtype=torch.float16,
41
+ device_map="auto",
42
+ )
43
+ model = PeftModel.from_pretrained(
44
+ model,
45
+ lora_weights,
46
+ torch_dtype=torch.float16,
47
+ )
48
+
49
+ inputs = tokenizer(prompt, return_tensors="pt")
50
+ input_ids = inputs["input_ids"].to(device)
51
+
52
+ generation_config = GenerationConfig(
53
+ temperature=temperature,
54
+ top_p=top_p,
55
+ top_k=top_k,
56
+ num_beams=num_beams,
57
+ **kwargs,
58
+ )
59
+
60
+ with torch.no_grad():
61
+ generation_output = model.generate(
62
+ input_ids=input_ids,
63
+ generation_config=generation_config,
64
+ return_dict_in_generate=True,
65
+ output_scores=True,
66
+ max_new_tokens=max_new_tokens,
67
+ )
68
+ ```