Mr-Vicky-01 commited on
Commit
a5c10e0
·
verified ·
1 Parent(s): 3280dfc

Unsloth Model Card

Browse files
Files changed (1) hide show
  1. README.md +14 -67
README.md CHANGED
@@ -1,74 +1,21 @@
1
  ---
2
- library_name: transformers
 
 
 
 
 
3
  license: apache-2.0
 
 
4
  ---
5
 
6
- ## INFERENCE
7
 
8
- ```python
9
- from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
10
- import torch
11
 
12
- # Load model and tokenizer
13
- tokenizer = AutoTokenizer.from_pretrained("Mr-Vicky-01/gemma-qna")
14
- model = AutoModelForCausalLM.from_pretrained("Mr-Vicky-01/gemma-qna")
15
 
16
- # Define the system prompt
17
- prompt = """
18
- <bos><start_of_turn>user
19
- You are Securitron, Created by Aquilax, a helpful AI assistant specialized in providing accurate and professional responses. Always prioritize clarity and precision in your answers.
20
- """
21
-
22
- # Initialize conversation history
23
- conversation_history = []
24
-
25
- # Set up device
26
- device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
27
- model.to(device)
28
-
29
- i=0
30
- while True:
31
- user_prompt = input("\nUser Question: ")
32
- if user_prompt.lower() == 'break':
33
- break
34
-
35
- if i==0:
36
- user_message = f"""
37
- {user_prompt}<end_of_turn>
38
- <start_of_turn>model"""
39
- i+=1
40
- else:
41
- user_message = f"""
42
- <start_of_turn>user
43
- {user_prompt}<end_of_turn>
44
- <start_of_turn>model"""
45
-
46
- # Add the user's question to the conversation history
47
- conversation_history.append(user_message)
48
-
49
- # Keep only the last 2 exchanges (4 turns)
50
- conversation_history = conversation_history[-5:]
51
-
52
- # Build the full prompt
53
- current_prompt = prompt + "\n".join(conversation_history)
54
-
55
- # Tokenize the prompt
56
- encodeds = tokenizer(current_prompt, return_tensors="pt", truncation=True).input_ids.to(device)
57
-
58
- # Initialize TextStreamer for real-time token generation
59
- text_streamer = TextStreamer(tokenizer, skip_prompt=True)
60
-
61
- # Generate response with TextStreamer
62
- response = model.generate(
63
- input_ids=encodeds,
64
- streamer=text_streamer,
65
- max_new_tokens=2048,
66
- use_cache=True,
67
- pad_token_id=106,
68
- eos_token_id=106,
69
- num_return_sequences=1
70
- )
71
-
72
- # Finalize conversation history with the assistant's response
73
- conversation_history.append(tokenizer.decode(response[0]).split('<start_of_turn>model')[-1].split('<end_of_turn>')[0].strip())
74
- ```
 
1
  ---
2
+ base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - gemma3_text
8
  license: apache-2.0
9
+ language:
10
+ - en
11
  ---
12
 
13
+ # Uploaded finetuned model
14
 
15
+ - **Developed by:** Mr-Vicky-01
16
+ - **License:** apache-2.0
17
+ - **Finetuned from model :** unsloth/gemma-3-1b-it-unsloth-bnb-4bit
18
 
19
+ This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
20
 
21
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)