DoeyLLM commited on
Commit
644cf90
·
verified ·
1 Parent(s): 29630f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -0
README.md CHANGED
@@ -1,3 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  datasets:
 
1
+ ## **Model Summary**
2
+ This model is a fine-tuned version of **LLaMA 3.2-3B**, optimized using **LoRA (Low-Rank Adaptation)** on the [NVIDIA ChatQA-Training-Data](https://huggingface.co/datasets/nvidia/ChatQA-Training-Data). It is tailored for conversational AI, question answering, and other instruction-following tasks, with support for sequences up to 1024 tokens.
3
+
4
+ ---
5
+
6
+ ## **Key Features**
7
+ - **Base Model**: LLaMA 3.2-3B
8
+ - **Fine-Tuning Framework**: LoRA
9
+ - **Dataset**: NVIDIA ChatQA-Training-Data
10
+ - **Max Sequence Length**: 1024 tokens
11
+ - **Use Case**: Instruction-based tasks, question answering, conversational AI.
12
+
13
+ ## **Model Usage**
14
+ This fine-tuned model is suitable for:
15
+ - **Conversational AI**: Chatbots and dialogue agents with improved contextual understanding.
16
+ - **Question Answering**: Generating concise and accurate answers to user queries.
17
+ - **Instruction Following**: Responding to structured prompts.
18
+ - **Long-Context Tasks**: Processing sequences up to 1024 tokens for long-text reasoning.
19
+
20
+ # **How to Use DoeyLLM / OneLLM-Doey-V1-Llama-3.1-8B-Instruct**
21
+
22
+ This guide explains how to use the **DoeyLLM** model on both app (iOS) and PC platforms.
23
+
24
+ ---
25
+
26
+ ## **App (iOS): Use with OneLLM**
27
+
28
+ OneLLM brings versatile large language models (LLMs) to your device—Llama, Gemma, Qwen, Mistral, and more. Enjoy private, offline GPT and AI tools tailored to your needs.
29
+
30
+ With OneLLM, experience the capabilities of leading-edge language models directly on your device, all without an internet connection. Get fast, reliable, and intelligent responses, while keeping your data secure with local processing.
31
+
32
+ ### **Quick Start for iOS**
33
+
34
+ Follow these steps to integrate the **DoeyLLM** model using the OneLLM app:
35
+
36
+ 1. **Download OneLLM**
37
+ Get the app from the [App Store](https://apps.apple.com/us/app/onellm-private-ai-gpt-llm/id6737907910) and install it on your iOS device.
38
+
39
+ 2. **Load the DoeyLLM Model**
40
+ Use the OneLLM interface to load the DoeyLLM model directly into the app:
41
+ - Navigate to the **Model Library**.
42
+ - Search for `DoeyLLM`.
43
+ - Select the model and tap **Download** to store it locally on your device.
44
+ 3. **Start Conversing**
45
+ Once the model is loaded, you can begin interacting with it through the app's chat interface. For example:
46
+ - Tap the **Chat** tab.
47
+ - Type your question or prompt, such as:
48
+ > "Explain the significance of AI in education."
49
+ - Receive real-time, intelligent responses generated locally.
50
+
51
+ ### **Key Features of OneLLM**
52
+ - **Versatile Models**: Supports various LLMs, including Llama, Gemma, and Qwen.
53
+ - **Private & Secure**: All processing occurs locally on your device, ensuring data privacy.
54
+ - **Offline Capability**: Use the app without requiring an internet connection.
55
+ - **Fast Performance**: Optimized for mobile devices, delivering low-latency responses.
56
+
57
+ For more details or support, visit the [OneLLM App Store page](https://apps.apple.com/us/app/onellm-private-ai-gpt-llm/id6737907910).
58
+
59
+ ## **PC: Use with Transformers**
60
+
61
+ The DoeyLLM model can also be used on PC platforms through the `transformers` library, enabling robust and scalable inference for various NLP tasks.
62
+
63
+ ### **Quick Start for PC**
64
+ Follow these steps to use the model with Transformers:
65
+
66
+ 1. **Install Transformers**
67
+ Ensure you have `transformers >= 4.43.0` installed. Update or install it via pip:
68
+
69
+ ```bash
70
+ pip install --upgrade transformers
71
+
72
+ 2. **Load the Model**
73
+ Use the transformers library to load the model and tokenizer:
74
+
75
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
76
+
77
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
78
+
79
+ ```python
80
+ import torch
81
+ from transformers import pipeline
82
+
83
+ model_id = "OneLLM-Doey-V1-Llama-3.2-3B"
84
+ pipe = pipeline(
85
+ "text-generation",
86
+ model=model_id,
87
+ torch_dtype=torch.bfloat16,
88
+ device_map="auto",
89
+ )
90
+ messages = [
91
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
92
+ {"role": "user", "content": "Who are you?"},
93
+ ]
94
+ outputs = pipe(
95
+ messages,
96
+ max_new_tokens=256,
97
+ )
98
+ print(outputs[0]["generated_text"][-1])
99
+ ```
100
+
101
+
102
+
103
+ ## Responsibility & Safety
104
+
105
+ As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks:
106
+
107
+ 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama
108
+ 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm
109
+ 3. Provide protections for the community to help prevent the misuse of our models
110
+
111
  ---
112
  license: apache-2.0
113
  datasets: