|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.3-70B-Instruct |
|
pipeline_tag: text-generation |
|
tags: |
|
- lora |
|
- adapter |
|
- writing |
|
- CoT |
|
--- |
|
# Merged-Llama-Adapters-317-320 |
|
|
|
A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model. |
|
|
|
## Model Details |
|
|
|
- Base Model: meta-llama/Llama-3.1-8B-instruct |
|
- Adaptation Method: Merged LoRA |
|
- Source Adapters: |
|
- https://huggingface.co/kevin009/llama317 |
|
- https://huggingface.co/kevin009/llama318 |
|
- https://huggingface.co/kevin009/llama319 |
|
- https://huggingface.co/kevin009/llama320 |
|
|
|
## Merger Configuration |
|
|
|
### Source Adapters |
|
|
|
All source adapters share the following configuration: |
|
- Rank (r): 16 |
|
- Alpha: 16 |
|
- Target Modules: |
|
- q_proj (Query projection) |
|
- k_proj (Key projection) |
|
- v_proj (Value projection) |
|
- o_proj (Output projection) |
|
- up_proj (Upsampling projection) |
|
- down_proj (Downsampling projection) |
|
- gate_proj (Gate projection) |
|
|
|
### Merger Details |
|
|
|
- Merger Method: Linear interpolation |
|
- Merger Weights: Equal weights (0.25) for each adapter |
|
- Combined Rank: 16 (maintained from source adapters) |
|
|
|
## Usage |
|
|
|
This merged adapter must be used with the base Llama-3.1-8B-instruct model. |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load base model |
|
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-instruct") |
|
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-instruct") |
|
|
|
# Load merged LoRA adapter |
|
model = PeftModel.from_pretrained(base_model, "path_to_merged_adapter") |
|
``` |
|
|
|
## Limitations and Biases |
|
|
|
- This merged adapter inherits limitations and biases from: |
|
- The base Llama-3.1-8B-instruct model |
|
- All four source adapters |
|
- The merging process may result in: |
|
- Potential loss of specialized capabilities from individual adapters |
|
- Averaged behavior across different adapter specializations |
|
- Possible interference between adapter weights |
|
|
|
## Merging Process |
|
|
|
The adapters were merged using the following approach: |
|
1. Linear interpolation of adapter weights |
|
2. Equal weighting (0.25) applied to each source adapter |
|
3. Preservation of original LoRA rank and architecture |
|
|
|
### Method Used |
|
|
|
The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights. |
|
|
|
### Step-by-Step Merging Process |
|
|
|
1. Load the base model and initial adapter: |
|
```python |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct" |
|
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME) |
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME) |
|
|
|
# Load first adapter as base |
|
peft_model = PeftModel.from_pretrained(model, "llama319", adapter_name="llama319") |
|
``` |
|
|
|
2. Load additional adapters: |
|
```python |
|
# Load remaining adapters |
|
peft_model.load_adapter("llama320", adapter_name="llama320") |
|
peft_model.load_adapter("llama318", adapter_name="llama318") |
|
peft_model.load_adapter("llama317", adapter_name="llama317") |
|
``` |
|
|
|
3. Configure and execute the merger: |
|
```python |
|
# Define adapters and their weights |
|
adapters = ["llama319", "llama320", "llama318", "llama317"] |
|
weights = [1.0, 1.0, 1.0, 1.0] # Equal weights for all adapters |
|
|
|
# Merge adapters |
|
peft_model.add_weighted_adapter( |
|
adapters, |
|
weights, |
|
"merge", |
|
combination_type="ties", # Using ties combination method |
|
density=0.2 # Density parameter for merger |
|
) |
|
|
|
# Set active adapter to merged version |
|
peft_model.set_adapter("merge") |
|
|
|
# Save the merged adapter |
|
peft_model.save_pretrained("merged") |
|
``` |
|
|
|
### Key Parameters |
|
|
|
- `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters |
|
- `density=0.2`: Controls the sparsity of the merged weights |
|
- `weights=[1.0, 1.0, 1.0, 1.0]`: Equal weighting for all adapters (0.25 each after normalization) |
|
|
|
### Notes |
|
|
|
- The order of loading adapters may affect the final result |
|
- Equal weights were chosen to maintain balanced influence from each adapter |
|
- The merged adapter maintains the same architecture and rank as the original adapters |
|
- While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process. |
|
|
|
## License |
|
|
|
Licensed under Apache 2.0 License. |
|
|
|
This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note: |
|
|
|
- You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms |
|
- This work is provided "as is" without warranties or conditions of any kind |
|
- This is an independent research project and not affiliated with any organization |
|
- Attribution is appreciated but not required |
|
- For full license details, see: https://www.apache.org/licenses/LICENSE-2.0 |