merge-17-20 / README.md

Update README.md

8b14b25 verified 7 months ago

5.06 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- meta-llama/Llama-3.3-70B-Instruct
	pipeline_tag: text-generation
	tags:
	- lora
	- adapter
	- writing
	- CoT
	---
	# Merged-Llama-Adapters-317-320

	A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model.

	## Model Details

	- Base Model: meta-llama/Llama-3.1-8B-instruct
	- Adaptation Method: Merged LoRA
	- Source Adapters:
	- https://huggingface.co/kevin009/llama317
	- https://huggingface.co/kevin009/llama318
	- https://huggingface.co/kevin009/llama319
	- https://huggingface.co/kevin009/llama320

	## Merger Configuration

	### Source Adapters

	All source adapters share the following configuration:
	- Rank (r): 16
	- Alpha: 16
	- Target Modules:
	- q_proj (Query projection)
	- k_proj (Key projection)
	- v_proj (Value projection)
	- o_proj (Output projection)
	- up_proj (Upsampling projection)
	- down_proj (Downsampling projection)
	- gate_proj (Gate projection)

	### Merger Details

	- Merger Method: Linear interpolation
	- Merger Weights: Equal weights (0.25) for each adapter
	- Combined Rank: 16 (maintained from source adapters)

	## Usage

	This merged adapter must be used with the base Llama-3.1-8B-instruct model.

	### Loading the Model

	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-instruct")
	tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-instruct")

	# Load merged LoRA adapter
	model = PeftModel.from_pretrained(base_model, "path_to_merged_adapter")
	```

	## Limitations and Biases

	- This merged adapter inherits limitations and biases from:
	- The base Llama-3.1-8B-instruct model
	- All four source adapters
	- The merging process may result in:
	- Potential loss of specialized capabilities from individual adapters
	- Averaged behavior across different adapter specializations
	- Possible interference between adapter weights

	## Merging Process

	The adapters were merged using the following approach:
	1. Linear interpolation of adapter weights
	2. Equal weighting (0.25) applied to each source adapter
	3. Preservation of original LoRA rank and architecture

	### Method Used

	The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights.

	### Step-by-Step Merging Process

	1. Load the base model and initial adapter:
	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer

	MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct"
	model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
	tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

	# Load first adapter as base
	peft_model = PeftModel.from_pretrained(model, "llama319", adapter_name="llama319")
	```

	2. Load additional adapters:
	```python
	# Load remaining adapters
	peft_model.load_adapter("llama320", adapter_name="llama320")
	peft_model.load_adapter("llama318", adapter_name="llama318")
	peft_model.load_adapter("llama317", adapter_name="llama317")
	```

	3. Configure and execute the merger:
	```python
	# Define adapters and their weights
	adapters = ["llama319", "llama320", "llama318", "llama317"]
	weights = [1.0, 1.0, 1.0, 1.0] # Equal weights for all adapters

	# Merge adapters
	peft_model.add_weighted_adapter(
	adapters,
	weights,
	"merge",
	combination_type="ties", # Using ties combination method
	density=0.2 # Density parameter for merger
	)

	# Set active adapter to merged version
	peft_model.set_adapter("merge")

	# Save the merged adapter
	peft_model.save_pretrained("merged")
	```

	### Key Parameters

	- `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters
	- `density=0.2`: Controls the sparsity of the merged weights
	- `weights=[1.0, 1.0, 1.0, 1.0]`: Equal weighting for all adapters (0.25 each after normalization)

	### Notes

	- The order of loading adapters may affect the final result
	- Equal weights were chosen to maintain balanced influence from each adapter
	- The merged adapter maintains the same architecture and rank as the original adapters
	- While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process.

	## License

	Licensed under Apache 2.0 License.

	This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note:

	- You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms
	- This work is provided "as is" without warranties or conditions of any kind
	- This is an independent research project and not affiliated with any organization
	- Attribution is appreciated but not required
	- For full license details, see: https://www.apache.org/licenses/LICENSE-2.0