davzoku
/

moecule-2x1b-m9-ks

Question Answering

Safetensors

moellama

custom_code

Model card Files Files and versions Community

davzoku commited on Mar 19

Commit

0f0c243

verified ·

1 Parent(s): e7e5f4f

Create README.md

Browse files

Files changed (1) hide show

README.md +114 -0

README.md ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+datasets:
+- davzoku/moecule-stock-market-outlook
+- davzoku/moecule-kyc
+base_model:
+- unsloth/Llama-3.2-1B-Instruct
+pipeline_tag: question-answering
+---
+# 🫐 Moecule 2x1B M9 KS
+<p align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/63c51d0e72db0f638ff1eb82/8BNZvdKBuSComBepbH-QW.png" width="150" height="150" alt="logo"> <br>
+</p>
+## Model Details
+This model is a mixture of experts (MoE) using the [RhuiDih/moetify](https://github.com/RhuiDih/moetify) library with various task-specific experts. All relevant expert models, LoRA adapters, and datasets are available at [Moecule Ingredients](https://huggingface.co/collections/davzoku/moecule-ingredients-67dac0e6210eb1d95abc6411).
+## Key Features
+- **Zero Additional Training:** Combine existing domain-specific / task-specific experts into a powerful MoE model without additional training!
+## System Requirements
+| Steps            | System Requirements    |
+| ---------------- | ---------------------- |
+| MoE Creation     | > 22.5 GB System RAM   |
+| Inference (fp16) | GPU with > 5.4 GB VRAM |
+## MoE Creation
+To reproduce this model, run the following command:
+```shell
+# git clone moetify fork that fixes dependency issue
+!git clone -b fix-transformers-4.47.1-FlashA2-dependency --single-branch https://github.com/davzoku/moetify.git
+!cd moetify && pip install -e .
+python -m moetify.mix \
+ --output_dir ./moecule-2x1b-m9-ks \
+ --model_path unsloth/llama-3.2-1b-Instruct \
+ --modules mlp q_proj \
+ --ingredients \
+ davzoku/kyc_expert_1b \
+ davzoku/stock_market_expert_1b
+```
+## Model Parameters
+```shell
+INFO:root:Stem parameters: 626067456
+INFO:root:Experts parameters: 1744830464
+INFO:root:Routers parameters: 131072
+INFO:root:MOE total parameters (numel): 2371028992
+INFO:root:MOE total parameters : 2371028992
+INFO:root:MOE active parameters: 2371028992
+```
+## Inference
+To run an inference with this model, you can use the following code snippet:
+```python
+# git clone moetify fork that fixes dependency issue
+!git clone -b fix-transformers-4.47.1-FlashA2-dependency --single-branch https://github.com/davzoku/moetify.git
+!cd moetify && pip install -e .
+model = AutoModelForCausalLM.from_pretrained(<model-name>, device_map='auto', trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained(<model-name>)
+def format_instruction(row):
+    return f"""### Question: {row}"""
+greedy_generation_config = GenerationConfig(
+    temperature=0.1,
+    top_p=0.75,
+    top_k=40,
+    num_beams=1,
+    max_new_tokens=128,
+    repetition_penalty=1.2
+)
+input_text = "In what ways did Siemens's debt restructuring on March 06, 2024 reflect its strategic priorities?"
+formatted_input = format_instruction(input_text)
+inputs = tokenizer(formatted_input, return_tensors="pt").to('cuda')
+with torch.no_grad():
+    outputs = model.generate(
+        input_ids=inputs.input_ids,
+        attention_mask=inputs.attention_mask,
+        generation_config=greedy_generation_config
+    )
+generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(generated_text)
+```
+## The Team
+- CHOCK Wan Kee
+- Farlin Deva Binusha DEVASUGIN MERLISUGITHA
+- GOH Bao Sheng
+- Jessica LEK Si Jia
+- Sinha KHUSHI
+- TENG Kok Wai (Walter)
+## References
+- [Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts](https://arxiv.org/abs/2408.17280v2)
+- [RhuiDih/moetify](https://github.com/RhuiDih/moetify)