--- base_model: EleutherAI/pythia-70m-deduped model_name: "Pythia-70M Sarcasm LoRA by hyvve" library_name: peft tags: - text-generation - lora - peft - sarcasm - pythia - fine-tuning - causal-lm - EleutherAI license: apache-2.0 pipeline_tag: text-generation --- # Model Card for Pythia-70M Sarcasm LoRA This model is a LoRA (Low-Rank Adaptation) fine-tune of the `EleutherAI/pythia-70m-deduped` model, specifically adapted for tasks related to sarcasm. ## Model Details ### Model Description This is a PEFT LoRA adapter for the `EleutherAI/pythia-70m-deduped` model. It has been fine-tuned on a dataset related to sarcasm. As a Causal Language Model (CLM), its primary function is to predict the next token in a sequence. This fine-tuning aims to imbue the model with an understanding or stylistic representation of sarcastic language. - **Developed by:** [hyvve](https://hyvve.xyz) (based on job configurations) - **Model type:** Causal Language Model (specifically, a LoRA adapter for a GPT-NeoX based model) - **Language(s) (NLP):** English (derived from the base model and assumed dataset language) - **License:** Apache-2.0 (inherited from the base model `EleutherAI/pythia-70m-deduped`) - **Finetuned from model:** `EleutherAI/pythia-70m-deduped` ### Model Sources [optional] - **Repository (LoRA Adapter):** `https://huggingface.co/manny-uncharted/pythia-70m-sarcasm-lora` (based on `hf_target_model_repo_id`) - **Base Model Repository:** `https://huggingface.co/EleutherAI/pythia-70m-deduped` - **Paper [optional]:** For Pythia suite: [arXiv:2304.01373](https://arxiv.org/abs/2304.01373) - **Demo [optional]:** [Not Provided] ## Uses ### Direct Use This LoRA adapter is intended to be loaded on top of the `EleutherAI/pythia-70m-deduped` base model. It can be used for: * Generating text with a sarcastic tone or style. * Completing prompts in a sarcastic manner. * Research into modeling nuanced aspects of language like sarcasm with smaller LMs. **Note:** Due to the extremely small dataset size used for fine-tuning (~1000 examples with each file containing 200 examples), the model's ability to robustly generate or understand sarcasm will be very limited. It primarily serves as a pipeline and integration test. ### Downstream Use [optional] * Further fine-tuning on larger, more diverse sarcasm datasets. * Integration into applications requiring conditional text generation with a sarcastic flavor (e.g., chatbots, creative writing tools), though extensive further tuning would be necessary. ### Out-of-Scope Use * Reliable sarcasm detection or classification without significant further development and evaluation. * Generating harmful, biased, or offensive content, even if framed as sarcasm. * Use in critical applications where misinterpretation of sarcasm could have negative consequences. * Generating fluent, coherent, and factually accurate long-form text beyond the capabilities of the 70M parameter base model. ## Bias, Risks, and Limitations * **Limited Scope:** Fine-tuned on a very small dataset (1000 examples), so its understanding and generation of sarcasm will be superficial and not generalizable. * **Inherited Biases:** Inherits biases from the `EleutherAI/pythia-70m-deduped` base model, which was trained on The Pile. These can include societal, gender, and racial biases. * **Misinterpretation of Sarcasm:** Sarcasm is highly context-dependent and subjective. The model may generate text that is inappropriately sarcastic or fail to understand sarcastic prompts correctly. * **Potential for Harmful Sarcasm:** Sarcasm can be used to convey negativity or veiled aggression. The model might inadvertently generate such content. * **Numerical Instability:** During the logged training run, an `eval_loss: nan` was observed, indicating potential issues with evaluation on the tiny validation set or numerical instability under the given configuration. The `train_loss: 0.0` also suggests extreme overfitting or issues with the learning process on such limited data. ### Recommendations * **Thorough Evaluation:** Before any production use, the model (after further fine-tuning on a substantial dataset) would require rigorous evaluation for both sarcasm generation quality and potential biases. * **Content Moderation:** Downstream applications should implement content moderation and safety filters. * **Context is Key:** Use with clear context and be aware that its sarcastic capabilities are likely very brittle due to the limited training data. * **Do Not Use for Critical Decisions:** This model, in its current state, is not suitable for any critical applications. ## How to Get Started with the Model To use this LoRA adapter, you'll need to load the base model and then apply the adapter using the PEFT library. ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model_id = "EleutherAI/pythia-70m-deduped" adapter_model_id = "manny-uncharted/pythia-70m-sarcasm-lora" # Replace with your actual model ID # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_id) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token # Load the base model (e.g., in 4-bit if that's how the adapter was trained/intended) # For QLoRA, BitsAndBytesConfig would be needed here as during training # For simplicity, this example loads without quantization. Adapt as needed. base_model = AutoModelForCausalLM.from_pretrained( base_model_id, # quantization_config=BitsAndBytesConfig(...) # Add if loading in 4-bit/8-bit # torch_dtype=torch.float16, # Or torch.bfloat16 device_map="auto" ) # Load the PEFT LoRA model (adapter) model = PeftModel.from_pretrained(base_model, adapter_model_id) model = model.merge_and_unload() # Optional: merge adapter into base model for faster inference # Now you can use the model for generation prompt = "The weather today is just " # Example prompt inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # Generate text # Adjust generation parameters as needed outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_k=50, top_p=0.95, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True))