Model Card for Pythia-70M Sarcasm LoRA

This model is a LoRA (Low-Rank Adaptation) fine-tune of the EleutherAI/pythia-70m-deduped model, specifically adapted for tasks related to sarcasm.

Model Details

Model Description

This is a PEFT LoRA adapter for the EleutherAI/pythia-70m-deduped model. It has been fine-tuned on a dataset related to sarcasm. As a Causal Language Model (CLM), its primary function is to predict the next token in a sequence. This fine-tuning aims to imbue the model with an understanding or stylistic representation of sarcastic language.

  • Developed by: hyvve (based on job configurations)
  • Model type: Causal Language Model (specifically, a LoRA adapter for a GPT-NeoX based model)
  • Language(s) (NLP): English (derived from the base model and assumed dataset language)
  • License: Apache-2.0 (inherited from the base model EleutherAI/pythia-70m-deduped)
  • Finetuned from model: EleutherAI/pythia-70m-deduped

Model Sources [optional]

  • Repository (LoRA Adapter): https://huggingface.co/manny-uncharted/pythia-70m-sarcasm-lora (based on hf_target_model_repo_id)
  • Base Model Repository: https://huggingface.co/EleutherAI/pythia-70m-deduped
  • Paper [optional]: For Pythia suite: arXiv:2304.01373
  • Demo [optional]: [Not Provided]

Uses

Direct Use

This LoRA adapter is intended to be loaded on top of the EleutherAI/pythia-70m-deduped base model. It can be used for:

  • Generating text with a sarcastic tone or style.
  • Completing prompts in a sarcastic manner.
  • Research into modeling nuanced aspects of language like sarcasm with smaller LMs.

Note: Due to the extremely small dataset size used for fine-tuning (~1000 examples with each file containing 200 examples), the model's ability to robustly generate or understand sarcasm will be very limited. It primarily serves as a pipeline and integration test.

Downstream Use [optional]

  • Further fine-tuning on larger, more diverse sarcasm datasets.
  • Integration into applications requiring conditional text generation with a sarcastic flavor (e.g., chatbots, creative writing tools), though extensive further tuning would be necessary.

Out-of-Scope Use

  • Reliable sarcasm detection or classification without significant further development and evaluation.
  • Generating harmful, biased, or offensive content, even if framed as sarcasm.
  • Use in critical applications where misinterpretation of sarcasm could have negative consequences.
  • Generating fluent, coherent, and factually accurate long-form text beyond the capabilities of the 70M parameter base model.

Bias, Risks, and Limitations

  • Limited Scope: Fine-tuned on a very small dataset (1000 examples), so its understanding and generation of sarcasm will be superficial and not generalizable.
  • Inherited Biases: Inherits biases from the EleutherAI/pythia-70m-deduped base model, which was trained on The Pile. These can include societal, gender, and racial biases.
  • Misinterpretation of Sarcasm: Sarcasm is highly context-dependent and subjective. The model may generate text that is inappropriately sarcastic or fail to understand sarcastic prompts correctly.
  • Potential for Harmful Sarcasm: Sarcasm can be used to convey negativity or veiled aggression. The model might inadvertently generate such content.
  • Numerical Instability: During the logged training run, an eval_loss: nan was observed, indicating potential issues with evaluation on the tiny validation set or numerical instability under the given configuration. The train_loss: 0.0 also suggests extreme overfitting or issues with the learning process on such limited data.

Recommendations

  • Thorough Evaluation: Before any production use, the model (after further fine-tuning on a substantial dataset) would require rigorous evaluation for both sarcasm generation quality and potential biases.
  • Content Moderation: Downstream applications should implement content moderation and safety filters.
  • Context is Key: Use with clear context and be aware that its sarcastic capabilities are likely very brittle due to the limited training data.
  • Do Not Use for Critical Decisions: This model, in its current state, is not suitable for any critical applications.

How to Get Started with the Model

To use this LoRA adapter, you'll need to load the base model and then apply the adapter using the PEFT library.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model_id = "EleutherAI/pythia-70m-deduped"
adapter_model_id = "manny-uncharted/pythia-70m-sarcasm-lora" # Replace with your actual model ID

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load the base model (e.g., in 4-bit if that's how the adapter was trained/intended)
# For QLoRA, BitsAndBytesConfig would be needed here as during training
# For simplicity, this example loads without quantization. Adapt as needed.
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    # quantization_config=BitsAndBytesConfig(...) # Add if loading in 4-bit/8-bit
    # torch_dtype=torch.float16, # Or torch.bfloat16
    device_map="auto"
)

# Load the PEFT LoRA model (adapter)
model = PeftModel.from_pretrained(base_model, adapter_model_id)
model = model.merge_and_unload() # Optional: merge adapter into base model for faster inference

# Now you can use the model for generation
prompt = "The weather today is just " # Example prompt
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate text
# Adjust generation parameters as needed
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_k=50, top_p=0.95, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for manny-uncharted/pythia-70m-sarcasm-lora

Adapter
(242)
this model