NoahShen/BAIT-ModelZoo · Hugging Face

We provide a curated set of poisoned and benign fine-tuned LLMs for evaluating BAIT. The model zoo follows this file structure:

BAIT-ModelZoo/
├── base_models/
│   ├── BASE/MODEL/1/FOLDER  
│   ├── BASE/MODEL/2/FOLDER
│   └── ...
├── models/
│   ├── id-0001/
│   │   ├── model/
│   │   │   └── ...
│   │   └── config.json
│   ├── id-0002/
│   └── ...
└── METADATA.csv

base_models stores pretrained LLMs downloaded from Huggingface. We evaluate BAIT on the following 3 LLM architectures:

The models directory contains fine-tuned models, both benign and backdoored, organized by unique identifiers. Each model folder includes:

The model files
A config.json file with metadata about the model, including:
- Fine-tuning hyperparameters
- Fine-tuning dataset
- Whether it's backdoored or benign
- Backdoor attack type, injected trigger and target (if applicable)

The METADATA.csv file in the root of BAIT-ModelZoo provides a summary of all available models for easy reference. Current model zoo contains 91 models. We will keep updating the model zoo with new models.

NoahShen
/

BAIT-ModelZoo

Model tree for NoahShen/BAIT-ModelZoo

Datasets used to train NoahShen/BAIT-ModelZoo