We provide a curated set of poisoned and benign fine-tuned LLMs for evaluating BAIT. The model zoo follows this file structure:
BAIT-ModelZoo/
βββ base_models/
β βββ BASE/MODEL/1/FOLDER
β βββ BASE/MODEL/2/FOLDER
β βββ ...
βββ models/
β βββ id-0001/
β β βββ model/
β β β βββ ...
β β βββ config.json
β βββ id-0002/
β βββ ...
βββ METADATA.csv
base_models
stores pretrained LLMs downloaded from Huggingface. We evaluate BAIT on the following 3 LLM architectures:
The models
directory contains fine-tuned models, both benign and backdoored, organized by unique identifiers. Each model folder includes:
- The model files
- A
config.json
file with metadata about the model, including:- Fine-tuning hyperparameters
- Fine-tuning dataset
- Whether it's backdoored or benign
- Backdoor attack type, injected trigger and target (if applicable)
The METADATA.csv
file in the root of BAIT-ModelZoo
provides a summary of all available models for easy reference. Current model zoo contains 91 models. We will keep updating the model zoo with new models.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for NoahShen/BAIT-ModelZoo
Base model
meta-llama/Llama-2-7b-hf