File size: 5,561 Bytes
4de15eb be62733 4de15eb be62733 4de15eb be62733 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
---
library_name: transformers
pipeline_tag: translation
license: mit
datasets:
- westenfelder/NL2SH-ALFA
language:
- en
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
model-index:
- name: Qwen2.5-Coder-3B-Instruct-NL2SH
results:
- task:
type: translation
name: Natural Language to Bash Translation
dataset:
type: translation
name: NL2SH-ALFA
split: test
metrics:
- type: accuracy
value: 0.51
name: InterCode-ALFA
source:
name: InterCode-ALFA
url: https://arxiv.org/abs/2502.06858
---
# Model Card for Qwen2.5-Coder-3B-Instruct-NL2SH
This model translates natural language (English) instructions to Bash commands.
## Model Details
### Model Description
This model is a fine-tuned version of the [Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct) model trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the [paper](https://arxiv.org/abs/2502.06858).
- **Developed by:** [Anyscale Learning For All (ALFA) Group at MIT-CSAIL](https://alfagroup.csail.mit.edu/)
- **Language:** English
- **License:** MIT License
- **Finetuned from model:** [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
### Model Sources
- **Repository:** [GitHub Repo](https://github.com/westenfelder/NL2SH)
- **Paper:** [LLM-Supported Natural Language to Bash Translation](https://arxiv.org/abs/2502.06858)
## Uses
### Direct Use
This model is intended for research on machine translation. The model can also be used as an educational resource for learning Bash.
### Out-of-Scope Use
This model should not be used in production or automated systems without human verification.
**Considerations for use in high-risk environments:** This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands.
## Bias, Risks, and Limitations
This model has a tendency to generate overly complex and incorrect Bash commands. It may produce harmful commands that delete data or corrupt a system. This model is not intended for natural languages other than English, scripting languages or than Bash, or multi-line Bash scripts.
### Recommendations
Users are encouraged to use this model as Bash reference tool and should not execute commands without verification.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
def translate(prompt):
model_name = "westenfelder/Qwen2.5-Coder-3B-Instruct-NL2SH"
tokenizer = AutoTokenizer.from_pretrained(model_name, clean_up_tokenization_spaces=False)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda", torch_dtype=torch.bfloat16)
messages = [
{"role": "system", "content": "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."},
{"role": "user", "content": f"{prompt}"},
]
tokens = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt"
).to(model.device)
attention_mask = torch.ones_like(tokens)
outputs = model.generate(
tokens,
attention_mask=attention_mask,
max_new_tokens=100,
do_sample=False,
temperature=None,
top_p=None,
top_k=None,
)
response = outputs[0][tokens.shape[-1]:]
return tokenizer.decode(response, skip_special_tokens=True)
nl = "List files in the /workspace directory that were accessed over an hour ago."
sh = translate(nl)
print(sh)
```
## Training Details
### Training Data
This model was trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset.
### Training Procedure
Please refer to section 4.1 and 4.3.4 of the [paper](https://arxiv.org/abs/2502.06858) for information about data pre-processing, training hyper-parameters and hardware.
## Evaluation
This model was evaluated on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) test set using the [InterCode-ALFA](https://github.com/westenfelder/InterCode-ALFA) benchmark.
### Results
This model achieved an accuracy of **0.51** on the InterCode-ALFA benchmark.
## Environmental Impact
Experiments were conducted using a private infrastructure, which has a approximate carbon efficiency of 0.432 kgCO2eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO2eq of which 0 percents were directly offset. Estimations were conducted using the [Machine Learning Emissions Calculator](https://mlco2.github.io/impact#compute).
## Citation
**BibTeX:**
```
@misc{westenfelder2025llmsupportednaturallanguagebash,
title={LLM-Supported Natural Language to Bash Translation},
author={Finnian Westenfelder and Erik Hemberg and Miguel Tulla and Stephen Moskal and Una-May O'Reilly and Silviu Chiricescu},
year={2025},
eprint={2502.06858},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.06858},
}
```
## Model Card Authors
Finn Westenfelder
## Model Card Contact
Please email finnw@mit.edu or make a pull request. |