library_name: mlc-llm | |
base_model: nvidia/OpenCodeReasoning-Nemotron-1.1-14B | |
tags: | |
- mlc-llm | |
- web-llm | |
# OpenCodeReasoning-Nemotron-1.1-14B-q0f16-MLC | |
This is the [OpenCodeReasoning-Nemotron-1.1-14B](https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-1.1-14B) model in MLC format `q0f16`. | |
The model can be used with [MLC-LLM](https://github.com/mlc-ai/mlc-llm) and [WebLLM](https://github.com/mlc-ai/web-llm). | |
## Example Usage | |
Before using the examples, please follow the [installation guide](https://llm.mlc.ai/docs/install/mlc_llm.html#install-mlc-packages). | |
### Chat CLI | |
```bash | |
mlc_llm chat HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-14B-q0f16-MLC | |
```` | |
### REST Server | |
```bash | |
mlc_llm serve HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-14B-q0f16-MLC | |
``` | |
### Python API | |
```python | |
from mlc_llm import MLCEngine | |
model = "HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-14B-q0f16-MLC" | |
engine = MLCEngine(model) | |
for response in engine.chat.completions.create( | |
messages=[{"role": "user", "content": "What is the meaning of life?"}], | |
model=model, | |
stream=True, | |
): | |
for choice in response.choices: | |
print(choice.delta.content, end="", flush=True) | |
print("\n") | |
engine.terminate() | |
``` | |
## Documentation | |
For more on MLC LLM, visit the [documentation](https://llm.mlc.ai/docs/) and [GitHub repo](https://github.com/mlc-ai/mlc-llm). | |