---
library_name: ggml
language:
- ru
- en
pipeline_tag: text-generation
license: apache-2.0
license_name: apache-2.0
license_link: https://huggingface.co/MTSAIR/Kodify-Nano-GGUF/blob/main/Apache%20License%20MTS%20AI.docx
---

# Kodify-Nano-GGUF 🤖

Kodify-Nano-GGUF - GGUF версия модели [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), оптимизированная для CPU/GPU-инференса и использованием Ollama/llama.cpp. Легковесная LLM для задач разработки кода с минимальными ресурсами.

Kodify-Nano-GGUF - GGUF version of [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), optimized for CPU/GPU inference with Ollama/llama.cpp. Lightweight LLM for code development tasks with minimal resource requirements.


## Using the Image
You can run Kodify Nano on OLLAMA in two ways:

1. **Using Docker**  
2. **Locally** (provides faster responses than Docker)

### Method 1: Running Kodify Nano on OLLAMA in Docker

#### Without NVIDIA GPU:

```bash
docker run -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
```

#### With NVIDIA GPU:

```bash
docker run --runtime nvidia -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
```

> **Important:**  
> - Ensure Docker is installed and running  
> - If port 8985 is occupied, replace it with any available port and update plugin configuration

#### Load the model:

```bash
docker exec ollama ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
```

#### Rename the model:
```bash
docker exec ollama ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
```

#### Start the model:

```bash
docker exec ollama ollama run kodify_nano
```
---

### Method 2: Local Kodify Nano on OLLAMA

1. **Download OLLAMA:**  
https://ollama.com/download

2. **Set the port:**

```bash
export OLLAMA_HOST=0.0.0.0:8985
```

> **Note:** If port 8985 is occupied, replace it and update plugin configuration

3. **Start OLLAMA server:**  

```bash
ollama serve &
```

4. **Download the model:**  

```bash
ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
```

5. **Rename the model:**  

```bash
ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
```

6. **Run the model:**  

```bash
ollama run kodify_nano
```

## Plugin Installation

### For Visual Studio Code

1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for VS Code.
2. Open the **Extensions** panel on the left sidebar.
3. Click **Install from VSIX...** and select the downloaded plugin file.

### For JetBrains IDEs

1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for JetBrains.
2. Open the IDE and go to **Settings > Plugins**.
3. Click the gear icon (⚙️) and select **Install Plugin from Disk...**.
4. Choose the downloaded plugin file.
5. Restart the IDE when prompted.

---

### Changing the Port in Plugin Settings (for Visual Studio Code and JetBrains)

If you changed the Docker port from `8985`, update the plugin's `config.json`:

1. Open any file in the IDE.
2. Open the Kodify sidebar:
   - **VS Code**: `Ctrl+L` (`Cmd+L` on Mac).
   - **JetBrains**: `Ctrl+J` (`Cmd+J` on Mac).
3. Access the `config.json` file:
   - **Method 1**: Click **Open Settings** (VS Code) or **Kodify Config** (JetBrains), then navigate to **Configuration > Chat Settings > Open Config File**.
   - **Method 2**: Click the gear icon (⚙️) in the Kodify sidebar.
4. Modify the `apiBase` port under `tabAutocompleteModel` and `models`.
5. Save the file (`Ctrl+S` or **File > Save**).

---


## Available quantization variants:
- Kodify_Nano_q4_k_s.gguf (balanced)
- Kodify_Nano_q8_0.gguf (high quality)
- Kodify_Nano.gguf (best quality, unquantized)

Download using huggingface_hub:

```bash
pip install huggingface-hub
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')"
```


## Python Integration

Install Ollama Python library:

```bash
pip install ollama
```

Example code:

```python
import ollama

response = ollama.generate(
    model="kodify-nano",
    prompt="Write a Python function to calculate factorial",
    options={
        "temperature": 0.4,
        "top_p": 0.8,
        "num_ctx": 8192
    }
)

print(response['response'])
```

## Usage Examples

```python
response = ollama.generate(
    model="kodify-nano",
    prompt="""<s>[INST] 
Write a Python function that:
1. Accepts a list of numbers
2. Returns the median value
[/INST]""",
    options={"max_tokens": 512}
)

### Code Refactoring
response = ollama.generate(
    model="kodify-nano",
    prompt="""<s>[INST] 
Refactor this Python code:

def calc(a,b):
    s = a + b
    d = a - b
    p = a * b
    return s, d, p
[/INST]""",
    options={"temperature": 0.3}
)
```