MTSAIR
/

Kodify-Nano-GGUF

@@ -11,42 +11,137 @@ license_link: https://huggingface.co/MTSAIR/Kodify-Nano-GGUF/blob/main/Apache%20
 # Kodify-Nano-GGUF 🤖
-Kodify-Nano-GGUF - 4-битная квантизированная GGUF версия модели [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), оптимизированная для CPU-инференса. Легковесная LLM для задач разработки кода с минимальными ресурсами.
-Kodify-Nano-GGUF - 4-bit quantized GGUF version of [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), optimized for CPU inference. Lightweight LLM for code development tasks with minimal resource requirements.
-## Download Models
-Available quantization variants:
-- Kodify_Nano_q4_k_s.gguf (balanced)
-- Kodify_Nano_q8_0.gguf (high quality)
-- Kodify_Nano.gguf (best quality, unquantized)
-Download using huggingface_hub:
 ```bash
-pip install huggingface-hub
-python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')"
 ```
-## Using with Ollama
-1. Install Ollama:
 https://ollama.com/download
-2. Create Modelfile:
 ```
-FROM ./models/Kodify_Nano_q4_k_s.gguf
-PARAMETER temperature 0.4
-PARAMETER top_p 0.8
-PARAMETER num_ctx 8192
-TEMPLATE """<s>[INST] {{ .System }} {{ .Prompt }} [/INST]"""
 ```
-3. Create and run model:
-ollama create kodify-nano -f Modelfile
-ollama run kodify-nano "Write a Python function to check prime numbers"
 ## Python Integration
@@ -76,8 +171,6 @@ print(response['response'])
 ## Usage Examples
-### Code Generation
 ```python
 response = ollama.generate(
     model="kodify-nano",

 # Kodify-Nano-GGUF 🤖
+Kodify-Nano-GGUF - GGUF версия модели [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), оптимизированная для CPU/GPU-инференса и использованием Ollama/llama.cpp. Легковесная LLM для задач разработки кода с минимальными ресурсами.
+Kodify-Nano-GGUF - GGUF version of [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), optimized for CPU/GPU inference with Ollama/llama.cpp. Lightweight LLM for code development tasks with minimal resource requirements.
+## Using the Image
+You can run Kodify Nano on OLLAMA in two ways:
+1. **Using Docker**
+2. **Locally** (provides faster responses than Docker)
+### Method 1: Running Kodify Nano on OLLAMA in Docker
+#### Without NVIDIA GPU:
 ```bash
+docker run -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
+```
+#### With NVIDIA GPU:
+```bash
+docker run --runtime nvidia -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
+```
+> **Important:**
+> - Ensure Docker is installed and running
+> - If port 8985 is occupied, replace it with any available port and update plugin configuration
+#### Load the model:
+```bash
+docker exec ollama ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
 ```
+#### Rename the model:
+```bash
+docker exec ollama ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
+```
+#### Start the model:
+```bash
+docker exec ollama ollama run kodify_nano
+```
+---
+### Method 2: Local Kodify Nano on OLLAMA
+1. **Download OLLAMA:**
 https://ollama.com/download
+2. **Set the port:**
+```bash
+export OLLAMA_HOST=0.0.0.0:8985
 ```
+> **Note:** If port 8985 is occupied, replace it and update plugin configuration
+3. **Start OLLAMA server:**
+```bash
+ollama serve &
+```
+4. **Download the model:**
+```bash
+ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
+```
+5. **Rename the model:**
+```bash
+ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
+```
+6. **Run the model:**
+```bash
+ollama run kodify_nano
+```
+## Plugin Installation
+### For Visual Studio Code
+1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for VS Code.
+2. Open the **Extensions** panel on the left sidebar.
+3. Click **Install from VSIX...** and select the downloaded plugin file.
+### For JetBrains IDEs
+1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for JetBrains.
+2. Open the IDE and go to **Settings > Plugins**.
+3. Click the gear icon (⚙️) and select **Install Plugin from Disk...**.
+4. Choose the downloaded plugin file.
+5. Restart the IDE when prompted.
+---
+### Changing the Port in Plugin Settings (for Visual Studio Code and JetBrains)
+If you changed the Docker port from `8985`, update the plugin's `config.json`:
+1. Open any file in the IDE.
+2. Open the Kodify sidebar:
+   - **VS Code**: `Ctrl+L` (`Cmd+L` on Mac).
+   - **JetBrains**: `Ctrl+J` (`Cmd+J` on Mac).
+3. Access the `config.json` file:
+   - **Method 1**: Click **Open Settings** (VS Code) or **Kodify Config** (JetBrains), then navigate to **Configuration > Chat Settings > Open Config File**.
+   - **Method 2**: Click the gear icon (⚙️) in the Kodify sidebar.
+4. Modify the `apiBase` port under `tabAutocompleteModel` and `models`.
+5. Save the file (`Ctrl+S` or **File > Save**).
+---
+## Available quantization variants:
+- Kodify_Nano_q4_k_s.gguf (balanced)
+- Kodify_Nano_q8_0.gguf (high quality)
+- Kodify_Nano.gguf (best quality, unquantized)
+Download using huggingface_hub:
+```bash
+pip install huggingface-hub
+python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')"
 ```
 ## Python Integration
 ## Usage Examples
 ```python
 response = ollama.generate(
     model="kodify-nano",