--- library_name: ggml language: - ru - en pipeline_tag: text-generation license: apache-2.0 license_name: apache-2.0 license_link: https://huggingface.co/MTSAIR/Kodify-Nano-GGUF/blob/main/Apache%20License%20MTS%20AI.docx --- # Kodify-Nano-GGUF πŸ€– Kodify-Nano-GGUF - GGUF вСрсия ΠΌΠΎΠ΄Π΅Π»ΠΈ [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), оптимизированная для CPU/GPU-инфСрСнса ΠΈ использованиСм Ollama/llama.cpp. ЛСгковСсная LLM для Π·Π°Π΄Π°Ρ‡ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΊΠΎΠ΄Π° с ΠΌΠΈΠ½ΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹ΠΌΠΈ рСсурсами. Kodify-Nano-GGUF - GGUF version of [MTSAIR/Kodify-Nano](https://huggingface.co/MTSAIR/Kodify-Nano), optimized for CPU/GPU inference with Ollama/llama.cpp. Lightweight LLM for code development tasks with minimal resource requirements. ## Using the Image You can run Kodify Nano on OLLAMA in two ways: 1. **Using Docker** 2. **Locally** (provides faster responses than Docker) ### Method 1: Running Kodify Nano on OLLAMA in Docker #### Without NVIDIA GPU: ```bash docker run -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama ``` #### With NVIDIA GPU: ```bash docker run --runtime nvidia -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama ``` > **Important:** > - Ensure Docker is installed and running > - If port 8985 is occupied, replace it with any available port and update plugin configuration #### Load the model: ```bash docker exec ollama ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF ``` #### Rename the model: ```bash docker exec ollama ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano ``` #### Start the model: ```bash docker exec ollama ollama run kodify_nano ``` --- ### Method 2: Local Kodify Nano on OLLAMA 1. **Download OLLAMA:** https://ollama.com/download 2. **Set the port:** ```bash export OLLAMA_HOST=0.0.0.0:8985 ``` > **Note:** If port 8985 is occupied, replace it and update plugin configuration 3. **Start OLLAMA server:** ```bash ollama serve & ``` 4. **Download the model:** ```bash ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF ``` 5. **Rename the model:** ```bash ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano ``` 6. **Run the model:** ```bash ollama run kodify_nano ``` ## Plugin Installation ### For Visual Studio Code 1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for VS Code. 2. Open the **Extensions** panel on the left sidebar. 3. Click **Install from VSIX...** and select the downloaded plugin file. ### For JetBrains IDEs 1. Download the [latest Kodify plugin](https://mts.ai/ru/product/kodify/?utm_source=huggingface&utm_medium=pr&utm_campaign=post#models) for JetBrains. 2. Open the IDE and go to **Settings > Plugins**. 3. Click the gear icon (βš™οΈ) and select **Install Plugin from Disk...**. 4. Choose the downloaded plugin file. 5. Restart the IDE when prompted. --- ### Changing the Port in Plugin Settings (for Visual Studio Code and JetBrains) If you changed the Docker port from `8985`, update the plugin's `config.json`: 1. Open any file in the IDE. 2. Open the Kodify sidebar: - **VS Code**: `Ctrl+L` (`Cmd+L` on Mac). - **JetBrains**: `Ctrl+J` (`Cmd+J` on Mac). 3. Access the `config.json` file: - **Method 1**: Click **Open Settings** (VS Code) or **Kodify Config** (JetBrains), then navigate to **Configuration > Chat Settings > Open Config File**. - **Method 2**: Click the gear icon (βš™οΈ) in the Kodify sidebar. 4. Modify the `apiBase` port under `tabAutocompleteModel` and `models`. 5. Save the file (`Ctrl+S` or **File > Save**). --- ## Available quantization variants: - Kodify_Nano_q4_k_s.gguf (balanced) - Kodify_Nano_q8_0.gguf (high quality) - Kodify_Nano.gguf (best quality, unquantized) Download using huggingface_hub: ```bash pip install huggingface-hub python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')" ``` ## Python Integration Install Ollama Python library: ```bash pip install ollama ``` Example code: ```python import ollama response = ollama.generate( model="kodify-nano", prompt="Write a Python function to calculate factorial", options={ "temperature": 0.4, "top_p": 0.8, "num_ctx": 8192 } ) print(response['response']) ``` ## Usage Examples ```python response = ollama.generate( model="kodify-nano", prompt="""[INST] Write a Python function that: 1. Accepts a list of numbers 2. Returns the median value [/INST]""", options={"max_tokens": 512} ) ### Code Refactoring response = ollama.generate( model="kodify-nano", prompt="""[INST] Refactor this Python code: def calc(a,b): s = a + b d = a - b p = a * b return s, d, p [/INST]""", options={"temperature": 0.3} ) ```