|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- zh |
|
tags: |
|
- vlm |
|
- XinYuan |
|
- multimodal |
|
- Qwen2-VL |
|
pipeline_tag: image-text-to-text |
|
library_name: transformers |
|
--- |
|
<div align=center><img src ="https://cdn-uploads.huggingface.co/production/uploads/6299c90ef1f2a097fcaa1293/XEfp5nnJOixkGAOyF8UtN.png"/></div> |
|
|
|
|
|
## How to use it |
|
|
|
1. To get the Code: |
|
``` |
|
git clone https://github.com/HimariO/llama.cpp.git |
|
cd llama.cpp |
|
git switch qwen2-vl |
|
``` |
|
2. nano Makefile #(to add llama-qwen2vl-cli) |
|
|
|
``` |
|
diff --git a/Makefile b/Makefile |
|
index 8a903d7e..51403be2 100644 |
|
--- a/Makefile |
|
+++ b/Makefile |
|
@@ -1485,6 +1485,14 @@ libllava.a: examples/llava/llava.cpp \ |
|
$(OBJ_ALL) |
|
$(CXX) $(CXXFLAGS) -static -fPIC -c $< -o $@ -Wno-cast-qual |
|
|
|
+llama-qwen2vl-cli: examples/llava/qwen2vl-cli.cpp \ |
|
+ examples/llava/llava.cpp \ |
|
+ examples/llava/llava.h \ |
|
+ examples/llava/clip.cpp \ |
|
+ examples/llava/clip.h \ |
|
+ $(OBJ_ALL) |
|
+ $(CXX) $(CXXFLAGS) $< $(filter-out %.h $<,$^) -o $@ $(LDFLAGS) -Wno-cast-qual |
|
+ |
|
llama-llava-cli: examples/llava/llava-cli.cpp \ |
|
examples/llava/llava.cpp \ |
|
examples/llava/llava.h \ |
|
|
|
``` |
|
|
|
3. Metal Build |
|
``` |
|
cmake . -DGGML_CUDA=ON -DCMAKE_CUDA_COMPILER=$(which nvcc) -DTCNN_CUDA_ARCHITECTURES=61 |
|
make -j35 |
|
``` |
|
|
|
4. RUN |
|
``` |
|
./bin/llama-qwen2vl-cli -m ./Cylingo/XinYuan-VL-2B-GGUF/XinYuan-VL-2B-GGUF-Q4_K_M.gguf --mmproj ./Cylingo/XinYuan-VL-2B-GGUF/qwen2vl-vision.gguf -p "Describe the image" --image "./Cylingo/XinYuan-VL-2B-GGUF/1.png" |
|
``` |
|
|
|
|
|
|