--- tags: - image-captioning - deep-learning - pytorch - encoder-decoder - vision --- # 🖼️ Image Captioning Model This is a deep learning-based **image captioning model** trained using a **CNN Encoder + LSTM Decoder** architecture. The model generates captions for input images based on visual features extracted by a Convolutional Neural Network (CNN). ## 📌 Model Details - **Model Type**: Image Captioning - **Architecture**: CNN Encoder + LSTM Decoder - **Framework**: PyTorch - **Input**: Image (`.jpg`, `.png`, etc.) - **Output**: Generated caption (text) - **Vocabulary**: Pre-trained vocabulary file ## 🚀 How to Use ### **1️⃣ Install Dependencies** ```bash pip install torch torchvision transformers huggingface_hub pickle5