--- title: Sentence Transformers emoji: 🏢 colorFrom: green colorTo: gray sdk: gradio sdk_version: 5.33.1 app_file: app.py pinned: false --- # Sentence Transformers Demo Interactive web application for semantic text similarity analysis using Sentence Transformers models. ## Features ### 1. Paraphrase Mining - Find sentences with similar meaning in a text corpus - Support for multiple language models - Adjustable similarity threshold - Export results in CSV format ### 2. Semantic Textual Similarity (STS) - Calculate semantic similarity between two sets of sentences - Uses advanced sentence transformation models - Compare sentences in different languages - Export results in CSV format ## Available Models - [`Lajavaness/bilingual-embedding-large`](https://huggingface.co/Lajavaness/bilingual-embedding-large): Multilingual model optimized for multiple languages - [`sentence-transformers/all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2): High-quality general-purpose model - [`intfloat/multilingual-e5-large-instruct`](https://huggingface.co/intfloat/multilingual-e5-large-instruct): Multilingual model with instructions ## Requirements - Python 3.8+ - Dependencies listed in `requirements.txt` ## Installation 1. Clone the repository: ```bash git clone https://github.com/yourusername/sentence-transformers.git cd sentence-transformers ``` 2. Create and activate a virtual environment: ```bash python -m venv venv source venv/bin/activate # Linux/Mac # or .\venv\Scripts\activate # Windows ``` 3. Install dependencies: ```bash pip install -r requirements.txt ``` ## Usage 1. Start the application: ```bash python app.py ``` 2. Open your browser at `http://localhost:7860` 3. Select the desired functionality: - Paraphrase Mining: Upload a CSV file with sentences to analyze - STS: Upload two CSV files with sentences to compare 4. Select the model and adjust the similarity threshold 5. Click "Process" to start the analysis 6. Download results in CSV format ## CSV File Format CSV files must contain a column named "text" with the sentences to analyze: ```csv text "First sentence to analyze" "Second sentence to analyze" ... ``` ## Notes - Temporary files are automatically cleaned up every 30 minutes - Using complete sentences is recommended for better results - Models may take time to load on first use ## License MIT Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference