--- title: 'Vanishing Voices: Language Atlas' emoji: 🌍 colorFrom: indigo colorTo: blue sdk: streamlit sdk_version: 1.44.1 app_file: rag_hf.py pinned: false --- # Vanishing Voices: South America's Endangered Language Atlas 🌍 This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages: - **Standard Search**: Based on Wikipedia/Wikidata embeddings only. - **Hybrid Search**: Combines embeddings with RDF cultural knowledge. - **GraphSAGE Search**: Includes structural information from a graph neural network. ## 🧠 Powered by - 🤗 [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints) - 🧱 SentenceTransformers for multilingual embeddings - 🧮 NetworkX + RDFLib for cultural graphs - 🔗 Glottolog, Wikidata, Wikipedia ## 📊 Features - RAG with local numpy embeddings - RDF triple inspection - Comparison of methods in terms of relevance and hallucination - Custom prompt injected into a Hugging Face endpoint > Note: This app requires your own HF API token in `.streamlit/secrets.toml`. ## 📄 Instructions 1. Upload your own `.ttl`, `.pkl`, `.npy` files for graph and embeddings. 2. Set up `HF_ENDPOINT` and `HF_API_TOKEN` in `.streamlit/secrets.toml`. 3. Deploy via Streamlit or Hugging Face Spaces. ---