File size: 1,347 Bytes
fbe8caf 004b19b fbe8caf 004b19b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
title: 'Vanishing Voices: Language Atlas'
emoji: ๐
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.44.1
app_file: rag_hf.py
pinned: false
---
# Vanishing Voices: South America's Endangered Language Atlas ๐
This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages:
- **Standard Search**: Based on Wikipedia/Wikidata embeddings only.
- **Hybrid Search**: Combines embeddings with RDF cultural knowledge.
- **GraphSAGE Search**: Includes structural information from a graph neural network.
## ๐ง Powered by
- ๐ค [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints)
- ๐งฑ SentenceTransformers for multilingual embeddings
- ๐งฎ NetworkX + RDFLib for cultural graphs
- ๐ Glottolog, Wikidata, Wikipedia
## ๐ Features
- RAG with local numpy embeddings
- RDF triple inspection
- Comparison of methods in terms of relevance and hallucination
- Custom prompt injected into a Hugging Face endpoint
> Note: This app requires your own HF API token in `.streamlit/secrets.toml`.
## ๐ Instructions
1. Upload your own `.ttl`, `.pkl`, `.npy` files for graph and embeddings.
2. Set up `HF_ENDPOINT` and `HF_API_TOKEN` in `.streamlit/secrets.toml`.
3. Deploy via Streamlit or Hugging Face Spaces.
--- |