Spaces:
Running
Running
metadata
title: 'Vanishing Voices: Language Atlas'
emoji: ๐
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.29.0
app_file: rag_hf.py
pinned: false
Vanishing Voices: South America's Endangered Language Atlas ๐
This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages:
- Standard Search: Based on Wikipedia/Wikidata embeddings only.
- Hybrid Search: Combines embeddings with RDF cultural knowledge.
- GraphSAGE Search: Includes structural information from a graph neural network.
๐ง Powered by
- ๐ค Hugging Face Inference Endpoints
- ๐งฑ SentenceTransformers for multilingual embeddings
- ๐งฎ NetworkX + RDFLib for cultural graphs
- ๐ Glottolog, Wikidata, Wikipedia
๐ Features
- RAG with local numpy embeddings
- RDF triple inspection
- Comparison of methods in terms of relevance and hallucination
- Custom prompt injected into a Hugging Face endpoint
Note: This app requires your own HF API token in
.streamlit/secrets.toml
.
๐ Instructions
- Upload your own
.ttl
,.pkl
,.npy
files for graph and embeddings. - Set up
HF_ENDPOINT
andHF_API_TOKEN
in.streamlit/secrets.toml
. - Deploy via Streamlit or Hugging Face Spaces.