RAG-SA / README.md
javiervzpucp's picture
Update README.md
fbe8caf verified
|
raw
history blame
1.36 kB
metadata
title: 'Vanishing Voices: Language Atlas'
emoji: ๐ŸŒ
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.29.0
app_file: rag_hf.py
pinned: false

Vanishing Voices: South America's Endangered Language Atlas ๐ŸŒ

This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages:

  • Standard Search: Based on Wikipedia/Wikidata embeddings only.
  • Hybrid Search: Combines embeddings with RDF cultural knowledge.
  • GraphSAGE Search: Includes structural information from a graph neural network.

๐Ÿง  Powered by

  • ๐Ÿค— Hugging Face Inference Endpoints
  • ๐Ÿงฑ SentenceTransformers for multilingual embeddings
  • ๐Ÿงฎ NetworkX + RDFLib for cultural graphs
  • ๐Ÿ”— Glottolog, Wikidata, Wikipedia

๐Ÿ“Š Features

  • RAG with local numpy embeddings
  • RDF triple inspection
  • Comparison of methods in terms of relevance and hallucination
  • Custom prompt injected into a Hugging Face endpoint

Note: This app requires your own HF API token in .streamlit/secrets.toml.

๐Ÿ“„ Instructions

  1. Upload your own .ttl, .pkl, .npy files for graph and embeddings.
  2. Set up HF_ENDPOINT and HF_API_TOKEN in .streamlit/secrets.toml.
  3. Deploy via Streamlit or Hugging Face Spaces.