javiervzpucp commited on
Commit
fbe8caf
ยท
verified ยท
1 Parent(s): 8c421fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -1
README.md CHANGED
@@ -1 +1,41 @@
1
- # RAG-glottolog
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "Vanishing Voices: Language Atlas"
3
+ emoji: "๐ŸŒ"
4
+ colorFrom: "indigo"
5
+ colorTo: "blue"
6
+ sdk: "streamlit"
7
+ sdk_version: "1.29.0"
8
+ app_file: "rag_hf.py"
9
+ pinned: false
10
+ ---
11
+ # Vanishing Voices: South America's Endangered Language Atlas ๐ŸŒ
12
+
13
+ This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages:
14
+
15
+ - **Standard Search**: Based on Wikipedia/Wikidata embeddings only.
16
+ - **Hybrid Search**: Combines embeddings with RDF cultural knowledge.
17
+ - **GraphSAGE Search**: Includes structural information from a graph neural network.
18
+
19
+ ## ๐Ÿง  Powered by
20
+
21
+ - ๐Ÿค— [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints)
22
+ - ๐Ÿงฑ SentenceTransformers for multilingual embeddings
23
+ - ๐Ÿงฎ NetworkX + RDFLib for cultural graphs
24
+ - ๐Ÿ”— Glottolog, Wikidata, Wikipedia
25
+
26
+ ## ๐Ÿ“Š Features
27
+
28
+ - RAG with local numpy embeddings
29
+ - RDF triple inspection
30
+ - Comparison of methods in terms of relevance and hallucination
31
+ - Custom prompt injected into a Hugging Face endpoint
32
+
33
+ > Note: This app requires your own HF API token in `.streamlit/secrets.toml`.
34
+
35
+ ## ๐Ÿ“„ Instructions
36
+
37
+ 1. Upload your own `.ttl`, `.pkl`, `.npy` files for graph and embeddings.
38
+ 2. Set up `HF_ENDPOINT` and `HF_API_TOKEN` in `.streamlit/secrets.toml`.
39
+ 3. Deploy via Streamlit or Hugging Face Spaces.
40
+
41
+ ---