[INST]
You are an expert in South American indigenous languages.
Use strictly and only the information below to answer the user question in **English**.
- Do not infer or assume facts that are not explicitly stated.
- If the answer is unknown or insufficient, say \"I cannot answer with the available data.\"
- Limit your answer to 100 words.
### CONTEXT:
{chr(10).join(context)}
### RDF RELATIONS:
{chr(10).join(rdf_facts)}
### QUESTION:
{user_question}
Answer:
[/INST]"""
try:
res = requests.post(
ENDPOINT_URL,
headers={"Authorization": f"Bearer {HF_API_TOKEN}", "Content-Type": "application/json"},
json={"inputs": prompt}, timeout=60
)
out = res.json()
if isinstance(out, list) and "generated_text" in out[0]:
return out[0]["generated_text"].replace(prompt.strip(), "").strip(), ids, context, rdf_facts
return str(out), ids, context, rdf_facts
except Exception as e:
return str(e), ids, context, rdf_facts
# === MAIN APP ===
def main():
# Load components
methods, embedder = load_all_components()
# Main header
st.markdown("""
""", unsafe_allow_html=True)
# Overview section
with st.expander("π Overview", expanded=True):
st.markdown("""
This app provides **AI-powered analysis** of endangered indigenous languages in South America,
integrating knowledge graphs from **Glottolog, Wikipedia, and Wikidata**.
""")
cols = st.columns(2)
with cols[0]:
st.markdown("""
πΉ Two AI Methods Available:
- InfoMatch (Node2Vec + Textual Data)
- LinkGraph (GraphSAGE + Structured Relations)
πΉ Powered by Mistral-7B for contextual responses
""", unsafe_allow_html=True)
with cols[1]:
st.markdown("""
π οΈ Features
- β
Multisource Knowledge Graph
- β
Hybrid AI Analysis
- β
Comparative Results
- β
Structured & Unstructured Data
""", unsafe_allow_html=True)
# Sidebar
with st.sidebar:
# Logo and academic info
st.markdown("### Departamento AcadΓ©mico de Humanidades")
st.markdown("---")
# Quick start guide
st.markdown("### π Quick Start")
st.markdown("""
1. **Type a question** in the input box
2. **Click 'Analyze'** to compare methods
3. **Explore results** with expandable details
""")
st.markdown("---")
# Suggested questions
st.markdown("### π Example Queries")
questions = [
"What languages are endangered in Brazil?",
"How many speakers does Aymara have?",
"Which languages are related to Quechua?",
"Where is Mapudungun spoken?"
]
for q in questions:
if st.markdown(f"{q}
", unsafe_allow_html=True):
st.session_state.query = q
st.markdown("---")
# Technical details
st.markdown("### βοΈ Technical Details")
st.markdown("""
- Embeddings Node2Vec vs. GraphSAGE
- Language Model Mistral-7B-Instruct
- Knowledge Graph RDF-based integration
""", unsafe_allow_html=True)
st.markdown("---")
# Data sources
st.markdown("### π Data Sources")
st.markdown("""
- **Glottolog** (Language classification)
- **Wikipedia** (Textual summaries)
- **Wikidata** (Structured facts)
""")
st.markdown("---")
# Analysis parameters
st.markdown("### π Analysis Parameters")
k = st.slider("Number of languages to analyze", 1, 10, 3)
st.markdown("---")
# Debug options
st.markdown("### π§ Advanced Options")
show_ctx = st.checkbox("Show context information", False)
show_rdf = st.checkbox("Show structured facts", False)
# Main query interface
st.markdown("### π Ask About Indigenous Languages")
query = st.text_input(
"Enter your question:",
value=st.session_state.get("query", ""),
label_visibility="collapsed",
placeholder="e.g. What languages are spoken in Peru?"
)
if st.button("Analyze", type="primary", use_container_width=True):
if not query:
st.warning("Please enter a question")
return
col1, col2 = st.columns(2)
for col, (label, method) in zip([col1, col2], methods.items()):
with col:
st.markdown(f"#### {label} Method")
st.caption({
"InfoMatch": "Node2Vec embeddings combining text and graph structure",
"LinkGraph": "GraphSAGE embeddings capturing network patterns"
}[label])
start = datetime.datetime.now()
response, lang_ids, context, rdf_data = generate_response(*method, query, k)
duration = (datetime.datetime.now() - start).total_seconds()
# Response display
st.markdown(f"""
{response}
β±οΈ {duration:.2f}s
π {len(lang_ids)} languages
""", unsafe_allow_html=True)
# Additional information
if show_ctx:
with st.expander(f"π Context from {len(lang_ids)} languages"):
for lang_id, ctx in zip(lang_ids, context):
st.markdown(f"{ctx}
", unsafe_allow_html=True)
if show_rdf:
with st.expander("π Structured facts (RDF)"):
st.code("\n".join(rdf_data))
# Footer note
st.markdown("---")
st.markdown("""
π Note: This tool is designed for researchers, linguists, and cultural preservationists.
For best results, use specific questions about languages, families, or regions.
""", unsafe_allow_html=True)
if __name__ == "__main__":
main()