Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 306 Bytes

5fa1a76

Model Details
Mistral-7B-v0.1 is a decoder-based LM with the following architectural choices:
* Sliding Window Attention - Trained with 8k context length and fixed cache size, with a theoretical attention span of 128K tokens
* GQA (Grouped Query Attention) - allowing faster inference and lower cache size.