Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

306 Bytes

	Using Local self attention, the memory and time complexity of the query-key matmul operation can be reduced from
	\(\mathcal{O}(n_s \times n_s)\) to \(\mathcal{O}(n_s \times \log(n_s))\), which usually represents the memory
	and time bottleneck in a transformer model, with \(n_s\) being the sequence length.