Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

310 Bytes

	input_ids = tokenizer(prompt, return_tensors="pt").input_ids
	gen_tokens = model.generate(
	input_ids,
	do_sample=True,
	temperature=0.9,
	max_length=100,
	)
	gen_text = tokenizer.batch_decode(gen_tokens)[0]

	Using Flash Attention 2
	Flash Attention 2 is an faster, optimized version of the model.