togethercomputer
/

LLaMA-2-7B-32K

Text Generation

text-generation-inference

Model card Files Files and versions

Ctrl+K

Ctrl+K

5 contributors

History: 20 commits

Birchlabs's picture

Fix RuntimeError: pad attn scores back to original query sequence length, instead of unpadded sequence length (i.e. no change).

e6c58da over 1 year ago

.gitattributes

1.52 kB

initial commit almost 2 years ago
README.md

6.15 kB

Update README.md almost 2 years ago
added_tokens.json

21 Bytes

init almost 2 years ago
config.json

709 Bytes

init almost 2 years ago
generation_config.json

132 Bytes

init almost 2 years ago
modeling_flash_llama.py

45.3 kB

Fix RuntimeError: pad attn scores back to original query sequence length, instead of unpadded sequence length (i.e. no change). over 1 year ago
pytorch_model-00001-of-00002.bin
Detected Pickle imports (4)
- "torch._utils._rebuild_tensor_v2",
- "torch.FloatStorage",
- "torch.HalfStorage",
- "collections.OrderedDict"
What is a pickle import?
9.98 GB
xet

fix missing inv freq almost 2 years ago
pytorch_model-00002-of-00002.bin
Detected Pickle imports (4)
- "torch.FloatStorage",
- "torch._utils._rebuild_tensor_v2",
- "torch.HalfStorage",
- "collections.OrderedDict"
What is a pickle import?
3.5 GB
xet

fix missing inv freq almost 2 years ago
pytorch_model.bin.index.json

26.8 kB

fix missing inv freq almost 2 years ago
special_tokens_map.json

435 Bytes

init almost 2 years ago
tokenizer.model

500 kB
xet

init almost 2 years ago
tokenizer_config.json

670 Bytes

Update tokenizer_config.json almost 2 years ago