CausalLM
/

35b-beta2ep

Text Generation

text-generation-inference

Model card Files Files and versions Community

Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+ ~ 30M entries long, web crawl input, GPT-4-32k/3.5-16k output, synthetic dataset - 1 epoch

For another candidate version of 1 epoch - https://huggingface.co/CausalLM/35b-beta - somehow less overfitting?

No loras, no quants, no tricks.

This one is not "very 128k", use https://huggingface.co/CausalLM/35b-beta-long for long context. But better in general tasks, knowledge, coding and so on.

And, merge them if you want!

Downloads last month: 16

Safetensors

Model size

35B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CausalLM/35b-beta2ep

Quantizations

1 model

Datasets used to train CausalLM/35b-beta2ep

Collection including CausalLM/35b-beta2ep

34B & 35B

5 items • Updated Aug 25, 2024 • 2