Andriy (Burkov) – Community Activity

New activity in google/gemma-2-9b 9 months ago

Issues with FSDP and DeepSpeed During Distributed Training for Gemma

2

5

#30 opened 9 months ago by

anandhperumal

New activity in mistralai/Mistral-7B-Instruct-v0.2 about 1 year ago

How does v0.2 manages to support 32k token context without Sliding Window Attention?

4

#85 opened about 1 year ago by

Andriy

What is the max. content length of Mistral-7B-Instruct-v0.2?

17

#43 opened about 1 year ago by

hanshupe

New activity in 1bitLLM/bitnet_b1_58-3B about 1 year ago

Longer inference time

2

#4 opened about 1 year ago by

dittops

New activity in 01-ai/Yi-34B-Chat about 1 year ago

What the SFT data?

3

5

#7 opened over 1 year ago by

Ede-CH

New activity in abacaj/phi-2-super about 1 year ago

Dataset?

5

#1 opened about 1 year ago by

0xbitches

New activity in abacusai/Smaug-72B-v0.1 about 1 year ago

Questions about architecture (+ LoRA)

2

#16 opened about 1 year ago by

alex0dd

New activity in OpenPipe/mistral-ft-optimized-1218 over 1 year ago

Can you tell us the original models that you merged to create this model？

3

1

#3 opened over 1 year ago by

Bruce001

Burkov

AI & ML interests

Organizations

Andriy's activity