Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
11
Dmytro Dzhulgakov
dzhulgakov
Follow
clem's profile picture
naturelizer's profile picture
21world's profile picture
5 followers
·
10 following
dzhulgakov
dzhulgakov
AI & ML interests
None yet
Recent Activity
new
activity
2 days ago
deepseek-ai/DeepSeek-V3.1:
Add tools to the end of the system prompt
new
activity
about 1 month ago
moonshotai/Kimi-K2-Instruct:
Adjust number of reserved tokens to match the model
new
activity
8 months ago
deepseek-ai/DeepSeek-V3:
Bug in fp8_cast_bf16.py
View all activity
Organizations
dzhulgakov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
deepseek-ai/DeepSeek-V3.1
2 days ago
Add tools to the end of the system prompt
#20 opened 2 days ago by
dzhulgakov
New activity in
moonshotai/Kimi-K2-Instruct
about 1 month ago
Adjust number of reserved tokens to match the model
#15 opened about 1 month ago by
dzhulgakov
New activity in
deepseek-ai/DeepSeek-V3
8 months ago
Bug in fp8_cast_bf16.py
1
#4 opened 8 months ago by
dzhulgakov
updated
a model
about 1 year ago
fireworks-ai/llama-68m-test
0.1B
•
Updated
Aug 9, 2024
•
5
New activity in
deepseek-ai/DeepSeek-Coder-V2-Instruct
about 1 year ago
How important is the grouped_topk?
👀
1
#6 opened about 1 year ago by
dzhulgakov
New activity in
google/gemma-2-9b
about 1 year ago
Can't repro MMLU: sliding window attention implementation seems broken
3
#11 opened about 1 year ago by
dzhulgakov
updated
a model
about 1 year ago
fireworks-ai/mistral-7b-eagle-head-experimental
Text Generation
•
Updated
Jun 1, 2024
•
6
New activity in
google/gemma-7b-it
over 1 year ago
Running sample code gives ma a shape error
1
#22 opened over 1 year ago by
dzhulgakov
New activity in
DiscoResearch/mixtral-7b-8expert
over 1 year ago
Update modeling_moe_mistral.py
2
#1 opened over 1 year ago by
bjoernp
commented
a paper
almost 2 years ago
Mistral 7B
Paper
•
2310.06825
•
Published
Oct 10, 2023
•
52
•
8