Rahim Khan

rahim-xelpmoc
ยท

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face Discord Community's profile picture

rahim-xelpmoc's activity

upvoted an article 6 days ago
view article
Article

How to generate text: using different decoding methods for language generation with Transformers

โ€ข 193
New activity in jamesliu1217/EasyControl_Ghibli 13 days ago
published a Space 28 days ago
New activity in ds4sd/SmolDocling-256M-preview about 1 month ago

hallucinating a lot

1
#30 opened about 1 month ago by
rahim-xelpmoc
reacted to AdinaY's post with ๐Ÿ˜Ž 2 months ago
reacted to nicolay-r's post with ๐Ÿ”ฅ 3 months ago
view post
Post
1623
๐Ÿ“ข The LLaMA-3.1-8B distilled 8B version of the R1 DeepSeek AI is available besides the one based on Qwen

๐Ÿ“™ Notebook for using it in reasoning over series of data ๐Ÿง  :
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_llama3.ipynb

Loading using the pipeline API of the transformers library:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_llama.py
๐ŸŸก GPU Usage: 12.3 GB (FP16/FP32 mode) which is suitable for T4. (a 1.5 GB less than Qwen-distilled version)
๐ŸŒ Perfomance: T4 instance: ~0.19 tokens/sec (FP32 mode) and (FP16 mode) ~0.22-0.30 tokens/sec. Is it should be that slow? ๐Ÿค”
Model name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
โญ Framework: https://github.com/nicolay-r/bulk-chain
๐ŸŒŒ Notebooks and models hub: https://github.com/nicolay-r/nlp-thirdgate