Nishith Jain's picture

Nishith Jain

KingNish

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

liked a Space about 7 hours ago
KingNish/Transformers.js-Playground
updated a Space about 7 hours ago
KingNish/Transformers.js-Playground
published a Space about 7 hours ago
KingNish/Transformers.js-Playground
View all activity

Organizations

Wikimedia's profile picture OpenGVLab's profile picture Blog-explorers's profile picture Multi๐Ÿค–Transformers's profile picture The Collectionists's profile picture HelpingAI's profile picture ZeroGPU Explorers's profile picture Project Fluently's profile picture Poscye's profile picture INNOVA AI's profile picture Narra's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture Dev Mode Explorers's profile picture Stable Diffusion Community (Unofficial, Non-profit)'s profile picture ONNX Community's profile picture Hugging Face Discord Community's profile picture Nerdy Face's profile picture None yet's profile picture Project R's profile picture Doge Face's profile picture

KingNish's activity

reacted to stefan-french's post with ๐Ÿ˜Ž 3 days ago
reacted to as-cle-bert's post with ๐Ÿ”ฅ 7 days ago
view post
Post
2888
Llama-4 is out and I couldn't resist but to cook something with it... So I came up with ๐‹๐ฅ๐š๐ฆ๐š๐‘๐ž๐ฌ๐ž๐š๐ซ๐œ๐ก๐ž๐ซ (https://llamaresearcher.com), your deep-research AI companion!๐Ÿ”Ž

The workflow behind ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ๐—ฅ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ฒ๐—ฟ is simple:
๐Ÿ’ฌ You submit a query
๐Ÿ›ก๏ธ Your query is evaluated by Llama 3 guard model, which deems it safe or unsafe
๐Ÿง  If your query is safe, it is routed to the Researcher Agent
โš™๏ธ The Researcher Agent expands the query into three sub-queries, with which to search the web
๐ŸŒ The web is searched for each of the sub-queries
๐Ÿ“Š The retrieved information is evaluated for relevancy against your original query
โœ๏ธ The Researcher Agent produces an essay based on the information it gathered, paying attention to referencing its sources

The agent itself is also built with easy-to-use and intuitive blocks:
๐Ÿฆ™ LlamaIndex provides the agentic architecture and the integrations with the language models
โšกGroq makes Llama-4 available with its lightning-fast inference
๐Ÿ”Ž Linkup allows the agent to deep-search the web and provides sourced answers
๐Ÿ’ช FastAPI does the heavy loading with wrapping everything within an elegant API interface
โฑ๏ธ Redis is used for API rate limiting
๐ŸŽจ Gradio creates a simple but powerful user interface

Special mention also to Lovable, which helped me build the first draft of the landing page for LlamaResearcher!๐Ÿ’–

If you're curious and you want to try LlamaResearcher, you can - completely for free and without subscription - for 30 days from now โžก๏ธ https://llamaresearcher.com
And if you're like me, and you like getting your hands in code and build stuff on your own machine, I have good news: this is all open-source, fully reproducible locally and Docker-ready๐Ÿ‹
Just go to the GitHub repo: https://github.com/AstraBert/llama-4-researcher and don't forget to star it, if you find it useful!โญ

As always, have fun and feel free to leave your feedbackโœจ
  • 2 replies
ยท
reacted to merterbak's post with ๐Ÿ‘€ 10 days ago
reacted to abidlabs's post with โค๏ธ 15 days ago
view post
Post
3394
JOURNEY TO 1 MILLION DEVELOPERS

5 years ago, we launched Gradio as a simple Python library to let researchers at Stanford easily demo computer vision models with a web interface.

Today, Gradio is used by >1 million developers each month to build and share AI web apps. This includes some of the most popular open-source projects of all time, like Automatic1111, Fooocus, Oobaboogaโ€™s Text WebUI, Dall-E Mini, and LLaMA-Factory.

How did we get here? How did Gradio keep growing in the very crowded field of open-source Python libraries? I get this question a lot from folks who are building their own open-source libraries. This post distills some of the lessons that I have learned over the past few years:

1. Invest in good primitives, not high-level abstractions
2. Embed virality directly into your library
3. Focus on a (growing) niche
4. Your only roadmap should be rapid iteration
5. Maximize ways users can consume your library's outputs

1. Invest in good primitives, not high-level abstractions

When we first launched Gradio, we offered only one high-level class (gr.Interface), which created a complete web app from a single Python function. We quickly realized that developers wanted to create other kinds of apps (e.g. multi-step workflows, chatbots, streaming applications), but as we started listing out the apps users wanted to build, we realized what we needed to do:

Read the rest here: https://x.com/abidlabs/status/1907886
reacted to hexgrad's post with ๐Ÿ‘€ 15 days ago
view post
Post
4014
To Meta AI Research: I would like to fold ylacombe/expresso into the training mix of an Apache TTS model series. Can you relax the Expresso dataset license to CC-BY or more permissive?

Barring that, can I have an individual exception to train on the materials and distribute trained Apache models, without direct redistribution of the original files? Thanks!

CC (Expresso paper authors whose handles I could find on HF) @wnhsu @adavirro @bowenshi @itaigat @TalRemez @JadeCopet @hassid @felixkreuk @adiyoss @edupoux
reacted to clem's post with ๐Ÿ”ฅ 17 days ago
view post
Post
3970
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possibleโ€”just look at the โ€œTโ€ in ChatGPT, which comes from the Transformer architecture openly shared by Google.

Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.

With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratizationโ€”powered by openness and collaboration, in the US and around the world.

This is incredibly exciting. Letโ€™s go, open science and open-source AI!
ยท
reacted to burtenshaw's post with โค๏ธ 28 days ago
view post
Post
3695
The Hugging Face Agents Course now includes three major agent frameworks!

๐Ÿ”— agents-course

This includes LlamaIndex, LangChain, and our very own smolagents. We've worked to integrate the three frameworks in distinctive ways so that learners can reflect on when and where to use each.

This also means that you can follow the course if you're already familiar with one of these frameworks, and soak up some of the fundamental knowledge in earlier units.

Hopefully, this makes the agents course as open to as many people as possible.
  • 3 replies
ยท
reacted to chansung's post with โค๏ธ 30 days ago
view post
Post
2567
Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)
  • 1 reply
ยท
reacted to fdaudens's post with ๐Ÿ”ฅ 30 days ago
view post
Post
2000
๐Ÿ”Š Meet Orpheus: A breakthrough open-source TTS model that matches human-level speech with empathy & emotion.
- Available in 4 sizes (150M-3B parameters)
- delivers ultra-fast streaming
- zero-shot voice cloning.
- Apache 2.0 license

canopylabs/orpheus-tts-67d9ea3f6c05a941c06ad9d2
  • 1 reply
ยท
reacted to mlabonne's post with ๐Ÿš€ about 1 month ago
view post
Post
6119
โœ‚๏ธ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

  • 1 reply
ยท
reacted to KaiChen1998's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
4831
๐Ÿ“ข Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!

๐Ÿค— EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.

โœจ EMOVA Highlights
โœ… State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
โœ… Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
โœ… Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model, even including the most recent DeepSeekMoE-tiny!

๐Ÿ”ฅ You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo
reacted to m-ric's post with ๐Ÿ”ฅ๐Ÿค— about 1 month ago
view post
Post
4851
smolagents now support vLLM! ๐Ÿฅณ

As one of the most popular local inference solutions, the community had been asking us to integrate vLLM: after a heavy refactoring of our LLM classes, we've just released smolagents 1.11.0, with a brand new VLLMModel class.

Go try it and tell us what you think!

https://github.com/huggingface/smolagents/blob/45b2c86857b7f7657daaa74e4d17d347e9e2c4a4/src/smolagents/models.py#L497
reacted to clem's post with ๐Ÿค— about 1 month ago
view post
Post
4641
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ๐Ÿค— about 1 month ago
view post
Post
1923
Open Sora 2.0 is out ๐Ÿ”ฅ
hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5
โœจ 11B with Apache2.0
โœจ Low training cost - $200k
โœจ open weights, code and training workflow
reacted to burtenshaw's post with ๐Ÿค— about 1 month ago
view post
Post
1945
everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go!

1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running

git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft


plus this with --no-deps

git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly


2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb

3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps.

4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters.

from trl import GRPOConfig

training_args = GRPOConfig(
    learning_rate = 5e-6,
    adam_beta1 = 0.9,
    adam_beta2 = 0.99,
    weight_decay = 0.1,
    warmup_ratio = 0.1,
    lr_scheduler_type = "cosine",
    optim = "adamw_8bit",
    logging_steps = 1,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 1,
    num_generations = 2,
    max_prompt_length = 256,
    max_completion_length = 1024 - 256,
    num_train_epochs = 1,
    max_steps = 250,
    save_steps = 250,
    max_grad_norm = 0.1,
    report_to = "none",
)


5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)


if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way.

reasoning-course
  • 2 replies
ยท
reacted to thomwolf's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
2825
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: โšก๏ธOlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming โ€“a domain Anthropic has been historically really strong atโ€“ and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
reacted to BrigitteTousi's post with ๐Ÿค— about 1 month ago
view post
Post
3729
Regardless of X being down or not, so glad I can rely on HF Posts for AI news โค๏ธ๐Ÿค—
  • 1 reply
ยท
reacted to Smooke's post with ๐Ÿ‘ about 1 month ago
view post
Post
1876
Hallucinations Blog Research Reading List:

Hallucinations Are A Feature of AI, Humans Are The Bug https://hackernoon.com/hallucinations-are-a-feature-of-ai-humans-are-the-bug

Overcome LLM Hallucinations Using Knowledge Bases https://hackernoon.com/overcome-llm-hallucinations-using-knowledge-bases

How to Detect and Minimise Hallucinations in AI Models https://hackernoon.com/how-to-detect-and-minimise-hallucinations-in-ai-models

Predictive Coding, AI: Modeling Placebos in RCTs for Psychedelics and Antidepressants https://hackernoon.com/predictive-coding-ai-modeling-placebos-in-rcts-for-psychedelics-and-antidepressants

A Simple Method to Improving the Accuracy of Your RAG System https://hackernoon.com/say-goodbye-to-ai-hallucinations-a-simple-method-to-improving-the-accuracy-of-your-rag-system

Gen AI Hallucinations: The Good, the Bad, and the Costly https://hackernoon.com/gen-ai-hallucinations-the-good-the-bad-and-the-costly

Why Do LLMs Hallucinate? https://hackernoon.com/why-do-llms-hallucinate

Truth Serum For The AI Age: Factiverse To Fight Fake News And Hallucinations https://hackernoon.com/truth-serum-for-the-ai-age-factiverse-to-fight-fake-news-and-hallucinations

A Secret Technique To Sidestepping LLM Hallucinations https://hackernoon.com/a-secret-technique-to-sidestepping-llm-hallucinations

The Importance of Explainability in AI (XAI) https://hackernoon.com/tackling-ai-hallucinations-the-importance-of-explainability-in-ai-xai

What You Need to Know About Amazon Bedrockโ€™s RAG Evaluation and LLM-as-a-Judge for Advancing AI https://hackernoon.com/what-you-need-to-know-about-amazon-bedrocks-rag-evaluation-and-llm-as-a-judge-for-advancing-ai

I Over Relied on AI and Those Shortcuts Cost Me https://hackernoon.com/i-over-relied-on-ai-and-those-shortcuts-cost-me

AIโ€™s Non-Determinism, Hallucinations, And... Cats? https://hackernoon.com/ais-non-determinism-hallucinations-and-cats

More to read --> https://hackernoon.com/search?query=hallucinations