Ravi

ravi4198
ยท

AI & ML interests

None yet

Recent Activity

liked a model 17 days ago
meta-llama/Llama-4-Maverick-17B-128E
liked a model about 1 month ago
google/gemma-3-27b-it
View all activity

Organizations

ONNX Community's profile picture

ravi4198's activity

reacted to smirki's post with ๐Ÿ”ฅ 2 months ago
view post
Post
3404
UIGEN for Tailwind v4 is coming soon!
  • 2 replies
ยท
reacted to fdaudens's post with ๐Ÿ‘ 2 months ago
view post
Post
2137
๐Ÿ”Š Meet Kokoro Web - Free, ML speech synthesis on your computer, that'll make you ditch paid services!

28 natural voices, unlimited generations, and WebGPU acceleration. Perfect for journalists and content creators.

Test it with full articlesโ€”sounds amazingly human! ๐ŸŽฏ๐ŸŽ™๏ธ

Xenova/kokoro-web
reacted to hexgrad's post with ๐Ÿ”ฅ 2 months ago
view post
Post
7064
Wanted: Peak Data. I'm collecting audio data to train another TTS model:
+ AVM data: ChatGPT Advanced Voice Mode audio & text from source
+ Professional audio: Permissive (CC0, Apache, MIT, CC-BY)

This audio should *impress* most native speakers, not just barely pass their audio Turing tests. Professional-caliber means S or A-tier, not your average bloke off the street. Traditional TTS may not make the cut. Absolutely no low-fi microphone recordings like Common Voice.

The bar is much higher than last time, so there are no timelines yet and I expect it may take longer to collect such mythical data. Raising the bar means evicting quite a bit of old data, and voice/language availability may decrease. The theme is *quality* over quantity. I would rather have 1 hour of A/S-tier than 100 hours of mid data.

I have nothing to offer but the north star of a future Apache 2.0 TTS model, so prefer data that you *already have* and costs you *nothing extra* to send. Additionally, *all* the new data may be used to construct public, Apache 2.0 voicepacks, and if that arrangement doesn't work for you, no need to send any audio.

Last time I asked for horses; now I'm asking for unicorns. As of writing this post, I've currently got a few English & Chinese unicorns, but there is plenty of room in the stable. Find me over on Discord at rzvzn: https://discord.gg/QuGxSWBfQy
ยท
reacted to Xenova's post with ๐Ÿ”ฅ 2 months ago
view post
Post
13156
We did it. Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. โšก๏ธ

Generate 10 seconds of speech in ~1 second for $0.

What will you build? ๐Ÿ”ฅ
webml-community/kokoro-webgpu

The most difficult part was getting the model running in the first place, but the next steps are simple:
โœ‚๏ธ Implement sentence splitting, allowing for streamed responses
๐ŸŒ Multilingual support (only phonemization left)

Who wants to help?
ยท
New activity in onnx-community/Kokoro-82M-v1.0-ONNX 2 months ago

Appreciation

2
#1 opened 2 months ago by
ravi4198
reacted to m-ric's post with ๐Ÿ‘ 3 months ago
view post
Post
3386
Today we make the biggest release in smolagents so far: ๐˜„๐—ฒ ๐—ฒ๐—ป๐—ฎ๐—ฏ๐—น๐—ฒ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€, ๐˜„๐—ต๐—ถ๐—ฐ๐—ต ๐—ฎ๐—น๐—น๐—ผ๐˜„๐˜€ ๐˜๐—ผ ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ ๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ณ๐˜‚๐—น ๐˜„๐—ฒ๐—ฏ ๐—ฏ๐—ฟ๐—ผ๐˜„๐˜€๐—ถ๐—ป๐—ด ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€! ๐Ÿฅณ

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while ๐Ÿคฏ (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog ๐Ÿ‘‰ https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here ๐Ÿ‘‰ https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
ยท
reacted to onekq's post with ๐Ÿ”ฅ 3 months ago
view post
Post
2734
This is historical. ๐ŸŽ‰

DeepSeek ๐Ÿ‹R1๐Ÿ‹ surpassed OpenAI ๐Ÿ“o1๐Ÿ“ on the dual leaderboard. What a year for the open source!

onekq-ai/WebApp1K-models-leaderboard
reacted to onekq's post with ๐Ÿ”ฅ 3 months ago
view post
Post
4813
๐Ÿ‹DeepSeek ๐Ÿ‹ is the real OpenAI ๐Ÿ˜ฏ
ยท