This is unbelievable! Bravo team 👏

by yunomioni - opened 11 days ago

Discussion

yunomioni

11 days ago

That's all I have to say. I really appreciate your team's efforts. Thank you!

michaelzhiluo

Agentica org 11 days ago

Happy to hear this!

Some updates.. our LCB scores has just been verified by a 3rd party, it will be officially on the leaderboard early this week ;)

legolasyiu

9 days ago

Good job!

dfsafdsf

9 days ago

•

edited 9 days ago

@michaelzhiluo what about merge your model ? have you looked ? (https://huggingface.co/spacematt/Qwen2.5-Recursive-Coder-14B-Instruct via transitive Qwen2.5-Channel-Coder-14B-Instruct)

battleman0526

8 days ago

This comment has been hidden

battleman0526

8 days ago

This comment has been hidden (marked as Graphic Content)

battleman0526

8 days ago

This comment has been hidden (marked as Abuse)

Mwd1993

3 days ago

so far this has been good. I'm surprised how well and fast it works and how useful it is (on a 3060) when used for coding. Thank you for the release.

Mushoz

2 days ago

@michaelzhiluo Your comment about the LCB scores being on the leaderboard soon were made 9 days ago. Maybe I am missing something, but I am unable to find them. Could you point me in the right direction, please?

michaelzhiluo

Agentica org 2 days ago

Recently, the LCB leaderboard has been updated to v6. They should be there if the marker is set <=2/1/2025. We will contact the author leads to update the leaderboard with the new DeepCoder results!

Mushoz

2 days ago

I am still unable to find the model. What am I doing wrong? These are the steps I am taking:

Go to https://livecodebench.github.io/leaderboard.html
Drag the right slider to 2/1/2025 (I also tried 1/1/2025 with the same result)
Control + F and search for DeepCoder -> No results found

I get 16 results (only 15 visible in the screenshot, the 16th is Claude-3-Haiku)

michaelzhiluo

Agentica org 2 days ago

•

edited 2 days ago

There were formerly 40+ results on the leaderboard before it got updated to v6. I do happen to have a screenshot here for v5 a week ago.

Also a Tweet from one of our collaborators: https://x.com/AlpayAriyak/status/1912171348409061468

Mushoz

1 day ago

Thanks for that! Unfortunate that they removed so many models all of a sudden, really weird. Is there any chance your model could also be evaluated on Aider's benchmark? It's quite a popular benchmark for testing the coding capabilities of LLMs. The leaderboard can be found here: https://aider.chat/docs/leaderboards/

How to run the benchmark can be found here: https://github.com/Aider-AI/aider/blob/main/benchmark/README.md

ZiggyS

1 day ago

That's all I have to say. I really appreciate your team's efforts. Thank you!

still doing my own testing, but i tend to agree.

michaelzhiluo

Agentica org about 1 hour ago

@Mushoz Can try later... we do note that it is trained on Competitive Coding problems, and we're currently working on training it for more general coding settings ;)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment