Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture On Device Medical Notes's profile picture

Posts 17

view post
Post
858
Just released NVAMP Loss!

āœ”ļø modification of the cross-entropy loss function designed specifically for training LLMs.
āœ”ļø twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
āœ”ļø more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! šŸ”„šŸ¤–

https://github.com/mkurman/nvamp-loss