Context length

#4
by harley-pham - opened

Thanks for the good works!
BTW, can you help to explain why is the context length only 512 while the original one seems much larger?

No problem. The model had to have a fixed context size to work on ANE. Longer contexts are slower even when you have a short sequence (the rest is filled with padding). I chose 512 since it is usable and also decently fast. You can probably go a bit higher but the speed does not scale linearly (it is worse).

It would be possible to use multifunction models to support different sized contexts but I haven’t gotten around to that.

The KV cache also gets to be quite substantial for longer contexts.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment