//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
The bug is Hugging Face tokenizers issue #2058 (github.com/huggingface/...): the library counts merge pairs in an i32 hash map, and once any pair's count crosses 2^31 − 1 = 2,147,483,647 the counter wraps to a negative value and the pair never gets selected as a merge.