This website requires JavaScript.
Explore
Help
Sign In
schihei
/
llama2.c
Watch
1
Star
0
Fork
0
You've already forked llama2.c
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
285
Commits
5
Branches
0
Tags
4c6f0af9ff3671b0b8053c6a3a512a06bad5c676
Commit Graph
3 Commits
Author
SHA1
Message
Date
Andrej Karpathy
4c6f0af9ff
add the ability to train a custom sentencepiece tokenizer with a given vocab_size, and pretok with it. some more changes still needed to merge this branch, in train.py and ofc run.c. did this in a sadly bit ugly, but fully backwards compatible way. basically when we use custom tokenizer we create a whole new directory structure for that
2023-08-11 03:58:22 +00:00
Milos Cubrilo
af3f3a7b31
Speed up tinystories pretokenize command
2023-07-29 03:08:33 +02:00
Andrej Karpathy
5b161abb9a
somewhere ~20 hours later
2023-07-23 05:23:45 +00:00