Commit Graph

327 Commits

Author SHA1 Message Date
Andrej 013e012b87 Merge pull request #286 from Nick-infinity/master
[Feat]: Add support for meta llama hf model conversion
2023-08-14 07:46:39 -07:00
Andrej 50f970d170 Merge pull request #289 from chenyangMl/update_readme
Update readme to introduce llama2.c-zh
2023-08-14 07:41:13 -07:00
chenyang 2a9a4c4e14 update readme wiht a simple line to introduce llama2.c-zh 2023-08-14 15:12:30 +08:00
chenyang 79900ff68e update readme wiht a simple line to introduce llama2.c-zh 2023-08-14 15:00:33 +08:00
Andrej Karpathy 82ad2ba34e remove tiktoken as dependency 2023-08-14 05:53:57 +00:00
Nikhil Gupta c39f19f1a9 [Feat]: Add support for meta llama hf model conversion
Description:
Llama 2 hf models have weights stored with diff name

Signed-off-by: Nikhil Gupta <nikhilg.me@gmail.com>
2023-08-14 10:18:51 +05:30
Andrej bae0bcf484 Small tweaks to Readme intro 2023-08-13 20:03:00 -07:00
Andrej Karpathy 45afa91dca the accum function has been bothering me, there is no real need to add a function here, it does something trivial and is only used twice, scrap 2023-08-14 02:54:27 +00:00
Andrej Karpathy 854c97b660 turn topp 0.9 back on by default thanks to recent PR contributions truncating before quicksort 2023-08-14 00:12:45 +00:00
Andrej 4a2c375df9 Merge pull request #276 from jrudolph/improve-top-p
optimize sample_topp by filtering out small value elements up front
2023-08-13 17:05:38 -07:00
Andrej b3d6a9e6b5 Merge pull request #285 from karpathy/feature/civ2
Upgrading CI to run our new pytest
2023-08-13 16:55:01 -07:00
Andrej 091c799653 Merge branch 'master' into feature/civ2 2023-08-13 16:54:24 -07:00
Andrej Karpathy c970f69334 oops i should probably call this function lol 2023-08-13 23:48:01 +00:00
Andrej Karpathy 223a67048a add optional manual dispatch of actions 2023-08-13 23:39:37 +00:00
Andrej Karpathy 86325bf7e8 attempt to upgrade the CI to run our pytest 2023-08-13 23:35:29 +00:00
Andrej b51c63b9f2 Merge pull request #283 from wizzard0/wizzard0-mention-1
Add TypeScript port
2023-08-13 14:36:10 -07:00
Andrej Karpathy 8506036185 remove 'revive tests' as a todo from the readme 2023-08-13 21:23:27 +00:00
Andrej Karpathy f0024cfc88 revive tests. now that we have a tiny stories260K model this only requires a 2MB download. phew 2023-08-13 21:22:44 +00:00
Andrej 0805cb2c31 tiny whitespace fix to try to eliminate scrollbar 2023-08-13 13:40:09 -07:00
Andrej b2cce341e0 oops typo fix in readme 2023-08-13 13:39:12 -07:00
Andrej Karpathy 3e989e21f2 link to stories260K model 2023-08-13 20:38:05 +00:00
Andrej Karpathy 58075b5ac5 update API of sample.py to be better, small changes here 2023-08-13 20:31:32 +00:00
Andrej 1bcb2d18d6 Merge pull request #284 from karpathy/feature/customtokenizer
multiquery support add
2023-08-13 12:38:06 -07:00
Andrej Karpathy 38bfac90a8 bigchange: add multiquery support in run.c. we can now train and inference multiquery models (where n_kv_heads < n_heads). this also means that we, in principle, support Llama 2 34B and 70B models, which are multiquery 2023-08-13 19:34:05 +00:00
Andrej b28c1e26c5 Merge pull request #275 from icppWorld/webassembly-internet-computer
Notable fork section for WebAssembly
2023-08-13 10:14:39 -07:00
Andrej 5295cbb821 Merge pull request #281 from lintian06/original_llama2
Update README.md for a new rust port.
2023-08-13 10:14:00 -07:00
Andrej 12dec61fbf Merge pull request #282 from mihainadas/master-1
Fixes https://github.com/karpathy/llama2.c/issues/280
2023-08-13 10:13:08 -07:00
Oleksandr Nikitin 0e6213c6e0 Mention I can run the full 7B model 2023-08-13 20:02:34 +03:00
Oleksandr Nikitin 1d68a36d14 Add TypeScript port
I've never been so happy to have missed that the JS port already exists :D also it was nice to discover that the JS can reach 80% of the single-threaded C speed (10 tokens/s for TinyStories-110M)
2023-08-13 19:10:07 +03:00
Mihai Nadăș 570789aa04 Fixes https://github.com/karpathy/llama2.c/issues/280
There was a small bug in tinystories.py, described here: https://github.com/karpathy/llama2.c/issues/280

This commit simply passes vocab_size to get_tokenizer_model_path to avoid silent crash when processing shards (in process_shard)
2023-08-13 17:49:10 +03:00
Tian Lin 27adb082f1 Update README.md 2023-08-13 21:58:14 +08:00
Andrej 8b472ded1f Merge pull request #272 from karpathy/feature/customtokenizer
Big Change: Custom Tokenizer training: add the ability to train custom tokenizers instead of using the pretrained Llama 2 tokenizer. This is useful in custom, narrow-domain LLMs because smaller vocab sizes make much smaller, faster, and potentially more capable models. For example, in tinystories a vocab size 4096 custom tokenizer compresses the input text sequences about as well as the Llama 2 tokenizer with vocab size 32000. The result is also "safer" because a badly trained model can't accidentally e.g. output some random chinese character and rapidly go "off the rails" in subsequent tokens.
2023-08-12 20:31:21 -07:00
Andrej Karpathy 9ff459b925 todo changes 2023-08-13 03:24:31 +00:00
Andrej Karpathy 1d14cb8dd8 add note about 4096 vs 32000 token size on tinystories 2023-08-13 03:19:35 +00:00
Andrej Karpathy fe49eb222c readme for custom tokenizers 2023-08-13 03:16:18 +00:00
Andrej Karpathy 9c3cfb46a3 make default be the llama2 tokenizer 2023-08-13 03:08:07 +00:00
Andrej Karpathy 00a61dc7f9 remove the tinyshakespeare dataset until i can bring it back later in a nicer form, otherwise right now we just have a ton of copy paste code here 2023-08-13 02:18:30 +00:00
Andrej Karpathy f5fc0c245f final piece: run.c support for new tokenizer, super ez 2023-08-13 02:12:13 +00:00
Andrej Karpathy ea4cedc588 add ability to export custom tokenizer to .bin format for run.c file 2023-08-13 02:00:19 +00:00
Johannes Rudolph d421a95b2b optimize sample_topp by filtering out small value elements up front
This works because we know that in worst case only 1 element will be selected
and therefore the remaining (n-1) elements have to split the remaining (1-topp)
probability. Probabilities smaller than that cannot be selected and can
be filtered out up front.
2023-08-12 20:31:19 +02:00
Andrej Karpathy b0cfa2458d ok i can train and sample a model with a custom tokenizer 2023-08-11 16:47:29 +00:00
icpp f96c7afb2d Notable fork section for WebAssembly
Added my repo `icpp-lmm` for running it on the Internet Computer
2023-08-11 10:11:32 -04:00
Andrej Karpathy 4c6f0af9ff add the ability to train a custom sentencepiece tokenizer with a given vocab_size, and pretok with it. some more changes still needed to merge this branch, in train.py and ofc run.c. did this in a sadly bit ugly, but fully backwards compatible way. basically when we use custom tokenizer we create a whole new directory structure for that 2023-08-11 03:58:22 +00:00
Andrej Karpathy c42641205f turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think... 2023-08-10 15:23:05 +00:00
Andrej Karpathy 3f69c6cdc4 change the default to use runfast, which imo works just fine 2023-08-10 05:06:49 +00:00
Andrej 5f8068fd43 Merge pull request #260 from madroidmaq/master
Add Jupyter notebook for easier feel the magic
2023-08-09 22:03:36 -07:00
Andrej f60285ee78 Merge pull request #264 from trrahul/master
Added C# port information in readme
2023-08-09 22:00:23 -07:00
Andrej 04121d1b85 Merge pull request #256 from rdentato/patch-rng-seed
Patch rng seed
2023-08-09 21:56:07 -07:00
Rahul TR 256e7f885b Added C# port information in readme 2023-08-09 17:59:47 +05:30
Andrej Karpathy e36e3fb50d Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-09 02:08:37 +00:00