443 Commits

Author SHA1 Message Date
Andrej 7325bab657 Merge pull request #365 from atamurad/patch-1
Update README.md - unclosed code block quotes
2023-08-26 20:11:04 -07:00
Atamurad Hezretkuliyev 37157bc0a3 Update README.md
Fixed unclosed code block quotes
2023-08-27 02:27:47 +03:00
Andrej Karpathy f4b8a81742 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-26 21:22:28 +00:00
Andrej Karpathy 91d57db925 add note on code llama being a bit wrong 2023-08-26 21:22:19 +00:00
Andrej f856539f41 Merge pull request #363 from byte-6174/patch-1
fix tinyllamas url
2023-08-26 14:13:20 -07:00
byte-6174 b5a0b65dbf fix tinyllamas url 2023-08-26 17:05:21 -04:00
Andrej 7b0017c6cd Merge pull request #362 from byte-6174/upmaster
freeing tokenizer in test.c
2023-08-26 14:03:31 -07:00
Andrej Karpathy 50832e3dff move script into the new docs folder 2023-08-26 21:02:23 +00:00
Andrej Karpathy 1386edfd90 add docs on stories260K 2023-08-26 20:52:49 +00:00
Aniket 32cecbfe4a freeing tokenizer in test.c 2023-08-26 16:35:50 -04:00
Andrej e47bacdc62 Merge pull request #355 from janimo/export-vocab-size
Export vocab size and Code Llama usage docs
2023-08-26 13:24:55 -07:00
Jani Monoses 604d3c59c0 Add Code Llama info 2023-08-26 22:36:09 +03:00
Jani Monoses 2c2b284988 Get vocab_size from token embeddings size 2023-08-26 22:35:55 +03:00
Andrej 49daf18f2f Merge pull request #343 from karpathy/feature/chat
Add interactive loop to enable nice chat with a Llama 2 Chat model
2023-08-25 08:00:11 -07:00
Andrej 4a7a62bd21 Merge branch 'master' into feature/chat 2023-08-25 07:58:33 -07:00
Andrej 5c6427e4d7 Merge pull request #352 from dmarcos/readmeTypo
Fix typo in README.md
2023-08-25 07:55:54 -07:00
Andrej cbc2488b82 Merge pull request #353 from photomz/master
Clearer WandB log step
2023-08-25 07:55:26 -07:00
Andrej Karpathy fbe324fc5a adjust things a bit 2023-08-25 14:54:05 +00:00
Markus Zhang 6def77d4ba Correct WandB log step 2023-08-25 17:12:29 +08:00
Diego Marcos Segura 19cfbeca71 Fix typo in README.md 2023-08-24 19:46:43 -07:00
Andrej d7cd98633d add todo item to add a PyTorch Engine 2023-08-24 09:04:52 -07:00
Andrej Karpathy 3d787b2463 ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such 2023-08-24 04:31:06 +00:00
Andrej Karpathy 40fb902cf0 fix chat format bug i think 2023-08-24 03:33:44 +00:00
Andrej Karpathy c7a26264a2 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-24 03:10:18 +00:00
Andrej Karpathy 446c1c0df3 Merge branch 'janimo-train-vocab-python' 2023-08-24 03:10:07 +00:00
Andrej Karpathy 096325b66c bring back num_threads 2023-08-24 03:09:55 +00:00
Andrej 90104db721 Merge pull request #348 from nehzata/clip_steps
Clip steps maximum value
2023-08-23 19:57:01 -07:00
Ali Nehzat 9bc72acab0 steps shouldn't exceed the model's seq_len either 2023-08-24 09:09:16 +10:00
Andrej Karpathy c5e0e7fce4 attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now 2023-08-23 16:27:48 +00:00
Jani Monoses fe9b9f2f15 Train vocab in Python 2023-08-23 19:10:28 +03:00
Andrej Karpathy 7ac65cb2c2 make decode safer and fix issue with skipping bad byte tokens 2023-08-23 01:08:31 +00:00
Andrej Karpathy 4b3e66021a lol text 2023-08-23 00:26:47 +00:00
Andrej Karpathy d1eb18b8ec add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability 2023-08-23 00:08:22 +00:00
Andrej Karpathy d26a499207 absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful. 2023-08-22 03:22:56 +00:00
Andrej Karpathy ac6cf8d6e8 tweak todo list 2023-08-22 02:48:51 +00:00
Andrej Karpathy ad7a1ef525 clean up swiglu a little bit 2023-08-22 02:32:21 +00:00
Andrej Karpathy 0e362f735f and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close 2023-08-22 02:22:36 +00:00
Andrej Karpathy d73b917d3b hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner 2023-08-22 02:17:51 +00:00
Andrej Karpathy 379f083b85 make sorted vocab a buffer of Tokenizer 2023-08-22 01:56:51 +00:00
Andrej 5eaca535cd Merge pull request #335 from ozabluda/ozabluda-patch-5
Remove unneeded check of free(NULL)
2023-08-21 18:16:07 -07:00
Andrej Karpathy 83287ff254 fix steps=0 is max context 2023-08-22 01:15:00 +00:00
Oleg Zabluda c2834c8a1f Remove unneeded check of free(NULL)
Passing NULL to free() is totally allowed
2023-08-21 10:54:53 -07:00
Andrej ee95b1bf29 Merge pull request #315 from davidar/vocab_source
Fix vocab_source in sample.py
2023-08-21 08:26:28 -07:00
Andrej Karpathy d02e0c90d8 Merge branch 'rdentato-patch-check-params' 2023-08-21 15:17:37 +00:00
Andrej Karpathy 33d94f60a5 parameter validation cleanup 2023-08-21 15:17:14 +00:00
Remo Dentato 2d972f1763 Merge branch 'karpathy:master' into patch-check-params 2023-08-21 17:02:42 +02:00
Andrej 8a3ea7b433 Merge pull request #329 from atamurad/import_meta
Moved export_meta_llama_bin.py to new export.py
2023-08-21 07:34:32 -07:00
atamyrat 61c26d5392 Updated README to replace export_meta_llama_bin.py script with export.py 2023-08-21 14:24:01 +03:00
atamyrat 36a78af5e1 tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file 2023-08-21 14:19:56 +03:00
atamyrat de005474d3 Added load_meta_model() to export.py 2023-08-21 14:13:47 +03:00