Andrej
|
7325bab657
|
Merge pull request #365 from atamurad/patch-1
Update README.md - unclosed code block quotes
|
2023-08-26 20:11:04 -07:00 |
|
Atamurad Hezretkuliyev
|
37157bc0a3
|
Update README.md
Fixed unclosed code block quotes
|
2023-08-27 02:27:47 +03:00 |
|
Andrej Karpathy
|
f4b8a81742
|
Merge branch 'master' of github.com:karpathy/llama2.c
|
2023-08-26 21:22:28 +00:00 |
|
Andrej Karpathy
|
91d57db925
|
add note on code llama being a bit wrong
|
2023-08-26 21:22:19 +00:00 |
|
Andrej
|
f856539f41
|
Merge pull request #363 from byte-6174/patch-1
fix tinyllamas url
|
2023-08-26 14:13:20 -07:00 |
|
byte-6174
|
b5a0b65dbf
|
fix tinyllamas url
|
2023-08-26 17:05:21 -04:00 |
|
Andrej
|
7b0017c6cd
|
Merge pull request #362 from byte-6174/upmaster
freeing tokenizer in test.c
|
2023-08-26 14:03:31 -07:00 |
|
Andrej Karpathy
|
50832e3dff
|
move script into the new docs folder
|
2023-08-26 21:02:23 +00:00 |
|
Andrej Karpathy
|
1386edfd90
|
add docs on stories260K
|
2023-08-26 20:52:49 +00:00 |
|
Aniket
|
32cecbfe4a
|
freeing tokenizer in test.c
|
2023-08-26 16:35:50 -04:00 |
|
Andrej
|
e47bacdc62
|
Merge pull request #355 from janimo/export-vocab-size
Export vocab size and Code Llama usage docs
|
2023-08-26 13:24:55 -07:00 |
|
Jani Monoses
|
604d3c59c0
|
Add Code Llama info
|
2023-08-26 22:36:09 +03:00 |
|
Jani Monoses
|
2c2b284988
|
Get vocab_size from token embeddings size
|
2023-08-26 22:35:55 +03:00 |
|
Andrej
|
49daf18f2f
|
Merge pull request #343 from karpathy/feature/chat
Add interactive loop to enable nice chat with a Llama 2 Chat model
|
2023-08-25 08:00:11 -07:00 |
|
Andrej
|
4a7a62bd21
|
Merge branch 'master' into feature/chat
|
2023-08-25 07:58:33 -07:00 |
|
Andrej
|
5c6427e4d7
|
Merge pull request #352 from dmarcos/readmeTypo
Fix typo in README.md
|
2023-08-25 07:55:54 -07:00 |
|
Andrej
|
cbc2488b82
|
Merge pull request #353 from photomz/master
Clearer WandB log step
|
2023-08-25 07:55:26 -07:00 |
|
Andrej Karpathy
|
fbe324fc5a
|
adjust things a bit
|
2023-08-25 14:54:05 +00:00 |
|
Markus Zhang
|
6def77d4ba
|
Correct WandB log step
|
2023-08-25 17:12:29 +08:00 |
|
Diego Marcos Segura
|
19cfbeca71
|
Fix typo in README.md
|
2023-08-24 19:46:43 -07:00 |
|
Andrej
|
d7cd98633d
|
add todo item to add a PyTorch Engine
|
2023-08-24 09:04:52 -07:00 |
|
Andrej Karpathy
|
3d787b2463
|
ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such
|
2023-08-24 04:31:06 +00:00 |
|
Andrej Karpathy
|
40fb902cf0
|
fix chat format bug i think
|
2023-08-24 03:33:44 +00:00 |
|
Andrej Karpathy
|
c7a26264a2
|
Merge branch 'master' of github.com:karpathy/llama2.c
|
2023-08-24 03:10:18 +00:00 |
|
Andrej Karpathy
|
446c1c0df3
|
Merge branch 'janimo-train-vocab-python'
|
2023-08-24 03:10:07 +00:00 |
|
Andrej Karpathy
|
096325b66c
|
bring back num_threads
|
2023-08-24 03:09:55 +00:00 |
|
Andrej
|
90104db721
|
Merge pull request #348 from nehzata/clip_steps
Clip steps maximum value
|
2023-08-23 19:57:01 -07:00 |
|
Ali Nehzat
|
9bc72acab0
|
steps shouldn't exceed the model's seq_len either
|
2023-08-24 09:09:16 +10:00 |
|
Andrej Karpathy
|
c5e0e7fce4
|
attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now
|
2023-08-23 16:27:48 +00:00 |
|
Jani Monoses
|
fe9b9f2f15
|
Train vocab in Python
|
2023-08-23 19:10:28 +03:00 |
|
Andrej Karpathy
|
7ac65cb2c2
|
make decode safer and fix issue with skipping bad byte tokens
|
2023-08-23 01:08:31 +00:00 |
|
Andrej Karpathy
|
4b3e66021a
|
lol text
|
2023-08-23 00:26:47 +00:00 |
|
Andrej Karpathy
|
d1eb18b8ec
|
add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability
|
2023-08-23 00:08:22 +00:00 |
|
Andrej Karpathy
|
d26a499207
|
absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful.
|
2023-08-22 03:22:56 +00:00 |
|
Andrej Karpathy
|
ac6cf8d6e8
|
tweak todo list
|
2023-08-22 02:48:51 +00:00 |
|
Andrej Karpathy
|
ad7a1ef525
|
clean up swiglu a little bit
|
2023-08-22 02:32:21 +00:00 |
|
Andrej Karpathy
|
0e362f735f
|
and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close
|
2023-08-22 02:22:36 +00:00 |
|
Andrej Karpathy
|
d73b917d3b
|
hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner
|
2023-08-22 02:17:51 +00:00 |
|
Andrej Karpathy
|
379f083b85
|
make sorted vocab a buffer of Tokenizer
|
2023-08-22 01:56:51 +00:00 |
|
Andrej
|
5eaca535cd
|
Merge pull request #335 from ozabluda/ozabluda-patch-5
Remove unneeded check of free(NULL)
|
2023-08-21 18:16:07 -07:00 |
|
Andrej Karpathy
|
83287ff254
|
fix steps=0 is max context
|
2023-08-22 01:15:00 +00:00 |
|
Oleg Zabluda
|
c2834c8a1f
|
Remove unneeded check of free(NULL)
Passing NULL to free() is totally allowed
|
2023-08-21 10:54:53 -07:00 |
|
Andrej
|
ee95b1bf29
|
Merge pull request #315 from davidar/vocab_source
Fix vocab_source in sample.py
|
2023-08-21 08:26:28 -07:00 |
|
Andrej Karpathy
|
d02e0c90d8
|
Merge branch 'rdentato-patch-check-params'
|
2023-08-21 15:17:37 +00:00 |
|
Andrej Karpathy
|
33d94f60a5
|
parameter validation cleanup
|
2023-08-21 15:17:14 +00:00 |
|
Remo Dentato
|
2d972f1763
|
Merge branch 'karpathy:master' into patch-check-params
|
2023-08-21 17:02:42 +02:00 |
|
Andrej
|
8a3ea7b433
|
Merge pull request #329 from atamurad/import_meta
Moved export_meta_llama_bin.py to new export.py
|
2023-08-21 07:34:32 -07:00 |
|
atamyrat
|
61c26d5392
|
Updated README to replace export_meta_llama_bin.py script with export.py
|
2023-08-21 14:24:01 +03:00 |
|
atamyrat
|
36a78af5e1
|
tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file
|
2023-08-21 14:19:56 +03:00 |
|
atamyrat
|
de005474d3
|
Added load_meta_model() to export.py
|
2023-08-21 14:13:47 +03:00 |
|