llama2.c

schihei/llama2.c

Fork 0

Commit Graph

Select branches

Hide Pull Requests

feature/avx2

feature/chat

feature/int8

feature/int8_try2

master

7325bab657 Merge pull request #365 from atamurad/patch-1 master Andrej 2023-08-26 20:11:04 -07:00
37157bc0a3 Update README.md Atamurad Hezretkuliyev 2023-08-27 02:27:47 +03:00
df80471914 draft of int8 attempt number two feature/int8_try2 Andrej Karpathy 2023-08-26 22:28:08 +00:00
f4b8a81742 Merge branch 'master' of github.com:karpathy/llama2.c Andrej Karpathy 2023-08-26 21:22:28 +00:00
91d57db925 add note on code llama being a bit wrong Andrej Karpathy 2023-08-26 21:22:19 +00:00
f856539f41 Merge pull request #363 from byte-6174/patch-1 Andrej 2023-08-26 14:13:20 -07:00
b5a0b65dbf fix tinyllamas url byte-6174 2023-08-26 17:05:21 -04:00
7b0017c6cd Merge pull request #362 from byte-6174/upmaster Andrej 2023-08-26 14:03:31 -07:00
50832e3dff move script into the new docs folder Andrej Karpathy 2023-08-26 21:02:23 +00:00
1386edfd90 add docs on stories260K Andrej Karpathy 2023-08-26 20:52:49 +00:00
32cecbfe4a freeing tokenizer in test.c Aniket 2023-08-26 16:35:50 -04:00
e47bacdc62 Merge pull request #355 from janimo/export-vocab-size Andrej 2023-08-26 13:24:55 -07:00
604d3c59c0 Add Code Llama info Jani Monoses 2023-08-26 22:36:09 +03:00
2c2b284988 Get vocab_size from token embeddings size Jani Monoses 2023-08-26 22:35:55 +03:00
49daf18f2f Merge pull request #343 from karpathy/feature/chat Andrej 2023-08-25 08:00:11 -07:00
4a7a62bd21 Merge branch 'master' into feature/chat feature/chat Andrej 2023-08-25 07:58:33 -07:00
5c6427e4d7 Merge pull request #352 from dmarcos/readmeTypo Andrej 2023-08-25 07:55:54 -07:00
cbc2488b82 Merge pull request #353 from photomz/master Andrej 2023-08-25 07:55:26 -07:00
fbe324fc5a adjust things a bit Andrej Karpathy 2023-08-25 14:54:05 +00:00
6def77d4ba Correct WandB log step Markus Zhang 2023-08-25 17:12:29 +08:00
19cfbeca71 Fix typo in README.md Diego Marcos Segura 2023-08-24 19:45:23 -07:00
d7cd98633d add todo item to add a PyTorch Engine Andrej 2023-08-24 09:04:52 -07:00
3d787b2463 ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such Andrej Karpathy 2023-08-24 04:31:06 +00:00
40fb902cf0 fix chat format bug i think Andrej Karpathy 2023-08-24 03:33:44 +00:00
c7a26264a2 Merge branch 'master' of github.com:karpathy/llama2.c Andrej Karpathy 2023-08-24 03:10:18 +00:00
446c1c0df3 Merge branch 'janimo-train-vocab-python' Andrej Karpathy 2023-08-24 03:10:07 +00:00
096325b66c bring back num_threads Andrej Karpathy 2023-08-24 03:09:55 +00:00
90104db721 Merge pull request #348 from nehzata/clip_steps Andrej 2023-08-23 19:57:01 -07:00
9bc72acab0 steps shouldn't exceed the model's seq_len either Ali Nehzat 2023-08-24 09:09:16 +10:00
c5e0e7fce4 attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now Andrej Karpathy 2023-08-23 16:27:48 +00:00
fe9b9f2f15 Train vocab in Python Jani Monoses 2023-08-23 17:28:14 +03:00
7ac65cb2c2 make decode safer and fix issue with skipping bad byte tokens Andrej Karpathy 2023-08-23 01:08:31 +00:00
4b3e66021a lol text Andrej Karpathy 2023-08-23 00:26:47 +00:00
d1eb18b8ec add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability Andrej Karpathy 2023-08-23 00:08:22 +00:00
d26a499207 absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful. Andrej Karpathy 2023-08-22 03:22:56 +00:00
ac6cf8d6e8 tweak todo list Andrej Karpathy 2023-08-22 02:48:51 +00:00
ad7a1ef525 clean up swiglu a little bit Andrej Karpathy 2023-08-22 02:32:21 +00:00
0e362f735f and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close Andrej Karpathy 2023-08-22 02:22:36 +00:00
d73b917d3b hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner Andrej Karpathy 2023-08-22 02:17:51 +00:00
379f083b85 make sorted vocab a buffer of Tokenizer Andrej Karpathy 2023-08-22 01:56:51 +00:00
5eaca535cd Merge pull request #335 from ozabluda/ozabluda-patch-5 Andrej 2023-08-21 18:16:07 -07:00
83287ff254 fix steps=0 is max context Andrej Karpathy 2023-08-22 01:15:00 +00:00
c2834c8a1f Remove unneeded check of free(NULL) Oleg Zabluda 2023-08-21 10:54:53 -07:00
ee95b1bf29 Merge pull request #315 from davidar/vocab_source Andrej 2023-08-21 08:26:28 -07:00
d02e0c90d8 Merge branch 'rdentato-patch-check-params' Andrej Karpathy 2023-08-21 15:17:37 +00:00
33d94f60a5 parameter validation cleanup Andrej Karpathy 2023-08-21 15:17:14 +00:00
2d972f1763 Merge branch 'karpathy:master' into patch-check-params Remo Dentato 2023-08-21 17:02:42 +02:00
8a3ea7b433 Merge pull request #329 from atamurad/import_meta Andrej 2023-08-21 07:34:32 -07:00
61c26d5392 Updated README to replace export_meta_llama_bin.py script with export.py atamyrat 2023-08-21 14:24:01 +03:00
36a78af5e1 tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file atamyrat 2023-08-21 14:19:56 +03:00
de005474d3 Added load_meta_model() to export.py atamyrat 2023-08-21 14:13:47 +03:00
4444575c4e Added check of generation parameters. rdentato 2023-08-21 06:43:39 +00:00
dd61b13e57 delete the save_torchscript export file, but copy its content to the new export.py for the future maybe Andrej Karpathy 2023-08-21 05:09:06 +00:00
ea44f53568 now that the export.py HF functionality is in master, we can delete this file, and update the readme Andrej Karpathy 2023-08-21 04:58:19 +00:00
801c68f5a1 Merge pull request #326 from atamurad/import_hf Andrej 2023-08-20 21:53:17 -07:00
74a68eeb35 Merge pull request #325 from HarryGifford/users/hegi/update-readme-threading Andrej 2023-08-20 21:50:26 -07:00
288b3cec09 remove dagger in the eyeball Andrej Karpathy 2023-08-21 04:47:49 +00:00
14275bd623 minor clean. i think a lot of chaos has been reduced for today. we shall now rest. Andrej Karpathy 2023-08-21 04:43:24 +00:00
3868f732a4 and finally refactor the Sampler. things are starting to look a lot cleaner I think Andrej Karpathy 2023-08-21 04:23:02 +00:00
8a377a1d31 refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too Andrej Karpathy 2023-08-21 03:55:12 +00:00
ae2e4f8d88 name the tokenizer methods cleaner: encode and decode Andrej Karpathy 2023-08-21 03:11:54 +00:00
0dd82158f6 removed transformers from requirements.txt, added error message atamyrat 2023-08-21 06:07:29 +03:00
155475a523 Fix WQ and WK permutation in huggingface models atamyrat 2023-08-21 05:16:11 +03:00
d7704bdeaa mark ModelArgs.hidden_dim as optional and calculate as previously if not provided atamyrat 2023-08-21 03:40:34 +03:00
09db52c69e Added huggingface model loader to export.py atamyrat 2023-08-21 02:53:50 +03:00
a72b3b0206 Update readme with suggestion on number of threads to use Harry Gifford 2023-08-20 15:01:33 -07:00
c74456f3f0 refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit Andrej Karpathy 2023-08-20 18:18:23 +00:00
1e335a41cf remove freq_cis fields as they are not used anymore Andrej Karpathy 2023-08-20 17:26:43 +00:00
c0511de617 probindex should never have been part of RunState. i apologize for this failure of abstraction Andrej Karpathy 2023-08-20 17:18:06 +00:00
8c93c7a30e Merge pull request #322 from karpathy/feature/export Andrej 2023-08-20 10:08:32 -07:00
13dcee493a todos update Andrej Karpathy 2023-08-20 17:02:22 +00:00
f3db92a2dc use out_file.tell() instead of nbytes += arithmetic Andrej Karpathy 2023-08-20 16:51:35 +00:00
fa8dfd854e isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1 Andrej Karpathy 2023-08-19 19:21:12 +00:00
4df5e2e939 make version 1 be the legacy export but with new header. version 2 will be Q8_0 export Andrej Karpathy 2023-08-19 18:51:32 +00:00
4212bd6d43 oops fix double indent on quantize def Andrej Karpathy 2023-08-19 18:34:49 +00:00
7f551dbfd7 new model export: versions 0 (legacy) and 1 Andrej Karpathy 2023-08-19 18:25:20 +00:00
6c5d78fa41 Merge pull request #317 from yiminghan/yhan/old Andrej 2023-08-19 10:01:08 -07:00
db1a722816 Merge pull request #318 from rahoua/master Andrej 2023-08-19 10:00:56 -07:00
d2a546c577 Merge pull request #319 from RahulSChand/warning Andrej 2023-08-19 10:00:27 -07:00
fbefeec1b1 add assert message to give better warning rahulschand 2023-08-19 13:05:26 +05:30
978c311b30 Add pecca-rs to README.md rahoua 2023-08-18 14:58:21 -07:00
882e480bc0 update read me YiMing Han 2023-08-18 15:18:29 -04:00
d09ebbb32b Revert "working one" YiMing Han 2023-08-18 15:14:08 -04:00
bc7cb7d0e8 Revert "only dart" YiMing Han 2023-08-18 15:13:59 -04:00
01df3731d6 only dart YiMing Han 2023-08-18 15:09:24 -04:00
8607b11ea1 working one YiMing Han 2023-08-18 15:07:41 -04:00
039a9713c2 ok this first version works but i don't think is ready to merge, have to think on more feature/int8 Andrej Karpathy 2023-08-18 15:44:02 +00:00
52fe3653e5 Fix vocab_source in sample.py David A Roberts 2023-08-18 18:40:25 +10:00
591f1353c7 ok this works but is super slow because we are doing all the work in fp32 still Andrej Karpathy 2023-08-18 03:40:18 +00:00
e9cbe3e84f small improvements to comments and warnings and increase header size during model export Andrej Karpathy 2023-08-17 14:32:22 +00:00
5e2e5b28f4 re-write the model export to do int8 quantization in groups, with group size fallback, and also change the header to be much better Andrej Karpathy 2023-08-17 05:56:20 +00:00
bd182289c5 calculate the freq_cis online, no need to write/read them to/from checkpoints Andrej Karpathy 2023-08-17 04:13:13 +00:00
b68a6d2ab5 Merge pull request #307 from madroidmaq/master Andrej 2023-08-16 20:09:32 -07:00
57bf0e9ee4 Merge pull request #306 from rdentato/patch-utf8-no-validation Andrej 2023-08-16 09:51:11 -07:00
9fbe96fc2e Jupter Notebook: Add run Meta's Llama 2 models madroid 2023-08-16 20:23:27 +08:00
55e60740f5 Added space to str_buffer in case max_token_length is 1. rdentato 2023-08-16 07:58:07 +00:00
befe4867b3 minimal protection against invalid UTF8 encoding. rdentato 2023-08-16 07:42:53 +00:00
df6557a10d Merge pull request #267 from krrishnarraj/master Andrej 2023-08-15 19:26:34 -07:00
65c899314c Merge branch 'Majdoddin-ci-tiny-model' Andrej Karpathy 2023-08-16 02:22:26 +00:00
62a6d69d86 style changes and remove spurious runc test call at the bottom Andrej Karpathy 2023-08-16 02:22:13 +00:00