llama2.c

Author	SHA1	Message	Date
Andrej	7325bab657	Merge pull request #365 from atamurad/patch-1 Update README.md - unclosed code block quotes	2023-08-26 20:11:04 -07:00
Atamurad Hezretkuliyev	37157bc0a3	Update README.md Fixed unclosed code block quotes	2023-08-27 02:27:47 +03:00
Andrej Karpathy	f4b8a81742	Merge branch 'master' of github.com:karpathy/llama2.c	2023-08-26 21:22:28 +00:00
Andrej Karpathy	91d57db925	add note on code llama being a bit wrong	2023-08-26 21:22:19 +00:00
Andrej	f856539f41	Merge pull request #363 from byte-6174/patch-1 fix tinyllamas url	2023-08-26 14:13:20 -07:00
byte-6174	b5a0b65dbf	fix tinyllamas url	2023-08-26 17:05:21 -04:00
Andrej	7b0017c6cd	Merge pull request #362 from byte-6174/upmaster freeing tokenizer in test.c	2023-08-26 14:03:31 -07:00
Andrej Karpathy	50832e3dff	move script into the new docs folder	2023-08-26 21:02:23 +00:00
Andrej Karpathy	1386edfd90	add docs on stories260K	2023-08-26 20:52:49 +00:00
Aniket	32cecbfe4a	freeing tokenizer in test.c	2023-08-26 16:35:50 -04:00
Andrej	e47bacdc62	Merge pull request #355 from janimo/export-vocab-size Export vocab size and Code Llama usage docs	2023-08-26 13:24:55 -07:00
Jani Monoses	604d3c59c0	Add Code Llama info	2023-08-26 22:36:09 +03:00
Jani Monoses	2c2b284988	Get vocab_size from token embeddings size	2023-08-26 22:35:55 +03:00
Andrej	49daf18f2f	Merge pull request #343 from karpathy/feature/chat Add interactive loop to enable nice chat with a Llama 2 Chat model	2023-08-25 08:00:11 -07:00
Andrej	4a7a62bd21	Merge branch 'master' into feature/chat	2023-08-25 07:58:33 -07:00
Andrej	5c6427e4d7	Merge pull request #352 from dmarcos/readmeTypo Fix typo in README.md	2023-08-25 07:55:54 -07:00
Andrej	cbc2488b82	Merge pull request #353 from photomz/master Clearer WandB log step	2023-08-25 07:55:26 -07:00
Andrej Karpathy	fbe324fc5a	adjust things a bit	2023-08-25 14:54:05 +00:00
Markus Zhang	6def77d4ba	Correct WandB log step	2023-08-25 17:12:29 +08:00
Diego Marcos Segura	19cfbeca71	Fix typo in README.md	2023-08-24 19:46:43 -07:00
Andrej	d7cd98633d	add todo item to add a PyTorch Engine	2023-08-24 09:04:52 -07:00
Andrej Karpathy	3d787b2463	ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such	2023-08-24 04:31:06 +00:00
Andrej Karpathy	40fb902cf0	fix chat format bug i think	2023-08-24 03:33:44 +00:00
Andrej Karpathy	c7a26264a2	Merge branch 'master' of github.com:karpathy/llama2.c	2023-08-24 03:10:18 +00:00
Andrej Karpathy	446c1c0df3	Merge branch 'janimo-train-vocab-python'	2023-08-24 03:10:07 +00:00
Andrej Karpathy	096325b66c	bring back num_threads	2023-08-24 03:09:55 +00:00
Andrej	90104db721	Merge pull request #348 from nehzata/clip_steps Clip steps maximum value	2023-08-23 19:57:01 -07:00
Ali Nehzat	9bc72acab0	steps shouldn't exceed the model's seq_len either	2023-08-24 09:09:16 +10:00
Andrej Karpathy	c5e0e7fce4	attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now	2023-08-23 16:27:48 +00:00
Jani Monoses	fe9b9f2f15	Train vocab in Python	2023-08-23 19:10:28 +03:00
Andrej Karpathy	7ac65cb2c2	make decode safer and fix issue with skipping bad byte tokens	2023-08-23 01:08:31 +00:00
Andrej Karpathy	4b3e66021a	lol text	2023-08-23 00:26:47 +00:00
Andrej Karpathy	d1eb18b8ec	add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability	2023-08-23 00:08:22 +00:00
Andrej Karpathy	d26a499207	absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful.	2023-08-22 03:22:56 +00:00
Andrej Karpathy	ac6cf8d6e8	tweak todo list	2023-08-22 02:48:51 +00:00
Andrej Karpathy	ad7a1ef525	clean up swiglu a little bit	2023-08-22 02:32:21 +00:00
Andrej Karpathy	0e362f735f	and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close	2023-08-22 02:22:36 +00:00
Andrej Karpathy	d73b917d3b	hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner	2023-08-22 02:17:51 +00:00
Andrej Karpathy	379f083b85	make sorted vocab a buffer of Tokenizer	2023-08-22 01:56:51 +00:00
Andrej	5eaca535cd	Merge pull request #335 from ozabluda/ozabluda-patch-5 Remove unneeded check of free(NULL)	2023-08-21 18:16:07 -07:00
Andrej Karpathy	83287ff254	fix steps=0 is max context	2023-08-22 01:15:00 +00:00
Oleg Zabluda	c2834c8a1f	Remove unneeded check of free(NULL) Passing NULL to free() is totally allowed	2023-08-21 10:54:53 -07:00
Andrej	ee95b1bf29	Merge pull request #315 from davidar/vocab_source Fix vocab_source in sample.py	2023-08-21 08:26:28 -07:00
Andrej Karpathy	d02e0c90d8	Merge branch 'rdentato-patch-check-params'	2023-08-21 15:17:37 +00:00
Andrej Karpathy	33d94f60a5	parameter validation cleanup	2023-08-21 15:17:14 +00:00
Remo Dentato	2d972f1763	Merge branch 'karpathy:master' into patch-check-params	2023-08-21 17:02:42 +02:00
Andrej	8a3ea7b433	Merge pull request #329 from atamurad/import_meta Moved export_meta_llama_bin.py to new export.py	2023-08-21 07:34:32 -07:00
atamyrat	61c26d5392	Updated README to replace export_meta_llama_bin.py script with export.py	2023-08-21 14:24:01 +03:00
atamyrat	36a78af5e1	tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file	2023-08-21 14:19:56 +03:00
atamyrat	de005474d3	Added load_meta_model() to export.py	2023-08-21 14:13:47 +03:00

1 2 3 4 5 ...

443 Commits