llama2.c

Author	SHA1	Message	Date
Aniket	32cecbfe4a	freeing tokenizer in test.c	2023-08-26 16:35:50 -04:00
Andrej	e47bacdc62	Merge pull request #355 from janimo/export-vocab-size Export vocab size and Code Llama usage docs	2023-08-26 13:24:55 -07:00
Jani Monoses	604d3c59c0	Add Code Llama info	2023-08-26 22:36:09 +03:00
Jani Monoses	2c2b284988	Get vocab_size from token embeddings size	2023-08-26 22:35:55 +03:00
Andrej	49daf18f2f	Merge pull request #343 from karpathy/feature/chat Add interactive loop to enable nice chat with a Llama 2 Chat model	2023-08-25 08:00:11 -07:00
Andrej	4a7a62bd21	Merge branch 'master' into feature/chat	2023-08-25 07:58:33 -07:00
Andrej	5c6427e4d7	Merge pull request #352 from dmarcos/readmeTypo Fix typo in README.md	2023-08-25 07:55:54 -07:00
Andrej	cbc2488b82	Merge pull request #353 from photomz/master Clearer WandB log step	2023-08-25 07:55:26 -07:00
Andrej Karpathy	fbe324fc5a	adjust things a bit	2023-08-25 14:54:05 +00:00
Markus Zhang	6def77d4ba	Correct WandB log step	2023-08-25 17:12:29 +08:00
Diego Marcos Segura	19cfbeca71	Fix typo in README.md	2023-08-24 19:46:43 -07:00
Andrej	d7cd98633d	add todo item to add a PyTorch Engine	2023-08-24 09:04:52 -07:00
Andrej Karpathy	3d787b2463	ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such	2023-08-24 04:31:06 +00:00
Andrej Karpathy	40fb902cf0	fix chat format bug i think	2023-08-24 03:33:44 +00:00
Andrej Karpathy	c7a26264a2	Merge branch 'master' of github.com:karpathy/llama2.c	2023-08-24 03:10:18 +00:00
Andrej Karpathy	446c1c0df3	Merge branch 'janimo-train-vocab-python'	2023-08-24 03:10:07 +00:00
Andrej Karpathy	096325b66c	bring back num_threads	2023-08-24 03:09:55 +00:00
Andrej	90104db721	Merge pull request #348 from nehzata/clip_steps Clip steps maximum value	2023-08-23 19:57:01 -07:00
Ali Nehzat	9bc72acab0	steps shouldn't exceed the model's seq_len either	2023-08-24 09:09:16 +10:00
Andrej Karpathy	c5e0e7fce4	attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now	2023-08-23 16:27:48 +00:00
Jani Monoses	fe9b9f2f15	Train vocab in Python	2023-08-23 19:10:28 +03:00
Andrej Karpathy	7ac65cb2c2	make decode safer and fix issue with skipping bad byte tokens	2023-08-23 01:08:31 +00:00
Andrej Karpathy	4b3e66021a	lol text	2023-08-23 00:26:47 +00:00
Andrej Karpathy	d1eb18b8ec	add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability	2023-08-23 00:08:22 +00:00
Andrej Karpathy	d26a499207	absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful.	2023-08-22 03:22:56 +00:00
Andrej Karpathy	ac6cf8d6e8	tweak todo list	2023-08-22 02:48:51 +00:00
Andrej Karpathy	ad7a1ef525	clean up swiglu a little bit	2023-08-22 02:32:21 +00:00
Andrej Karpathy	0e362f735f	and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close	2023-08-22 02:22:36 +00:00
Andrej Karpathy	d73b917d3b	hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner	2023-08-22 02:17:51 +00:00
Andrej Karpathy	379f083b85	make sorted vocab a buffer of Tokenizer	2023-08-22 01:56:51 +00:00
Andrej	5eaca535cd	Merge pull request #335 from ozabluda/ozabluda-patch-5 Remove unneeded check of free(NULL)	2023-08-21 18:16:07 -07:00
Andrej Karpathy	83287ff254	fix steps=0 is max context	2023-08-22 01:15:00 +00:00
Oleg Zabluda	c2834c8a1f	Remove unneeded check of free(NULL) Passing NULL to free() is totally allowed	2023-08-21 10:54:53 -07:00
Andrej	ee95b1bf29	Merge pull request #315 from davidar/vocab_source Fix vocab_source in sample.py	2023-08-21 08:26:28 -07:00
Andrej Karpathy	d02e0c90d8	Merge branch 'rdentato-patch-check-params'	2023-08-21 15:17:37 +00:00
Andrej Karpathy	33d94f60a5	parameter validation cleanup	2023-08-21 15:17:14 +00:00
Remo Dentato	2d972f1763	Merge branch 'karpathy:master' into patch-check-params	2023-08-21 17:02:42 +02:00
Andrej	8a3ea7b433	Merge pull request #329 from atamurad/import_meta Moved export_meta_llama_bin.py to new export.py	2023-08-21 07:34:32 -07:00
atamyrat	61c26d5392	Updated README to replace export_meta_llama_bin.py script with export.py	2023-08-21 14:24:01 +03:00
atamyrat	36a78af5e1	tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file	2023-08-21 14:19:56 +03:00
atamyrat	de005474d3	Added load_meta_model() to export.py	2023-08-21 14:13:47 +03:00
rdentato	4444575c4e	Added check of generation parameters.	2023-08-21 06:43:39 +00:00
Andrej Karpathy	dd61b13e57	delete the save_torchscript export file, but copy its content to the new export.py for the future maybe	2023-08-21 05:09:06 +00:00
Andrej Karpathy	ea44f53568	now that the export.py HF functionality is in master, we can delete this file, and update the readme	2023-08-21 04:58:19 +00:00
Andrej	801c68f5a1	Merge pull request #326 from atamurad/import_hf Added huggingface model loader/importer to export.py	2023-08-20 21:53:17 -07:00
Andrej	74a68eeb35	Merge pull request #325 from HarryGifford/users/hegi/update-readme-threading Update readme with suggestion on number of threads to use	2023-08-20 21:50:26 -07:00
Andrej Karpathy	288b3cec09	remove dagger in the eyeball	2023-08-21 04:47:49 +00:00
Andrej Karpathy	14275bd623	minor clean. i think a lot of chaos has been reduced for today. we shall now rest.	2023-08-21 04:43:24 +00:00
Andrej Karpathy	3868f732a4	and finally refactor the Sampler. things are starting to look a lot cleaner I think	2023-08-21 04:23:02 +00:00
Andrej Karpathy	8a377a1d31	refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too	2023-08-21 03:55:12 +00:00

1 2 3 4 5 ...

434 Commits