llama2.c

Author	SHA1	Message	Date
Andrej	4a7a62bd21	Merge branch 'master' into feature/chat	2023-08-25 07:58:33 -07:00
Andrej Karpathy	fbe324fc5a	adjust things a bit	2023-08-25 14:54:05 +00:00
Andrej Karpathy	3d787b2463	ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such	2023-08-24 04:31:06 +00:00
Andrej Karpathy	40fb902cf0	fix chat format bug i think	2023-08-24 03:33:44 +00:00
Ali Nehzat	9bc72acab0	steps shouldn't exceed the model's seq_len either	2023-08-24 09:09:16 +10:00
Andrej Karpathy	c5e0e7fce4	attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now	2023-08-23 16:27:48 +00:00
Andrej Karpathy	7ac65cb2c2	make decode safer and fix issue with skipping bad byte tokens	2023-08-23 01:08:31 +00:00
Andrej Karpathy	d1eb18b8ec	add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability	2023-08-23 00:08:22 +00:00
Andrej Karpathy	d26a499207	absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful.	2023-08-22 03:22:56 +00:00
Andrej Karpathy	ad7a1ef525	clean up swiglu a little bit	2023-08-22 02:32:21 +00:00
Andrej Karpathy	0e362f735f	and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close	2023-08-22 02:22:36 +00:00
Andrej Karpathy	d73b917d3b	hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner	2023-08-22 02:17:51 +00:00
Andrej Karpathy	379f083b85	make sorted vocab a buffer of Tokenizer	2023-08-22 01:56:51 +00:00
Andrej	5eaca535cd	Merge pull request #335 from ozabluda/ozabluda-patch-5 Remove unneeded check of free(NULL)	2023-08-21 18:16:07 -07:00
Andrej Karpathy	83287ff254	fix steps=0 is max context	2023-08-22 01:15:00 +00:00
Oleg Zabluda	c2834c8a1f	Remove unneeded check of free(NULL) Passing NULL to free() is totally allowed	2023-08-21 10:54:53 -07:00
Andrej Karpathy	33d94f60a5	parameter validation cleanup	2023-08-21 15:17:14 +00:00
rdentato	4444575c4e	Added check of generation parameters.	2023-08-21 06:43:39 +00:00
Andrej Karpathy	288b3cec09	remove dagger in the eyeball	2023-08-21 04:47:49 +00:00
Andrej Karpathy	14275bd623	minor clean. i think a lot of chaos has been reduced for today. we shall now rest.	2023-08-21 04:43:24 +00:00
Andrej Karpathy	3868f732a4	and finally refactor the Sampler. things are starting to look a lot cleaner I think	2023-08-21 04:23:02 +00:00
Andrej Karpathy	8a377a1d31	refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too	2023-08-21 03:55:12 +00:00
Andrej Karpathy	ae2e4f8d88	name the tokenizer methods cleaner: encode and decode	2023-08-21 03:11:54 +00:00
Andrej Karpathy	c74456f3f0	refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit	2023-08-20 18:18:23 +00:00
Andrej Karpathy	1e335a41cf	remove freq_cis fields as they are not used anymore	2023-08-20 17:26:43 +00:00
Andrej Karpathy	c0511de617	probindex should never have been part of RunState. i apologize for this failure of abstraction	2023-08-20 17:18:06 +00:00
Andrej Karpathy	fa8dfd854e	isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1	2023-08-19 19:21:12 +00:00
Andrej Karpathy	bd182289c5	calculate the freq_cis online, no need to write/read them to/from checkpoints	2023-08-17 04:13:13 +00:00
rdentato	55e60740f5	Added space to str_buffer in case max_token_length is 1.	2023-08-16 07:58:07 +00:00
rdentato	befe4867b3	minimal protection against invalid UTF8 encoding.	2023-08-16 07:42:53 +00:00
Andrej Karpathy	ca67253f28	smallfix: not sure what the point of this indirection was	2023-08-15 16:09:33 +00:00
Andrej Karpathy	4c63c5608d	shorten top comment on run.c file	2023-08-15 16:07:48 +00:00
Andrej Karpathy	a47f9b3969	collapsing copy paste code because it's driving my ocd crazy	2023-08-15 16:03:11 +00:00
Andrej Karpathy	a9a0628c92	thoroughly commented the UTF-8 byte reading code	2023-08-15 02:18:49 +00:00
Andrej Karpathy	d459fd4243	add back careful processing of the byte tokens	2023-08-15 01:42:33 +00:00
Andrej Karpathy	4bf36ecc17	get rid of the special byte decoding logic	2023-08-15 01:04:10 +00:00
Andrej Karpathy	8417cb438d	Merge branch 'utf8' of https://github.com/atamurad/llama2.c into feature/utf8	2023-08-15 00:18:53 +00:00
Andrej Karpathy	32c1ff97fb	missed p->dim to kv_dim for k,v vectors. we're not doing anything wrong we're just being wasteful with memory. thanks @xefoci7612 for pointing out	2023-08-14 14:52:07 +00:00
Andrej Karpathy	45afa91dca	the accum function has been bothering me, there is no real need to add a function here, it does something trivial and is only used twice, scrap	2023-08-14 02:54:27 +00:00
Andrej Karpathy	854c97b660	turn topp 0.9 back on by default thanks to recent PR contributions truncating before quicksort	2023-08-14 00:12:45 +00:00
Andrej	4a2c375df9	Merge pull request #276 from jrudolph/improve-top-p optimize sample_topp by filtering out small value elements up front	2023-08-13 17:05:38 -07:00
atamyrat	36b54321e5	bugfix: allocate +1 in tokens buffer for dummy whitespace	2023-08-13 23:23:32 +03:00
Andrej Karpathy	38bfac90a8	bigchange: add multiquery support in run.c. we can now train and inference multiquery models (where n_kv_heads < n_heads). this also means that we, in principle, support Llama 2 34B and 70B models, which are multiquery	2023-08-13 19:34:05 +00:00
atamyrat	daa9fd9b8a	sort vocabulary for faster lookup with bsearch()	2023-08-13 15:02:11 +03:00
Andrej Karpathy	f5fc0c245f	final piece: run.c support for new tokenizer, super ez	2023-08-13 02:12:13 +00:00
Johannes Rudolph	d421a95b2b	optimize sample_topp by filtering out small value elements up front This works because we know that in worst case only 1 element will be selected and therefore the remaining (n-1) elements have to split the remaining (1-topp) probability. Probabilities smaller than that cannot be selected and can be filtered out up front.	2023-08-12 20:31:19 +02:00
Andrej Karpathy	c42641205f	turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think...	2023-08-10 15:23:05 +00:00
atamyrat	c02865df30	prompt tokenizer improvements: utf8 support, add_dummy_prefix and byte_fallback options to match sentencepiece	2023-08-07 13:12:44 +03:00
rdentato	ff6a2f0a7a	Reset the #include <omp.h>	2023-08-07 07:28:03 +00:00
rdentato	e49c16caa5	Changed how rng_seed is handled. Now 0 is treated as time(NULL).	2023-08-07 06:51:57 +00:00

1 2 3

108 Commits