llama2.c

Author	SHA1	Message	Date
Andrej Karpathy	32c1ff97fb	missed p->dim to kv_dim for k,v vectors. we're not doing anything wrong we're just being wasteful with memory. thanks @xefoci7612 for pointing out	2023-08-14 14:52:07 +00:00
Andrej Karpathy	82ad2ba34e	remove tiktoken as dependency	2023-08-14 05:53:57 +00:00
Andrej	bae0bcf484	Small tweaks to Readme intro	2023-08-13 20:03:00 -07:00
Andrej Karpathy	45afa91dca	the accum function has been bothering me, there is no real need to add a function here, it does something trivial and is only used twice, scrap	2023-08-14 02:54:27 +00:00
Andrej Karpathy	854c97b660	turn topp 0.9 back on by default thanks to recent PR contributions truncating before quicksort	2023-08-14 00:12:45 +00:00
Andrej	4a2c375df9	Merge pull request #276 from jrudolph/improve-top-p optimize sample_topp by filtering out small value elements up front	2023-08-13 17:05:38 -07:00
Andrej	b3d6a9e6b5	Merge pull request #285 from karpathy/feature/civ2 Upgrading CI to run our new pytest	2023-08-13 16:55:01 -07:00
Andrej	091c799653	Merge branch 'master' into feature/civ2	2023-08-13 16:54:24 -07:00
Andrej Karpathy	c970f69334	oops i should probably call this function lol	2023-08-13 23:48:01 +00:00
Andrej Karpathy	223a67048a	add optional manual dispatch of actions	2023-08-13 23:39:37 +00:00
Andrej Karpathy	86325bf7e8	attempt to upgrade the CI to run our pytest	2023-08-13 23:35:29 +00:00
Andrej	b51c63b9f2	Merge pull request #283 from wizzard0/wizzard0-mention-1 Add TypeScript port	2023-08-13 14:36:10 -07:00
Andrej Karpathy	8506036185	remove 'revive tests' as a todo from the readme	2023-08-13 21:23:27 +00:00
Andrej Karpathy	f0024cfc88	revive tests. now that we have a tiny stories260K model this only requires a 2MB download. phew	2023-08-13 21:22:44 +00:00
Andrej	0805cb2c31	tiny whitespace fix to try to eliminate scrollbar	2023-08-13 13:40:09 -07:00
Andrej	b2cce341e0	oops typo fix in readme	2023-08-13 13:39:12 -07:00
Andrej Karpathy	3e989e21f2	link to stories260K model	2023-08-13 20:38:05 +00:00
Andrej Karpathy	58075b5ac5	update API of sample.py to be better, small changes here	2023-08-13 20:31:32 +00:00
Andrej	1bcb2d18d6	Merge pull request #284 from karpathy/feature/customtokenizer multiquery support add	2023-08-13 12:38:06 -07:00
Andrej Karpathy	38bfac90a8	bigchange: add multiquery support in run.c. we can now train and inference multiquery models (where n_kv_heads < n_heads). this also means that we, in principle, support Llama 2 34B and 70B models, which are multiquery	2023-08-13 19:34:05 +00:00
Andrej	b28c1e26c5	Merge pull request #275 from icppWorld/webassembly-internet-computer Notable fork section for WebAssembly	2023-08-13 10:14:39 -07:00
Andrej	5295cbb821	Merge pull request #281 from lintian06/original_llama2 Update README.md for a new rust port.	2023-08-13 10:14:00 -07:00
Andrej	12dec61fbf	Merge pull request #282 from mihainadas/master-1 Fixes https://github.com/karpathy/llama2.c/issues/280	2023-08-13 10:13:08 -07:00
Oleksandr Nikitin	0e6213c6e0	Mention I can run the full 7B model	2023-08-13 20:02:34 +03:00
Oleksandr Nikitin	1d68a36d14	Add TypeScript port I've never been so happy to have missed that the JS port already exists :D also it was nice to discover that the JS can reach 80% of the single-threaded C speed (10 tokens/s for TinyStories-110M)	2023-08-13 19:10:07 +03:00
Mihai Nadăș	570789aa04	Fixes https://github.com/karpathy/llama2.c/issues/280 There was a small bug in tinystories.py, described here: https://github.com/karpathy/llama2.c/issues/280 This commit simply passes vocab_size to get_tokenizer_model_path to avoid silent crash when processing shards (in process_shard)	2023-08-13 17:49:10 +03:00
Tian Lin	27adb082f1	Update README.md	2023-08-13 21:58:14 +08:00
Andrej	8b472ded1f	Merge pull request #272 from karpathy/feature/customtokenizer Big Change: Custom Tokenizer training: add the ability to train custom tokenizers instead of using the pretrained Llama 2 tokenizer. This is useful in custom, narrow-domain LLMs because smaller vocab sizes make much smaller, faster, and potentially more capable models. For example, in tinystories a vocab size 4096 custom tokenizer compresses the input text sequences about as well as the Llama 2 tokenizer with vocab size 32000. The result is also "safer" because a badly trained model can't accidentally e.g. output some random chinese character and rapidly go "off the rails" in subsequent tokens.	2023-08-12 20:31:21 -07:00
Andrej Karpathy	9ff459b925	todo changes	2023-08-13 03:24:31 +00:00
Andrej Karpathy	1d14cb8dd8	add note about 4096 vs 32000 token size on tinystories	2023-08-13 03:19:35 +00:00
Andrej Karpathy	fe49eb222c	readme for custom tokenizers	2023-08-13 03:16:18 +00:00
Andrej Karpathy	9c3cfb46a3	make default be the llama2 tokenizer	2023-08-13 03:08:07 +00:00
Andrej Karpathy	00a61dc7f9	remove the tinyshakespeare dataset until i can bring it back later in a nicer form, otherwise right now we just have a ton of copy paste code here	2023-08-13 02:18:30 +00:00
Andrej Karpathy	f5fc0c245f	final piece: run.c support for new tokenizer, super ez	2023-08-13 02:12:13 +00:00
Andrej Karpathy	ea4cedc588	add ability to export custom tokenizer to .bin format for run.c file	2023-08-13 02:00:19 +00:00
Johannes Rudolph	d421a95b2b	optimize sample_topp by filtering out small value elements up front This works because we know that in worst case only 1 element will be selected and therefore the remaining (n-1) elements have to split the remaining (1-topp) probability. Probabilities smaller than that cannot be selected and can be filtered out up front.	2023-08-12 20:31:19 +02:00
Andrej Karpathy	b0cfa2458d	ok i can train and sample a model with a custom tokenizer	2023-08-11 16:47:29 +00:00
icpp	f96c7afb2d	Notable fork section for WebAssembly Added my repo `icpp-lmm` for running it on the Internet Computer	2023-08-11 10:11:32 -04:00
Andrej Karpathy	4c6f0af9ff	add the ability to train a custom sentencepiece tokenizer with a given vocab_size, and pretok with it. some more changes still needed to merge this branch, in train.py and ofc run.c. did this in a sadly bit ugly, but fully backwards compatible way. basically when we use custom tokenizer we create a whole new directory structure for that	2023-08-11 03:58:22 +00:00
Andrej Karpathy	c42641205f	turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think...	2023-08-10 15:23:05 +00:00
Andrej Karpathy	3f69c6cdc4	change the default to use runfast, which imo works just fine	2023-08-10 05:06:49 +00:00
Andrej	5f8068fd43	Merge pull request #260 from madroidmaq/master Add Jupyter notebook for easier feel the magic	2023-08-09 22:03:36 -07:00
Andrej	f60285ee78	Merge pull request #264 from trrahul/master Added C# port information in readme	2023-08-09 22:00:23 -07:00
Andrej	04121d1b85	Merge pull request #256 from rdentato/patch-rng-seed Patch rng seed	2023-08-09 21:56:07 -07:00
Rahul TR	256e7f885b	Added C# port information in readme	2023-08-09 17:59:47 +05:30
Andrej Karpathy	e36e3fb50d	Merge branch 'master' of github.com:karpathy/llama2.c	2023-08-09 02:08:37 +00:00
Andrej Karpathy	96873b0274	refine todos section make more concrete and sort	2023-08-09 02:08:33 +00:00
madroid	9713609023	Add Colab GUI: select model/temperature/prompt/etc	2023-08-08 20:29:53 +08:00
madroid	27c5fc76b1	Add Google Colab button	2023-08-08 01:50:19 +08:00
madroid	57ca3c0401	Add run.ipynb for easier feel the magic	2023-08-08 01:32:51 +08:00

1 2 3 4 5 ...

323 Commits