Commit Graph

289 Commits

Author SHA1 Message Date
Andrej Karpathy 00a61dc7f9 remove the tinyshakespeare dataset until i can bring it back later in a nicer form, otherwise right now we just have a ton of copy paste code here 2023-08-13 02:18:30 +00:00
Andrej Karpathy f5fc0c245f final piece: run.c support for new tokenizer, super ez 2023-08-13 02:12:13 +00:00
Andrej Karpathy ea4cedc588 add ability to export custom tokenizer to .bin format for run.c file 2023-08-13 02:00:19 +00:00
Andrej Karpathy b0cfa2458d ok i can train and sample a model with a custom tokenizer 2023-08-11 16:47:29 +00:00
Andrej Karpathy 4c6f0af9ff add the ability to train a custom sentencepiece tokenizer with a given vocab_size, and pretok with it. some more changes still needed to merge this branch, in train.py and ofc run.c. did this in a sadly bit ugly, but fully backwards compatible way. basically when we use custom tokenizer we create a whole new directory structure for that 2023-08-11 03:58:22 +00:00
Andrej Karpathy c42641205f turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think... 2023-08-10 15:23:05 +00:00
Andrej Karpathy 3f69c6cdc4 change the default to use runfast, which imo works just fine 2023-08-10 05:06:49 +00:00
Andrej 5f8068fd43 Merge pull request #260 from madroidmaq/master
Add Jupyter notebook for easier feel the magic
2023-08-09 22:03:36 -07:00
Andrej f60285ee78 Merge pull request #264 from trrahul/master
Added C# port information in readme
2023-08-09 22:00:23 -07:00
Andrej 04121d1b85 Merge pull request #256 from rdentato/patch-rng-seed
Patch rng seed
2023-08-09 21:56:07 -07:00
Rahul TR 256e7f885b Added C# port information in readme 2023-08-09 17:59:47 +05:30
Andrej Karpathy e36e3fb50d Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-09 02:08:37 +00:00
Andrej Karpathy 96873b0274 refine todos section make more concrete and sort 2023-08-09 02:08:33 +00:00
madroid 9713609023 Add Colab GUI: select model/temperature/prompt/etc 2023-08-08 20:29:53 +08:00
madroid 27c5fc76b1 Add Google Colab button 2023-08-08 01:50:19 +08:00
madroid 57ca3c0401 Add run.ipynb for easier feel the magic 2023-08-08 01:32:51 +08:00
rdentato ff6a2f0a7a Reset the #include <omp.h> 2023-08-07 07:28:03 +00:00
rdentato e49c16caa5 Changed how rng_seed is handled. Now 0 is treated as time(NULL). 2023-08-07 06:51:57 +00:00
Remo Dentato 2e5fad83da Merge branch 'karpathy:master' into master 2023-08-07 07:57:42 +02:00
Andrej 3c3b19b14c Merge pull request #242 from tairov/llama2-py
Add a link to simple one file pure Python port
2023-08-06 19:51:30 -07:00
Andrej f4f4cae4cb Merge pull request #241 from danielgrittner/master
add a Rust port
2023-08-06 19:51:13 -07:00
Andrej 09de2cc4ca Merge pull request #250 from npinto/master-1
FIX: model.generate(); forward() only returns logits now.
2023-08-06 18:43:01 -07:00
Nicolas Pinto 98b515e44d FIX: model.generate()
This patch fixes a simple bug in `generate()` due to model's `forward()` only returning logits and not losses since `f2e34e6b0ac55accd6ba930a04c6f683f5158b29`.
2023-08-06 14:48:47 -07:00
rdentato 999b1bf776 Added conditinal include of the OpenMP header. 2023-08-06 21:07:09 +00:00
Aydyn Tairov 2297d158e3 Fix link to a github profile 2023-08-06 21:47:05 +01:00
Daniel Grittner 512f039d5d Merge branch 'master' into master 2023-08-06 19:55:43 +02:00
Aydyn Tairov 6734eaeff5 Rebase chanes to master 2023-08-06 18:47:05 +01:00
Aydyn Tairov 7178facb75 Rebase changes to master 2023-08-06 18:45:47 +01:00
Andrej Karpathy a7a3aa09b8 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-06 16:33:36 +00:00
Andrej Karpathy 79791f39b4 let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner 2023-08-06 16:33:23 +00:00
Andrej Karpathy 4e8a3e8d5d fix style issue space with stderr printing 2023-08-06 15:51:58 +00:00
Andrej 7af81ded7e Merge pull request #244 from madroidmaq/master
Update README.md: format notable forks
2023-08-06 08:43:24 -07:00
Andrej a25958fd45 Merge pull request #245 from rdentato/patch-stderr
Errors and info on stderr
2023-08-06 08:42:09 -07:00
Madroid Ma 1f53735d12 Merge branch 'karpathy:master' into master 2023-08-06 18:18:36 +08:00
rdentato 9cfb7efb85 Changed all the printf() for error/info messages so that they print on stderr. 2023-08-06 09:53:02 +00:00
madroid baefaaaf76 Update README.md: add notable forks author's link 2023-08-06 17:42:31 +08:00
Daniel Grittner fcb4cdef8b add a Rust port 2023-08-06 10:44:48 +02:00
Andrej Karpathy 623894f5da fix bug, have to use raw_model not model to access the loss 2023-08-06 07:55:46 +00:00
Andrej Karpathy 65b0846637 error on seed=0 2023-08-06 07:31:21 +00:00
Andrej Karpathy 8931d5092e add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have 2023-08-06 07:22:39 +00:00
madroid 8c1f1b280f Update README.md: format notable forks 2023-08-06 14:23:57 +08:00
Andrej Karpathy 49e3ff6d08 update makefile to use correct arg call after our argparse update 2023-08-05 23:11:11 +00:00
Andrej Karpathy a1037d79ee turned on trimTrailingWhitespace in my vscode sorry about that 2023-08-05 22:46:35 +00:00
Andrej a2962b9a0c Merge pull request #195 from clebert/prompt-tokens-size
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej bdf3a6c22c Merge pull request #167 from mzcu/pretokenize-speedup
Speed up tinystories pretokenize command
2023-08-05 15:14:51 -07:00
Andrej Karpathy d1a59a9ca8 use EXIT_FAILURE instead of 1 2023-08-05 19:19:49 +00:00
Andrej f3e7710763 Merge pull request #215 from Majdoddin/windows
replaced __int64 with int64_t and DWORD with uint_32
2023-08-05 12:14:39 -07:00
Andrej Karpathy 0447b06e7c simplify rope 2023-08-05 18:41:31 +00:00
Andrej Karpathy 8719d5f5a8 Merge branch 'mpcusack-mpcusack/jitsave' 2023-08-05 18:13:07 +00:00
Andrej Karpathy e03d7ecf12 Merge branch 'mpcusack/jitsave' of https://github.com/mpcusack/llama2.c into mpcusack-mpcusack/jitsave 2023-08-05 18:11:21 +00:00