Commit Graph

78 Commits

Author SHA1 Message Date
Andrej Karpathy ca67253f28 smallfix: not sure what the point of this indirection was 2023-08-15 16:09:33 +00:00
Andrej Karpathy 4c63c5608d shorten top comment on run.c file 2023-08-15 16:07:48 +00:00
Andrej Karpathy a47f9b3969 collapsing copy paste code because it's driving my ocd crazy 2023-08-15 16:03:11 +00:00
Andrej Karpathy a9a0628c92 thoroughly commented the UTF-8 byte reading code 2023-08-15 02:18:49 +00:00
Andrej Karpathy d459fd4243 add back careful processing of the byte tokens 2023-08-15 01:42:33 +00:00
Andrej Karpathy 4bf36ecc17 get rid of the special byte decoding logic 2023-08-15 01:04:10 +00:00
Andrej Karpathy 8417cb438d Merge branch 'utf8' of https://github.com/atamurad/llama2.c into feature/utf8 2023-08-15 00:18:53 +00:00
Andrej Karpathy 32c1ff97fb missed p->dim to kv_dim for k,v vectors. we're not doing anything wrong we're just being wasteful with memory. thanks @xefoci7612 for pointing out 2023-08-14 14:52:07 +00:00
Andrej Karpathy 45afa91dca the accum function has been bothering me, there is no real need to add a function here, it does something trivial and is only used twice, scrap 2023-08-14 02:54:27 +00:00
Andrej Karpathy 854c97b660 turn topp 0.9 back on by default thanks to recent PR contributions truncating before quicksort 2023-08-14 00:12:45 +00:00
Andrej 4a2c375df9 Merge pull request #276 from jrudolph/improve-top-p
optimize sample_topp by filtering out small value elements up front
2023-08-13 17:05:38 -07:00
atamyrat 36b54321e5 bugfix: allocate +1 in tokens buffer for dummy whitespace 2023-08-13 23:23:32 +03:00
Andrej Karpathy 38bfac90a8 bigchange: add multiquery support in run.c. we can now train and inference multiquery models (where n_kv_heads < n_heads). this also means that we, in principle, support Llama 2 34B and 70B models, which are multiquery 2023-08-13 19:34:05 +00:00
atamyrat daa9fd9b8a sort vocabulary for faster lookup with bsearch() 2023-08-13 15:02:11 +03:00
Andrej Karpathy f5fc0c245f final piece: run.c support for new tokenizer, super ez 2023-08-13 02:12:13 +00:00
Johannes Rudolph d421a95b2b optimize sample_topp by filtering out small value elements up front
This works because we know that in worst case only 1 element will be selected
and therefore the remaining (n-1) elements have to split the remaining (1-topp)
probability. Probabilities smaller than that cannot be selected and can
be filtered out up front.
2023-08-12 20:31:19 +02:00
Andrej Karpathy c42641205f turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think... 2023-08-10 15:23:05 +00:00
atamyrat c02865df30 prompt tokenizer improvements: utf8 support, add_dummy_prefix and byte_fallback options to match sentencepiece 2023-08-07 13:12:44 +03:00
rdentato ff6a2f0a7a Reset the #include <omp.h> 2023-08-07 07:28:03 +00:00
rdentato e49c16caa5 Changed how rng_seed is handled. Now 0 is treated as time(NULL). 2023-08-07 06:51:57 +00:00
rdentato 999b1bf776 Added conditinal include of the OpenMP header. 2023-08-06 21:07:09 +00:00
Andrej Karpathy 79791f39b4 let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner 2023-08-06 16:33:23 +00:00
Andrej Karpathy 4e8a3e8d5d fix style issue space with stderr printing 2023-08-06 15:51:58 +00:00
rdentato 9cfb7efb85 Changed all the printf() for error/info messages so that they print on stderr. 2023-08-06 09:53:02 +00:00
Andrej Karpathy 65b0846637 error on seed=0 2023-08-06 07:31:21 +00:00
Andrej Karpathy 8931d5092e add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have 2023-08-06 07:22:39 +00:00
Andrej Karpathy a1037d79ee turned on trimTrailingWhitespace in my vscode sorry about that 2023-08-05 22:46:35 +00:00
Andrej a2962b9a0c Merge pull request #195 from clebert/prompt-tokens-size
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej Karpathy d1a59a9ca8 use EXIT_FAILURE instead of 1 2023-08-05 19:19:49 +00:00
Andrej Karpathy 0447b06e7c simplify rope 2023-08-05 18:41:31 +00:00
Andrej Karpathy dcef5ff7c7 add a bit less embarassing argparse that uses keyword arguments instead of positional arguments 2023-08-05 17:08:11 +00:00
rahulschand 02cf3c7311 Small changes to ROPE & comments 2023-08-03 20:13:50 +05:30
Clemens Akens 2c0134f669 adjust malloc size for prompt_tokens 2023-07-31 14:11:15 +02:00
richinseattle cddb05d5b2 use ssize_t/int64 and 64bit version of ftell on windows 2023-07-29 22:54:27 -07:00
Andrej Karpathy fd68dd222f reshuffle blocks of code a bit 2023-07-28 05:37:43 +00:00
aegkmq 6ce28fb388 Merge branch 'master' into better-rng 2023-07-28 13:52:34 +09:00
Andrej Karpathy b4bb47bb7b big change: adding prompting. many LOC, but critical. ty @atamurad for the first draft, i ended up tuning it quite a bit. 2023-07-28 04:12:54 +00:00
Andrej Karpathy e5752e1fc9 strip leading whitespace 2023-07-27 22:59:19 +00:00
Andrej Karpathy 25b50ee0e2 small stylistic fixes and adjustments, fix bug in Makefile, and change the timing code to skip the first (slow) iteration 2023-07-27 22:42:08 +00:00
aegkmq 71200f3092 Fix random_f32 2023-07-28 00:35:59 +09:00
Aydyn Tairov acf1e18e8f remove second ifdefs for windows timing by introducing ported version of clock_gettime 2023-07-27 16:33:21 +01:00
aegkmq 1bdf5af743 Replace the rand() with a portable PRNG 2023-07-27 20:14:08 +09:00
Andrej Karpathy f19f50a744 stylistic changes for the windows support ifdefs 2023-07-27 06:08:40 +00:00
richinseattle 4a6b7a471d Include windows support header (for mmap) 2023-07-26 22:40:01 -07:00
Andrej Karpathy 0d18fa7780 Merge branch 'patch-2' of https://github.com/richinseattle/llama2.c into richinseattle-patch-2 2023-07-27 05:23:05 +00:00
richinseattle 37e8c20f4f Windows compat: Use GetTickCount for delta timer
Intentionally not including a windows header here to avoid merge conflict on include with mmap support. cl.exe doesn't complain, mingw warns.
2023-07-26 22:19:49 -07:00
richinseattle 539dc73196 fix whitespace 2023-07-26 22:12:32 -07:00
richinseattle 7f7a3b2d56 update openmp pragmas for MSVC compatibility
This has no negative impact on Linux and is in preparation for windows support. Windows compiles will not work without additional timer and mmap compatibility patches
2023-07-26 22:06:23 -07:00
Bernardo Ramos 57034480b6 add some code comments 2023-07-26 19:48:14 -03:00
aegkmq 8986005f23 Minor cleanup 2023-07-26 16:57:08 +09:00