Commit Graph

62 Commits

Author SHA1 Message Date
Andrej Karpathy f5fc0c245f final piece: run.c support for new tokenizer, super ez 2023-08-13 02:12:13 +00:00
Andrej Karpathy c42641205f turn off topp sampling by default because it is a bit too slow to be the default. it is likely that turning it on, e.g. -p 0.9 is midlly higher quality and safer samples, but this comes at a cost of too much performance in double digit percent sometimes, for it to be on by default i think... 2023-08-10 15:23:05 +00:00
rdentato ff6a2f0a7a Reset the #include <omp.h> 2023-08-07 07:28:03 +00:00
rdentato e49c16caa5 Changed how rng_seed is handled. Now 0 is treated as time(NULL). 2023-08-07 06:51:57 +00:00
rdentato 999b1bf776 Added conditinal include of the OpenMP header. 2023-08-06 21:07:09 +00:00
Andrej Karpathy 79791f39b4 let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner 2023-08-06 16:33:23 +00:00
Andrej Karpathy 4e8a3e8d5d fix style issue space with stderr printing 2023-08-06 15:51:58 +00:00
rdentato 9cfb7efb85 Changed all the printf() for error/info messages so that they print on stderr. 2023-08-06 09:53:02 +00:00
Andrej Karpathy 65b0846637 error on seed=0 2023-08-06 07:31:21 +00:00
Andrej Karpathy 8931d5092e add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have 2023-08-06 07:22:39 +00:00
Andrej Karpathy a1037d79ee turned on trimTrailingWhitespace in my vscode sorry about that 2023-08-05 22:46:35 +00:00
Andrej a2962b9a0c Merge pull request #195 from clebert/prompt-tokens-size
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej Karpathy d1a59a9ca8 use EXIT_FAILURE instead of 1 2023-08-05 19:19:49 +00:00
Andrej Karpathy 0447b06e7c simplify rope 2023-08-05 18:41:31 +00:00
Andrej Karpathy dcef5ff7c7 add a bit less embarassing argparse that uses keyword arguments instead of positional arguments 2023-08-05 17:08:11 +00:00
rahulschand 02cf3c7311 Small changes to ROPE & comments 2023-08-03 20:13:50 +05:30
Clemens Akens 2c0134f669 adjust malloc size for prompt_tokens 2023-07-31 14:11:15 +02:00
richinseattle cddb05d5b2 use ssize_t/int64 and 64bit version of ftell on windows 2023-07-29 22:54:27 -07:00
Andrej Karpathy fd68dd222f reshuffle blocks of code a bit 2023-07-28 05:37:43 +00:00
aegkmq 6ce28fb388 Merge branch 'master' into better-rng 2023-07-28 13:52:34 +09:00
Andrej Karpathy b4bb47bb7b big change: adding prompting. many LOC, but critical. ty @atamurad for the first draft, i ended up tuning it quite a bit. 2023-07-28 04:12:54 +00:00
Andrej Karpathy e5752e1fc9 strip leading whitespace 2023-07-27 22:59:19 +00:00
Andrej Karpathy 25b50ee0e2 small stylistic fixes and adjustments, fix bug in Makefile, and change the timing code to skip the first (slow) iteration 2023-07-27 22:42:08 +00:00
aegkmq 71200f3092 Fix random_f32 2023-07-28 00:35:59 +09:00
Aydyn Tairov acf1e18e8f remove second ifdefs for windows timing by introducing ported version of clock_gettime 2023-07-27 16:33:21 +01:00
aegkmq 1bdf5af743 Replace the rand() with a portable PRNG 2023-07-27 20:14:08 +09:00
Andrej Karpathy f19f50a744 stylistic changes for the windows support ifdefs 2023-07-27 06:08:40 +00:00
richinseattle 4a6b7a471d Include windows support header (for mmap) 2023-07-26 22:40:01 -07:00
Andrej Karpathy 0d18fa7780 Merge branch 'patch-2' of https://github.com/richinseattle/llama2.c into richinseattle-patch-2 2023-07-27 05:23:05 +00:00
richinseattle 37e8c20f4f Windows compat: Use GetTickCount for delta timer
Intentionally not including a windows header here to avoid merge conflict on include with mmap support. cl.exe doesn't complain, mingw warns.
2023-07-26 22:19:49 -07:00
richinseattle 539dc73196 fix whitespace 2023-07-26 22:12:32 -07:00
richinseattle 7f7a3b2d56 update openmp pragmas for MSVC compatibility
This has no negative impact on Linux and is in preparation for windows support. Windows compiles will not work without additional timer and mmap compatibility patches
2023-07-26 22:06:23 -07:00
Bernardo Ramos 57034480b6 add some code comments 2023-07-26 19:48:14 -03:00
aegkmq 8986005f23 Minor cleanup 2023-07-26 16:57:08 +09:00
aegkmq 36c522a0d8 Improve locality 2023-07-26 13:24:27 +09:00
Andrej 614bf91e5d Merge pull request #60 from emma-eva/patch-1
Fixed time_in_ms() compile time error (termux and neoterm)
2023-07-25 16:06:41 -07:00
Andrej Karpathy 05ee4cbf38 fix bug in timing - use steps not max seq len doh 2023-07-25 14:21:37 +00:00
Emma Eva 6ce91b1b3b Fixed time_in_ms() compile time error (termux and neoterm)
clang version 16.0.4
2023-07-25 12:12:40 +06:00
Andrej Karpathy c3e0d73bd2 we can inference Meta's Llama 2 7B, yay 2023-07-25 04:21:07 +00:00
Andrej Karpathy a1f6b4653e merge conflict resolve with imports 2023-07-25 01:58:46 +00:00
richinseattle b2857c6af2 Switch to using timespec_get() for cross OS compatibility 2023-07-24 16:31:38 -07:00
Andrej Karpathy e6e3f1322b candidate memmap implementation 2023-07-24 22:54:49 +00:00
richinseattle 2be7d7887b MSVC Compatibility fix for timer
use clock() instead of gettimeofday() for cross-platform compatibility
2023-07-24 15:22:20 -07:00
Andrej 669b75ddc8 Merge pull request #43 from krzysztof-jusiak/rmsnorm
Speed up rmsnorm by using sqrtf/expf
2023-07-24 14:13:49 -07:00
Andrej Karpathy 791be9d991 tweak argparse. fix steps=256, even if some models may support longer maximum seq_len. get rid of seed option for now, use temp=0.0 for deterministic behavior 2023-07-24 20:59:32 +00:00
Kris Jusiak c9b1f10124 Speed up rmsnorm by using sqrtf/expf
Problem:
- exp and sqrt are using double precision for operations which is not
  required.

Solution:
- Use expf and sqrtf intead.

Notes:
- Although it's using single precision doesn't seem to affect the
  result.

Results: ~ 10% improvement
  - before:  940 tok/s
  - after:  1020 tok/s
2023-07-24 13:06:27 -05:00
Franz Louis Cesista c9ad067c5d parallelize multi-head attention 2023-07-25 01:10:12 +08:00
Andrej d0ddf94cc3 Merge pull request #36 from hu-po/patch-1
typo
2023-07-24 07:27:36 -07:00
Andrej 228c4ea3ea Merge pull request #28 from SlyEcho/master
Fix tokenizer reading on Windows
2023-07-24 07:23:07 -07:00
hu-po d95c7617c6 typo 2023-07-24 07:35:12 -05:00