Commit Graph

262 Commits

Author SHA1 Message Date
Andrej 09de2cc4ca Merge pull request #250 from npinto/master-1
FIX: model.generate(); forward() only returns logits now.
2023-08-06 18:43:01 -07:00
Nicolas Pinto 98b515e44d FIX: model.generate()
This patch fixes a simple bug in `generate()` due to model's `forward()` only returning logits and not losses since `f2e34e6b0ac55accd6ba930a04c6f683f5158b29`.
2023-08-06 14:48:47 -07:00
Andrej Karpathy a7a3aa09b8 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-06 16:33:36 +00:00
Andrej Karpathy 79791f39b4 let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner 2023-08-06 16:33:23 +00:00
Andrej Karpathy 4e8a3e8d5d fix style issue space with stderr printing 2023-08-06 15:51:58 +00:00
Andrej 7af81ded7e Merge pull request #244 from madroidmaq/master
Update README.md: format notable forks
2023-08-06 08:43:24 -07:00
Andrej a25958fd45 Merge pull request #245 from rdentato/patch-stderr
Errors and info on stderr
2023-08-06 08:42:09 -07:00
Madroid Ma 1f53735d12 Merge branch 'karpathy:master' into master 2023-08-06 18:18:36 +08:00
rdentato 9cfb7efb85 Changed all the printf() for error/info messages so that they print on stderr. 2023-08-06 09:53:02 +00:00
madroid baefaaaf76 Update README.md: add notable forks author's link 2023-08-06 17:42:31 +08:00
Andrej Karpathy 623894f5da fix bug, have to use raw_model not model to access the loss 2023-08-06 07:55:46 +00:00
Andrej Karpathy 65b0846637 error on seed=0 2023-08-06 07:31:21 +00:00
Andrej Karpathy 8931d5092e add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have 2023-08-06 07:22:39 +00:00
madroid 8c1f1b280f Update README.md: format notable forks 2023-08-06 14:23:57 +08:00
Andrej Karpathy 49e3ff6d08 update makefile to use correct arg call after our argparse update 2023-08-05 23:11:11 +00:00
Andrej Karpathy a1037d79ee turned on trimTrailingWhitespace in my vscode sorry about that 2023-08-05 22:46:35 +00:00
Andrej a2962b9a0c Merge pull request #195 from clebert/prompt-tokens-size
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej bdf3a6c22c Merge pull request #167 from mzcu/pretokenize-speedup
Speed up tinystories pretokenize command
2023-08-05 15:14:51 -07:00
Andrej Karpathy d1a59a9ca8 use EXIT_FAILURE instead of 1 2023-08-05 19:19:49 +00:00
Andrej f3e7710763 Merge pull request #215 from Majdoddin/windows
replaced __int64 with int64_t and DWORD with uint_32
2023-08-05 12:14:39 -07:00
Andrej Karpathy 0447b06e7c simplify rope 2023-08-05 18:41:31 +00:00
Andrej Karpathy 8719d5f5a8 Merge branch 'mpcusack-mpcusack/jitsave' 2023-08-05 18:13:07 +00:00
Andrej Karpathy e03d7ecf12 Merge branch 'mpcusack/jitsave' of https://github.com/mpcusack/llama2.c into mpcusack-mpcusack/jitsave 2023-08-05 18:11:21 +00:00
Andrej Karpathy 0609eb6601 slightly tune todos 2023-08-05 17:13:35 +00:00
Andrej Karpathy dcef5ff7c7 add a bit less embarassing argparse that uses keyword arguments instead of positional arguments 2023-08-05 17:08:11 +00:00
Andrej Karpathy 837796e0b7 get rid of unneeded comment now 2023-08-05 16:19:27 +00:00
Andrej db4ad580f3 Merge pull request #225 from RahulSChand/rope_changes
Fixing max_seq_len passed to RoPE implementation. Minor comment changes
2023-08-05 09:18:47 -07:00
Andrej 9d001c6249 Merge pull request #223 from LexiestLeszek/master-1
Updated README.md with added steps for junior devs
2023-08-05 09:13:12 -07:00
Andrej f93e7b5626 Merge pull request #228 from aiwizzard/master
Fixed typo in README.md
2023-08-05 09:09:19 -07:00
Andrej 2abd77a57f Merge pull request #231 from madroidmaq/master
Update README.md: add a Kotlin port of this project
2023-08-05 09:08:59 -07:00
Andrej ba036696b7 Merge branch 'master' into master 2023-08-05 09:08:51 -07:00
Andrej 4b1e5d57a1 Merge pull request #232 from clebert/zig-port
Add Zig port to README
2023-08-05 09:08:31 -07:00
Michael Cusack 13f342af9e docs typo 2023-08-04 23:12:06 +07:00
Michael Cusack f4c96b7339 Add options to save_torchscript 2023-08-04 23:11:33 +07:00
Michael Cusack 4b3a41b8fc Add options to save_torchscript 2023-08-04 23:10:14 +07:00
Clemens Akens a4e961f378 Add Zig port to README 2023-08-04 18:00:04 +02:00
Michael Cusack 113c675bc9 Rename save_model.py 2023-08-04 20:31:44 +07:00
madroid ec65aac182 Update README.md: add a Kotlin port of this project 2023-08-04 18:50:06 +08:00
Michael Cusack 305d920862 Zero'ing params docs 2023-08-04 17:33:23 +07:00
Michael Cusack dfff7812db Zero'ing params docs 2023-08-04 17:31:31 +07:00
Michael Cusack 34f0402501 Zero'ing params docs 2023-08-04 17:31:11 +07:00
Michael Cusack d4cdd6259e Zero'ing params docs 2023-08-04 17:30:05 +07:00
Michael Cusack 9f8e0857ee Typo 2023-08-04 17:22:27 +07:00
Michael Cusack f8d45f180d Reinline loss function 2023-08-04 17:21:29 +07:00
Michael Cusack f67185958b Model args in save script 2023-08-04 17:07:41 +07:00
Michael Cusack fd5e2cc7bc Updating training code for loss result 2023-08-04 17:03:11 +07:00
Michael Cusack ac2b435151 docs 2023-08-04 16:55:26 +07:00
Michael Cusack 11a8348dfc extra line 2023-08-04 16:52:04 +07:00
Michael Cusack f2e34e6b0a Resolve jit.save errors 2023-08-04 16:49:26 +07:00
Ajmal K b9f303f3b8 Fixed typo in README.md
Fixed typo
2023-08-04 10:30:11 +05:30