Andrej
09de2cc4ca
Merge pull request #250 from npinto/master-1
...
FIX: model.generate(); forward() only returns logits now.
2023-08-06 18:43:01 -07:00
Nicolas Pinto
98b515e44d
FIX: model.generate()
...
This patch fixes a simple bug in `generate()` due to model's `forward()` only returning logits and not losses since `f2e34e6b0ac55accd6ba930a04c6f683f5158b29`.
2023-08-06 14:48:47 -07:00
Andrej Karpathy
a7a3aa09b8
Merge branch 'master' of github.com:karpathy/llama2.c
2023-08-06 16:33:36 +00:00
Andrej Karpathy
79791f39b4
let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner
2023-08-06 16:33:23 +00:00
Andrej Karpathy
4e8a3e8d5d
fix style issue space with stderr printing
2023-08-06 15:51:58 +00:00
Andrej
7af81ded7e
Merge pull request #244 from madroidmaq/master
...
Update README.md: format notable forks
2023-08-06 08:43:24 -07:00
Andrej
a25958fd45
Merge pull request #245 from rdentato/patch-stderr
...
Errors and info on stderr
2023-08-06 08:42:09 -07:00
Madroid Ma
1f53735d12
Merge branch 'karpathy:master' into master
2023-08-06 18:18:36 +08:00
rdentato
9cfb7efb85
Changed all the printf() for error/info messages so that they print on stderr.
2023-08-06 09:53:02 +00:00
madroid
baefaaaf76
Update README.md: add notable forks author's link
2023-08-06 17:42:31 +08:00
Andrej Karpathy
623894f5da
fix bug, have to use raw_model not model to access the loss
2023-08-06 07:55:46 +00:00
Andrej Karpathy
65b0846637
error on seed=0
2023-08-06 07:31:21 +00:00
Andrej Karpathy
8931d5092e
add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have
2023-08-06 07:22:39 +00:00
madroid
8c1f1b280f
Update README.md: format notable forks
2023-08-06 14:23:57 +08:00
Andrej Karpathy
49e3ff6d08
update makefile to use correct arg call after our argparse update
2023-08-05 23:11:11 +00:00
Andrej Karpathy
a1037d79ee
turned on trimTrailingWhitespace in my vscode sorry about that
2023-08-05 22:46:35 +00:00
Andrej
a2962b9a0c
Merge pull request #195 from clebert/prompt-tokens-size
...
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej
bdf3a6c22c
Merge pull request #167 from mzcu/pretokenize-speedup
...
Speed up tinystories pretokenize command
2023-08-05 15:14:51 -07:00
Andrej Karpathy
d1a59a9ca8
use EXIT_FAILURE instead of 1
2023-08-05 19:19:49 +00:00
Andrej
f3e7710763
Merge pull request #215 from Majdoddin/windows
...
replaced __int64 with int64_t and DWORD with uint_32
2023-08-05 12:14:39 -07:00
Andrej Karpathy
0447b06e7c
simplify rope
2023-08-05 18:41:31 +00:00
Andrej Karpathy
8719d5f5a8
Merge branch 'mpcusack-mpcusack/jitsave'
2023-08-05 18:13:07 +00:00
Andrej Karpathy
e03d7ecf12
Merge branch 'mpcusack/jitsave' of https://github.com/mpcusack/llama2.c into mpcusack-mpcusack/jitsave
2023-08-05 18:11:21 +00:00
Andrej Karpathy
0609eb6601
slightly tune todos
2023-08-05 17:13:35 +00:00
Andrej Karpathy
dcef5ff7c7
add a bit less embarassing argparse that uses keyword arguments instead of positional arguments
2023-08-05 17:08:11 +00:00
Andrej Karpathy
837796e0b7
get rid of unneeded comment now
2023-08-05 16:19:27 +00:00
Andrej
db4ad580f3
Merge pull request #225 from RahulSChand/rope_changes
...
Fixing max_seq_len passed to RoPE implementation. Minor comment changes
2023-08-05 09:18:47 -07:00
Andrej
9d001c6249
Merge pull request #223 from LexiestLeszek/master-1
...
Updated README.md with added steps for junior devs
2023-08-05 09:13:12 -07:00
Andrej
f93e7b5626
Merge pull request #228 from aiwizzard/master
...
Fixed typo in README.md
2023-08-05 09:09:19 -07:00
Andrej
2abd77a57f
Merge pull request #231 from madroidmaq/master
...
Update README.md: add a Kotlin port of this project
2023-08-05 09:08:59 -07:00
Andrej
ba036696b7
Merge branch 'master' into master
2023-08-05 09:08:51 -07:00
Andrej
4b1e5d57a1
Merge pull request #232 from clebert/zig-port
...
Add Zig port to README
2023-08-05 09:08:31 -07:00
Michael Cusack
13f342af9e
docs typo
2023-08-04 23:12:06 +07:00
Michael Cusack
f4c96b7339
Add options to save_torchscript
2023-08-04 23:11:33 +07:00
Michael Cusack
4b3a41b8fc
Add options to save_torchscript
2023-08-04 23:10:14 +07:00
Clemens Akens
a4e961f378
Add Zig port to README
2023-08-04 18:00:04 +02:00
Michael Cusack
113c675bc9
Rename save_model.py
2023-08-04 20:31:44 +07:00
madroid
ec65aac182
Update README.md: add a Kotlin port of this project
2023-08-04 18:50:06 +08:00
Michael Cusack
305d920862
Zero'ing params docs
2023-08-04 17:33:23 +07:00
Michael Cusack
dfff7812db
Zero'ing params docs
2023-08-04 17:31:31 +07:00
Michael Cusack
34f0402501
Zero'ing params docs
2023-08-04 17:31:11 +07:00
Michael Cusack
d4cdd6259e
Zero'ing params docs
2023-08-04 17:30:05 +07:00
Michael Cusack
9f8e0857ee
Typo
2023-08-04 17:22:27 +07:00
Michael Cusack
f8d45f180d
Reinline loss function
2023-08-04 17:21:29 +07:00
Michael Cusack
f67185958b
Model args in save script
2023-08-04 17:07:41 +07:00
Michael Cusack
fd5e2cc7bc
Updating training code for loss result
2023-08-04 17:03:11 +07:00
Michael Cusack
ac2b435151
docs
2023-08-04 16:55:26 +07:00
Michael Cusack
11a8348dfc
extra line
2023-08-04 16:52:04 +07:00
Michael Cusack
f2e34e6b0a
Resolve jit.save errors
2023-08-04 16:49:26 +07:00
Ajmal K
b9f303f3b8
Fixed typo in README.md
...
Fixed typo
2023-08-04 10:30:11 +05:30