atamyrat
36b54321e5
bugfix: allocate +1 in tokens buffer for dummy whitespace
2023-08-13 23:23:32 +03:00
atamyrat
daa9fd9b8a
sort vocabulary for faster lookup with bsearch()
2023-08-13 15:02:11 +03:00
atamyrat
c02865df30
prompt tokenizer improvements: utf8 support, add_dummy_prefix and byte_fallback options to match sentencepiece
2023-08-07 13:12:44 +03:00
Andrej
3c3b19b14c
Merge pull request #242 from tairov/llama2-py
...
Add a link to simple one file pure Python port
2023-08-06 19:51:30 -07:00
Andrej
f4f4cae4cb
Merge pull request #241 from danielgrittner/master
...
add a Rust port
2023-08-06 19:51:13 -07:00
Andrej
09de2cc4ca
Merge pull request #250 from npinto/master-1
...
FIX: model.generate(); forward() only returns logits now.
2023-08-06 18:43:01 -07:00
Nicolas Pinto
98b515e44d
FIX: model.generate()
...
This patch fixes a simple bug in `generate()` due to model's `forward()` only returning logits and not losses since `f2e34e6b0ac55accd6ba930a04c6f683f5158b29`.
2023-08-06 14:48:47 -07:00
Aydyn Tairov
2297d158e3
Fix link to a github profile
2023-08-06 21:47:05 +01:00
Daniel Grittner
512f039d5d
Merge branch 'master' into master
2023-08-06 19:55:43 +02:00
Aydyn Tairov
6734eaeff5
Rebase chanes to master
2023-08-06 18:47:05 +01:00
Aydyn Tairov
7178facb75
Rebase changes to master
2023-08-06 18:45:47 +01:00
Andrej Karpathy
a7a3aa09b8
Merge branch 'master' of github.com:karpathy/llama2.c
2023-08-06 16:33:36 +00:00
Andrej Karpathy
79791f39b4
let's start respecting the BOS token. Don't print it explicitly, and terminate sequence if it appears. This makes sense especially after the recent addition of prompting. Also be careful with timings and making sure they come out right if we exit early in this data-dependent manner
2023-08-06 16:33:23 +00:00
Andrej Karpathy
4e8a3e8d5d
fix style issue space with stderr printing
2023-08-06 15:51:58 +00:00
Andrej
7af81ded7e
Merge pull request #244 from madroidmaq/master
...
Update README.md: format notable forks
2023-08-06 08:43:24 -07:00
Andrej
a25958fd45
Merge pull request #245 from rdentato/patch-stderr
...
Errors and info on stderr
2023-08-06 08:42:09 -07:00
Madroid Ma
1f53735d12
Merge branch 'karpathy:master' into master
2023-08-06 18:18:36 +08:00
rdentato
9cfb7efb85
Changed all the printf() for error/info messages so that they print on stderr.
2023-08-06 09:53:02 +00:00
madroid
baefaaaf76
Update README.md: add notable forks author's link
2023-08-06 17:42:31 +08:00
Daniel Grittner
fcb4cdef8b
add a Rust port
2023-08-06 10:44:48 +02:00
Andrej Karpathy
623894f5da
fix bug, have to use raw_model not model to access the loss
2023-08-06 07:55:46 +00:00
Andrej Karpathy
65b0846637
error on seed=0
2023-08-06 07:31:21 +00:00
Andrej Karpathy
8931d5092e
add nucleus sampling. it costs lines of code, but i think thit is the default best way to sample, so it is important to have
2023-08-06 07:22:39 +00:00
madroid
8c1f1b280f
Update README.md: format notable forks
2023-08-06 14:23:57 +08:00
Andrej Karpathy
49e3ff6d08
update makefile to use correct arg call after our argparse update
2023-08-05 23:11:11 +00:00
Andrej Karpathy
a1037d79ee
turned on trimTrailingWhitespace in my vscode sorry about that
2023-08-05 22:46:35 +00:00
Andrej
a2962b9a0c
Merge pull request #195 from clebert/prompt-tokens-size
...
Adjust `malloc` size for `prompt_tokens`
2023-08-05 15:29:33 -07:00
Andrej
bdf3a6c22c
Merge pull request #167 from mzcu/pretokenize-speedup
...
Speed up tinystories pretokenize command
2023-08-05 15:14:51 -07:00
Andrej Karpathy
d1a59a9ca8
use EXIT_FAILURE instead of 1
2023-08-05 19:19:49 +00:00
Andrej
f3e7710763
Merge pull request #215 from Majdoddin/windows
...
replaced __int64 with int64_t and DWORD with uint_32
2023-08-05 12:14:39 -07:00
Andrej Karpathy
0447b06e7c
simplify rope
2023-08-05 18:41:31 +00:00
Andrej Karpathy
8719d5f5a8
Merge branch 'mpcusack-mpcusack/jitsave'
2023-08-05 18:13:07 +00:00
Andrej Karpathy
e03d7ecf12
Merge branch 'mpcusack/jitsave' of https://github.com/mpcusack/llama2.c into mpcusack-mpcusack/jitsave
2023-08-05 18:11:21 +00:00
Andrej Karpathy
0609eb6601
slightly tune todos
2023-08-05 17:13:35 +00:00
Andrej Karpathy
dcef5ff7c7
add a bit less embarassing argparse that uses keyword arguments instead of positional arguments
2023-08-05 17:08:11 +00:00
Andrej Karpathy
837796e0b7
get rid of unneeded comment now
2023-08-05 16:19:27 +00:00
Andrej
db4ad580f3
Merge pull request #225 from RahulSChand/rope_changes
...
Fixing max_seq_len passed to RoPE implementation. Minor comment changes
2023-08-05 09:18:47 -07:00
Andrej
9d001c6249
Merge pull request #223 from LexiestLeszek/master-1
...
Updated README.md with added steps for junior devs
2023-08-05 09:13:12 -07:00
Andrej
f93e7b5626
Merge pull request #228 from aiwizzard/master
...
Fixed typo in README.md
2023-08-05 09:09:19 -07:00
Andrej
2abd77a57f
Merge pull request #231 from madroidmaq/master
...
Update README.md: add a Kotlin port of this project
2023-08-05 09:08:59 -07:00
Andrej
ba036696b7
Merge branch 'master' into master
2023-08-05 09:08:51 -07:00
Andrej
4b1e5d57a1
Merge pull request #232 from clebert/zig-port
...
Add Zig port to README
2023-08-05 09:08:31 -07:00
Michael Cusack
13f342af9e
docs typo
2023-08-04 23:12:06 +07:00
Michael Cusack
f4c96b7339
Add options to save_torchscript
2023-08-04 23:11:33 +07:00
Michael Cusack
4b3a41b8fc
Add options to save_torchscript
2023-08-04 23:10:14 +07:00
Clemens Akens
a4e961f378
Add Zig port to README
2023-08-04 18:00:04 +02:00
Michael Cusack
113c675bc9
Rename save_model.py
2023-08-04 20:31:44 +07:00
madroid
ec65aac182
Update README.md: add a Kotlin port of this project
2023-08-04 18:50:06 +08:00
Michael Cusack
305d920862
Zero'ing params docs
2023-08-04 17:33:23 +07:00
Michael Cusack
dfff7812db
Zero'ing params docs
2023-08-04 17:31:31 +07:00