Andrej Karpathy
288b3cec09
remove dagger in the eyeball
2023-08-21 04:47:49 +00:00
Andrej Karpathy
14275bd623
minor clean. i think a lot of chaos has been reduced for today. we shall now rest.
2023-08-21 04:43:24 +00:00
Andrej Karpathy
3868f732a4
and finally refactor the Sampler. things are starting to look a lot cleaner I think
2023-08-21 04:23:02 +00:00
Andrej Karpathy
8a377a1d31
refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too
2023-08-21 03:55:12 +00:00
Andrej Karpathy
ae2e4f8d88
name the tokenizer methods cleaner: encode and decode
2023-08-21 03:11:54 +00:00
Andrej Karpathy
c74456f3f0
refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit
2023-08-20 18:18:23 +00:00
Andrej Karpathy
1e335a41cf
remove freq_cis fields as they are not used anymore
2023-08-20 17:26:43 +00:00
Andrej Karpathy
c0511de617
probindex should never have been part of RunState. i apologize for this failure of abstraction
2023-08-20 17:18:06 +00:00
Andrej
8c93c7a30e
Merge pull request #322 from karpathy/feature/export
...
New model export (the code remains "dead" and legacy version is still the default behavior, so no breaking changes are introduced). The major benefit is a new export.py file, which we can use to centralize work on formatting: both imports and exports.
2023-08-20 10:08:32 -07:00
Andrej Karpathy
13dcee493a
todos update
2023-08-20 17:02:22 +00:00
Andrej Karpathy
f3db92a2dc
use out_file.tell() instead of nbytes += arithmetic
2023-08-20 16:51:35 +00:00
Andrej Karpathy
fa8dfd854e
isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1
2023-08-19 19:21:12 +00:00
Andrej Karpathy
4df5e2e939
make version 1 be the legacy export but with new header. version 2 will be Q8_0 export
2023-08-19 18:51:32 +00:00
Andrej Karpathy
4212bd6d43
oops fix double indent on quantize def
2023-08-19 18:34:49 +00:00
Andrej Karpathy
7f551dbfd7
new model export: versions 0 (legacy) and 1
2023-08-19 18:25:20 +00:00
Andrej
6c5d78fa41
Merge pull request #317 from yiminghan/yhan/old
...
Add a link to Dart port in README
2023-08-19 10:01:08 -07:00
Andrej
db1a722816
Merge pull request #318 from rahoua/master
...
YARP - Yet Another Rust Port in README.md
2023-08-19 10:00:56 -07:00
Andrej
d2a546c577
Merge pull request #319 from RahulSChand/warning
...
Give better error message in Tinystories data loader
2023-08-19 10:00:27 -07:00
rahulschand
fbefeec1b1
add assert message to give better warning
2023-08-19 13:05:26 +05:30
rahoua
978c311b30
Add pecca-rs to README.md
2023-08-18 14:58:21 -07:00
YiMing Han
882e480bc0
update read me
2023-08-18 15:18:29 -04:00
YiMing Han
d09ebbb32b
Revert "working one"
...
This reverts commit 8607b11ea1 .
2023-08-18 15:14:08 -04:00
YiMing Han
bc7cb7d0e8
Revert "only dart"
...
This reverts commit 01df3731d6 .
2023-08-18 15:13:59 -04:00
YiMing Han
01df3731d6
only dart
2023-08-18 15:09:24 -04:00
YiMing Han
8607b11ea1
working one
2023-08-18 15:07:41 -04:00
Andrej Karpathy
bd182289c5
calculate the freq_cis online, no need to write/read them to/from checkpoints
2023-08-17 04:13:13 +00:00
Andrej
b68a6d2ab5
Merge pull request #307 from madroidmaq/master
...
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:09:32 -07:00
Andrej
57bf0e9ee4
Merge pull request #306 from rdentato/patch-utf8-no-validation
...
minimal protection against invalid UTF8 encoding.
2023-08-16 09:51:11 -07:00
madroid
9fbe96fc2e
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:27:28 +08:00
rdentato
55e60740f5
Added space to str_buffer in case max_token_length is 1.
2023-08-16 07:58:07 +00:00
rdentato
befe4867b3
minimal protection against invalid UTF8 encoding.
2023-08-16 07:42:53 +00:00
Andrej
df6557a10d
Merge pull request #267 from krrishnarraj/master
...
Update readme for openmp on mac
2023-08-15 19:26:34 -07:00
Andrej Karpathy
65c899314c
Merge branch 'Majdoddin-ci-tiny-model'
2023-08-16 02:22:26 +00:00
Andrej Karpathy
62a6d69d86
style changes and remove spurious runc test call at the bottom
2023-08-16 02:22:13 +00:00
Andrej Karpathy
d47fc41b6a
Merge branch 'ci-tiny-model' of https://github.com/Majdoddin/llama2.c into Majdoddin-ci-tiny-model
2023-08-16 02:20:34 +00:00
Andrej Karpathy
ca67253f28
smallfix: not sure what the point of this indirection was
2023-08-15 16:09:33 +00:00
Andrej Karpathy
4c63c5608d
shorten top comment on run.c file
2023-08-15 16:07:48 +00:00
Andrej Karpathy
a47f9b3969
collapsing copy paste code because it's driving my ocd crazy
2023-08-15 16:03:11 +00:00
Ruhollah Majdoddin
87b11edf27
modifiying test_all so it can safely run on windows
2023-08-15 16:01:53 +00:00
Ruhollah Majdoddin
66c9f5e6c8
Adding pytest with the tiny model to macOS and windows (except amd64_arm64) runners
2023-08-15 15:58:04 +00:00
Andrej Karpathy
88eb238255
add tests into Makefile convenience
2023-08-15 15:57:27 +00:00
Andrej
600cedb33d
Merge pull request #297 from karpathy/feature/utf8
...
Add UTF-8 support to prompts
2023-08-14 19:54:49 -07:00
Andrej Karpathy
fe2de68688
fix sample.py from tokenizer changes before
2023-08-15 02:33:01 +00:00
Andrej Karpathy
a9a0628c92
thoroughly commented the UTF-8 byte reading code
2023-08-15 02:18:49 +00:00
Andrej Karpathy
d459fd4243
add back careful processing of the byte tokens
2023-08-15 01:42:33 +00:00
Andrej Karpathy
4bf36ecc17
get rid of the special byte decoding logic
2023-08-15 01:04:10 +00:00
Andrej Karpathy
8417cb438d
Merge branch 'utf8' of https://github.com/atamurad/llama2.c into feature/utf8
2023-08-15 00:18:53 +00:00
Andrej Karpathy
94a3a5e0a5
Merge branch 'master' of github.com:karpathy/llama2.c
2023-08-14 14:52:15 +00:00
Andrej Karpathy
32c1ff97fb
missed p->dim to kv_dim for k,v vectors. we're not doing anything wrong we're just being wasteful with memory. thanks @xefoci7612 for pointing out
2023-08-14 14:52:07 +00:00
Andrej
013e012b87
Merge pull request #286 from Nick-infinity/master
...
[Feat]: Add support for meta llama hf model conversion
2023-08-14 07:46:39 -07:00