Commit Graph

382 Commits

Author SHA1 Message Date
Andrej Karpathy 288b3cec09 remove dagger in the eyeball 2023-08-21 04:47:49 +00:00
Andrej Karpathy 14275bd623 minor clean. i think a lot of chaos has been reduced for today. we shall now rest. 2023-08-21 04:43:24 +00:00
Andrej Karpathy 3868f732a4 and finally refactor the Sampler. things are starting to look a lot cleaner I think 2023-08-21 04:23:02 +00:00
Andrej Karpathy 8a377a1d31 refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too 2023-08-21 03:55:12 +00:00
Andrej Karpathy ae2e4f8d88 name the tokenizer methods cleaner: encode and decode 2023-08-21 03:11:54 +00:00
Andrej Karpathy c74456f3f0 refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit 2023-08-20 18:18:23 +00:00
Andrej Karpathy 1e335a41cf remove freq_cis fields as they are not used anymore 2023-08-20 17:26:43 +00:00
Andrej Karpathy c0511de617 probindex should never have been part of RunState. i apologize for this failure of abstraction 2023-08-20 17:18:06 +00:00
Andrej 8c93c7a30e Merge pull request #322 from karpathy/feature/export
New model export (the code remains "dead" and legacy version is still the default behavior, so no breaking changes are introduced). The major benefit is a new export.py file, which we can use to centralize work on formatting: both imports and exports.
2023-08-20 10:08:32 -07:00
Andrej Karpathy 13dcee493a todos update 2023-08-20 17:02:22 +00:00
Andrej Karpathy f3db92a2dc use out_file.tell() instead of nbytes += arithmetic 2023-08-20 16:51:35 +00:00
Andrej Karpathy fa8dfd854e isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1 2023-08-19 19:21:12 +00:00
Andrej Karpathy 4df5e2e939 make version 1 be the legacy export but with new header. version 2 will be Q8_0 export 2023-08-19 18:51:32 +00:00
Andrej Karpathy 4212bd6d43 oops fix double indent on quantize def 2023-08-19 18:34:49 +00:00
Andrej Karpathy 7f551dbfd7 new model export: versions 0 (legacy) and 1 2023-08-19 18:25:20 +00:00
Andrej 6c5d78fa41 Merge pull request #317 from yiminghan/yhan/old
Add a link to Dart port in README
2023-08-19 10:01:08 -07:00
Andrej db1a722816 Merge pull request #318 from rahoua/master
YARP - Yet Another Rust Port in README.md
2023-08-19 10:00:56 -07:00
Andrej d2a546c577 Merge pull request #319 from RahulSChand/warning
Give better error message in Tinystories data loader
2023-08-19 10:00:27 -07:00
rahulschand fbefeec1b1 add assert message to give better warning 2023-08-19 13:05:26 +05:30
rahoua 978c311b30 Add pecca-rs to README.md 2023-08-18 14:58:21 -07:00
YiMing Han 882e480bc0 update read me 2023-08-18 15:18:29 -04:00
YiMing Han d09ebbb32b Revert "working one"
This reverts commit 8607b11ea1.
2023-08-18 15:14:08 -04:00
YiMing Han bc7cb7d0e8 Revert "only dart"
This reverts commit 01df3731d6.
2023-08-18 15:13:59 -04:00
YiMing Han 01df3731d6 only dart 2023-08-18 15:09:24 -04:00
YiMing Han 8607b11ea1 working one 2023-08-18 15:07:41 -04:00
Andrej Karpathy bd182289c5 calculate the freq_cis online, no need to write/read them to/from checkpoints 2023-08-17 04:13:13 +00:00
Andrej b68a6d2ab5 Merge pull request #307 from madroidmaq/master
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:09:32 -07:00
Andrej 57bf0e9ee4 Merge pull request #306 from rdentato/patch-utf8-no-validation
minimal protection against invalid UTF8 encoding.
2023-08-16 09:51:11 -07:00
madroid 9fbe96fc2e Jupter Notebook: Add run Meta's Llama 2 models 2023-08-16 20:27:28 +08:00
rdentato 55e60740f5 Added space to str_buffer in case max_token_length is 1. 2023-08-16 07:58:07 +00:00
rdentato befe4867b3 minimal protection against invalid UTF8 encoding. 2023-08-16 07:42:53 +00:00
Andrej df6557a10d Merge pull request #267 from krrishnarraj/master
Update readme for openmp on mac
2023-08-15 19:26:34 -07:00
Andrej Karpathy 65c899314c Merge branch 'Majdoddin-ci-tiny-model' 2023-08-16 02:22:26 +00:00
Andrej Karpathy 62a6d69d86 style changes and remove spurious runc test call at the bottom 2023-08-16 02:22:13 +00:00
Andrej Karpathy d47fc41b6a Merge branch 'ci-tiny-model' of https://github.com/Majdoddin/llama2.c into Majdoddin-ci-tiny-model 2023-08-16 02:20:34 +00:00
Andrej Karpathy ca67253f28 smallfix: not sure what the point of this indirection was 2023-08-15 16:09:33 +00:00
Andrej Karpathy 4c63c5608d shorten top comment on run.c file 2023-08-15 16:07:48 +00:00
Andrej Karpathy a47f9b3969 collapsing copy paste code because it's driving my ocd crazy 2023-08-15 16:03:11 +00:00
Ruhollah Majdoddin 87b11edf27 modifiying test_all so it can safely run on windows 2023-08-15 16:01:53 +00:00
Ruhollah Majdoddin 66c9f5e6c8 Adding pytest with the tiny model to macOS and windows (except amd64_arm64) runners 2023-08-15 15:58:04 +00:00
Andrej Karpathy 88eb238255 add tests into Makefile convenience 2023-08-15 15:57:27 +00:00
Andrej 600cedb33d Merge pull request #297 from karpathy/feature/utf8
Add UTF-8 support to prompts
2023-08-14 19:54:49 -07:00
Andrej Karpathy fe2de68688 fix sample.py from tokenizer changes before 2023-08-15 02:33:01 +00:00
Andrej Karpathy a9a0628c92 thoroughly commented the UTF-8 byte reading code 2023-08-15 02:18:49 +00:00
Andrej Karpathy d459fd4243 add back careful processing of the byte tokens 2023-08-15 01:42:33 +00:00
Andrej Karpathy 4bf36ecc17 get rid of the special byte decoding logic 2023-08-15 01:04:10 +00:00
Andrej Karpathy 8417cb438d Merge branch 'utf8' of https://github.com/atamurad/llama2.c into feature/utf8 2023-08-15 00:18:53 +00:00
Andrej Karpathy 94a3a5e0a5 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-14 14:52:15 +00:00
Andrej Karpathy 32c1ff97fb missed p->dim to kv_dim for k,v vectors. we're not doing anything wrong we're just being wasteful with memory. thanks @xefoci7612 for pointing out 2023-08-14 14:52:07 +00:00
Andrej 013e012b87 Merge pull request #286 from Nick-infinity/master
[Feat]: Add support for meta llama hf model conversion
2023-08-14 07:46:39 -07:00