443 Commits

Author SHA1 Message Date
rdentato 4444575c4e Added check of generation parameters. 2023-08-21 06:43:39 +00:00
Andrej Karpathy dd61b13e57 delete the save_torchscript export file, but copy its content to the new export.py for the future maybe 2023-08-21 05:09:06 +00:00
Andrej Karpathy ea44f53568 now that the export.py HF functionality is in master, we can delete this file, and update the readme 2023-08-21 04:58:19 +00:00
Andrej 801c68f5a1 Merge pull request #326 from atamurad/import_hf
Added huggingface model loader/importer to export.py
2023-08-20 21:53:17 -07:00
Andrej 74a68eeb35 Merge pull request #325 from HarryGifford/users/hegi/update-readme-threading
Update readme with suggestion on number of threads to use
2023-08-20 21:50:26 -07:00
Andrej Karpathy 288b3cec09 remove dagger in the eyeball 2023-08-21 04:47:49 +00:00
Andrej Karpathy 14275bd623 minor clean. i think a lot of chaos has been reduced for today. we shall now rest. 2023-08-21 04:43:24 +00:00
Andrej Karpathy 3868f732a4 and finally refactor the Sampler. things are starting to look a lot cleaner I think 2023-08-21 04:23:02 +00:00
Andrej Karpathy 8a377a1d31 refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too 2023-08-21 03:55:12 +00:00
Andrej Karpathy ae2e4f8d88 name the tokenizer methods cleaner: encode and decode 2023-08-21 03:11:54 +00:00
atamyrat 0dd82158f6 removed transformers from requirements.txt, added error message 2023-08-21 06:07:29 +03:00
atamyrat 155475a523 Fix WQ and WK permutation in huggingface models 2023-08-21 05:16:11 +03:00
atamyrat d7704bdeaa mark ModelArgs.hidden_dim as optional and calculate as previously if not provided 2023-08-21 03:40:34 +03:00
atamyrat 09db52c69e Added huggingface model loader to export.py 2023-08-21 02:59:12 +03:00
Harry Gifford a72b3b0206 Update readme with suggestion on number of threads to use
Update the documentation to make suggestions on the number of threads. The performance difference can be very large. Also linked to the PyTorch docs which are relevant here.
2023-08-20 15:01:33 -07:00
Andrej Karpathy c74456f3f0 refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit 2023-08-20 18:18:23 +00:00
Andrej Karpathy 1e335a41cf remove freq_cis fields as they are not used anymore 2023-08-20 17:26:43 +00:00
Andrej Karpathy c0511de617 probindex should never have been part of RunState. i apologize for this failure of abstraction 2023-08-20 17:18:06 +00:00
Andrej 8c93c7a30e Merge pull request #322 from karpathy/feature/export
New model export (the code remains "dead" and legacy version is still the default behavior, so no breaking changes are introduced). The major benefit is a new export.py file, which we can use to centralize work on formatting: both imports and exports.
2023-08-20 10:08:32 -07:00
Andrej Karpathy 13dcee493a todos update 2023-08-20 17:02:22 +00:00
Andrej Karpathy f3db92a2dc use out_file.tell() instead of nbytes += arithmetic 2023-08-20 16:51:35 +00:00
Andrej Karpathy fa8dfd854e isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1 2023-08-19 19:21:12 +00:00
Andrej Karpathy 4df5e2e939 make version 1 be the legacy export but with new header. version 2 will be Q8_0 export 2023-08-19 18:51:32 +00:00
Andrej Karpathy 4212bd6d43 oops fix double indent on quantize def 2023-08-19 18:34:49 +00:00
Andrej Karpathy 7f551dbfd7 new model export: versions 0 (legacy) and 1 2023-08-19 18:25:20 +00:00
Andrej 6c5d78fa41 Merge pull request #317 from yiminghan/yhan/old
Add a link to Dart port in README
2023-08-19 10:01:08 -07:00
Andrej db1a722816 Merge pull request #318 from rahoua/master
YARP - Yet Another Rust Port in README.md
2023-08-19 10:00:56 -07:00
Andrej d2a546c577 Merge pull request #319 from RahulSChand/warning
Give better error message in Tinystories data loader
2023-08-19 10:00:27 -07:00
rahulschand fbefeec1b1 add assert message to give better warning 2023-08-19 13:05:26 +05:30
rahoua 978c311b30 Add pecca-rs to README.md 2023-08-18 14:58:21 -07:00
YiMing Han 882e480bc0 update read me 2023-08-18 15:18:29 -04:00
YiMing Han d09ebbb32b Revert "working one"
This reverts commit 8607b11ea1.
2023-08-18 15:14:08 -04:00
YiMing Han bc7cb7d0e8 Revert "only dart"
This reverts commit 01df3731d6.
2023-08-18 15:13:59 -04:00
YiMing Han 01df3731d6 only dart 2023-08-18 15:09:24 -04:00
YiMing Han 8607b11ea1 working one 2023-08-18 15:07:41 -04:00
David A Roberts 52fe3653e5 Fix vocab_source in sample.py 2023-08-18 18:40:25 +10:00
Andrej Karpathy bd182289c5 calculate the freq_cis online, no need to write/read them to/from checkpoints 2023-08-17 04:13:13 +00:00
Andrej b68a6d2ab5 Merge pull request #307 from madroidmaq/master
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:09:32 -07:00
Andrej 57bf0e9ee4 Merge pull request #306 from rdentato/patch-utf8-no-validation
minimal protection against invalid UTF8 encoding.
2023-08-16 09:51:11 -07:00
madroid 9fbe96fc2e Jupter Notebook: Add run Meta's Llama 2 models 2023-08-16 20:27:28 +08:00
rdentato 55e60740f5 Added space to str_buffer in case max_token_length is 1. 2023-08-16 07:58:07 +00:00
rdentato befe4867b3 minimal protection against invalid UTF8 encoding. 2023-08-16 07:42:53 +00:00
Andrej df6557a10d Merge pull request #267 from krrishnarraj/master
Update readme for openmp on mac
2023-08-15 19:26:34 -07:00
Andrej Karpathy 65c899314c Merge branch 'Majdoddin-ci-tiny-model' 2023-08-16 02:22:26 +00:00
Andrej Karpathy 62a6d69d86 style changes and remove spurious runc test call at the bottom 2023-08-16 02:22:13 +00:00
Andrej Karpathy d47fc41b6a Merge branch 'ci-tiny-model' of https://github.com/Majdoddin/llama2.c into Majdoddin-ci-tiny-model 2023-08-16 02:20:34 +00:00
Andrej Karpathy ca67253f28 smallfix: not sure what the point of this indirection was 2023-08-15 16:09:33 +00:00
Andrej Karpathy 4c63c5608d shorten top comment on run.c file 2023-08-15 16:07:48 +00:00
Andrej Karpathy a47f9b3969 collapsing copy paste code because it's driving my ocd crazy 2023-08-15 16:03:11 +00:00
Ruhollah Majdoddin 87b11edf27 modifiying test_all so it can safely run on windows 2023-08-15 16:01:53 +00:00