rdentato
4444575c4e
Added check of generation parameters.
2023-08-21 06:43:39 +00:00
Andrej Karpathy
dd61b13e57
delete the save_torchscript export file, but copy its content to the new export.py for the future maybe
2023-08-21 05:09:06 +00:00
Andrej Karpathy
ea44f53568
now that the export.py HF functionality is in master, we can delete this file, and update the readme
2023-08-21 04:58:19 +00:00
Andrej
801c68f5a1
Merge pull request #326 from atamurad/import_hf
...
Added huggingface model loader/importer to export.py
2023-08-20 21:53:17 -07:00
Andrej
74a68eeb35
Merge pull request #325 from HarryGifford/users/hegi/update-readme-threading
...
Update readme with suggestion on number of threads to use
2023-08-20 21:50:26 -07:00
Andrej Karpathy
288b3cec09
remove dagger in the eyeball
2023-08-21 04:47:49 +00:00
Andrej Karpathy
14275bd623
minor clean. i think a lot of chaos has been reduced for today. we shall now rest.
2023-08-21 04:43:24 +00:00
Andrej Karpathy
3868f732a4
and finally refactor the Sampler. things are starting to look a lot cleaner I think
2023-08-21 04:23:02 +00:00
Andrej Karpathy
8a377a1d31
refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too
2023-08-21 03:55:12 +00:00
Andrej Karpathy
ae2e4f8d88
name the tokenizer methods cleaner: encode and decode
2023-08-21 03:11:54 +00:00
atamyrat
0dd82158f6
removed transformers from requirements.txt, added error message
2023-08-21 06:07:29 +03:00
atamyrat
155475a523
Fix WQ and WK permutation in huggingface models
2023-08-21 05:16:11 +03:00
atamyrat
d7704bdeaa
mark ModelArgs.hidden_dim as optional and calculate as previously if not provided
2023-08-21 03:40:34 +03:00
atamyrat
09db52c69e
Added huggingface model loader to export.py
2023-08-21 02:59:12 +03:00
Harry Gifford
a72b3b0206
Update readme with suggestion on number of threads to use
...
Update the documentation to make suggestions on the number of threads. The performance difference can be very large. Also linked to the PyTorch docs which are relevant here.
2023-08-20 15:01:33 -07:00
Andrej Karpathy
c74456f3f0
refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit
2023-08-20 18:18:23 +00:00
Andrej Karpathy
1e335a41cf
remove freq_cis fields as they are not used anymore
2023-08-20 17:26:43 +00:00
Andrej Karpathy
c0511de617
probindex should never have been part of RunState. i apologize for this failure of abstraction
2023-08-20 17:18:06 +00:00
Andrej
8c93c7a30e
Merge pull request #322 from karpathy/feature/export
...
New model export (the code remains "dead" and legacy version is still the default behavior, so no breaking changes are introduced). The major benefit is a new export.py file, which we can use to centralize work on formatting: both imports and exports.
2023-08-20 10:08:32 -07:00
Andrej Karpathy
13dcee493a
todos update
2023-08-20 17:02:22 +00:00
Andrej Karpathy
f3db92a2dc
use out_file.tell() instead of nbytes += arithmetic
2023-08-20 16:51:35 +00:00
Andrej Karpathy
fa8dfd854e
isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1
2023-08-19 19:21:12 +00:00
Andrej Karpathy
4df5e2e939
make version 1 be the legacy export but with new header. version 2 will be Q8_0 export
2023-08-19 18:51:32 +00:00
Andrej Karpathy
4212bd6d43
oops fix double indent on quantize def
2023-08-19 18:34:49 +00:00
Andrej Karpathy
7f551dbfd7
new model export: versions 0 (legacy) and 1
2023-08-19 18:25:20 +00:00
Andrej
6c5d78fa41
Merge pull request #317 from yiminghan/yhan/old
...
Add a link to Dart port in README
2023-08-19 10:01:08 -07:00
Andrej
db1a722816
Merge pull request #318 from rahoua/master
...
YARP - Yet Another Rust Port in README.md
2023-08-19 10:00:56 -07:00
Andrej
d2a546c577
Merge pull request #319 from RahulSChand/warning
...
Give better error message in Tinystories data loader
2023-08-19 10:00:27 -07:00
rahulschand
fbefeec1b1
add assert message to give better warning
2023-08-19 13:05:26 +05:30
rahoua
978c311b30
Add pecca-rs to README.md
2023-08-18 14:58:21 -07:00
YiMing Han
882e480bc0
update read me
2023-08-18 15:18:29 -04:00
YiMing Han
d09ebbb32b
Revert "working one"
...
This reverts commit 8607b11ea1 .
2023-08-18 15:14:08 -04:00
YiMing Han
bc7cb7d0e8
Revert "only dart"
...
This reverts commit 01df3731d6 .
2023-08-18 15:13:59 -04:00
YiMing Han
01df3731d6
only dart
2023-08-18 15:09:24 -04:00
YiMing Han
8607b11ea1
working one
2023-08-18 15:07:41 -04:00
David A Roberts
52fe3653e5
Fix vocab_source in sample.py
2023-08-18 18:40:25 +10:00
Andrej Karpathy
bd182289c5
calculate the freq_cis online, no need to write/read them to/from checkpoints
2023-08-17 04:13:13 +00:00
Andrej
b68a6d2ab5
Merge pull request #307 from madroidmaq/master
...
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:09:32 -07:00
Andrej
57bf0e9ee4
Merge pull request #306 from rdentato/patch-utf8-no-validation
...
minimal protection against invalid UTF8 encoding.
2023-08-16 09:51:11 -07:00
madroid
9fbe96fc2e
Jupter Notebook: Add run Meta's Llama 2 models
2023-08-16 20:27:28 +08:00
rdentato
55e60740f5
Added space to str_buffer in case max_token_length is 1.
2023-08-16 07:58:07 +00:00
rdentato
befe4867b3
minimal protection against invalid UTF8 encoding.
2023-08-16 07:42:53 +00:00
Andrej
df6557a10d
Merge pull request #267 from krrishnarraj/master
...
Update readme for openmp on mac
2023-08-15 19:26:34 -07:00
Andrej Karpathy
65c899314c
Merge branch 'Majdoddin-ci-tiny-model'
2023-08-16 02:22:26 +00:00
Andrej Karpathy
62a6d69d86
style changes and remove spurious runc test call at the bottom
2023-08-16 02:22:13 +00:00
Andrej Karpathy
d47fc41b6a
Merge branch 'ci-tiny-model' of https://github.com/Majdoddin/llama2.c into Majdoddin-ci-tiny-model
2023-08-16 02:20:34 +00:00
Andrej Karpathy
ca67253f28
smallfix: not sure what the point of this indirection was
2023-08-15 16:09:33 +00:00
Andrej Karpathy
4c63c5608d
shorten top comment on run.c file
2023-08-15 16:07:48 +00:00
Andrej Karpathy
a47f9b3969
collapsing copy paste code because it's driving my ocd crazy
2023-08-15 16:03:11 +00:00
Ruhollah Majdoddin
87b11edf27
modifiying test_all so it can safely run on windows
2023-08-15 16:01:53 +00:00