517763346d
HF checkpoints i removed the optimizer to save space, init Adam without the first/second moments is ok
Andrej Karpathy
2023-07-27 22:20:07 +00:00
747db60562
Merge pull request #133 from nikolaydubina/patch-1
Andrej
2023-07-27 15:08:21 -07:00
6b3a689d96
Merge pull request #146 from admu-progvar/master
Andrej
2023-07-27 15:07:58 -07:00
b63cb91303
Add llama2.cpp to notable forks section
Franz Louis Cesista
2023-07-28 05:06:37 +08:00
459b9c8561
Merge branch 'master' into patch-1
Nikolay Dubina
2023-07-28 01:19:10 +08:00
cc66a2037e
Merge pull request #86 from tairov/master
Andrej
2023-07-27 08:59:00 -07:00
b6d63a973e
Merge branch 'tairov-win-timing'
Andrej Karpathy
2023-07-27 15:43:58 +00:00
6ce91b1b3b
Fixed time_in_ms() compile time error (termux and neoterm)
Emma Eva
2023-07-25 12:12:40 +06:00
98ec4ba23d
Update README.md
Andrej
2023-07-24 22:54:54 -07:00
81c90bfcb7
Update README.md: small tweaks
Andrej
2023-07-24 22:51:39 -07:00
cf625ecd7e
Update README.md
Andrej
2023-07-24 21:25:31 -07:00
c3e0d73bd2
we can inference Meta's Llama 2 7B, yay
Andrej Karpathy
2023-07-25 04:21:07 +00:00
133ad3ffff
Merge pull request #50 from karpathy/memmap
Andrej
2023-07-24 18:59:29 -07:00
a1f6b4653e
merge conflict resolve with imports
Andrej Karpathy
2023-07-25 01:58:46 +00:00
d18e9efd77
Merge pull request #48 from richinseattle/richinseattle-patch-1
Andrej
2023-07-24 16:37:37 -07:00
b2857c6af2
Switch to using timespec_get() for cross OS compatibility
richinseattle
2023-07-24 16:31:38 -07:00
f121f5f0c5
Merge branch 'karpathy:master' into richinseattle-patch-1
richinseattle
2023-07-24 16:30:07 -07:00
cae88dfbab
tune readme around timings etc
Andrej Karpathy
2023-07-24 23:27:48 +00:00
496466f78f
add rundebug to makefile, useful for spotting issues and such
Andrej Karpathy
2023-07-24 23:13:59 +00:00
e6e3f1322b
candidate memmap implementation
Andrej Karpathy
2023-07-24 22:54:49 +00:00
2be7d7887b
MSVC Compatibility fix for timer
richinseattle
2023-07-24 15:22:20 -07:00
16edfe6364
add a simple makefile
Andrej Karpathy
2023-07-24 21:50:04 +00:00
bf9f6f2ece
Add discord link to Readme
Andrej
2023-07-24 14:22:29 -07:00
669b75ddc8
Merge pull request #43 from krzysztof-jusiak/rmsnorm
Andrej
2023-07-24 14:13:49 -07:00
687473c009
Update README.md with TinyStories model series
Andrej
2023-07-24 14:11:27 -07:00
791be9d991
tweak argparse. fix steps=256, even if some models may support longer maximum seq_len. get rid of seed option for now, use temp=0.0 for deterministic behavior
Andrej Karpathy
2023-07-24 20:59:32 +00:00