richinseattle
539dc73196
fix whitespace
2023-07-26 22:12:32 -07:00
richinseattle
7f7a3b2d56
update openmp pragmas for MSVC compatibility
...
This has no negative impact on Linux and is in preparation for windows support. Windows compiles will not work without additional timer and mmap compatibility patches
2023-07-26 22:06:23 -07:00
Andrej Karpathy
5f681b64b1
oops missed a section somehow, updating readme
2023-07-27 03:01:48 +00:00
Andrej Karpathy
c2bbe9c6fb
link to the huggingface hub models instead
2023-07-27 00:14:23 +00:00
Andrej Karpathy
7a4ca4a98b
add contributing section to readme, and also notable forks section
2023-07-26 23:58:49 +00:00
Andrej
4085e8971f
Merge pull request #119 from kroggen/code-comments
...
add some code comments
2023-07-26 15:50:01 -07:00
Bernardo Ramos
57034480b6
add some code comments
2023-07-26 19:48:14 -03:00
Andrej Karpathy
f0f43b7288
small note on traing times
2023-07-26 22:12:50 +00:00
Andrej Karpathy
2711ae8c32
make compiler tunable in Makefile, i think potentially nice and useful
2023-07-26 16:40:40 +00:00
Andrej
7059d7dba9
Update README.md
2023-07-26 09:06:08 -07:00
Andrej
7496ea8108
Update README.md
2023-07-26 08:59:42 -07:00
Andrej
f5d8797af2
Update README.md
2023-07-26 08:59:12 -07:00
Andrej Karpathy
3aedfe59f1
Merge branch 'aegkmq-master'
2023-07-26 15:43:06 +00:00
aegkmq
8986005f23
Minor cleanup
2023-07-26 16:57:08 +09:00
aegkmq
36c522a0d8
Improve locality
2023-07-26 13:24:27 +09:00
Andrej Karpathy
f5650891d5
honestly at this point this is a lot more my nanogpt code than llama code
2023-07-25 23:57:03 +00:00
Andrej
7f9f5ca853
Update README.md: new llama model export
2023-07-25 16:30:28 -07:00
Andrej
5bcd19a204
Merge pull request #85 from python273/export-llama-without-llama
...
Export llama without llama
2023-07-25 16:23:56 -07:00
Andrej
614bf91e5d
Merge pull request #60 from emma-eva/patch-1
...
Fixed time_in_ms() compile time error (termux and neoterm)
2023-07-25 16:06:41 -07:00
Andrej
366711acf8
Merge pull request #77 from madroidmaq/master
...
Update README.md: formate output samples
2023-07-25 16:01:55 -07:00
python273
4d1fa2f2c6
Export llama without llama
2023-07-26 01:32:00 +04:00
madroid
ac22fbce7e
Update README.md: formate output samples
2023-07-26 00:46:14 +08:00
Andrej
6cf34d610a
Update README.md
2023-07-25 08:14:48 -07:00
Andrej Karpathy
34ccb64ed8
fix typo in readme after adding the 110m model
2023-07-25 15:02:11 +00:00
Andrej Karpathy
94730f1766
add the 110m model, as it finished training
2023-07-25 15:00:57 +00:00
Andrej Karpathy
05ee4cbf38
fix bug in timing - use steps not max seq len doh
2023-07-25 14:21:37 +00:00
Andrej
d359fae505
Merge pull request #69 from RichardScottOZ/patch-1
...
intimately
2023-07-25 07:04:17 -07:00
RichardScottOZ
f3a1e227fe
intimately
2023-07-25 21:26:30 +09:30
Emma Eva
6ce91b1b3b
Fixed time_in_ms() compile time error (termux and neoterm)
...
clang version 16.0.4
2023-07-25 12:12:40 +06:00
Andrej
98ec4ba23d
Update README.md
2023-07-24 22:54:54 -07:00
Andrej
81c90bfcb7
Update README.md: small tweaks
2023-07-24 22:51:39 -07:00
Andrej
cf625ecd7e
Update README.md
2023-07-24 21:25:31 -07:00
Andrej Karpathy
c3e0d73bd2
we can inference Meta's Llama 2 7B, yay
2023-07-25 04:21:07 +00:00
Andrej
133ad3ffff
Merge pull request #50 from karpathy/memmap
...
candidate memmap implementation
2023-07-24 18:59:29 -07:00
Andrej Karpathy
a1f6b4653e
merge conflict resolve with imports
2023-07-25 01:58:46 +00:00
Andrej
d18e9efd77
Merge pull request #48 from richinseattle/richinseattle-patch-1
...
MSVC Compatibility fix for timer
2023-07-24 16:37:37 -07:00
richinseattle
b2857c6af2
Switch to using timespec_get() for cross OS compatibility
2023-07-24 16:31:38 -07:00
richinseattle
f121f5f0c5
Merge branch 'karpathy:master' into richinseattle-patch-1
2023-07-24 16:30:07 -07:00
Andrej Karpathy
cae88dfbab
tune readme around timings etc
2023-07-24 23:27:48 +00:00
Andrej Karpathy
496466f78f
add rundebug to makefile, useful for spotting issues and such
2023-07-24 23:13:59 +00:00
Andrej Karpathy
e6e3f1322b
candidate memmap implementation
2023-07-24 22:54:49 +00:00
richinseattle
2be7d7887b
MSVC Compatibility fix for timer
...
use clock() instead of gettimeofday() for cross-platform compatibility
2023-07-24 15:22:20 -07:00
Andrej Karpathy
16edfe6364
add a simple makefile
2023-07-24 21:50:04 +00:00
Andrej
bf9f6f2ece
Add discord link to Readme
2023-07-24 14:22:29 -07:00
Andrej
669b75ddc8
Merge pull request #43 from krzysztof-jusiak/rmsnorm
...
Speed up rmsnorm by using sqrtf/expf
2023-07-24 14:13:49 -07:00
Andrej
687473c009
Update README.md with TinyStories model series
2023-07-24 14:11:27 -07:00
Andrej Karpathy
791be9d991
tweak argparse. fix steps=256, even if some models may support longer maximum seq_len. get rid of seed option for now, use temp=0.0 for deterministic behavior
2023-07-24 20:59:32 +00:00
Andrej Karpathy
90ae37c3e6
git push origin masterMerge branch 'admu-progvar-master'
2023-07-24 20:39:40 +00:00
Kris Jusiak
c9b1f10124
Speed up rmsnorm by using sqrtf/expf
...
Problem:
- exp and sqrt are using double precision for operations which is not
required.
Solution:
- Use expf and sqrtf intead.
Notes:
- Although it's using single precision doesn't seem to affect the
result.
Results: ~ 10% improvement
- before: 940 tok/s
- after: 1020 tok/s
2023-07-24 13:06:27 -05:00
Franz Louis Cesista
c9ad067c5d
parallelize multi-head attention
2023-07-25 01:10:12 +08:00