Andrej
7496ea8108
Update README.md
2023-07-26 08:59:42 -07:00
Andrej
f5d8797af2
Update README.md
2023-07-26 08:59:12 -07:00
Andrej Karpathy
3aedfe59f1
Merge branch 'aegkmq-master'
2023-07-26 15:43:06 +00:00
aegkmq
8986005f23
Minor cleanup
2023-07-26 16:57:08 +09:00
aegkmq
36c522a0d8
Improve locality
2023-07-26 13:24:27 +09:00
Andrej Karpathy
f5650891d5
honestly at this point this is a lot more my nanogpt code than llama code
2023-07-25 23:57:03 +00:00
Andrej
7f9f5ca853
Update README.md: new llama model export
2023-07-25 16:30:28 -07:00
Andrej
5bcd19a204
Merge pull request #85 from python273/export-llama-without-llama
...
Export llama without llama
2023-07-25 16:23:56 -07:00
Andrej
614bf91e5d
Merge pull request #60 from emma-eva/patch-1
...
Fixed time_in_ms() compile time error (termux and neoterm)
2023-07-25 16:06:41 -07:00
Andrej
366711acf8
Merge pull request #77 from madroidmaq/master
...
Update README.md: formate output samples
2023-07-25 16:01:55 -07:00
python273
4d1fa2f2c6
Export llama without llama
2023-07-26 01:32:00 +04:00
madroid
ac22fbce7e
Update README.md: formate output samples
2023-07-26 00:46:14 +08:00
Andrej
6cf34d610a
Update README.md
2023-07-25 08:14:48 -07:00
Andrej Karpathy
34ccb64ed8
fix typo in readme after adding the 110m model
2023-07-25 15:02:11 +00:00
Andrej Karpathy
94730f1766
add the 110m model, as it finished training
2023-07-25 15:00:57 +00:00
Andrej Karpathy
05ee4cbf38
fix bug in timing - use steps not max seq len doh
2023-07-25 14:21:37 +00:00
Andrej
d359fae505
Merge pull request #69 from RichardScottOZ/patch-1
...
intimately
2023-07-25 07:04:17 -07:00
RichardScottOZ
f3a1e227fe
intimately
2023-07-25 21:26:30 +09:30
Emma Eva
6ce91b1b3b
Fixed time_in_ms() compile time error (termux and neoterm)
...
clang version 16.0.4
2023-07-25 12:12:40 +06:00
Andrej
98ec4ba23d
Update README.md
2023-07-24 22:54:54 -07:00
Andrej
81c90bfcb7
Update README.md: small tweaks
2023-07-24 22:51:39 -07:00
Andrej
cf625ecd7e
Update README.md
2023-07-24 21:25:31 -07:00
Andrej Karpathy
c3e0d73bd2
we can inference Meta's Llama 2 7B, yay
2023-07-25 04:21:07 +00:00
Andrej
133ad3ffff
Merge pull request #50 from karpathy/memmap
...
candidate memmap implementation
2023-07-24 18:59:29 -07:00
Andrej Karpathy
a1f6b4653e
merge conflict resolve with imports
2023-07-25 01:58:46 +00:00
Andrej
d18e9efd77
Merge pull request #48 from richinseattle/richinseattle-patch-1
...
MSVC Compatibility fix for timer
2023-07-24 16:37:37 -07:00
richinseattle
b2857c6af2
Switch to using timespec_get() for cross OS compatibility
2023-07-24 16:31:38 -07:00
richinseattle
f121f5f0c5
Merge branch 'karpathy:master' into richinseattle-patch-1
2023-07-24 16:30:07 -07:00
Andrej Karpathy
cae88dfbab
tune readme around timings etc
2023-07-24 23:27:48 +00:00
Andrej Karpathy
496466f78f
add rundebug to makefile, useful for spotting issues and such
2023-07-24 23:13:59 +00:00
Andrej Karpathy
e6e3f1322b
candidate memmap implementation
2023-07-24 22:54:49 +00:00
richinseattle
2be7d7887b
MSVC Compatibility fix for timer
...
use clock() instead of gettimeofday() for cross-platform compatibility
2023-07-24 15:22:20 -07:00
Andrej Karpathy
16edfe6364
add a simple makefile
2023-07-24 21:50:04 +00:00
Andrej
bf9f6f2ece
Add discord link to Readme
2023-07-24 14:22:29 -07:00
Andrej
669b75ddc8
Merge pull request #43 from krzysztof-jusiak/rmsnorm
...
Speed up rmsnorm by using sqrtf/expf
2023-07-24 14:13:49 -07:00
Andrej
687473c009
Update README.md with TinyStories model series
2023-07-24 14:11:27 -07:00
Andrej Karpathy
791be9d991
tweak argparse. fix steps=256, even if some models may support longer maximum seq_len. get rid of seed option for now, use temp=0.0 for deterministic behavior
2023-07-24 20:59:32 +00:00
Andrej Karpathy
90ae37c3e6
git push origin masterMerge branch 'admu-progvar-master'
2023-07-24 20:39:40 +00:00
Kris Jusiak
c9b1f10124
Speed up rmsnorm by using sqrtf/expf
...
Problem:
- exp and sqrt are using double precision for operations which is not
required.
Solution:
- Use expf and sqrtf intead.
Notes:
- Although it's using single precision doesn't seem to affect the
result.
Results: ~ 10% improvement
- before: 940 tok/s
- after: 1020 tok/s
2023-07-24 13:06:27 -05:00
Franz Louis Cesista
c9ad067c5d
parallelize multi-head attention
2023-07-25 01:10:12 +08:00
Andrej Karpathy
50a086edde
add warning about fastmath
2023-07-24 15:18:04 +00:00
Andrej Karpathy
fff00ffd07
ack to lambda
2023-07-24 14:31:52 +00:00
Andrej
d0ddf94cc3
Merge pull request #36 from hu-po/patch-1
...
typo
2023-07-24 07:27:36 -07:00
Andrej
228c4ea3ea
Merge pull request #28 from SlyEcho/master
...
Fix tokenizer reading on Windows
2023-07-24 07:23:07 -07:00
Andrej Karpathy
624cdfc76a
add dropout support to model
2023-07-24 14:18:50 +00:00
Andrej
cdfb49208a
Merge pull request #37 from awgu/pt2
...
Have DDP ignore `freqs_cis` to avoid broadcast
2023-07-24 07:15:40 -07:00
Andrej Karpathy
9055766cf6
docs on how to run with openmp
2023-07-24 14:08:06 +00:00
Andrej Karpathy
cbbe4301b0
Merge branch 'krzysztof-jusiak-openmp'
2023-07-24 14:02:28 +00:00
Andrew Gu
25494f9cbc
Have DDP ignore freqs_cis to avoid broadcast
2023-07-24 13:58:09 +00:00
hu-po
d95c7617c6
typo
2023-07-24 07:35:12 -05:00