Commit Graph

107 Commits

Author SHA1 Message Date
richinseattle 539dc73196 fix whitespace 2023-07-26 22:12:32 -07:00
richinseattle 7f7a3b2d56 update openmp pragmas for MSVC compatibility
This has no negative impact on Linux and is in preparation for windows support. Windows compiles will not work without additional timer and mmap compatibility patches
2023-07-26 22:06:23 -07:00
Andrej Karpathy 5f681b64b1 oops missed a section somehow, updating readme 2023-07-27 03:01:48 +00:00
Andrej Karpathy c2bbe9c6fb link to the huggingface hub models instead 2023-07-27 00:14:23 +00:00
Andrej Karpathy 7a4ca4a98b add contributing section to readme, and also notable forks section 2023-07-26 23:58:49 +00:00
Andrej 4085e8971f Merge pull request #119 from kroggen/code-comments
add some code comments
2023-07-26 15:50:01 -07:00
Bernardo Ramos 57034480b6 add some code comments 2023-07-26 19:48:14 -03:00
Andrej Karpathy f0f43b7288 small note on traing times 2023-07-26 22:12:50 +00:00
Andrej Karpathy 2711ae8c32 make compiler tunable in Makefile, i think potentially nice and useful 2023-07-26 16:40:40 +00:00
Andrej 7059d7dba9 Update README.md 2023-07-26 09:06:08 -07:00
Andrej 7496ea8108 Update README.md 2023-07-26 08:59:42 -07:00
Andrej f5d8797af2 Update README.md 2023-07-26 08:59:12 -07:00
Andrej Karpathy 3aedfe59f1 Merge branch 'aegkmq-master' 2023-07-26 15:43:06 +00:00
aegkmq 8986005f23 Minor cleanup 2023-07-26 16:57:08 +09:00
aegkmq 36c522a0d8 Improve locality 2023-07-26 13:24:27 +09:00
Andrej Karpathy f5650891d5 honestly at this point this is a lot more my nanogpt code than llama code 2023-07-25 23:57:03 +00:00
Andrej 7f9f5ca853 Update README.md: new llama model export 2023-07-25 16:30:28 -07:00
Andrej 5bcd19a204 Merge pull request #85 from python273/export-llama-without-llama
Export llama without llama
2023-07-25 16:23:56 -07:00
Andrej 614bf91e5d Merge pull request #60 from emma-eva/patch-1
Fixed time_in_ms() compile time error (termux and neoterm)
2023-07-25 16:06:41 -07:00
Andrej 366711acf8 Merge pull request #77 from madroidmaq/master
Update README.md: formate output samples
2023-07-25 16:01:55 -07:00
python273 4d1fa2f2c6 Export llama without llama 2023-07-26 01:32:00 +04:00
madroid ac22fbce7e Update README.md: formate output samples 2023-07-26 00:46:14 +08:00
Andrej 6cf34d610a Update README.md 2023-07-25 08:14:48 -07:00
Andrej Karpathy 34ccb64ed8 fix typo in readme after adding the 110m model 2023-07-25 15:02:11 +00:00
Andrej Karpathy 94730f1766 add the 110m model, as it finished training 2023-07-25 15:00:57 +00:00
Andrej Karpathy 05ee4cbf38 fix bug in timing - use steps not max seq len doh 2023-07-25 14:21:37 +00:00
Andrej d359fae505 Merge pull request #69 from RichardScottOZ/patch-1
intimately
2023-07-25 07:04:17 -07:00
RichardScottOZ f3a1e227fe intimately 2023-07-25 21:26:30 +09:30
Emma Eva 6ce91b1b3b Fixed time_in_ms() compile time error (termux and neoterm)
clang version 16.0.4
2023-07-25 12:12:40 +06:00
Andrej 98ec4ba23d Update README.md 2023-07-24 22:54:54 -07:00
Andrej 81c90bfcb7 Update README.md: small tweaks 2023-07-24 22:51:39 -07:00
Andrej cf625ecd7e Update README.md 2023-07-24 21:25:31 -07:00
Andrej Karpathy c3e0d73bd2 we can inference Meta's Llama 2 7B, yay 2023-07-25 04:21:07 +00:00
Andrej 133ad3ffff Merge pull request #50 from karpathy/memmap
candidate memmap implementation
2023-07-24 18:59:29 -07:00
Andrej Karpathy a1f6b4653e merge conflict resolve with imports 2023-07-25 01:58:46 +00:00
Andrej d18e9efd77 Merge pull request #48 from richinseattle/richinseattle-patch-1
MSVC Compatibility fix for timer
2023-07-24 16:37:37 -07:00
richinseattle b2857c6af2 Switch to using timespec_get() for cross OS compatibility 2023-07-24 16:31:38 -07:00
richinseattle f121f5f0c5 Merge branch 'karpathy:master' into richinseattle-patch-1 2023-07-24 16:30:07 -07:00
Andrej Karpathy cae88dfbab tune readme around timings etc 2023-07-24 23:27:48 +00:00
Andrej Karpathy 496466f78f add rundebug to makefile, useful for spotting issues and such 2023-07-24 23:13:59 +00:00
Andrej Karpathy e6e3f1322b candidate memmap implementation 2023-07-24 22:54:49 +00:00
richinseattle 2be7d7887b MSVC Compatibility fix for timer
use clock() instead of gettimeofday() for cross-platform compatibility
2023-07-24 15:22:20 -07:00
Andrej Karpathy 16edfe6364 add a simple makefile 2023-07-24 21:50:04 +00:00
Andrej bf9f6f2ece Add discord link to Readme 2023-07-24 14:22:29 -07:00
Andrej 669b75ddc8 Merge pull request #43 from krzysztof-jusiak/rmsnorm
Speed up rmsnorm by using sqrtf/expf
2023-07-24 14:13:49 -07:00
Andrej 687473c009 Update README.md with TinyStories model series 2023-07-24 14:11:27 -07:00
Andrej Karpathy 791be9d991 tweak argparse. fix steps=256, even if some models may support longer maximum seq_len. get rid of seed option for now, use temp=0.0 for deterministic behavior 2023-07-24 20:59:32 +00:00
Andrej Karpathy 90ae37c3e6 git push origin masterMerge branch 'admu-progvar-master' 2023-07-24 20:39:40 +00:00
Kris Jusiak c9b1f10124 Speed up rmsnorm by using sqrtf/expf
Problem:
- exp and sqrt are using double precision for operations which is not
  required.

Solution:
- Use expf and sqrtf intead.

Notes:
- Although it's using single precision doesn't seem to affect the
  result.

Results: ~ 10% improvement
  - before:  940 tok/s
  - after:  1020 tok/s
2023-07-24 13:06:27 -05:00
Franz Louis Cesista c9ad067c5d parallelize multi-head attention 2023-07-25 01:10:12 +08:00