Andrej Karpathy
|
e03d7ecf12
|
Merge branch 'mpcusack/jitsave' of https://github.com/mpcusack/llama2.c into mpcusack-mpcusack/jitsave
|
2023-08-05 18:11:21 +00:00 |
|
Andrej Karpathy
|
837796e0b7
|
get rid of unneeded comment now
|
2023-08-05 16:19:27 +00:00 |
|
Michael Cusack
|
f8d45f180d
|
Reinline loss function
|
2023-08-04 17:21:29 +07:00 |
|
Michael Cusack
|
11a8348dfc
|
extra line
|
2023-08-04 16:52:04 +07:00 |
|
Michael Cusack
|
f2e34e6b0a
|
Resolve jit.save errors
|
2023-08-04 16:49:26 +07:00 |
|
rahulschand
|
02cf3c7311
|
Small changes to ROPE & comments
|
2023-08-03 20:13:50 +05:30 |
|
aidoge
|
883cda1a2c
|
fix freq_cos, freq_sin serialize
|
2023-08-01 16:31:43 +08:00 |
|
aidoge
|
36bf904c18
|
Refactor freqs_cis into freqs_cos and freqs_sin, and remove complex64 for ONNX export compatibility
|
2023-07-26 14:23:25 +08:00 |
|
Andrej Karpathy
|
f5650891d5
|
honestly at this point this is a lot more my nanogpt code than llama code
|
2023-07-25 23:57:03 +00:00 |
|
Andrej Karpathy
|
624cdfc76a
|
add dropout support to model
|
2023-07-24 14:18:50 +00:00 |
|
Andrew Gu
|
af3b5c0364
|
Register freqs_cis as non-persistent buffer
|
2023-07-24 03:18:20 +00:00 |
|
Andrej Karpathy
|
9414e7a45e
|
tweaks and add a simple test
|
2023-07-23 14:52:08 +00:00 |
|
Andrej Karpathy
|
5b161abb9a
|
somewhere ~20 hours later
|
2023-07-23 05:23:45 +00:00 |
|