Will Lamond
|
e592ed5d64
|
Add tinyshakespeare dataset
|
2023-08-01 15:26:47 -07:00 |
|
Andrej Karpathy
|
78952fb0b4
|
propagate the dropout flag
|
2023-07-27 22:20:31 +00:00 |
|
Andrej Karpathy
|
517763346d
|
HF checkpoints i removed the optimizer to save space, init Adam without the first/second moments is ok
|
2023-07-27 22:20:07 +00:00 |
|
Andrew Gu
|
25494f9cbc
|
Have DDP ignore freqs_cis to avoid broadcast
|
2023-07-24 13:58:09 +00:00 |
|
Andrej Karpathy
|
9414e7a45e
|
tweaks and add a simple test
|
2023-07-23 14:52:08 +00:00 |
|
Andrej Karpathy
|
f499d9d2b5
|
delete debug line
|
2023-07-23 05:37:44 +00:00 |
|
Andrej Karpathy
|
5b161abb9a
|
somewhere ~20 hours later
|
2023-07-23 05:23:45 +00:00 |
|