From f0f43b72883b6c3d1052ad0a64433c2476fbd49d Mon Sep 17 00:00:00 2001
From: Andrej Karpathy <andrej.karpathy@gmail.com>
Date: Wed, 26 Jul 2023 22:12:50 +0000
Subject: [PATCH] small note on traing times

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 6193d19..712786e 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ base models... ¯\\_(ツ)_/¯. Since we can inference the base model, it should
 
 ## models
 
-For the sake of examples of smaller, from-scratch models, I trained multiple models on TinyStories and catalogue them here:
+For the sake of examples of smaller, from-scratch models, I trained multiple models on TinyStories and catalogue them below. All of these trained in a few hours on my training setup (4X A100 40GB GPUs). The 110M took around 24 hours.
 
 | model | dim | n_layers | n_heads | max context length | parameters | val loss | download
 | --- | --- | --- | --- | --- | --- | --- | --- |