move lines around

2023-07-23 05:25:07 +00:00
parent 5b161abb9a
commit 60d32cf13a
1 changed files with 2 additions and 2 deletions
@@ -1,10 +1,10 @@

 ## llama2.c

-![llama2c](assets/llama_cute.jpg)
-
 Have you ever wanted to inference a baby [Llama 2](https://ai.meta.com/llama/) model in pure C? No? Well, now you can!

+![llama2c](assets/llama_cute.jpg)
+
 Code in this repo first lets you train the Llama 2 architecture from scratch in PyTorch, then save the weights to a raw binary file, then load that into one ~simple 500-line C file that inferences the model, simply in fp32 for now.

 Of course, this is not super fast, but it's not too bad either. E.g. on my cloud Linux devbox a dim 288 6-layer 6-head model (~15M params) inferences at ~18 tok/s in fp32, and about the same on my M1 MacBook Air.