diff --git a/README.md b/README.md
index ad0efb1..209ffc5 100644
--- a/README.md
+++ b/README.md
@@ -19,10 +19,10 @@ Let's just run a baby Llama 2 model in C. You need a model checkpoint. Download
 wget https://karpathy.ai/llama2c/model.bin -P out
 ```
 
-(if that doesn't work try [google drive](https://drive.google.com/file/d/1aTimLdx3JktDXxcHySNrZJOOk8Vb1qBR/view?usp=share_link)). Compile and run the C code:
+(if that doesn't work try [google drive](https://drive.google.com/file/d/1aTimLdx3JktDXxcHySNrZJOOk8Vb1qBR/view?usp=share_link)). Compile and run the C code (check [howto](#howto) for faster optimization flags):
 
 ```bash
-gcc -O3 -funsafe-math-optimizations -o run run.c -lm
+gcc -O3 -o run run.c -lm
 ./run out/model.bin
 ```
 
@@ -64,6 +64,12 @@ wget https://karpathy.ai/llama2c/model.bin -P out
 
 Once we have the model.bin file, we can inference in C. Compile the C code first:
 
+```bash
+gcc -O3 -o run run.c -lm
+```
+
+Alternatively, if you want to increase the inference performance and are confident in using unsafe math optimizations, which are probably fine for this application, you can compile the code with the `-funsafe-math-optimizations` flag as shown below:
+
 ```bash
 gcc -O3 -funsafe-math-optimizations -o run run.c -lm
 ```