Update readme for openmp on mac

This commit is contained in:
Krishnaraj Bhat
2023-08-10 10:59:39 +05:30
parent e36e3fb50d
commit d45a36cdd2
+3 -1
View File
@@ -160,7 +160,7 @@ If compiling with gcc, try experimenting with `-funroll-all-loops`, see PR [#183
### OpenMP
Big improvements can also be achieved by compiling with OpenMP, which "activates" the `#pragma omp parallel for` inside the matmul and attention, allowing the work in the loops to be split up over multiple processors.
You'll need to install the OpenMP library and the clang compiler first (e.g. `apt install clang libomp-dev` on ubuntu). I was not able to get improvements from OpenMP on my MacBook, though. Then you can compile with `make runomp`, which does:
You'll need to install the OpenMP library and the clang compiler first (e.g. `apt install clang libomp-dev` on ubuntu). Then you can compile with `make runomp`, which does:
```bash
clang -Ofast -fopenmp -march=native run.c -lm -o run
@@ -180,6 +180,8 @@ On **Windows**, use `build_msvc.bat` in a Visual Studio Command Prompt to build
On **Centos 7**, **Amazon Linux 2018** use `rungnu` Makefile target: `make rungnu` or `make runompgnu` to use openmp.
On **Mac**, use clang from brew for openmp build. Install clang as `brew install llvm` and use the installed clang binary to compile with openmp: `make runomp CC=/opt/homebrew/opt/llvm/bin/clang`
## ack
I trained the llama2.c storyteller models on a 4X A100 40GB box graciously provided by the excellent [Lambda labs](https://lambdalabs.com/service/gpu-cloud), thank you.