Merge pull request #267 from krrishnarraj/master
Update readme for openmp on mac
This commit is contained in:
@@ -205,7 +205,7 @@ If compiling with gcc, try experimenting with `-funroll-all-loops`, see PR [#183
|
||||
|
||||
### OpenMP
|
||||
Big improvements can also be achieved by compiling with OpenMP, which "activates" the `#pragma omp parallel for` inside the matmul and attention, allowing the work in the loops to be split up over multiple processors.
|
||||
You'll need to install the OpenMP library and the clang compiler first (e.g. `apt install clang libomp-dev` on ubuntu). I was not able to get improvements from OpenMP on my MacBook, though. Then you can compile with `make runomp`, which does:
|
||||
You'll need to install the OpenMP library and the clang compiler first (e.g. `apt install clang libomp-dev` on ubuntu). Then you can compile with `make runomp`, which does:
|
||||
|
||||
```bash
|
||||
clang -Ofast -fopenmp -march=native run.c -lm -o run
|
||||
@@ -225,6 +225,8 @@ On **Windows**, use `build_msvc.bat` in a Visual Studio Command Prompt to build
|
||||
|
||||
On **Centos 7**, **Amazon Linux 2018** use `rungnu` Makefile target: `make rungnu` or `make runompgnu` to use openmp.
|
||||
|
||||
On **Mac**, use clang from brew for openmp build. Install clang as `brew install llvm` and use the installed clang binary to compile with openmp: `make runomp CC=/opt/homebrew/opt/llvm/bin/clang`
|
||||
|
||||
## tests
|
||||
|
||||
You can run tests simply with pytest:
|
||||
|
||||
Reference in New Issue
Block a user