diff --git a/README.md b/README.md index 2fd355d..7d3393b 100644 --- a/README.md +++ b/README.md @@ -217,7 +217,8 @@ When you run inference make sure to use OpenMP flags to set the number of thread OMP_NUM_THREADS=4 ./run out/model.bin ``` -Depending on your system resources you may want to tweak these hyperparameters and use more threads. But more is not always better, usually this is a bit U shaped. +Depending on your system resources you may want to tweak these hyperparameters and use more threads. But more is not always better, usually this is a bit U shaped. In particular, if your CPU has SMT (multithreading), try setting the number of threads to the number of physical cores rather than logical cores. The performance difference can be large due to cache thrashing and communication overhead. The PyTorch documentation [CPU specific optimizations +](https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#cpu-specific-optimizations) has some good information that applies here too. ## platforms