revive tests. now that we have a tiny stories260K model this only requires a 2MB download. phew

2023-08-13 21:22:44 +00:00
parent 0805cb2c31
commit f0024cfc88
2 changed files with 70 additions and 33 deletions
@@ -136,11 +136,7 @@ wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.pt -P ou
 python sample.py --checkpoint=out15M/stories15M.pt
 ```

-Which gives the same results. More detailed testing will be done in `test_all.py`. Currently you will need two files to test or sample: both the .bin file, and the .ckpt file inside a directory (see `test_all.py` for details). Sorry this is a bit janky right now, I have to think through running the tests without having to download 200MB of data. But run the tests with pytest:
-
-```bash
-$ pytest
-```
+Which gives the same results.

 ## custom tokenizers

@@ -227,6 +223,17 @@ On **Windows**, use `build_msvc.bat` in a Visual Studio Command Prompt to build

 On **Centos 7**, **Amazon Linux 2018** use `rungnu` Makefile target: `make rungnu` or `make runompgnu` to use openmp.

+## tests
+
+You can run tests simply with pytest:
+
+```bash
+$ pip install pytest
+$ pytest
+```
+
+This will currently invoke two tests inside `test_all.py`, which forward the model in both C and Python for 200 steps and check the output against a known good expected output. The tests currently run in only a few seconds, but will have to download and cache the stories260K models in a temporary `test` directory (only ~2MB download).
+
 ## ack

 I trained the llama2.c storyteller models on a 4X A100 40GB box graciously provided by the excellent [Lambda labs](https://lambdalabs.com/service/gpu-cloud), thank you.