From 9ff459b9258c20a5fcf6539e988f003e6e31f255 Mon Sep 17 00:00:00 2001 From: Andrej Karpathy Date: Sun, 13 Aug 2023 03:24:31 +0000 Subject: [PATCH] todo changes --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 331bb7a..2c3614e 100644 --- a/README.md +++ b/README.md @@ -292,12 +292,12 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg ## unsorted todos +- revive tests; train a tiny Llama test model (committed to repo) and use it as reference in unit tests +- make it easier to add a new dataset with not too much pain - add multiquery support into run.c -- add custom bpe training code and the ability to train a smaller vocabulary (32K is to much) - should calculate freq_cis online in the script run.c instead of loading them - int4/8 quantization - export the model in a more sensible output format with a proper header, etc. -- train a tiny Llama test model (committed to repo) and use it as reference in unit tests - support Llama 2 7B Chat models and tune run.c to Chat UI/UX - llama2.cu investigate and merge - (LoRA) finetuning and export of Llama 2 models