From 8dd9baddaa18aef2d52ccbd6799d9a07ad4387f3 Mon Sep 17 00:00:00 2001 From: Gottfried Haider Date: Wed, 2 Aug 2023 18:09:06 +0800 Subject: [PATCH] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ef34318..968b03a 100644 --- a/README.md +++ b/README.md @@ -211,12 +211,13 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg - [llama2.c - Llama 2 Everywhere](https://github.com/trholding/llama2.c) by @trholding: Standalone, Bootable & Portable Binary Llama 2 - [llama2.rs](https://github.com/leo-du/llama2.rs) by @leo-du: A Rust port of this project - [llama2.scala](https://github.com/jrudolph/llama2.scala) by @jrudolph: a Scala port of this project -- [llama2.c-emscripten](https://github.com/gohai/llama2.c-emscripten) by @gohai: Emscripten (JavaScript) port, based on @ggerganov initial prototype +- [llama2.c-emscripten](https://github.com/gohai/llama2.c-emscripten) by @gohai: Emscripten (JavaScript) port, based on @ggerganov's initial prototype ## unsorted todos - support Llama 2 7B Chat model and tune run.c to Chat UI/UX - speed up 7B Llama 2 models sufficiently to work at interactive rates on Apple Silicon MacBooks +- possibly include emscripten / web backend (as seen in @gg PR) - currently the project only runs in fp32, how easy would it be to different precisions? - look into quantization and what would be involved - todo multiquery support? doesn't seem as useful for smaller models that run on CPU (?)