From fcb4cdef8b38e6bfa5f620a8630ede7d324529eb Mon Sep 17 00:00:00 2001 From: Daniel Grittner Date: Sun, 6 Aug 2023 10:44:48 +0200 Subject: [PATCH 1/5] add a Rust port --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index c9926fc..ad0103c 100644 --- a/README.md +++ b/README.md @@ -227,6 +227,7 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg - [llama2.java](https://github.com/mukel/llama2.java) by @mukel: a Java port of this project - [llama2.kt](https://github.com/madroidmaq/llama2.kt) by @madroidmaq: a Kotlin port of this project - [llama2.zig](https://github.com/clebert/llama2.zig) by @clebert: a Zig port of this project +- [llama2-rs](https://github.com/danielgrittner/llama2-rs) by @danielgrittner: a Rust port of this project ## unsorted todos From 7178facb751a7b33083938bc3b915af535d278b2 Mon Sep 17 00:00:00 2001 From: Aydyn Tairov Date: Sun, 6 Aug 2023 18:45:47 +0100 Subject: [PATCH 2/5] Rebase changes to master --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ccd77c5..63bb04e 100644 --- a/README.md +++ b/README.md @@ -234,11 +234,12 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg - [llama2.scala](https://github.com/jrudolph/llama2.scala) by @[jrudolph](https://github.com/jrudolph): a Scala port of this project - Java - [llama2.java](https://github.com/mukel/llama2.java) by @[mukel](https://github.com/mukel): a Java port of this project +- Python + - [llama2.py](https://github.com/tairov/llama2.py) by @tairov: a simple one file pure Python port of this project with zero dependencies - Kotlin - [llama2.kt](https://github.com/madroidmaq/llama2.kt) by @[madroidmaq](https://github.com/madroidmaq): a Kotlin port of this project - [llama2.c - Llama 2 Everywhere](https://github.com/trholding/llama2.c) by @[trholding](https://github.com/trholding): Standalone, Bootable & Portable Binary Llama 2 - ## unsorted todos - should calculate freq_cis online in the script run.c instead of loading them From 6734eaeff54da0394ffab7788a0b48d7365e5746 Mon Sep 17 00:00:00 2001 From: Aydyn Tairov Date: Sun, 6 Aug 2023 18:47:05 +0100 Subject: [PATCH 3/5] Rebase chanes to master --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 63bb04e..e860a23 100644 --- a/README.md +++ b/README.md @@ -234,10 +234,10 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg - [llama2.scala](https://github.com/jrudolph/llama2.scala) by @[jrudolph](https://github.com/jrudolph): a Scala port of this project - Java - [llama2.java](https://github.com/mukel/llama2.java) by @[mukel](https://github.com/mukel): a Java port of this project -- Python - - [llama2.py](https://github.com/tairov/llama2.py) by @tairov: a simple one file pure Python port of this project with zero dependencies - Kotlin - [llama2.kt](https://github.com/madroidmaq/llama2.kt) by @[madroidmaq](https://github.com/madroidmaq): a Kotlin port of this project +- Python + - [llama2.py](https://github.com/tairov/llama2.py) by @tairov: a simple one file pure Python port of this project with zero dependencies - [llama2.c - Llama 2 Everywhere](https://github.com/trholding/llama2.c) by @[trholding](https://github.com/trholding): Standalone, Bootable & Portable Binary Llama 2 ## unsorted todos From 2297d158e3b1d31a9d382e7611eabd965c3e1b68 Mon Sep 17 00:00:00 2001 From: Aydyn Tairov Date: Sun, 6 Aug 2023 21:47:05 +0100 Subject: [PATCH 4/5] Fix link to a github profile --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e860a23..72e426c 100644 --- a/README.md +++ b/README.md @@ -237,7 +237,7 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg - Kotlin - [llama2.kt](https://github.com/madroidmaq/llama2.kt) by @[madroidmaq](https://github.com/madroidmaq): a Kotlin port of this project - Python - - [llama2.py](https://github.com/tairov/llama2.py) by @tairov: a simple one file pure Python port of this project with zero dependencies + - [llama2.py](https://github.com/tairov/llama2.py) by @[tairov](https://github.com/tairov): a simple one file pure Python port of this project with zero dependencies - [llama2.c - Llama 2 Everywhere](https://github.com/trholding/llama2.c) by @[trholding](https://github.com/trholding): Standalone, Bootable & Portable Binary Llama 2 ## unsorted todos From 98b515e44d23687258c08ec19e1e2458b57aa5ae Mon Sep 17 00:00:00 2001 From: Nicolas Pinto Date: Sun, 6 Aug 2023 14:48:47 -0700 Subject: [PATCH 5/5] FIX: model.generate() This patch fixes a simple bug in `generate()` due to model's `forward()` only returning logits and not losses since `f2e34e6b0ac55accd6ba930a04c6f683f5158b29`. --- model.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/model.py b/model.py index 66304e7..f7edbb6 100644 --- a/model.py +++ b/model.py @@ -317,7 +317,7 @@ class Transformer(nn.Module): # if the sequence context is growing too long we must crop it at block_size idx_cond = idx if idx.size(1) <= self.params.max_seq_len else idx[:, -self.params.max_seq_len:] # forward the model to get the logits for the index in the sequence - logits, _ = self(idx_cond) + logits = self(idx_cond) logits = logits[:, -1, :] # crop to just the final time step if temperature == 0.0: # "sample" the single most likely index