From fa872540ba415a88aaca6b066411fcc500f00cb1 Mon Sep 17 00:00:00 2001 From: Andrej Karpathy Date: Sun, 23 Jul 2023 17:11:35 +0000 Subject: [PATCH] fix comments in readme about spaces --- README.md | 4 ++-- run_wrap.py | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index dc5e9a7..ff3f7d5 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ pip install sentencepiece python run_wrap.py ``` -You'll see text stream, but with weird spaces in it (sorry). And after that the whole sample will be properly printed. (Call for help: help me fix sentencepiece streaming decoding and even better delete this wrapper.) On my M1 MacBook Air this runs at ~18 tokens/s, not bad for super naive fp32 single-threaded C code. Sample output: +You'll see text stream. On my M1 MacBook Air this runs at ~18 tokens/s, not bad for super naive fp32 single-threaded C code. Sample output: *Once upon a time, there was a boy named Timmy. Timmy loved to play sports with his friends. He was very good at throwing and catching balls. One day, Timmy's mom gave him a new shirt to wear to a party. Timmy thought it was impressive and asked his mom to explain what a shirt could be for. "A shirt is like a special suit for a basketball game," his mom said. Timmy was happy to hear that and put on his new shirt. He felt like a soldier going to the army and shouting. From that day on, Timmy wore his new shirt every time he played sports with his friends at the party. Once upon a time, there was a little girl named Lily. She loved to play outside with her friends. One day, Lily and her friend Emma were playing with a ball. Emma threw the ball too hard and it hit Lily's face. Lily felt embarrassed and didn't want to play anymore. Emma asked Lily what was wrong, and Lily told her about her memory. Emma told Lily that she was embarrassed because she had thrown the ball too hard. Lily felt bad @@ -80,7 +80,7 @@ But note that this only emits the SentencePiece tokens. To decode the tokens int python run_wrap.py ``` -Watch the tokens stream by, fun! Help me fix the weird spaces. We can also run the PyTorch inference script for comparison: +Watch the tokens stream by, fun! We can also run the PyTorch inference script for comparison: ```bash python sample.py diff --git a/run_wrap.py b/run_wrap.py index e2ddc39..c1b7a72 100644 --- a/run_wrap.py +++ b/run_wrap.py @@ -26,6 +26,8 @@ for line in proc.stdout: tokens.append(token) last = dec t1 = time.time() +# seeking help: how can we do streaming inference in sentencepiece properly? +# or even delete sentencepiece entirely? print(f"\nachieved tok/s: {len(tokens) / (t1 - t0)}") proc.wait()