Initial commit

2025-05-21 21:03:52 +02:00
commit c47d3205a0
8 changed files with 328 additions and 0 deletions
@@ -0,0 +1,64 @@
+# mknotes
+
+A command-line tool to transcribe all MP3 and M4A audio files in a directory using Faster Whisper, then enhance the transcriptions into comprehensive notes using OpenAI's GPT-4.1 model.
+
+## Features
+
+- Batch transcribes all `.mp3` and `.m4a` files in a specified directory
+- Saves transcriptions as `.txt` files
+- Enhances notes using GPT-4.1 with a custom prompt
+- Outputs enhanced notes in markdown format
+- Configurable input and output directories
+
+## Installation
+
+```bash
+# Clone the repository
+git clone https://github.com/yourusername/mknotes.git
+cd mknotes
+
+# Create a virtual environment
+python -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+
+# Install dependencies
+pip install -r requirements.txt
+```
+
+## Usage
+
+```bash
+export OPENAI_API_KEY="your-api-key-here"
+python main.py --input-dir /path/to/audio/files --output-dir /path/to/output [--turbo]
+```
+
+- `--input-dir`: Directory containing audio files (required)
+- `--output-dir`: Directory for output files (default: "output")
+- `--turbo`: Enable turbo mode for faster inference (uses int8_float16 compute type)
+- `--force`: Force re-processing of files even if output files already exist
+
+### Turbo Mode Hardware Requirements
+
+The `--turbo` flag enables faster inference using the `int8_float16` compute type, which can significantly speed up transcription. However, this requires:
+
+- CUDA-compatible GPU with Tensor Cores (NVIDIA Ampere, Turing, or newer architecture)
+- Or CPU with AVX2 support
+
+If your hardware does not support this optimization, the program will automatically fall back to the next most compatible compute type and print a warning.
+
+#### Compute Type Fallback
+
+The program will attempt to use the most efficient compute type supported by your hardware, in the following order:
+
+- `int8_float16` (if `--turbo` is enabled)
+- `float16`
+- `int8`
+- `float32` (most compatible, works on virtually all hardware)
+
+If a compute type is not supported, the program will try the next one in the list until successful.
+
+## Requirements
+
+- Python 3.8+
+- [Faster Whisper](https://github.com/SYSTRAN/faster-whisper)
+- [OpenAI Python SDK](https://github.com/openai/openai-python)