Move prompts to separate files and add prompt types

- Created directory structure for prompts (system and user prompts)
- Added specialized prompts for lectures, meetings, and interviews
- Updated enhancer.py to load prompts from files
- Added --prompt-type CLI parameter to select prompt type
- Updated documentation and enhancement proposals
This commit is contained in:
2025-05-22 21:28:36 +02:00
parent c47089aa0d
commit f00f29ab6b
9 changed files with 147 additions and 56 deletions
+1
View File
@@ -49,6 +49,7 @@ python main.py --input-dir /path/to/audio/files --output-dir /path/to/output [--
- `--output-dir`: Directory for output files (default: "output")
- `--turbo`: Enable turbo mode for faster inference (uses int8_float16 compute type)
- `--force`: Force re-processing of files even if output files already exist
- `--prompt-type`: Type of content to enhance (choices: "lecture", "meeting", "interview", default: "lecture")
### Turbo Mode Hardware Requirements
+7 -5
View File
@@ -9,9 +9,11 @@ This document outlines proposed improvements for the mknotes software, grouped b
- Extend support to include additional formats like FLAC, OGG, etc.
- ✅ Updated the `find_audio_files` function in `utils.py` to recognize WAV extension
- **Customizable Enhancement Prompts**
- Allow users to provide their own prompts via CLI arguments or configuration files.
- Increase flexibility for different use cases (e.g., meeting notes, lectures, interviews).
- **Customizable Enhancement Prompts** ✅ (Implemented)
- ✅ Added CLI argument to select different prompt types (lecture, meeting, interview)
- ✅ Moved prompts to separate files for easier customization
- ✅ Created specialized prompts for different use cases (lectures, meeting minutes, interviews)
- Allow users to provide their own custom prompts via configuration files.
- **Batch Processing Controls**
- Add the ability to limit the number of files processed in one run.
@@ -51,8 +53,8 @@ This document outlines proposed improvements for the mknotes software, grouped b
## Code Structure Improvements
- **Separation of Concerns**
- Move prompts to separate configuration files.
- **Separation of Concerns** ✅ (Partially implemented)
- Moved prompts to separate files in a dedicated prompts directory.
- Create a more abstract API client layer.
- **Progress Tracking and Reporting**
+4 -1
View File
@@ -61,7 +61,10 @@ def main():
# Enhance note (only if md file doesn't exist or force flag is set)
if not os.path.exists(md_path) or args.force:
try:
enhanced_note = enhance_note(transcription)
enhanced_note = enhance_note(
transcription,
prompt_type=args.prompt_type
)
with open(md_path, "w", encoding="utf-8") as f:
f.write(enhanced_note)
except Exception as e:
+33
View File
@@ -0,0 +1,33 @@
You are Edison, an expert executive assistant to the CTO of an IT technology firm with over 22 years of experience in technology. Your task is to provide a deep-dive consultation tailored to the client's issue. Ensure your responses make the user feel understood, guided, and satisfied. The name of the CTO is Heiko.
The consultation is deemed successful when the user explicitly communicates their satisfaction with the solution.
**Instructions:**
- Write clearly and straight to the point.
- Use professional business English.
- Use always British English, not American English.
- Format titles, main sections and subsections:
- Capitalise only the first word of each title, section, and subsection.
- Keep all subsequent words in lowercase except for acronyms, abbreviations and proper nouns, which should remain in their proper uppercase form.
- Do not use emojis.
- Format dates appropriately based on context:
- Use the ISO format (`yyyy-MM-dd`) for technical content, such as code, specifications, tables, deadlines, or numbered/bulleted lists (e.g., 2024-02-12).
- Use the British standard date format `<day> <month> <year>` in general, conversational, or non-technical text (e.g., 12 February 2024). Use the current year (2025) if no year is provided.
- Use the 24-hour time format (HH:mm) consistently throughout.
- Introduce abbreviations with the full term followed by the abbreviation in parentheses on their first mention, only when the context is provided.
- Do not introduce abbreviations for AI, CPU, and HPC.
- Use only the metric system and automatically convert imperial measurements (like Fahrenheit, inches, or feet) to metric units.
- Ensure that all phone numbers are formatted in the international format starting with a '+' followed by the country code, area code, and local number (e.g., +49-111-22223333).
**Guidelines for British English:**
British English is the form of English used in the United Kingdom, characterised by distinct spelling, vocabulary, grammar, and punctuation.
1. **Spelling Differences:**
- Use "-our" instead of "-or" (e.g., "colour" not "color", "honour" not "honor").
- Use "-re" instead of "-er" (e.g., "centre" not "center", "metre" not "meter").
- Prefer "-ise" over "-ize" (e.g., "realise" instead of "realize").
2. **Grammar Differences:**
- Use the present perfect tense with "just," "yet," and "already" (e.g., "I have just eaten").
- Treat collective nouns as singular or plural depending on context (e.g., "The team is winning" or "The team are playing well").
3. **Punctuation Usage:**
- Use single quotation marks for initial quotes and double quotation marks for quotes within quotes (e.g., 'He said, "Hello."').
- Place commas and periods outside quotation marks when they are not part of the quoted material (e.g., 'He said "hello", and then left.').
+26
View File
@@ -0,0 +1,26 @@
Enhance interview transcriptions into well-structured, comprehensive notes. Your task is to transform the raw interview dialogue into an organized document that captures the key insights, questions, and responses.
Focus on:
- Identifying the interviewer and interviewee
- Structuring questions and answers clearly
- Highlighting key insights, quotes, and important points
- Maintaining the context and flow of the conversation
- Summarizing main themes and takeaways
# Output Format
Provide the enhanced interview notes in markdown format. Use markdown syntax for headings, lists, and emphasis to improve clarity and presentation.
IMPORTANT: Start with heading level 2 (##) instead of heading level 1 (#) for all top-level headings. Use heading level 3 (###) for subheadings, and so on.
Structure the document with:
- Interview Information (date, participants, context if available)
- Executive Summary (brief overview of key points)
- Main Discussion (organized by topics or questions)
- Key Insights (important takeaways)
- Notable Quotes (formatted with blockquotes)
- Conclusion/Next Steps (if mentioned)
Use blockquotes (>) for direct quotes when appropriate. Format the Q&A sections clearly to distinguish between questions and responses.
Ensure that all integrated content from the context is accurately reflected. Return only the markdown-formatted note. Do not wrap the output in code blocks or use triple backticks.
+11
View File
@@ -0,0 +1,11 @@
Enhance existing notes using additional context provided from audio transcription or uploaded file content. Your task is to make the notes more useful and comprehensive by incorporating relevant information from the provided context.
Input will be provided within this context, providing a structure for the existing notes and context.
# Output Format
Provide the enhanced notes in markdown format. Use markdown syntax for headings, lists, and emphasis to improve clarity and presentation.
IMPORTANT: Start with heading level 2 (##) instead of heading level 1 (#) for all top-level headings. Use heading level 3 (###) for subheadings, and so on.
Ensure that all integrated content from the context is accurately reflected. Return only the markdown-formatted note. Do not wrap the output in code blocks or use triple backticks.
+25
View File
@@ -0,0 +1,25 @@
Enhance meeting minutes from the provided audio transcription. Your task is to structure the content into a clear, professional meeting summary that captures all key information.
Focus on extracting and organizing:
- Meeting details (date, time, attendees)
- Agenda items discussed
- Key decisions made
- Action items (with assignees and deadlines if mentioned)
- Important discussion points and outcomes
- Next steps and follow-ups
# Output Format
Provide the enhanced meeting minutes in markdown format. Use markdown syntax for headings, lists, and emphasis to improve clarity and presentation.
IMPORTANT: Start with heading level 2 (##) instead of heading level 1 (#) for all top-level headings. Use heading level 3 (###) for subheadings, and so on.
Structure the document with clear sections for:
- Meeting Information (date, time, participants)
- Agenda/Topics
- Discussion (organized by topic)
- Decisions
- Action Items (formatted as a checklist with assignees)
- Next Meeting (if mentioned)
Ensure that all integrated content from the context is accurately reflected. Return only the markdown-formatted note. Do not wrap the output in code blocks or use triple backticks.
+7
View File
@@ -35,4 +35,11 @@ def parse_args():
action="store_true",
help="Force re-processing of files even if output files already exist"
)
parser.add_argument(
"--prompt-type",
type=str,
default="lecture",
choices=["lecture", "meeting", "interview"],
help="Type of content to enhance (default: lecture)"
)
return parser.parse_args()
+33 -50
View File
@@ -2,53 +2,7 @@
import openai
import os
SYSTEM_PROMPT = """You are Edison, an expert executive assistant to the CTO of an IT technology firm with over 22 years of experience in technology. Your task is to provide a deep-dive consultation tailored to the client's issue. Ensure your responses make the user feel understood, guided, and satisfied. The name of the CTO is Heiko.
The consultation is deemed successful when the user explicitly communicates their satisfaction with the solution.
**Instructions:**
- Write clearly and straight to the point.
- Use professional business English.
- Use always British English, not American English.
- Format titles, main sections and subsections:
- Capitalise only the first word of each title, section, and subsection.
- Keep all subsequent words in lowercase except for acronyms, abbreviations and proper nouns, which should remain in their proper uppercase form.
- Do not use emojis.
- Format dates appropriately based on context:
- Use the ISO format (`yyyy-MM-dd`) for technical content, such as code, specifications, tables, deadlines, or numbered/bulleted lists (e.g., 2024-02-12).
- Use the British standard date format `<day> <month> <year>` in general, conversational, or non-technical text (e.g., 12 February 2024). Use the current year (2025) if no year is provided.
- Use the 24-hour time format (HH:mm) consistently throughout.
- Introduce abbreviations with the full term followed by the abbreviation in parentheses on their first mention, only when the context is provided.
- Do not introduce abbreviations for AI, CPU, and HPC.
- Use only the metric system and automatically convert imperial measurements (like Fahrenheit, inches, or feet) to metric units.
- Ensure that all phone numbers are formatted in the international format starting with a '+' followed by the country code, area code, and local number (e.g., +49-111-22223333).
**Guidelines for British English:**
British English is the form of English used in the United Kingdom, characterised by distinct spelling, vocabulary, grammar, and punctuation.
1. **Spelling Differences:**
- Use "-our" instead of "-or" (e.g., "colour" not "color", "honour" not "honor").
- Use "-re" instead of "-er" (e.g., "centre" not "center", "metre" not "meter").
- Prefer "-ise" over "-ize" (e.g., "realise" instead of "realize").
2. **Grammar Differences:**
- Use the present perfect tense with "just," "yet," and "already" (e.g., "I have just eaten").
- Treat collective nouns as singular or plural depending on context (e.g., "The team is winning" or "The team are playing well").
3. **Punctuation Usage:**
- Use single quotation marks for initial quotes and double quotation marks for quotes within quotes (e.g., 'He said, "Hello."').
- Place commas and periods outside quotation marks when they are not part of the quoted material (e.g., 'He said "hello", and then left.').
"""
PROMPT = """Enhance existing notes using additional context provided from audio transcription or uploaded file content. Your task is to make the notes more useful and comprehensive by incorporating relevant information from the provided context.
Input will be provided within this context, providing a structure for the existing notes and context.
# Output Format
Provide the enhanced notes in markdown format. Use markdown syntax for headings, lists, and emphasis to improve clarity and presentation.
IMPORTANT: Start with heading level 2 (##) instead of heading level 1 (#) for all top-level headings. Use heading level 3 (###) for subheadings, and so on.
Ensure that all integrated content from the context is accurately reflected. Return only the markdown-formatted note. Do not wrap the output in code blocks or use triple backticks."""
import pathlib
def get_model_max_tokens(client, model_name):
"""
@@ -71,7 +25,20 @@ def get_model_max_tokens(client, model_name):
max_tokens = 4096
return max_tokens
def enhance_note(transcription_text, api_key=None):
def load_prompt(prompt_path):
"""
Load a prompt from a file.
Args:
prompt_path: Path to the prompt file
Returns:
The prompt text
"""
with open(prompt_path, 'r', encoding='utf-8') as f:
return f.read().strip()
def enhance_note(transcription_text, prompt_type="lecture", api_key=None):
"""
Enhance the transcription using OpenAI GPT-4.1 and return markdown-formatted notes.
Dynamically retrieves the model's max token limit from the OpenAI API.
@@ -81,6 +48,22 @@ def enhance_note(transcription_text, api_key=None):
if not api_key:
raise ValueError("OpenAI API key not provided or set in environment variable OPENAI_API_KEY.")
# Get the base directory
base_dir = pathlib.Path(__file__).parent.parent
# Load system prompt
system_prompt_path = base_dir / "prompts" / "system" / "default.txt"
system_prompt = load_prompt(system_prompt_path)
# Load user prompt based on type
valid_prompt_types = ["lecture", "meeting", "interview"]
if prompt_type not in valid_prompt_types:
print(f"Warning: Invalid prompt type '{prompt_type}'. Using 'lecture' instead.")
prompt_type = "lecture"
user_prompt_path = base_dir / "prompts" / "user" / f"{prompt_type}.txt"
user_prompt = load_prompt(user_prompt_path)
client = openai.OpenAI(api_key=api_key)
model_name = "gpt-4.1"
max_tokens = int(get_model_max_tokens(client, model_name) * 0.9) # Use 90% of max to allow for prompt/context
@@ -88,8 +71,8 @@ def enhance_note(transcription_text, api_key=None):
response = client.chat.completions.create(
model=model_name,
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": PROMPT},
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
{"role": "user", "content": transcription_text}
],
max_tokens=max_tokens,