This tool splits a Markdown file at H2 headings and summarizes each section using an LLM.
The tool provides three separate commands for a flexible workflow:
Splits a Markdown file at H2 headings and saves the sections to a JSON file:
# Basic usage
python summarize.py load input.md
# Specify output JSON file
python summarize.py load input.md --output_file=sections.json
# Enable verbose output
python summarize.py load input.md --verbose
Processes the JSON file, summarizing each section and saving progress after each section:
# Basic usage
python summarize.py process sections.json
# Use a specific LLM model
python summarize.py process sections.json --model=gpt-4o-mini
# Enable verbose output
python summarize.py process sections.json --verbose
Converts the processed JSON file back to a Markdown file with summaries:
# Basic usage
python summarize.py save sections.json
# Specify output Markdown file
python summarize.py save sections.json --output_file=summary.md
# Enable verbose output
python summarize.py save sections.json --verbose
Run the entire workflow in one command:
# Basic usage
python summarize.py all input.md
# Specify output file and model
python summarize.py all input.md --output_file=summary.md --model=gpt-4o-mini
# Enable verbose output
python summarize.py all input.md --verbose
The script includes a uv run header, so you can also run it directly with uv:
uv run summarize.py load input.md
This three-step workflow provides several advantages:
The script uses the llm
library to interact with various LLM providers. It will try the following models in order:
gpt-4o-mini
(OpenAI)openrouter/google/gemini-flash-1.5
openrouter/openai/gpt-4o-mini
haiku
(Claude 3 Haiku)You can specify a different model using the --model
parameter.
If you have a Markdown file like:
# My Document
## 7. Introduction
This is an introduction to my document.
## 8. Section 1
This is the first section with important details.
## 9. Section 2
This is the second section with more information.
Running the summarizer will create a new file with summaries of each section, preserving the H2 headings.