How it's built
TheCutline is a local-first Python pipeline built around DSPy — a framework that treats AI behavior as an optimizable program rather than a hardcoded prompt. The result is a triage system that gets smarter the more you use it.
The pipeline
One command, one file in, one JSON report out. The pipeline runs locally on push — no server, no cron job.
flowchart TD
A["links/YYYY-MM-DD.txt"] --> B[read_links]
B --> C["for each URL"]
C --> D["scrape\nhttpx + BeautifulSoup"]
D --> E["triage\nDSPy TriageSignature"]
E -->|Must Read / Skim| F["summarize\nDSPy SummarizeSignature"]
E -->|Bankruptcy| G[skip]
F --> H[sort by priority]
G --> H
H --> I["synthesize_lead\n1-2 sentence theme"]
I --> J["reports/YYYY-MM-DD.json"]
J --> K["update manifest.json"]
DSPy — behavior as a program
Instead of writing prompts, DSPy lets you define signatures: typed contracts that specify what goes in and what comes out. The framework handles prompt construction, retries, and output parsing. The key advantage: signatures are optimizable — you can run a feedback loop that compiles a better version of the program from examples.
TheCutline uses two signatures. TriageSignature classifies and rates each article. SummarizeSignature extracts actionable bullets for anything that cleared the Must Read or Skim bar.
flowchart LR
subgraph TriageSignature
T1["url, content, profile"] --> T2["DSPy Predict"]
T2 --> T3["title, priority, rationale, tags"]
end
subgraph SummarizeSignature
S1["url, content, priority"] --> S2["DSPy Predict"]
S2 --> S3["summary_bullets"]
end
TriageSignature -->|priority != Bankruptcy| SummarizeSignature
profile.md is injected into the triage signature at load time — a plain markdown file that describes the user's context, interests, and signal framework. Changing it changes what the system considers worth reading.
Tags
Each article is tagged by the triage step from a fixed vocabulary of 22 topics. Using a closed set keeps the tag cloud coherent — the LLM can't invent new tags, so every tag that appears is one you can actually browse across reports.
Tags are emitted by TriageSignature alongside priority and rationale — no extra LLM call. Each article gets 1–3 tags. Browse the full tag cloud →
The feedback loop
After reading a report, you can vote on the AI's calls.
Those votes accumulate as training examples.
Running just optimize passes
them to DSPy's optimizer, which compiles a new version of the triage program
and saves it to optimized/triage.json.
On the next run, the pipeline loads it automatically.
flowchart LR
A[Read report] --> B["Vote on each item"]
B -->|agree| C[spot_on]
B -->|too low| D[higher_priority]
B -->|too high| E[lower_priority]
C --> F["feedback/YYYY-MM-DD.json"]
D --> F
E --> F
F --> G["just optimize"]
G --> H["DSPy MIPROv2 optimizer"]
H --> I["optimized/triage.json"]
I --> J[Pipeline gets smarter]
v1 deployment
The v1 stack keeps infrastructure surface area at zero. Git is the queue, GitHub Actions is the compute, the repo is the database, and Cloudflare Pages is the CDN. No servers to manage, no database to maintain.
flowchart TD
A["Commit links/YYYY-MM-DD.txt"] --> B[GitHub Actions]
B --> C["uv run main.py"]
C --> D["reports/YYYY-MM-DD.json\nmanifest.json updated"]
D --> E["GHA commits reports/ to main"]
E --> F[Cloudflare Pages]
F --> G["Astro build"]
G --> H["site live"]
links/YYYY-MM-DD.txt committed to the repo.
Full link history in git, diffs readable, nothing to maintain.
reports/*.json are committed artifacts.
No S3, no R2, no database. Every report is auditable via git history.
What comes next
- → Cloudflare Worker + D1 for a real feedback API (no more copy-paste JSON)
- → Automated optimize loop — weekly GHA job pulls feedback, recompiles the DSPy program
- → Cron trigger alongside push — reports generate even without a new commit
- → Google Sheets as input — zero-friction link capture from mobile