How it's built

TheCutline is a local-first Python pipeline built around DSPy — a framework that treats AI behavior as an optimizable program rather than a hardcoded prompt. The result is a triage system that gets smarter the more you use it.

The pipeline

One command, one file in, one JSON report out. The pipeline runs locally on push — no server, no cron job.

flowchart TD
    A["links/YYYY-MM-DD.txt"] --> B[read_links]
    B --> C["for each URL"]
    C --> D["scrape\nhttpx + BeautifulSoup"]
    D --> E["triage\nDSPy TriageSignature"]
    E -->|Must Read / Skim| F["summarize\nDSPy SummarizeSignature"]
    E -->|Bankruptcy| G[skip]
    F --> H[sort by priority]
    G --> H
    H --> I["synthesize_lead\n1-2 sentence theme"]
    I --> J["reports/YYYY-MM-DD.json"]
    J --> K["update manifest.json"]

DSPy — behavior as a program

Instead of writing prompts, DSPy lets you define signatures: typed contracts that specify what goes in and what comes out. The framework handles prompt construction, retries, and output parsing. The key advantage: signatures are optimizable — you can run a feedback loop that compiles a better version of the program from examples.

TheCutline uses two signatures. TriageSignature classifies and rates each article. SummarizeSignature extracts actionable bullets for anything that cleared the Must Read or Skim bar.

flowchart LR
    subgraph TriageSignature
        T1["url, content, profile"] --> T2["DSPy Predict"]
        T2 --> T3["title, priority, rationale, tags"]
    end
    subgraph SummarizeSignature
        S1["url, content, priority"] --> S2["DSPy Predict"]
        S2 --> S3["summary_bullets"]
    end
    TriageSignature -->|priority != Bankruptcy| SummarizeSignature

profile.md is injected into the triage signature at load time — a plain markdown file that describes the user's context, interests, and signal framework. Changing it changes what the system considers worth reading.

The feedback loop

After reading a report, you can vote on the AI's calls. Those votes accumulate as training examples. Running just optimize passes them to DSPy's optimizer, which compiles a new version of the triage program and saves it to optimized/triage.json. On the next run, the pipeline loads it automatically.

flowchart LR
    A[Read report] --> B["Vote on each item"]
    B -->|agree| C[spot_on]
    B -->|too low| D[higher_priority]
    B -->|too high| E[lower_priority]
    C --> F["feedback/YYYY-MM-DD.json"]
    D --> F
    E --> F
    F --> G["just optimize"]
    G --> H["DSPy MIPROv2 optimizer"]
    H --> I["optimized/triage.json"]
    I --> J[Pipeline gets smarter]

v1 deployment

The v1 stack keeps infrastructure surface area at zero. Git is the queue, GitHub Actions is the compute, the repo is the database, and Cloudflare Pages is the CDN. No servers to manage, no database to maintain.

flowchart TD
    A["Commit links/YYYY-MM-DD.txt"] --> B[GitHub Actions]
    B --> C["uv run main.py"]
    C --> D["reports/YYYY-MM-DD.json\nmanifest.json updated"]
    D --> E["GHA commits reports/ to main"]
    E --> F[Cloudflare Pages]
    F --> G["Astro build"]
    G --> H["site live"]

Input

Git as the queue — links/YYYY-MM-DD.txt committed to the repo. Full link history in git, diffs readable, nothing to maintain.

Storage

The repo itself. reports/*.json are committed artifacts. No S3, no R2, no database. Every report is auditable via git history.

Compute

GitHub Actions on push. Secrets live in GHA repo secrets. Free tier is ample — ~2 min per run, well within 2000 min/month.

Frontend

Cloudflare Pages — static files served globally from the edge. No build step complexity. Astro reads JSON at build time.

What comes next