A Simple Notation System for LLM Prompts

Before and After
Core Patterns
Mixing Notation with Prose
Why It Works
When It Doesn’t Work
Full Pattern Library
References

I stopped writing sentences to LLMs. Now I write something closer to YAML. Turns out they parse structure better than prose anyway.

Before and After

Research

Prose:

I’m trying to understand the FSRS spaced repetition algorithm and how it compares to the older SM-2 algorithm. I’m building a learning application backend in Python and need to integrate one of these. Can you explain how FSRS works, what makes it better than SM-2, and show me how I might implement the core scheduling logic? Please include any gotchas or edge cases I should watch out for.

Notation:

[RESEARCH]
topic: FSRS vs SM-2 spaced repetition
depth: implementation-focused
stack: Python
output: how it works, comparison, code patterns, gotchas
sources: yes

(FSRS and SM-2 are algorithms for scheduling flashcard reviews. If you’ve used Anki, you’ve used SM-2.)

Code Review

Prose:

Can you review this Python function for me? I’m specifically worried about error handling and whether there are any edge cases I’m missing. The function processes user uploads and saves them to S3. It’s meant to be production code so please point out anything that could cause issues at scale.

Notation:

[REVIEW]
focus: error handling, edge cases, security
context: production, user uploads to S3

async def upload_file(user_id: str, file: UploadFile) -> str:
    path = f"uploads/{user_id}/{file.filename}"
    await s3.upload(path, file.file)
    return path

Architecture Decision

Notation:

[ANALYZE]
question: relational tables vs JSONB for learner progress
context: AI tutoring system, FastAPI/PostgreSQL, spaced repetition
query patterns: "items due for review", progress by concept
constraints: scale to 10k learners
output: tradeoffs + recommendation

Debugging

Notation:

[DEBUG]
error: "connection pool exhausted" under concurrent load
stack: FastAPI, SQLAlchemy async, asyncpg, PostgreSQL
behavior: intermittent, only under load
tried: increased pool size (hit max_connections)

Sprint Planning

Notation:

[PLAN]
goal: ship Slack bot MVP this week
available: 15 hours across 5 days
blocked by: OAuth flow not working
dependencies: Redis running, API stable
output: daily tasks, 2-3 hours each, ordered by dependency

Explaining to Non-Technical People

Notation:

[EXPLAIN]
topic: why the AI tutoring system needs a knowledge graph
audience: non-technical investor
avoid: jargon, implementation details
goal: justify the engineering investment
length: 2 paragraphs

Core Patterns

[RESEARCH]  topic | depth | context | sources
[CODE]      task | stack | constraints
[REVIEW]    focus | context | <code>
[ANALYZE]   question | context | output
[DEBUG]     error | stack | behavior | tried
[PLAN]      goal | available | constraints | output
[EDIT]      text | goal | keep
[WRITE]     type | audience | tone | length
[EXPLAIN]   topic | audience | avoid | length

Mixing Notation with Prose

Pure notation can feel too terse. A header plus normal writing works better:

[RESEARCH]
topic: constitutional AI
depth: deep

Specifically curious about:
- How it's actually implemented in practice
- Any open-source examples
- Whether it makes sense for smaller models

Why It Works

The main benefit isn’t that LLMs magically understand structure better. It’s that the format forces you to know what you want.

When you write prose, you can be vague and still feel like you’ve communicated something. Notation won’t let you. You have to fill in the fields.

That said, LLMs do handle structured input well. They’re trained on JSON, YAML, markdown, code. And putting the task at the top makes it harder for the model to miss.

When It Doesn’t Work

This approach is best for task-oriented prompts: research, code, review, decisions.

It’s less useful for exploratory conversations where you’re thinking out loud, or creative work where ambiguity is productive. If you want the model to surprise you, prose is probably better.

Full Pattern Library

For reference, here’s every pattern I use:

[RESEARCH]   topic | depth: quick/deep | context | sources: yes/no
[CODE]       task | stack | constraints | style: minimal/production
[REVIEW]     focus | context | <code>
[ANALYZE]    question | context | output: tradeoffs/recommendation/both
[DEBUG]      error | stack | behavior | tried
[PLAN]       goal | available | blocked by | dependencies | output
[EDIT]       text | goal: shorter/clearer/formal | keep
[WRITE]      type: docs/email/blog | audience | tone | length
[EXPLAIN]    topic | audience | avoid | length
[LEARN]      topic | background | goal | depth: practical/academic
[SUMMARIZE]  content | extract | format: bullets/prose/table
[REFACTOR]   code | goal | constraints | style
[COMMIT]     changes | style: conventional | scope
[REFLECT]    context | framework | tone: no-guilt/honest
[PREP]       role | company | focus areas | my background | output

References

Prompt Compression Survey - Academic overview of compression techniques
Chain of Draft - Compression on the output side
Fact-RAR on r/LocalLLaMA - A more formal mini-language approach