Docs @ Work¶

Warning

Write

minimum correct document

at the earliest useful time

for the widest necessary audience.

When, Why, Who, What?¶

A. Before Building - Alignment & Planning¶

[verb:own] Design Docs: clarify problem, goals, trade-offs, proposed solution.
[verb:review] PRDs: product-side framing of why/what to build.
[verb:co-own] RFCs: socialize big or cross-team changes, gather feedback.

B. During Development - Execution & Interfaces¶

[verb:own] Tech Specs / API Docs: define contracts, data schemas, interfaces.
[verb:co-own] Meeting/Decision Notes: record ongoing decisions, unblockers, next steps.

C. After Experimentation - Evaluation & Iteration¶

[verb:own] Experiment Docs: capture hypotheses, metrics, results, learnings.

D. In Production - Operations & Maintenance¶

[verb:co-own] Runbooks / Playbooks: step-by-step guides for incidents & debugging.
[verb:own] Onboarding/Knowledge Docs: long-term references to ramp up engineers and avoid institutional memory loss.

How?¶

1. Design Docs (System/ML/Experiment)¶

Important

Write when the problem or trade-offs are unclear.
Capture decisions and rejected alternatives explicitly.

Best Practice: Keep decision rationale explicit; diagrams > text walls.
Structure: Title: [System/Model Name] Design Doc
- Context / Background
- Problem Statement
- Goals & Non-Goals
- Proposed Solution (with diagram)
- Alternatives Considered
- Rollout Plan / Monitoring
- Risks & Open Questions

2. PRDs (Product Requirement Docs)¶

Important

Problem and success metrics must be measurable.
If metrics are vague, stop and fix before execution.

Best Practice: Clarify “why now” and measurable success upfront. Bother only if you partner with PMs.
Structure: Title: [Feature/Project Name] PRD
- Context / Motivation
- Problem / User Need
- Target Users & Use Cases
- Requirements (Must / Nice to Have)
- Success Metrics / KPIs
- Out of Scope
- Timeline & Dependencies

3. RFCs (Request for Comments)¶

Important

Use when cross-team impact or approval is required.
Publish early, before commitment or implementation.

Best Practice: Keep decision logs so future engineers see why something was chosen.
Structure: Title: RFC: [Proposed Change]
- Summary
- Motivation / Why Now
- Detailed Proposal
- Alternatives Considered
- Impact (Teams, APIs, Infra)
- Backward Compatibility / Migration
- Open Questions

4. Tech Specs / API Docs¶

Important

Write once interfaces or schemas must be stable.
Treat it as a contract; update when compatibility changes.

Best Practice: Keep concise, machine-readable schemas alongside human docs.
Structure: Title: [Service / API Spec]
- Overview / Purpose
- Inputs & Outputs
- Data Schema / Contracts
- Endpoints / Interfaces
- Example Requests & Responses
- Edge Cases / Error Handling

5. Meeting/Decision Notes¶

Important

Write when decisions are made verbally.
Decisions and owners must be explicit and timestamped.

Best Practice: Send out within 24h; decisions bolded.
Structure: Meeting: [Name, Date]
- Attendees
- Agenda
- Discussion Highlights
- Decisions (bolded)
- Action Items (with owners & deadlines)

6. Experiment Docs¶

Important

Every experiment must have a hypothesis and metrics.
Always record outcomes, even if decision is already made.

Best Practice: Include links to dashboards, queries, PRDs.
Structure: Experiment: [Name]
- Background / Hypothesis
- Experiment Design (groups, traffic split, duration)
- Metrics (Primary / Guardrail)
- Results & Analysis
- Learnings
- Next Steps
- Links (dashboards, queries, PRs)

7. Runbooks / Playbooks¶

Important

Optimize for 3 AM usage by someone without context.
No design rationale; only symptoms and actions.

Best Practice: Test them via dry-runs; keep “one-page quick actions” upfront.
Structure: Service / Model: [Name]
- Symptoms / Alerts
- Diagnosis Steps
- Known Issues
- Mitigation / Recovery Steps
- Escalation Paths
- Quick Reference (one-liner commands, dashboards)

8. Onboarding/Knowledge Docs¶

Important

Answer “how does this system actually work end-to-end.”
Update when tribal knowledge causes repeated questions.

Best Practice: Keep “getting started in < 1hr” flow at top.
Structure: Topic: [System / Team Name]
- Introduction / High-level Overview
- Architecture Diagram
- How to Get Started (setup in <1hr)
- Common Workflows
- Pitfalls / Gotchas
- Resources (dashboards, repos, contacts)

Anything Else?¶

Supporting Details Doc - [H1 20XX / H2 20XY]¶

Career Alignment
- Long-term aspiration: [e.g. Senior -> Staff -> Senior Staff, or technical leadership in XYZ domain]
- Key themes this cycle: [Execution excellence, Technical depth, Mentorship, etc.]
- Connection to company/organization goals: [short bullets]
Weekly Rhythm
- Rituals: [standup, weekly reflection, learning block, 1:1 prep]
- Weekly focus areas: [delivery, design reviews, infra deep dive, mentoring]
- Status log: [table with week/date | wins | blockers | next focus]
Active Projects
- Project 1: [title, owner, status, link to doc]
  Key metrics: […]
  
  Next milestones: […]
  
  Commands/snippets/config notes: (inline or linked)
- Project 2: …
Work Artifacts
- Owned Docs: [links]
- Co-owned Docs: [links]
- Reviews / RFCs: [links]
- Code repos: [links]
Knowledge Base
- Domain knowledge: [links to wikis, internal docs, papers]
- Tech stack - inside company: [infra, APIs, pipelines]
- Tech stack - outside company: [OSS libraries, blogs, references]
Retros & Checkpoints
- Mid-cycle reflection: [notes]
- End-cycle reflection: [impact vs goals, lessons learned]