Skip to content
STAR ON GITHUB

Token Efficiency

When an AI coding session runs without external orchestration, the AI spends tokens on things that aren’t artifact generation:

  • “What stage am I in?” — re-reading prior conversation to reconstruct state
  • “What has been decided so far?” — re-summarizing prior specs and decisions
  • “What command should I run next?” — reasoning about workflow position
  • “Did that artifact get recorded?” — uncertainty about whether prior work persists

In a typical unguided session for a feature that goes through PM → Engineering, a large fraction of the AI’s token budget goes to orientation and workflow reasoning rather than to the PRD or TechSpec itself.

What s2s handles in the binary (zero tokens)

Section titled “What s2s handles in the binary (zero tokens)”

Every operation that s2s performs in the CLI binary costs zero AI tokens:

OperationToken cost
Intent classification0
Stage route planning0
Stage context package construction0
.s2s/live.md state update0
Artifact quality assessment0
Ledger advancement0
Gate creation and lifecycle0
Worktree setup and isolation0
Git delivery (branch, push, PR)0

Each stage gives the AI a context package with exactly what it needs. The AI’s job in each stage is narrow and well-defined:

StageAI’s taskToken efficiency
pmWrite PRD.md from the context packageHigh: no state reconstruction needed
researchWrite Research.md with focused investigationHigh: prior decisions are in the package
designWrite PrototypeSpec.mdHigh: PRD and research are summarized
engineeringWrite TechSpec.md + Backlog.mdHigh: full prior context delivered

The AI doesn’t need to remember where it is in the workflow. It reads the context package, generates the artifact, and submits. live.md holds the state between stages.

For a medium-complexity feature going through pm → engineering:

Without s2s (unguided session):

  • Tokens to orient at session start: ~500–1000
  • Tokens per stage to reconstruct prior decisions: ~300–800
  • Tokens on workflow reasoning: ~200–500 per stage
  • Total overhead: ~1500–3000 tokens for two stages

With s2s (chat-native):

  • Orientation: read live.md (~150 tokens)
  • Per stage: receive focused context package, generate artifact
  • Total overhead: ~200 tokens across both stages

The difference scales with project complexity. Long-running projects with many prior decisions benefit most — the AI never needs to re-read the full conversation history because live.md and the context package contain exactly what’s needed.

Quality checks run on --submit. The threshold controls when auto-approve fires vs. when a review gate is created:

{
"quality": {
"enabled": true,
"minAutoApproveScore": 0.85,
"blockOnFailure": false
}
}
  • minAutoApproveScore: 0.0–1.0. Default 0.85. Scores below this trigger a quality failure message; the AI must fix and re-submit.
  • blockOnFailure: if true, quality failure exits with a non-zero code (useful in CI). Default false.

Update with s2s config edit.

By default, s2s prints [s2s] prefix lines before and after stage output. To suppress them:

{ "verbose": false }

in runtime.json, or run s2s stage <stage> --no-verbose. Prefix lines are informational and do not affect the context package content.