Articles

Workflow + Superpowers Integration: Applying AI Agent Quality Harness

Chloe Kim

Backend engineer at Quandri

TL;DR: We integrated Superpowers plugin's quality enforcement patterns and Harness framework's "lessons become rules, rules become automation" feedback loop into Quandri's workflow plugin, so the AI agent automatically passes quality gates throughout the entire development lifecycle.

Background: Two Sources of Inspiration


1. Superpowers - Engineering Skills Library

Superpowers is a comprehensive skills library for Claude Code. It enforces proven engineering practices like TDD, systematic debugging, brainstorming, and code review so that the AI agent performs them automatically.

The core principle is "if a skill exists, it must be invoked." When the agent writes code, TDD auto-triggers. When it encounters a bug, systematic-debugging activates. Before declaring work complete, verification-before-completion demands evidence.

Key skills:

CategorySkillRoleTestingtest-driven-developmentEnforce RED-GREEN-REFACTOR cycleDebuggingsystematic-debugging4-phase root cause analysisCollaborationbrainstormingSocratic dialogue for design explorationCollaborationverification-before-completionNo completion claims without evidenceCollaborationreceiving-code-reviewTechnical evaluation of review feedback (no performative agreement)Developmentwriting-plansBite-sized implementation plans (2-5 min per step)


2. Harness - AI Agent Quality Enforcement Structure

Harness is a structure that wraps AI agents to prevent them from losing direction when working in a codebase. It pre-structures rules, verification mechanisms, and session memory so that agents face "cannot proceed if violated" level quality enforcement.

Two concepts from Harness stood out:

Verify Lane (Development area, 76% maturity)

verify lane + hook + arch lint is built into the development loop itself, so verification runs at every commit during coding, not just at PR time. This is why the Development area has the highest maturity rating in the Harness framework.

Memory Feedback Loop (Memory area)

"Lessons become rules, rules become automation." A memory layer that lets the next session pick up where the previous one left off, even when sessions are interrupted.


Problem: Good Tools Operating in Isolation

Quandri's workflow plugin automates the Linear-GitHub-Notion development lifecycle with 7 skills:

/workflow:starting -> /workflow:branching -> implementation -> /workflow:submitting -> /workflow:reviewing -> /workflow:ending

Superpowers enforces engineering quality with 14 skills.

Both plugins are installed simultaneously, but they operate independently. There is no guarantee that superpowers' verification runs automatically when workflow creates a PR, or that the receiving-code-review technical evaluation pattern applies when processing review feedback.


Solution: Inserting Superpowers Quality Gates into Workflow Skills

Approach

We insert superpowers skill invocation directives directly into each workflow SKILL.md file. Combined with superpowers' existing enforcement mechanism (using-superpowers mandates skill invocation at session start), this creates explicit and predictable quality gates.

Integration Mapping

Workflow SkillAdded SuperpowersEffect/workflow:startingbrainstorming (conditional), writing-plans, verify laneComplex issues get design exploration first; verification at every commit during implementation/workflow:submittingverification-before-completion, requesting-code-reviewTest/build/lint enforced before PR; AI self-review after PR creation/workflow:reviewingreceiving-code-reviewTechnical evaluation of review feedback (performative agreement forbidden)/workflow:endingverification-before-completion, memory feedback loopFinal verification before merge; lessons saved to memory

Unchanged skills: branching (pure git operation), creating-issue (metadata collection), creating-doc (document creation)


Detailed Changes

1. /workflow:starting - 3 Additions

Step 2b: Conditional Brainstorming

After issue analysis, complexity is evaluated. If design decisions are needed, changes span multiple modules, or the implementation approach is unclear, superpowers:brainstorming is invoked to explore the design.

Simple bug fixes or 1-2 file changes with obvious implementation skip brainstorming and proceed directly to planning.

Step 3 Enhancement: writing-plans Pattern

Augments existing planning with superpowers' writing-plans pattern. Each step becomes a single 2-5 minute action, RED-GREEN-REFACTOR cycles are explicitly stated, and file paths with specific changes are documented.

Step 5 Enhancement: Verify Lane (Adopted from Harness)

This is the key idea adopted from Harness. Instead of catching problems at PR time, tests and linter run before every commit during implementation. If verification fails, superpowers:systematic-debugging must find the root cause before proceeding.



2. /workflow:submitting - 3 Additions

Step 3c: Pre-Submit Verification

Invokes superpowers:verification-before-completion before PR creation. Tests, linter, and build must all pass. If anything fails, push/PR creation is blocked. Trivial changes (typo fixes, config updates) get reduced verification: linter + build only.

This verification applies to both initial submission (before Step 4) and follow-up submission (before Step 5 push).


Step 4b: Post-Submit Self-Review

After initial PR creation, superpowers:requesting-code-review dispatches a code-reviewer subagent. This adds an AI self-review layer before human reviewers see the PR. Critical issues are fixed and pushed immediately.


3. /workflow:reviewing - 1 Addition

Step 3b: Review Feedback Evaluation

Before applying review feedback, the superpowers:receiving-code-review evaluation pattern is applied:

  1. READ - Complete the feedback without reacting
  2. UNDERSTAND - Restate the requirement in your own words
  3. VERIFY - Check against actual codebase: is the reviewer's assumption correct?
  4. EVALUATE - Is this technically sound for THIS codebase?
  5. RESPOND - Technical acknowledgment or reasoned pushback

Responses like "You're absolutely right!" (performative agreement) are forbidden.


4. /workflow:ending - 2 Additions

Step 1b: Pre-Merge Verification

Invokes superpowers:verification-before-completion before merge. Pulls latest base branch, runs test suite, checks CI status, and confirms review approval.

Step 5: Memory Feedback Loop (Adopted from Harness)

This adopts Harness's "lessons become rules, rules become automation" feedback loop. After merge completes, the work cycle is reflected upon:

  • Were there recurring review comments indicating a pattern to remember?
  • Did unexpected build/test failures reveal a knowledge gap?
  • Were there workflow friction points worth noting for future sessions?

Lessons are saved to Claude's memory system so the same mistakes are not repeated in the next session.


End-to-End Flow

flowchart TD
   A["/workflow:starting INT-123"] --> B{Evaluate complexity}
   B -->|Complex| C["superpowers:brainstorming"]
   B -->|Simple| D["Enter plan mode"]
   C --> D
   D --> E["Plan with writing-plans pattern"]
   E --> F["/workflow:branching"]
   F --> G["Implementation + Verify Lane\\n(test+lint before each commit)"]
   G --> H["/workflow:submitting"]
   H --> I["verification-before-completion\\n(test/build/lint)"]
   I --> J["git push + Create PR"]
   J --> K["requesting-code-review\\n(AI self-review)"]
   K --> L["Await reviewer feedback"]
   L --> M["/workflow:reviewing"]
   M --> N["receiving-code-review pattern\\n(technical evaluation before applying)"]
   N --> O["/workflow:ending"]
   O --> P["verification-before-completion\\n(final check)"]
   P --> Q["gh pr merge --squash"]
   Q --> R["Memory Feedback Loop\\n(save lessons learned)"]



Why We Did It This Way


From Superpowers: Quality Enforcement Mechanism

The core value of Superpowers is mandatory enforcement: "if a skill exists, it must be invoked." Not optional, mandatory. By explicitly connecting this to each workflow step, quality gates fire automatically throughout the entire development lifecycle.


From Harness: In-Loop Verification and Learning

Two ideas adopted from Harness differentiate this integration:

Verify Lane - Instead of discovering problems at PR time in bulk, verify at every commit during development. This is why the Development area achieved 76% maturity, the highest in the Harness framework.

Memory Feedback Loop - Enables cross-session learning. The "lessons become rules, rules become automation" feedback loop creates a structure where workflow quality improves cumulatively over time.


What We Did Not Adopt

IdeaReasonHarness file-based task state management (todo/ -> done/)Claude Code's TaskCreate/TaskUpdate is sufficientHarness Runtime (Docker/simulators)Outside plugin scope, infrastructure concernHarness Evaluation (LLM-as-Judge)Code-reviewer subagent is sufficient at this stageHook-based auto-triggeringSKILL.md internal directives are more explicit and easier to debug

Chloe Kim

Chloe is a backend engineer at Quandri. She's interested in AI workflows and agent-native engineering, and how AI is changing the way software gets built.