ResourcesOpen Source AI SkillsQuality Toolkit for OpenClaw

Quality Toolkit for OpenClaw

Skill Quality Assurance

Two skills that work together: challenge your assumptions before building, then hunt for bugs after. A quality assurance system for AI agent workflows.

View on GitHub

⚠️ Before installing or using this skill, review its full contents — including all scripts — to ensure they meet your security and quality standards. Users take ultimate responsibility for any skill they choose to use. These are community-sourced and updated.

Skill 1: Premise Check

A 7-step reasoning checklist that forces genuine challenge of assumptions before committing to an approach.

  • State your core assumption
  • Design the best solution that doesn't rely on it
  • Compare honestly
  • Steel-man the opposite
  • Find the hidden cost
  • Ask "reactive or proactive?"
  • Explain it to a skeptic in 2 sentences
Download premise-check.skill

Skill 2: Stress Test

A 4-layer testing checklist covering immediate validation, adversarial testing, deployment verification, and type-specific checks.

  • Run with real data, in the real execution context
  • Bad input, missing dependencies, timing failures
  • Silent failure detection
  • Logging, monitoring, credential lifecycle
  • Type-specific checks: APIs, cron jobs, services, static sites, GitHub Actions
Download stress-test.skill

The Problem

AI agents are fast builders. They'll propose a solution and start implementing it before you've finished reading the proposal. That speed is great — until the first idea wasn't the best idea, or the implementation has silent bugs that won't surface until something breaks at 2 AM.

These two skills add friction in the right places: before you commit to an approach, and after you've built it.

How They Work Together

The toolkit follows a simple cycle:

  1. You describe what you want to build
  2. Premise Check runs — your agent challenges its own assumptions, designs the opposite approach, and compares honestly before presenting a recommendation
  3. You build it
  4. Stress Test runs — your agent systematically tries to break what it just built, checking for silent failures, missing error handling, timing issues, and deployment gaps

The skills are independent — you can use either one alone — but they're designed as a pair. Premise Check prevents building the wrong thing. Stress Test prevents shipping a broken thing.

Origin Story

These skills were born from a real conversation. We were building an event archive automation system and asked our agent to verify its proposal was the best approach by challenging itself five times. It generated five questions, answered them in one pass, and defended its original idea.

Then a less technically versed user asked one question — "why poll at all when you could just schedule it?" — and the agent immediately recognized this was a fundamentally better approach. The agent had challenged its alternatives but never its premise.

Premise Check exists because of that moment. Stress Test followed naturally: once we had the right approach, we needed a systematic way to find the bugs the agent would otherwise ship silently.

When to Use Each

Premise Check — Before Building

Stress Test — After Building

Example: Event Archive Scheduler

Here's how both skills were used on the system that inspired them:

Premise Check found:

Stress Test found 3 bugs:

  1. Cleanup logic deleted future scheduled triggers (not just past ones)
  2. curl exited 0 on authentication failure — silent failure on expired tokens
  3. GitHub token was hardcoded instead of reading from single source of truth