Quality Check Before Launch
Scenario
You've built your SKILL and need to verify it works correctly before going live. This tutorial covers static validation with validate and end-to-end evaluation with eval, ensuring the AI can correctly recognize and invoke your SKILL.
Prerequisites
- Completed Write Your First AI SKILL or have existing SKILLs to test
- Project opens normally in WeChat DevTools
- Base library ≥ 3.16.1
Steps
Step 1: Static validation
npx mp-skills validate
Checks performed:
| Check | Description |
|---|---|
| Directory structure | SKILL.md, mcp.json, index.js exist in correct locations |
| mcp.json schema | API names, inputSchema, outputSchema format compliance |
| Component completeness | componentPath target has all 4 files (js/json/wxml/wxss) |
| API consistency | API names match between mcp.json and index.js |
Expected output:
[OK] Directory structure check passed
[OK] mcp.json schema check passed
[OK] Component completeness check passed
[OK] API consistency check passed
[OK] Project config check passed
Step 2: Configure evaluation environment
mp-skills eval needs LLM credentials to generate test cases:
# Method 1: CloudBase token mode (recommended)
npx mp-skills login
npx mp-skills eval --provider cloudbase -c 3
# Method 2: Manual config
export WXA_SKILL_EVAL_LLM_BASE_URL=https://api.deepseek.com/v1
export WXA_SKILL_EVAL_LLM_API_KEY=sk-your-key
export WXA_SKILL_EVAL_LLM_MODEL=deepseek-chat
# Method 3: Interactive wizard (first run)
# Just run eval — it'll prompt you to select an LLM provider
npx mp-skills eval -c 3
Step 3: Run evaluation
# Generate 3 test cases for all SKILLs
npx mp-skills eval -c 3
# Evaluate a specific SKILL only
npx mp-skills eval -s order-skill -c 5
# Headless mode (for CI)
npx mp-skills eval --headless -c 3
Evaluation dimensions:
| Dimension | Description |
|---|---|
| Intent recognition | Does the AI correctly select your SKILL from trigger phrases |
| Parameter extraction | Can the AI extract correct parameters from natural language |
| API invocation | Does the atomic API return data correctly |
| Card rendering | Is the result displayed through the atomic component |
Step 4: Review evaluation results
=== Evaluation Report ===
SKILL: order-skill
Test cases: 3
Passed: 2
Failed: 1
Failed details:
- Case "search for Sichuan restaurant": Intent correct, but keyword parameter
extracted as "Sichuan restaurant" (expected: "Sichuan")
Suggestion: Clarify parameter extraction rules in mcp.json inputSchema
Step 5: Fix and iterate
Common issues:
| Issue | Cause | Fix |
|---|---|---|
| AI didn't pick your SKILL | Vague description in SKILL.md or mcp.json | Clarify trigger scenarios |
| Wrong parameter extraction | inputSchema.description doesn't specify value source | Explain where to extract from user's message |
| API returns empty | Bug in code or backend unavailable | Check API code and cloud functions |
| Card rendering broken | Field name mismatch in structuredContent | Check component's on(Result) field names |
After fixing, re-run validation and evaluation:
npx mp-skills validate
npx mp-skills eval -c 3
Verification Checklist
-
npx mp-skills validatepasses all checks -
npx mp-skills eval -c 3launches successfully - No critical errors in evaluation report
- Fixed SKILL passes validation again
Common Issues
npx mp-skills eval reports LLM credential error
- Use CloudBase token mode:
npx mp-skills loginthen--provider cloudbase - Or set
WXA_SKILL_EVAL_LLM_API_KEYenvironment variable manually
Evaluation is slow
- Each test case takes 1-3 minutes (calls LLM for test generation and verification)
- Start with
-c 1for quick validation, then increase
validate reports errors you can't find
- Check
mcp.jsonJSON syntax (commas, brackets) - Check component directories have all 4 files (js/json/wxml/wxss)