Recipe 4: Wrapping Image Generation
Scenario
Allow the AI to generate images based on user descriptions. When a user says "draw a kitten" or "generate a sunset beach image," the AI outputs an image instead of replying with text.
Prerequisites
- Completed Recipe 0 — have a runnable mini program project
- Project has AI development mode enabled
- Familiar with SKILL directory structure
- Debug base library ≥ 3.16.1
Steps
Step 1: Scaffold the SKILL
npx mp-skills create image-gen
Expected output:
* Creating Skill: image-gen
ok miniprogram/skills/image-gen-skill/
ok Scaffold created
Directory structure:
miniprogram/skills/image-gen-skill/
├── SKILL.md Business description file
├── mcp.json Tool declaration
├── apis/
│ └── generateImage.js Text-to-image implementation
├── components/
│ └── image-result-card/ Image result card component
├── seed/ Preview mock data
└── index.js SKILL entry point
Step 2: Write SKILL.md
SKILL.md tells the AI engine when to invoke this SKILL. Edit miniprogram/skills/image-gen-skill/SKILL.md:
# Image Generation
Generates an image based on the user's description.
## Trigger Conditions
When the user wants to generate or create an image, for example:
- "Draw a kitten"
- "Generate a sunset beach image"
- "Paint a landscape"
- "Create a starry sky image"
## Tools
| Tool Name | Description |
|-----------|-------------|
| generateImage | Generate an image from text description, supports style selection |
Step 3: Configure mcp.json
mcp.json declares the tools the AI can invoke and their parameters. Edit miniprogram/skills/image-gen-skill/mcp.json:
{
"tools": [
{
"name": "generateImage",
"description": "Generate an image from a text description",
"input": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "Image description prompt"
},
"style": {
"type": "string",
"description": "Image style",
"enum": ["Realistic", "Anime", "Ink Wash", "Oil Painting", "Watercolor", "Cyberpunk", "3D Render"],
"default": "Realistic"
}
},
"required": ["prompt"]
},
"output": {
"type": "object",
"properties": {
"imageUrl": {
"type": "string",
"description": "Generated image URL",
"format": "image"
},
"prompt": {
"type": "string",
"description": "Actual prompt used for generation"
}
}
}
}
]
}
Note the
"format": "image"on theimageUrlfield inoutput. This tells the AI engine that this field is an image URL, so the render layer will display it as an image instead of text. This is one of the key differences between an image generation SKILL and a text generation SKILL.
Step 4: Implement the Tool
Create miniprogram/skills/image-gen-skill/apis/generateImage.js:
const { cloud } = require('../_shared');
/**
* Generate an image from text description
* @param {Object} params
* @param {string} params.prompt - Image description prompt
* @param {string} [params.style] - Image style
* @returns {Promise<{imageUrl: string, prompt: string}>}
*/
async function generateImage({ prompt, style = 'Realistic' }) {
if (!prompt || typeof prompt !== 'string') {
throw new Error('prompt is required and must be a string');
}
const result = await cloud.callFunction({
name: 'image-gen',
data: { action: 'generateImage', prompt, style },
});
if (!result || !result.imageUrl) {
throw new Error('Image generation failed: no image URL returned');
}
return { imageUrl: result.imageUrl, prompt: result.prompt };
}
module.exports = { generateImage };
Deploying the
image-gencloud function is beyond the scope of this recipe. If you use the CloudBase Console's AI Gateway, you can directly integrate text-to-image models like Tencent HunYuan in your cloud function.
Step 5: Build the Component
The component is responsible for displaying the image. Create miniprogram/skills/image-gen-skill/components/image-result-card/image-result-card.wxml:
<view class="image-result-card">
<image
class="result-image"
src="{{imageUrl}}"
mode="widthFix"
binderror="onImageError"
bindload="onImageLoad"
/>
<view class="prompt-text" wx:if="{{prompt}}">
{{prompt}}
</view>
<view class="loading-mask" wx:if="{{loading}}">
<text class="loading-text">Generating image…</text>
</view>
</view>
image-result-card.wxss:
.image-result-card {
position: relative;
width: 100%;
border-radius: 16rpx;
overflow: hidden;
background: #f5f5f5;
}
.result-image {
width: 100%;
display: block;
}
.prompt-text {
padding: 16rpx 20rpx;
font-size: 26rpx;
color: #666;
background: #fafafa;
}
.loading-mask {
position: absolute;
inset: 0;
display: flex;
align-items: center;
justify-content: center;
background: rgba(255, 255, 255, 0.8);
}
.loading-text {
font-size: 28rpx;
color: #333;
}
image-result-card.js:
Component({
properties: {
imageUrl: {
type: String,
value: '',
},
prompt: {
type: String,
value: '',
},
},
data: {
loading: true,
error: false,
},
methods: {
onImageLoad() {
this.setData({ loading: false });
},
onImageError() {
this.setData({ loading: false, error: true });
console.error('Image load failed:', this.properties.imageUrl);
},
},
});
The component must handle loading and error states. Image generation and loading is an asynchronous process — you must give the user clear feedback. This is a key difference from text SKILLs, where text is near-instantaneous; images require a loading wait.
Step 6: Register the SKILL
Edit miniprogram/skills/image-gen-skill/index.js:
const { generateImage } = require('./apis/generateImage');
module.exports = {
name: 'image-gen',
description: 'Image Generation',
apis: [
{
name: 'generateImage',
handler: generateImage,
},
],
components: {
'image-result-card': './components/image-result-card/image-result-card',
},
};
Step 7: Register in app.json
mp-skills create already updated app.json's agent.skills automatically. Verify it contains this entry:
{
"agent": {
"skills": [
// ... other SKILLs
{
"name": "image-gen",
"description": "Image Generation: generate images from text descriptions",
"path": "skills/image-gen-skill"
}
]
}
}
Verification
In the developer tools, switch to "Mini Program AI Compile" mode:
- The SKILL list should show
image-gen - Select the
generateImagetool, enter the parameter{"prompt": "A cute orange cat basking in the sun"}and execute - The result should include
imageUrl(the image URL) - The image card renders correctly, showing the loading state and the final image
- In dialogue mode, type "draw a kitten" and the AI should automatically call
generateImage
Complete Code
apis/generateImage.js
const { cloud } = require('../_shared');
async function generateImage({ prompt, style = 'Realistic' }) {
if (!prompt || typeof prompt !== 'string') {
throw new Error('prompt is required and must be a string');
}
const result = await cloud.callFunction({
name: 'image-gen',
data: { action: 'generateImage', prompt, style },
});
if (!result || !result.imageUrl) {
throw new Error('Image generation failed: no image URL returned');
}
return { imageUrl: result.imageUrl, prompt: result.prompt };
}
module.exports = { generateImage };
components/image-result-card/image-result-card.wxml
<view class="image-result-card">
<image class="result-image" src="{{imageUrl}}" mode="widthFix" binderror="onImageError" bindload="onImageLoad" />
<view class="prompt-text" wx:if="{{prompt}}">{{prompt}}</view>
<view class="loading-mask" wx:if="{{loading}}">
<text class="loading-text">Generating image…</text>
</view>
</view>
components/image-result-card/image-result-card.wxss
.image-result-card {
position: relative; width: 100%; border-radius: 16rpx; overflow: hidden; background: #f5f5f5;
}
.result-image { width: 100%; display: block; }
.prompt-text { padding: 16rpx 20rpx; font-size: 26rpx; color: #666; background: #fafafa; }
.loading-mask {
position: absolute; inset: 0; display: flex; align-items: center;
justify-content: center; background: rgba(255,255,255,0.8);
}
.loading-text { font-size: 28rpx; color: #333; }
components/image-result-card/image-result-card.js
Component({
properties: { imageUrl: { type: String, value: '' }, prompt: { type: String, value: '' } },
data: { loading: true, error: false },
methods: {
onImageLoad() { this.setData({ loading: false }); },
onImageError() { this.setData({ loading: false, error: true }); },
},
});
index.js
const { generateImage } = require('./apis/generateImage');
module.exports = {
name: 'image-gen', description: 'Image Generation',
apis: [{ name: 'generateImage', handler: generateImage }],
components: { 'image-result-card': './components/image-result-card/image-result-card' },
};
Key Differences from Text Generation SKILLs
| Aspect | Text Generation | Image Generation |
|---|---|---|
| Output type | text string | image URL (format: image) |
| Core component | None or text display | Image card (with loading/error) |
| Loading wait | Near-instant | Loading state required |
| Async handling | Simple | Must handle image load failures |
| Style control | Prompt-based | Enum style parameter |
| Data size | A few KB | Image URL (remote load) |
FAQ
Image shows blank
- Check that
imageUrlstarts withhttps:// - Check the mini program's downloadFile domain whitelist
- Check if the cloud function returned URL is publicly accessible
Loading state never disappears
- Check the
bindloadevent binding - Check if the image URL is actually accessible
- Check the image request status in the developer tools Network panel
AI doesn't trigger image generation
- Check if common user expressions are included in SKILL.md's trigger conditions
- Test the tool directly in developer tools
- Check the
agent.skillsconfiguration inapp.json