Skip to main content

Recipe 4: Wrapping Image Generation

Scenario

Allow the AI to generate images based on user descriptions. When a user says "draw a kitten" or "generate a sunset beach image," the AI outputs an image instead of replying with text.

Prerequisites

  • Completed Recipe 0 — have a runnable mini program project
  • Project has AI development mode enabled
  • Familiar with SKILL directory structure
  • Debug base library ≥ 3.16.1

Steps

Step 1: Scaffold the SKILL

npx mp-skills create image-gen

Expected output:

* Creating Skill: image-gen
ok miniprogram/skills/image-gen-skill/
ok Scaffold created

Directory structure:

miniprogram/skills/image-gen-skill/
├── SKILL.md Business description file
├── mcp.json Tool declaration
├── apis/
│ └── generateImage.js Text-to-image implementation
├── components/
│ └── image-result-card/ Image result card component
├── seed/ Preview mock data
└── index.js SKILL entry point

Step 2: Write SKILL.md

SKILL.md tells the AI engine when to invoke this SKILL. Edit miniprogram/skills/image-gen-skill/SKILL.md:

# Image Generation

Generates an image based on the user's description.

## Trigger Conditions

When the user wants to generate or create an image, for example:
- "Draw a kitten"
- "Generate a sunset beach image"
- "Paint a landscape"
- "Create a starry sky image"

## Tools

| Tool Name | Description |
|-----------|-------------|
| generateImage | Generate an image from text description, supports style selection |

Step 3: Configure mcp.json

mcp.json declares the tools the AI can invoke and their parameters. Edit miniprogram/skills/image-gen-skill/mcp.json:

{
"tools": [
{
"name": "generateImage",
"description": "Generate an image from a text description",
"input": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "Image description prompt"
},
"style": {
"type": "string",
"description": "Image style",
"enum": ["Realistic", "Anime", "Ink Wash", "Oil Painting", "Watercolor", "Cyberpunk", "3D Render"],
"default": "Realistic"
}
},
"required": ["prompt"]
},
"output": {
"type": "object",
"properties": {
"imageUrl": {
"type": "string",
"description": "Generated image URL",
"format": "image"
},
"prompt": {
"type": "string",
"description": "Actual prompt used for generation"
}
}
}
}
]
}

Note the "format": "image" on the imageUrl field in output. This tells the AI engine that this field is an image URL, so the render layer will display it as an image instead of text. This is one of the key differences between an image generation SKILL and a text generation SKILL.

Step 4: Implement the Tool

Create miniprogram/skills/image-gen-skill/apis/generateImage.js:

const { cloud } = require('../_shared');

/**
* Generate an image from text description
* @param {Object} params
* @param {string} params.prompt - Image description prompt
* @param {string} [params.style] - Image style
* @returns {Promise<{imageUrl: string, prompt: string}>}
*/
async function generateImage({ prompt, style = 'Realistic' }) {
if (!prompt || typeof prompt !== 'string') {
throw new Error('prompt is required and must be a string');
}

const result = await cloud.callFunction({
name: 'image-gen',
data: { action: 'generateImage', prompt, style },
});

if (!result || !result.imageUrl) {
throw new Error('Image generation failed: no image URL returned');
}

return { imageUrl: result.imageUrl, prompt: result.prompt };
}

module.exports = { generateImage };

Deploying the image-gen cloud function is beyond the scope of this recipe. If you use the CloudBase Console's AI Gateway, you can directly integrate text-to-image models like Tencent HunYuan in your cloud function.

Step 5: Build the Component

The component is responsible for displaying the image. Create miniprogram/skills/image-gen-skill/components/image-result-card/image-result-card.wxml:

<view class="image-result-card">
<image
class="result-image"
src="{{imageUrl}}"
mode="widthFix"
binderror="onImageError"
bindload="onImageLoad"
/>
<view class="prompt-text" wx:if="{{prompt}}">
{{prompt}}
</view>
<view class="loading-mask" wx:if="{{loading}}">
<text class="loading-text">Generating image…</text>
</view>
</view>

image-result-card.wxss:

.image-result-card {
position: relative;
width: 100%;
border-radius: 16rpx;
overflow: hidden;
background: #f5f5f5;
}

.result-image {
width: 100%;
display: block;
}

.prompt-text {
padding: 16rpx 20rpx;
font-size: 26rpx;
color: #666;
background: #fafafa;
}

.loading-mask {
position: absolute;
inset: 0;
display: flex;
align-items: center;
justify-content: center;
background: rgba(255, 255, 255, 0.8);
}

.loading-text {
font-size: 28rpx;
color: #333;
}

image-result-card.js:

Component({
properties: {
imageUrl: {
type: String,
value: '',
},
prompt: {
type: String,
value: '',
},
},

data: {
loading: true,
error: false,
},

methods: {
onImageLoad() {
this.setData({ loading: false });
},

onImageError() {
this.setData({ loading: false, error: true });
console.error('Image load failed:', this.properties.imageUrl);
},
},
});

The component must handle loading and error states. Image generation and loading is an asynchronous process — you must give the user clear feedback. This is a key difference from text SKILLs, where text is near-instantaneous; images require a loading wait.

Step 6: Register the SKILL

Edit miniprogram/skills/image-gen-skill/index.js:

const { generateImage } = require('./apis/generateImage');

module.exports = {
name: 'image-gen',
description: 'Image Generation',
apis: [
{
name: 'generateImage',
handler: generateImage,
},
],
components: {
'image-result-card': './components/image-result-card/image-result-card',
},
};

Step 7: Register in app.json

mp-skills create already updated app.json's agent.skills automatically. Verify it contains this entry:

{
"agent": {
"skills": [
// ... other SKILLs
{
"name": "image-gen",
"description": "Image Generation: generate images from text descriptions",
"path": "skills/image-gen-skill"
}
]
}
}

Verification

In the developer tools, switch to "Mini Program AI Compile" mode:

  1. The SKILL list should show image-gen
  2. Select the generateImage tool, enter the parameter {"prompt": "A cute orange cat basking in the sun"} and execute
  3. The result should include imageUrl (the image URL)
  4. The image card renders correctly, showing the loading state and the final image
  5. In dialogue mode, type "draw a kitten" and the AI should automatically call generateImage

Complete Code

apis/generateImage.js

const { cloud } = require('../_shared');

async function generateImage({ prompt, style = 'Realistic' }) {
if (!prompt || typeof prompt !== 'string') {
throw new Error('prompt is required and must be a string');
}

const result = await cloud.callFunction({
name: 'image-gen',
data: { action: 'generateImage', prompt, style },
});

if (!result || !result.imageUrl) {
throw new Error('Image generation failed: no image URL returned');
}

return { imageUrl: result.imageUrl, prompt: result.prompt };
}

module.exports = { generateImage };

components/image-result-card/image-result-card.wxml

<view class="image-result-card">
<image class="result-image" src="{{imageUrl}}" mode="widthFix" binderror="onImageError" bindload="onImageLoad" />
<view class="prompt-text" wx:if="{{prompt}}">{{prompt}}</view>
<view class="loading-mask" wx:if="{{loading}}">
<text class="loading-text">Generating image…</text>
</view>
</view>

components/image-result-card/image-result-card.wxss

.image-result-card {
position: relative; width: 100%; border-radius: 16rpx; overflow: hidden; background: #f5f5f5;
}
.result-image { width: 100%; display: block; }
.prompt-text { padding: 16rpx 20rpx; font-size: 26rpx; color: #666; background: #fafafa; }
.loading-mask {
position: absolute; inset: 0; display: flex; align-items: center;
justify-content: center; background: rgba(255,255,255,0.8);
}
.loading-text { font-size: 28rpx; color: #333; }

components/image-result-card/image-result-card.js

Component({
properties: { imageUrl: { type: String, value: '' }, prompt: { type: String, value: '' } },
data: { loading: true, error: false },
methods: {
onImageLoad() { this.setData({ loading: false }); },
onImageError() { this.setData({ loading: false, error: true }); },
},
});

index.js

const { generateImage } = require('./apis/generateImage');
module.exports = {
name: 'image-gen', description: 'Image Generation',
apis: [{ name: 'generateImage', handler: generateImage }],
components: { 'image-result-card': './components/image-result-card/image-result-card' },
};

Key Differences from Text Generation SKILLs

AspectText GenerationImage Generation
Output typetext stringimage URL (format: image)
Core componentNone or text displayImage card (with loading/error)
Loading waitNear-instantLoading state required
Async handlingSimpleMust handle image load failures
Style controlPrompt-basedEnum style parameter
Data sizeA few KBImage URL (remote load)

FAQ

Image shows blank

  • Check that imageUrl starts with https://
  • Check the mini program's downloadFile domain whitelist
  • Check if the cloud function returned URL is publicly accessible

Loading state never disappears

  • Check the bindload event binding
  • Check if the image URL is actually accessible
  • Check the image request status in the developer tools Network panel

AI doesn't trigger image generation

  • Check if common user expressions are included in SKILL.md's trigger conditions
  • Test the tool directly in developer tools
  • Check the agent.skills configuration in app.json

References