Skip to main content

One doc tagged with "multimodal"

View All Tags

Multimodal Image Understanding with DeepSeek V4-Pro in CloudBase AI

A Next.js Route Handler receives user-uploaded images, converts them to base64, and calls @cloudbase/node-sdk's app.ai().createModel('cloudbase').generateText with model: 'deepseek-v4-pro' using multimodal messages to get image descriptions, OCR results, and content analysis. Covers single image, multiple images, and image + text prompts.