Add CloudBase AI (DeepSeek / Hunyuan) to Next.js

In one sentence: In a Next.js App Router Route Handler, use @cloudbase/node-sdk to obtain the textStream (AsyncIterable) returned by app.ai().createModel().streamText(), wrap it in a native ReadableStream and pipe it to the frontend, where a Client Component consumes it with fetch + reader.read() — no OpenAI key required on either side, and no Vercel AI SDK needed.

Estimated time: 30 minutes | Difficulty: Intermediate

Applicable Scenarios

You want to add an AI chat, summarization, or search box to a Next.js web app, without buying an OpenAI key or self-hosting an LLM gateway
Domestic (China) business that requires all model calls to route through Tencent Cloud, with no direct frontend connection to overseas APIs
You have already used CloudBase AI in a Mini Program via add-ai-wechat-miniprogram and want to bring the same capability to the Web
You want the simplest possible streaming UI (text appearing character by character) without introducing the Vercel AI SDK or configuring SSE

Not Applicable

Overseas business where users are primarily abroad — use OpenAI / Anthropic directly or via the Vercel AI SDK instead; this recipe does not apply
You only need user login / session management, not AI — add-auth-web-with-cloudbase-sdk is sufficient
You want to use the Vercel AI SDK's useChat hook with built-in message state management — this recipe uses bare fetch. If you specifically need useChat, you will have to additionally wrap the stream in SSE data: protocol, which is not covered here
Next.js Pages Router (pages/api/*.ts) — this recipe is entirely based on App Router (app/*). Pages Router requires a different approach using res.write() streaming, which is not covered here

Prerequisites

Dependency	Version
Next.js	14+ (App Router, stable Route Handler)
`@cloudbase/node-sdk`	`3.16.0` or higher (required by AI module)
Node.js	`18.17+` (required by Next.js 14)
Route Handler runtime	Must be `nodejs` — `edge` is not supported (the SDK depends on Node APIs that are unavailable in the Edge Runtime)
CloudBase environment	Provisioned, with "AI+" capability enabled in the Console

The server side must use @cloudbase/node-sdk. Do not reuse the Web-only @cloudbase/js-sdk + signInAnonymously() pattern: anonymous Web SDK calls are aggressively rate-limited by default (see Web SDK Security Policy) and are only suitable for demos. Production must go through a Node SDK + environment-level credential backend proxy.

Step 1: Enable AI Capability in the Console and Select a Model

This is identical to Step 1 in add-ai-wechat-miniprogram:

Open the CloudBase Console → select your environment → AI+ → Quick Setup
On first visit you will see an "Enable Now" button; clicking it automatically injects AI invocation permissions into the environment. Enabling is free; calls are billed per token
Under "Model Management" you can see the list of models available in the current environment. CloudBase provides unified access to DeepSeek, MiniMax, Hunyuan, Kimi, GLM and other mainstream models via Token Resource Packages, with deepseek-v4-flash as the official recommended default (cost-effective, general-purpose). See the full list at Model Access.

All examples below use deepseek-v4-flash. To switch models, only the model: line in the code needs to change; everything else stays the same.

Step 2: Install the SDK and Configure Environment Variables

npm install @cloudbase/node-sdk
# or pnpm add / yarn add

.env.local:

CLOUDBASE_ENV=your-env-id
TENCENTCLOUD_SECRETID=your-secret-id
TENCENTCLOUD_SECRETKEY=your-secret-key

None of these have a NEXT_PUBLIC_ prefix — the SDK call runs inside the server-side Route Handler, and neither the Env ID nor the credentials should be exposed in the client bundle.

SECRETID/SECRETKEY are issued from Tencent Cloud Console → API Keys. For production, prefer a sub-account key with a CAM policy scoped to the current CloudBase environment. If your Next.js app is deployed to CloudBase Cloud Run / Cloud Functions, these two variables are auto-injected and can be omitted.

If your team works in CodeBuddy / Cursor / VS Code with CloudBase MCP installed, the IDE can drive everything in this recipe through the built-in coding plan skill (decompose requirements → generate code → deploy). This recipe documents the equivalent hand-written flow for reference.

Step 3: Write the Route Handler — Convert AsyncIterable to ReadableStream

Create app/api/chat/route.ts:

import tcb from '@cloudbase/node-sdk';

export const runtime = 'nodejs'; // Required: edge is not supported — the SDK depends on Node APIs

let app: ReturnType<typeof tcb.init> | null = null;

function getAi() {
  if (!app) {
    // node-sdk auto-reads credentials from TENCENTCLOUD_SECRETID/SECRETKEY — no need to pass them explicitly
    // timeout 60s: AI generation is slow; the default 15s will be exceeded by long streamText outputs
    app = tcb.init({ env: process.env.CLOUDBASE_ENV!, timeout: 60000 });
  }
  return app.ai();
}

export async function POST(req: Request) {
  const { messages } = await req.json();
  const ai = getAi();
  const model = ai.createModel('cloudbase');

  const result = await model.streamText({
    model: 'deepseek-v4-flash',
    messages,
  });

  // Convert AsyncIterable<string> to a native ReadableStream<Uint8Array>
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of result.textStream) {
          controller.enqueue(encoder.encode(chunk));
        }
        controller.close();
      } catch (err) {
        controller.error(err);
      }
    },
  });

  return new Response(stream, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' },
  });
}

A few key points:

tcb.init() is called on every request (every POST invocation), but the module-level app variable caches the instance — initialization happens only once; subsequent requests reuse it
The server SDK carries an environment-level identity, so signInAnonymously() is not needed. Treat the Route Handler as the backend proxy layer: the Client Component talks only to /api/chat, and the credentials live on the server, never in the browser
result.textStream is an async iterator that yields only text increments; result.dataStream yields full chunk metadata — you can only consume one of them
Do not omit controller.error(err) — without it, the frontend reader will hang indefinitely
The Content-Type is text/plain rather than text/event-stream because the frontend in this recipe uses bare fetch, not SSE. If you want to switch to SSE for use with EventSource, change this to text/event-stream and wrap each chunk as data: xxx\n\n

Step 4: Client Component — Consume the Stream with fetch + getReader

Create app/chat/page.tsx:

'use client';

import { useState } from 'react';

type Message = { role: 'user' | 'assistant'; content: string };

export default function Chat() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);

  async function send() {
    const text = input.trim();
    if (!text || loading) return;

    const userMsg: Message = { role: 'user', content: text };
    const aiMsg: Message = { role: 'assistant', content: '' };
    const next = [...messages, userMsg, aiMsg];
    setMessages(next);
    setInput('');
    setLoading(true);

    try {
      const res = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ messages: [...messages, userMsg] }),
        headers: { 'Content-Type': 'application/json' },
      });

      if (!res.ok || !res.body) {
        throw new Error(`HTTP ${res.status}`);
      }

      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let acc = '';

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        // stream: true is required — without it, multi-byte characters split across chunk boundaries produce garbled output
        acc += decoder.decode(value, { stream: true });
        setMessages((prev) => {
          const copy = [...prev];
          copy[copy.length - 1] = { role: 'assistant', content: acc };
          return copy;
        });
      }
    } catch (err) {
      console.error('[chat] fetch failed', err);
      setMessages((prev) => {
        const copy = [...prev];
        copy[copy.length - 1] = {
          role: 'assistant',
          content: `[Error] ${err instanceof Error ? err.message : String(err)}`,
        };
        return copy;
      });
    } finally {
      setLoading(false);
    }
  }

  return (
    <div style={{ maxWidth: 720, margin: '40px auto', padding: 16 }}>
      <div style={{ minHeight: 400, marginBottom: 16 }}>
        {messages.map((m, i) => (
          <div
            key={i}
            style={{
              padding: 8,
              margin: '8px 0',
              background: m.role === 'user' ? '#eef' : '#f5f5f5',
              whiteSpace: 'pre-wrap',
            }}
          >
            <strong>{m.role}:</strong>
            {m.content}
          </div>
        ))}
      </div>
      <div style={{ display: 'flex', gap: 8 }}>
        <input
          style={{ flex: 1, padding: 8 }}
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask something"
          disabled={loading}
          onKeyDown={(e) => e.key === 'Enter' && send()}
        />
        <button onClick={send} disabled={loading || !input.trim()}>
          {loading ? 'Generating...' : 'Send'}
        </button>
      </div>
    </div>
  );
}

Implementation details worth noting:

decoder.decode(value, { stream: true }) — the stream: true option is essential. A Chinese character typically occupies 3 bytes, and a streaming chunk boundary may split a character in half; without stream: true, the partial bytes are decoded as the replacement character \uFFFD
Pre-push an assistant message with empty content, then replace the last entry on every chunk — the UI appears to grow in place rather than appearing blank and then popping in all at once
No throttling is needed here — the browser's setState + React re-render cycle is far faster than the Mini Program setData bridge, and firing dozens of milliseconds apart is fine. If the model emits text extremely fast (e.g. deepseek-v4-flash and similar Flash-tier models can return hundreds of characters at once), you can add an 80 ms throttle using the same approach as Step 3 in add-ai-wechat-miniprogram

Step 5: Add a System Prompt and Multi-Turn Conversation

Replace the messages argument in the Route Handler (line 19 above) by prepending a system message:

const result = await model.streamText({
  model: 'deepseek-v4-flash',
  messages: [
    {
      role: 'system',
      content:
        'You are a CloudBase product assistant. Keep answers concise; use TypeScript for code examples.',
    },
    ...messages,
  ],
});

Multi-turn conversation requires no special handling — the frontend messages state already preserves the full history, and the complete history is sent with every POST request. The model interprets the context using OpenAI-style role: user / assistant / system. If the conversation grows very long (dozens of turns) and you are concerned about token limits, you will need to implement truncation or summarization in the Route Handler, which is not covered here.

To add search augmentation (letting the AI search real-time web pages before answering), call connect-tavily-search-cloud-function first from the backend to retrieve search results, then splice the matched snippets into the system message or user message before calling streamText.

Verification

Start the development server: npm run dev
Open http://localhost:3000/chat in a browser
Type "Describe CloudBase in one sentence" and click Send
The UI should show the reply appearing incrementally, not a long pause followed by the full text at once
Open browser DevTools → Network panel, find the /api/chat request; the Response tab should show plain text accumulating progressively (not a single JSON object)
The server terminal should not show cloudbase.init is not a function / model not found / permission denied
CloudBase Console → AI+ → Call Records should show the token count for that call

Common Errors

Error / Symptom	Cause	Fix
After deploying to Vercel / CloudBase Run: `cannot find module '@cloudbase/node-sdk'` or `XMLHttpRequest is not defined`	The Route Handler uses `export const runtime = 'edge'`; the Edge Runtime lacks the full Node.js API	Change to `export const runtime = 'nodejs'`; the SDK must run on the Node Runtime
Deployed environment errors with `secretId or secretKey not found` / `getCredential failed`	Server-side credentials were not injected. Vercel and self-managed servers require explicit `TENCENTCLOUD_SECRETID` + `TENCENTCLOUD_SECRETKEY` (Cloud Run / Cloud Functions inject them automatically)	Configure both variables in the deployment platform's environment settings; values come from Tencent Cloud Console → API Keys. For best security, use a sub-account key scoped via CAM to the current CloudBase environment
Streaming response aborts mid-stream with `controller is closed`, or the frontend reader hangs indefinitely	An exception thrown by `streamText` inside the `for await` loop is not forwarded to `controller.error(err)`, leaving the controller in an inconsistent state	The `try/catch` in the Route Handler must convert exceptions to `controller.error(err)` — do not `throw` (once the Response has been sent, a throw cannot reach the client)
Garbled Chinese characters (`��`) in the frontend stream	`TextDecoder.decode(value)` is called without `{ stream: true }`, so multi-byte characters split across chunk boundaries are corrupted	Change to `decoder.decode(value, { stream: true })`; a final `decoder.decode()` with no arguments can flush any remaining bytes, though it can be omitted when the reader reads to completion and breaks
`model not found` / `model xxx is not supported`	Model ID is misspelled, or the model is not available in your environment	Check "Model Management" in the Console for the exact name. Do not copy OpenAI / Anthropic naming conventions. CloudBase's current recommended default is `deepseek-v4-flash`; see Model Access for the full list
`timeout` errors / request hangs around 60s	Node SDK default `timeout: 15000` is too short and is always exceeded in streamText long-output scenarios	Explicitly raise to 60s+ with `tcb.init({ env, timeout: 60000 })`, as shown in the Route Handler above

For the complete error code reference, see https://docs.cloudbase.net/error-code/.

Billing Notes

Newly provisioned environments receive 1 million free token credits for the first month (see the Console billing page for the current quota; this document does not lock in unit prices)
Billing is calculated separately for "input tokens + output tokens"; unit prices vary by model. Streaming is only a different transport mode — token billing is identical to non-streaming
The Route Handler is itself the backend proxy layer, so credentials never reach the browser, but /api/chat is still exposed to the public internet. For production, verify the caller in the POST handler: validate your own session cookie (see add-auth-web-with-cloudbase-sdk for real user identity), apply per-IP / per-UID rate limiting, or combine with CloudBase Security Controls for domain whitelisting

add-ai-wechat-miniprogram — the equivalent implementation of CloudBase AI in a Mini Program (wx.cloud.extend.AI, with the Mini Program's built-in identity)
add-vercel-ai-sdk-streaming-chatbot — if you prefer the Vercel AI SDK's useChat hook over bare fetch
connect-tavily-search-cloud-function — add real-time web search to AI, building a search-augmented chatbot
connect-openai-api-cloud-function — the alternative approach for overseas businesses using the OpenAI API via a Cloud Function proxy
add-auth-web-with-cloudbase-sdk — integrate real user identity in a Next.js web app, paired with the Route Handler for precise rate limiting
CloudBase AI Toolkit — drive the same flow in your AI IDE through the built-in coding plan skill
SDK Initialization and Invocation — official initialization guide for app.ai() (covers the Node.js server-side pattern)
SDK API Reference — complete signatures for createModel / generateText / streamText

Next Steps

After getting basic conversation working, the most practical follow-on is splicing search results into the prompt — use connect-tavily-search-cloud-function to call Tavily from the Route Handler, retrieve real-time web summaries, and insert them into the system message or the last user message before calling streamText; this produces a search-augmented chatbot capable of answering questions about recent events. If you need the AI to answer questions about your own product documentation or knowledge base, use add-rag-with-pgvector-cloudbase to splice retrieved snippets into messages.

Applicable Scenarios​

Not Applicable​

Prerequisites​

Step 1: Enable AI Capability in the Console and Select a Model​

Step 2: Install the SDK and Configure Environment Variables​

Step 3: Write the Route Handler — Convert AsyncIterable to ReadableStream​

Step 4: Client Component — Consume the Stream with fetch + getReader​

Step 5: Add a System Prompt and Multi-Turn Conversation​

Verification​

Common Errors​

Billing Notes​

Related Documentation​

Next Steps​