Proxy Overseas LLM APIs (OpenAI / Anthropic) via Cloud Function

In one sentence: Write an OpenAI-compatible thin reverse proxy in a CloudBase Web Cloud Function (HTTP trigger) so the frontend never sees the real API key, and SSE streaming responses are piped directly to the browser.

Estimated time: 30 minutes | Difficulty: Advanced

Applicable Scenarios

You have a CloudBase Environment and want to add LLM capabilities to a frontend / WeChat Mini Program / CloudBase Run service, but the upstream server is overseas and the key cannot be exposed to the frontend.
You want to switch the underlying model (OpenAI / Anthropic / DeepSeek / Tongyi / Hunyuan) without touching frontend code — the frontend talks to one internal endpoint only.
You want to use the OpenAI protocol as a standard integration layer, making it easy to later connect Vercel AI SDK or similar frameworks.

Not applicable:

Calling domestic-only models (Hunyuan, Tongyi): use their SDKs directly, no proxy needed.
Doing prompt orchestration or retrieval logic: that belongs in the upper-layer chatbot. This recipe only handles "passthrough + authentication".

Prerequisites

Dependency	Version
Node.js (Cloud Function runtime)	≥ 18 (built-in `fetch`; older versions need `node-fetch`)
`@cloudbase/cli`	latest
Cloud Function type	Web Cloud Function (HTTP trigger) — event-driven functions cannot do SSE streaming
Public network egress	Cloud Functions have no fixed public IP by default; enable "Network → Public Access" in the Console to call overseas APIs

You will need:

An API key for an overseas LLM (OpenAI, Anthropic, or DeepSeek; DeepSeek / Tongyi / Hunyuan natively speak the OpenAI-compatible protocol — simplest to start with).
A token for authenticating callers (the simplest option is a random 32-byte string, referred to below as PROXY_ACCESS_TOKEN).

Step 1: Initialize the Web Cloud Function project

mkdir llm-proxy && cd llm-proxy
npm init -y
npm install --save express

Set main to index.js in package.json and ensure a start script exists:

{
  "name": "llm-proxy",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}

A Web Cloud Function is essentially an HTTP service listening on port 9000. The CloudBase platform runs npm start at startup. Express is used here because it gives the most direct control over response headers and streaming res.write.

Step 2: Write the proxy code (OpenAI-compatible protocol + SSE passthrough)

index.js:

const express = require('express');

const app = express();
app.use(express.json({ limit: '10mb' }));

// All upstream config comes from environment variables — no keys in code
const UPSTREAM_BASE_URL = process.env.UPSTREAM_BASE_URL || 'https://api.openai.com/v1';
const UPSTREAM_API_KEY = process.env.UPSTREAM_API_KEY;
const PROXY_ACCESS_TOKEN = process.env.PROXY_ACCESS_TOKEN;

if (!UPSTREAM_API_KEY || !PROXY_ACCESS_TOKEN) {
  console.error('Missing UPSTREAM_API_KEY or PROXY_ACCESS_TOKEN env');
  process.exit(1);
}

// Simple auth: validate Authorization: Bearer <PROXY_ACCESS_TOKEN>
function requireAuth(req, res, next) {
  const auth = req.headers.authorization || '';
  const token = auth.startsWith('Bearer ') ? auth.slice(7) : '';
  if (token !== PROXY_ACCESS_TOKEN) {
    return res.status(401).json({ error: 'unauthorized' });
  }
  next();
}

// Proxy /v1/chat/completions to upstream
app.post('/v1/chat/completions', requireAuth, async (req, res) => {
  const isStream = req.body && req.body.stream === true;

  let upstream;
  try {
    upstream = await fetch(`${UPSTREAM_BASE_URL}/chat/completions`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        Authorization: `Bearer ${UPSTREAM_API_KEY}`,
      },
      body: JSON.stringify(req.body),
    });
  } catch (err) {
    console.error('upstream fetch failed', err);
    return res.status(502).json({ error: 'upstream_unreachable', message: err.message });
  }

  // Upstream returned non-2xx: pass through status code and body for easier debugging
  if (!upstream.ok) {
    const text = await upstream.text();
    return res
      .status(upstream.status)
      .type(upstream.headers.get('content-type') || 'application/json')
      .send(text);
  }

  if (isStream) {
    // SSE streaming: pipe upstream byte stream directly to the client
    res.setHeader('Content-Type', 'text/event-stream');
    res.setHeader('Cache-Control', 'no-cache');
    res.setHeader('Connection', 'keep-alive');
    res.flushHeaders?.();

    const reader = upstream.body.getReader();
    req.on('close', () => {
      reader.cancel().catch(() => {});
    });

    try {
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        res.write(value);
      }
    } catch (err) {
      console.error('stream pipe error', err);
    } finally {
      res.end();
    }
    return;
  }

  // Non-streaming: read the full response and return JSON
  const json = await upstream.json();
  res.json(json);
});

app.get('/health', (_req, res) => res.json({ ok: true }));

const PORT = process.env.PORT || 9000;
app.listen(PORT, () => {
  console.log(`llm-proxy listening on ${PORT}`);
});

A few common pitfalls:

You must use a Web Cloud Function (HTTP trigger). Event-driven functions cannot maintain long-lived SSE connections.
res.flushHeaders() exists in Express 4 but is optional; calling it once ensures the browser receives response headers early and starts parsing the SSE stream.
upstream.body is a Web Stream (ReadableStream); use getReader() to pull from it. If you use axios instead of fetch, you'll need a different approach — axios buffers the entire response by default.
Listen for req.on('close') to detect when the client disconnects (e.g., user closes the tab) and call reader.cancel(); otherwise the upstream connection hangs until timeout.

Step 3: Deploy to CloudBase and configure environment variables

Deploy:

tcb login
tcb fn deploy llm-proxy --httpFn -e your-env-id

After deployment, go to the Console under "Cloud Functions → llm-proxy" and do three things:

Add Environment Variables:
- UPSTREAM_BASE_URL: For OpenAI use https://api.openai.com/v1; for DeepSeek use https://api.deepseek.com/v1; for Anthropic via OpenAI-compatible protocol use https://api.anthropic.com/v1 (note: Anthropic's native API is not fully OpenAI-compatible — in most cases use an OpenAI-compatible gateway or the @anthropic-ai/sdk).
- UPSTREAM_API_KEY: Your real key (never put this in package.json or code).
- PROXY_ACCESS_TOKEN: Generate a random 32-byte string, e.g. node -e "console.log(require('crypto').randomBytes(32).toString('hex'))".
Network configuration: enable "Public Access" (Cloud Functions can reach the internet by default, but this varies by environment and region — check the actual option in the Console).
Trigger type: confirm "HTTP Access Service" is enabled and note the access URL, e.g. https://your-env.service.tcloudbase.com/llm-proxy.

Step 4: Local verification

Non-streaming request:

curl -X POST 'https://your-env.service.tcloudbase.com/llm-proxy/v1/chat/completions' \
  -H 'Authorization: Bearer YOUR_PROXY_ACCESS_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{ "role": "user", "content": "Describe CloudBase in one sentence." }]
  }'

Expected: a standard OpenAI-format JSON response containing choices[0].message.content.

Streaming request:

curl -N -X POST 'https://your-env.service.tcloudbase.com/llm-proxy/v1/chat/completions' \
  -H 'Authorization: Bearer YOUR_PROXY_ACCESS_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4o-mini",
    "stream": true,
    "messages": [{ "role": "user", "content": "Count to 5." }]
  }'

Expected: a line like data: {"id":...,"choices":[{"delta":{"content":"..."}}]} printed every few tens of milliseconds, ending with data: [DONE]. The -N flag disables curl's buffering — without it you'll see all data appear at once, which is a curl issue, not a function bug.

Authentication Strategy Comparison

This recipe uses the simplest "shared token" approach, which covers low-traffic scenarios. Upgrade as needed for production:

Approach	Best for	Downside
Shared token (this recipe)	Internal apps, allowlisted frontends	A leaked token requires a full rotation; hardcoding the token in frontend code is effectively no protection
CloudBase Authentication — frontend sends access_token, Cloud Function validates with `@cloudbase/node-sdk`	Apps with real user accounts	Frontend must be logged in first; Cloud Function must parse the ID token
API Gateway + signing	Public-facing APIs requiring per-caller metering	High complexity, outside the scope of this recipe

Common Errors

Error	Cause	Fix
`getaddrinfo EAI_AGAIN api.openai.com`	Cloud Function has no public network egress or region restriction	Enable Public Access in the Console, or attach an egress NAT in network settings
Streaming request returns no data for 10 seconds, then everything at once	Some intermediate layer is buffering (CDN, proxy, or curl without `-N`)	Use `-N` on the client side; add `res.flushHeaders()` in code; confirm upstream has `stream: true`
`401 unauthorized`	Frontend missing `Authorization: Bearer` header or wrong token	Compare `PROXY_ACCESS_TOKEN` on both sides; note that after changing an environment variable in the Console, the function needs to be redeployed or the instance restarted
`502 upstream_unreachable`	Upstream API domain resolution failed / TLS handshake failed	Ensure `UPSTREAM_BASE_URL` has no trailing `/` and includes `/v1` (not just `https://api.openai.com`)
Stream cuts off after 30 seconds	Default function execution timeout is 30 seconds	Console → Function Config → Timeout, increase to 60–900 seconds as needed; for long conversations consider splitting into multiple calls
Anthropic native API returns `unknown field "messages.role.user"`	Anthropic's native protocol is not the OpenAI protocol	Either switch to an Anthropic-compatible OpenAI gateway, or replace the proxy code with Anthropic SDK calls and change the path to `/v1/messages`

Error codes with full stack traces are visible in the Cloud Function logs. The Console's "Logs" panel supports filtering by requestId.

SSE Protocol Support — SSE streaming principles for Web Cloud Functions
HTTP Cloud Functions — Quick start for Web Cloud Functions (HTTP trigger)
Function Environment Variables — Configuring and reading injected variables
Function Timeout — Configuration for long-lived connections
Calling Cloud Functions — Cloud Function invocation methods and HTTP access service overview

Next Steps

Once this proxy is running, consider:

add-vercel-ai-sdk-streaming-chatbot — Build a frontend chatbot with Vercel AI SDK, pointing baseURL at this proxy URL.
add-rag-with-pgvector-cloudbase — Add Retrieval-Augmented Generation on top of the proxy so the model answers from your own documents.
secure-secrets-in-cloud-function — Properly layer UPSTREAM_API_KEY and other sensitive values across local dev / CI / production to keep them out of git.

Applicable Scenarios​

Prerequisites​

Step 1: Initialize the Web Cloud Function project​

Step 2: Write the proxy code (OpenAI-compatible protocol + SSE passthrough)​

Step 3: Deploy to CloudBase and configure environment variables​

Step 4: Local verification​

Authentication Strategy Comparison​

Common Errors​

Related Documentation​

Next Steps​